diff mbox series

doc/cat-file: clarify description regarding various command forms

Message ID 20231003082513.3003520-1-stepnem@smrk.net (mailing list archive)
State New, archived
Headers show
Series doc/cat-file: clarify description regarding various command forms | expand

Commit Message

Štěpán Němec Oct. 3, 2023, 8:25 a.m. UTC
The DESCRIPTION's "first form" is actually the 1st, 2nd, 3rd and 5th
form in SYNOPSIS, the "second form" is the 4th one.

Interestingly, this state of affairs was introduced in
97fe7250753b (cat-file docs: fix SYNOPSIS and "-h" output, 2021-12-28)
with the claim of "Now the two will match again." ("the two" being
DESCRIPTION and SYNOPSIS)...

Ordinals are hard, let's try talking about batch and non-batch mode
instead.

Signed-off-by: Štěpán Němec <stepnem@smrk.net>
---
 Documentation/git-cat-file.txt | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)


base-commit: d0e8084c65cbf949038ae4cc344ac2c2efd77415

Comments

Jeff King Oct. 3, 2023, 8:06 p.m. UTC | #1
On Tue, Oct 03, 2023 at 10:25:13AM +0200, Štěpán Němec wrote:

> The DESCRIPTION's "first form" is actually the 1st, 2nd, 3rd and 5th
> form in SYNOPSIS, the "second form" is the 4th one.
> 
> Interestingly, this state of affairs was introduced in
> 97fe7250753b (cat-file docs: fix SYNOPSIS and "-h" output, 2021-12-28)
> with the claim of "Now the two will match again." ("the two" being
> DESCRIPTION and SYNOPSIS)...
> 
> Ordinals are hard, let's try talking about batch and non-batch mode
> instead.

Thanks, I think this is a good direction. Two things I noticed:

>  DESCRIPTION
>  -----------
> -In its first form, the command provides the content or the type of an object in
> +This command can operate in two modes, depending on whether an option from
> +the `--batch` family is specified.
> +
> +In non-batch mode, the command provides the content or the type of an object in
>  the repository. The type is required unless `-t` or `-p` is used to find the
>  object type, or `-s` is used to find the object size, or `--textconv` or
>  `--filters` is used (which imply type "blob").

The existing text here is already a bit vague, considering the number of
operations it covers (like "-e", for example, which is not showing "the
content or the type" at all). But that is not new in your patch, and it
is maybe even OK to be a bit vague here, and let the OPTIONS section
cover the specifics.

> -In the second form, a list of objects (separated by linefeeds) is provided on
> +In batch mode, a list of objects (separated by linefeeds) is provided on
>  stdin, and the SHA-1, type, and size of each object is printed on stdout. The
>  output format can be overridden using the optional `<format>` argument. If
>  either `--textconv` or `--filters` was specified, the input is expected to

I think this got a bit inaccurate with "--batch-command", which is a
whole different mode itself compared to --batch and --batch-check. I
don't think your patch is really making anything worse, but arguably
there are three "major modes" here.

-Peff
Štěpán Němec Oct. 5, 2023, 11:20 a.m. UTC | #2
On Tue, 3 Oct 2023 16:06:59 -0400
Jeff King wrote:

>>  DESCRIPTION
>>  -----------
>> -In its first form, the command provides the content or the type of an object in
>> +This command can operate in two modes, depending on whether an option from
>> +the `--batch` family is specified.
>> +
>> +In non-batch mode, the command provides the content or the type of an object in
>>  the repository. The type is required unless `-t` or `-p` is used to find the
>>  object type, or `-s` is used to find the object size, or `--textconv` or
>>  `--filters` is used (which imply type "blob").
>
> The existing text here is already a bit vague, considering the number of
> operations it covers (like "-e", for example, which is not showing "the
> content or the type" at all). But that is not new in your patch, and it
> is maybe even OK to be a bit vague here, and let the OPTIONS section
> cover the specifics.

So how about we just butcher the DESCRIPTION completely; after all, the
information it gives is not quite correct (other than what you already
mentioned, e.g., -e is omitted in the "type not required" part; one is
left to wonder what <format> refers to: you have to go read the OPTIONS
and BATCH OUTPUT sections anyway), and the correct parts only duplicate
information given in the following sections, providing opportunities to
become out of date when the command and its documentation evolve:

Changes (if we agree this is the way to go, I'll also update the --help
output):
  synopsis:
    - move the (--textconv | --filters) form before --batch, closer
      to the other non-batch forms
    - cosmetics: swap -t and -s, --filters and --textconv (sort
      alphabetically)
  description:
    - reformulate, omit vague/imprecise information better
      provided in the detailed options list

SYNOPSIS
    git cat-file <type> <object>
    git cat-file (-e | -p) <object>
    git cat-file (-s | -t) [--allow-unknown-type] <object>
    git cat-file (--filters | --textconv)
                 [<rev>:<path|tree-ish> | --path=<path|tree-ish> <rev>]
    git cat-file (--batch | --batch-check | --batch-command) [--batch-all-objects]
                 [--buffer] [--follow-symlinks] [--unordered]
                 [--filters | --textconv] [-Z]

DESCRIPTION
    This command can operate in two modes, depending on whether an
    option from the --batch family is specified.

    In non-batch mode, the command provides information on an object
    named on the command line.

    In batch mode, arguments are read from standard input.

[That's all for a summary, read the following sections for details.]

>> -In the second form, a list of objects (separated by linefeeds) is provided on
>> +In batch mode, a list of objects (separated by linefeeds) is provided on
>>  stdin, and the SHA-1, type, and size of each object is printed on stdout. The
>>  output format can be overridden using the optional `<format>` argument. If
>>  either `--textconv` or `--filters` was specified, the input is expected to
>
> I think this got a bit inaccurate with "--batch-command", which is a
> whole different mode itself compared to --batch and --batch-check. I
> don't think your patch is really making anything worse, but arguably
> there are three "major modes" here.

This is not obvious to me (the "three major modes" part).  AIUI it's
still mainly a batch (read from stdin) vs. non-batch (args on command
line) dichotomy.  The details differ (just args vs. command + args), but
so does e.g. -e differ in providing information via exit code rather
than stdout.

(But please note I'm not trying to pose as an expert here: this all
started with me coming to git-cat-file(1) to _learn_ about cat-file
and finding the description more than a little confusing.)
Jeff King Oct. 5, 2023, 5:18 p.m. UTC | #3
On Thu, Oct 05, 2023 at 01:20:18PM +0200, Štěpán Němec wrote:

> So how about we just butcher the DESCRIPTION completely;
> [...]
> DESCRIPTION
>     This command can operate in two modes, depending on whether an
>     option from the --batch family is specified.
> 
>     In non-batch mode, the command provides information on an object
>     named on the command line.
> 
>     In batch mode, arguments are read from standard input.
> 
> [That's all for a summary, read the following sections for details.]

Yeah, I think that is a big improvement over the status quo. I might
also be worth starting with a single-sentence overview of what is common
to both modes. Something like:

  Output the contents or details of one or more objects. This command
  can operate in two modes, depending on whether an option from the
  --batch family is specified.

  In non-batch mode, the command provides information on a single object
  given on the command line.

  In batch mode, arguments are read from standard input.

> > I think this got a bit inaccurate with "--batch-command", which is a
> > whole different mode itself compared to --batch and --batch-check. I
> > don't think your patch is really making anything worse, but arguably
> > there are three "major modes" here.
> 
> This is not obvious to me (the "three major modes" part).  AIUI it's
> still mainly a batch (read from stdin) vs. non-batch (args on command
> line) dichotomy.  The details differ (just args vs. command + args), but
> so does e.g. -e differ in providing information via exit code rather
> than stdout.

Yeah, I think you understand it correctly. But what the current text
(both before and after your proposed patch) says about batch mode is:

  In batch mode, a list of objects (separated by linefeeds) is provided
  on stdin, [...]

which I think is not really true of --batch-command. But the rewrite you
suggest above takes care of that nicely, I think.

> (But please note I'm not trying to pose as an expert here: this all
> started with me coming to git-cat-file(1) to _learn_ about cat-file
> and finding the description more than a little confusing.)

That is a very valuable perspective. I am probably too much an expert in
cat-file, and it has rotted my brain. ;)

-Peff
Štěpán Němec Oct. 5, 2023, 5:48 p.m. UTC | #4
On Thu, 5 Oct 2023 13:18:27 -0400
Jeff King wrote:

> On Thu, Oct 05, 2023 at 01:20:18PM +0200, Štěpán Němec wrote:
>
>> So how about we just butcher the DESCRIPTION completely;
>> [...]
>> DESCRIPTION
>>     This command can operate in two modes, depending on whether an
>>     option from the --batch family is specified.
>> 
>>     In non-batch mode, the command provides information on an object
>>     named on the command line.
>> 
>>     In batch mode, arguments are read from standard input.
>> 
>> [That's all for a summary, read the following sections for details.]
>
> Yeah, I think that is a big improvement over the status quo. I might
> also be worth starting with a single-sentence overview of what is common
> to both modes. Something like:
>
>   Output the contents or details of one or more objects. [...]

I thought about that when proposing the rewrite, but feel that it would
again just duplicate what's said elsewhere, in this case even before,
not after, in the very first line of the man page:

    git-cat-file - Provide content or type and size information for
    repository objects

>   This command can operate in two modes, depending on whether an
>   option from the --batch family is specified.
>
>   In non-batch mode, the command provides information on a single object
>   given on the command line.
    ^^^^^
Any particular reason you prefer "given" to "named"?  However absurd a
notion of giving an actual object on the command line might seem, to me
"named" is better in that it leaves no room for such misinterpretation.
And the <object> description in OPTIONS talks about "ways to spell
object names", building on the same concept.
Jeff King Oct. 5, 2023, 9:01 p.m. UTC | #5
On Thu, Oct 05, 2023 at 07:48:52PM +0200, Štěpán Němec wrote:

> > Yeah, I think that is a big improvement over the status quo. I might
> > also be worth starting with a single-sentence overview of what is common
> > to both modes. Something like:
> >
> >   Output the contents or details of one or more objects. [...]
> 
> I thought about that when proposing the rewrite, but feel that it would
> again just duplicate what's said elsewhere, in this case even before,
> not after, in the very first line of the man page:
> 
>     git-cat-file - Provide content or type and size information for
>     repository objects

Ah, true, I was thinking that the DESCRIPTION section would be the first
thing users would read, but I didn't notice the headline. I agree that
what it says is probably sufficient (though arguably "type and size" is
slightly inaccurate there; I said "details" in my proposed text but
maybe that is too vague).

> >   This command can operate in two modes, depending on whether an
> >   option from the --batch family is specified.
> >
> >   In non-batch mode, the command provides information on a single object
> >   given on the command line.
>     ^^^^^
> Any particular reason you prefer "given" to "named"?  However absurd a
> notion of giving an actual object on the command line might seem, to me
> "named" is better in that it leaves no room for such misinterpretation.
> And the <object> description in OPTIONS talks about "ways to spell
> object names", building on the same concept.

Nope, I didn't even do that replacement consciously (I was just fleshing
out my example, and ended up deciding nothing else needed to be
changed). So "named" is fine by me.

Thanks.

-Peff
Štěpán Němec Oct. 9, 2023, 8:36 a.m. UTC | #6
On Thu, 5 Oct 2023 17:01:25 -0400
Jeff King wrote:

> On Thu, Oct 05, 2023 at 07:48:52PM +0200, Štěpán Němec wrote:
>
>> > Yeah, I think that is a big improvement over the status quo. I might
>> > also be worth starting with a single-sentence overview of what is common
>> > to both modes. Something like:
>> >
>> >   Output the contents or details of one or more objects. [...]
>> 
>> I thought about that when proposing the rewrite, but feel that it would
>> again just duplicate what's said elsewhere, in this case even before,
>> not after, in the very first line of the man page:
>> 
>>     git-cat-file - Provide content or type and size information for
>>     repository objects
>
> Ah, true, I was thinking that the DESCRIPTION section would be the first
> thing users would read, but I didn't notice the headline. I agree that
> what it says is probably sufficient (though arguably "type and size" is
> slightly inaccurate there; I said "details" in my proposed text but
> maybe that is too vague).

We could also leave the NAME vague(r) and put an additional sentence at
the beginning of DESCRIPTION:

NAME
    git-cat-file - Provide contents or details of repository objects

SYNOPSIS
    [...]

DESCRIPTION
    Output the contents or other properties such as size, type or delta
    information of one or more objects.

    The command can operate [...]
Jeff King Oct. 9, 2023, 3:56 p.m. UTC | #7
On Mon, Oct 09, 2023 at 10:36:51AM +0200, Štěpán Němec wrote:

> > Ah, true, I was thinking that the DESCRIPTION section would be the first
> > thing users would read, but I didn't notice the headline. I agree that
> > what it says is probably sufficient (though arguably "type and size" is
> > slightly inaccurate there; I said "details" in my proposed text but
> > maybe that is too vague).
> 
> We could also leave the NAME vague(r) and put an additional sentence at
> the beginning of DESCRIPTION:

Yup, that is a good suggestion. Do you want to wrap all of this
discussion up as a patch?

-Peff
diff mbox series

Patch

diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index 0e4936d18263..1957f90335a4 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -20,12 +20,15 @@  SYNOPSIS
 
 DESCRIPTION
 -----------
-In its first form, the command provides the content or the type of an object in
+This command can operate in two modes, depending on whether an option from
+the `--batch` family is specified.
+
+In non-batch mode, the command provides the content or the type of an object in
 the repository. The type is required unless `-t` or `-p` is used to find the
 object type, or `-s` is used to find the object size, or `--textconv` or
 `--filters` is used (which imply type "blob").
 
-In the second form, a list of objects (separated by linefeeds) is provided on
+In batch mode, a list of objects (separated by linefeeds) is provided on
 stdin, and the SHA-1, type, and size of each object is printed on stdout. The
 output format can be overridden using the optional `<format>` argument. If
 either `--textconv` or `--filters` was specified, the input is expected to