diff mbox series

[2/5] cat-file: mention --unordered along with --batch-all-objects

Message ID YVy2DNd+XemykKE0@coredump.intra.peff.net (mailing list archive)
State New, archived
Headers show
Series cat-file replace handling and optimization | expand

Commit Message

Jeff King Oct. 5, 2021, 8:31 p.m. UTC
The note on ordering for --batch-all-objects was written when that was
the only possible ordering. These days we have --unordered, too, so
let's point to it.

Signed-off-by: Jeff King <peff@peff.net>
---
Not strictly related to this series, but I noticed it while I was in the
area, and I'm about to touch these same lines, so it seemed better than
spinning it off into its own series.

 Documentation/git-cat-file.txt | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Ævar Arnfjörð Bjarmason Oct. 5, 2021, 9:02 p.m. UTC | #1
On Tue, Oct 05 2021, Jeff King wrote:

> The note on ordering for --batch-all-objects was written when that was
> the only possible ordering. These days we have --unordered, too, so
> let's point to it.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> Not strictly related to this series, but I noticed it while I was in the
> area, and I'm about to touch these same lines, so it seemed better than
> spinning it off into its own series.
>
>  Documentation/git-cat-file.txt | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
> index 4eb0421b3f..6467707c6e 100644
> --- a/Documentation/git-cat-file.txt
> +++ b/Documentation/git-cat-file.txt
> @@ -94,8 +94,9 @@ OPTIONS
>  	Instead of reading a list of objects on stdin, perform the
>  	requested batch operation on all objects in the repository and
>  	any alternate object stores (not just reachable objects).
> -	Requires `--batch` or `--batch-check` be specified. Note that
> -	the objects are visited in order sorted by their hashes.
> +	Requires `--batch` or `--batch-check` be specified. By default,
> +	the objects are visited in order sorted by their hashes; see
> +	also `--unordered` below.
>  
>  --buffer::
>  	Normally batch output is flushed after each object is output, so

Since you're doing while-you're-at-it anyway: Isn't the --unordered
documentation also incorrect to reference --batch, which I take as it
lazily using as a shorthand for --batch-all-objects.
Jeff King Oct. 5, 2021, 9:41 p.m. UTC | #2
On Tue, Oct 05, 2021 at 11:02:38PM +0200, Ævar Arnfjörð Bjarmason wrote:

> > diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
> > index 4eb0421b3f..6467707c6e 100644
> > --- a/Documentation/git-cat-file.txt
> > +++ b/Documentation/git-cat-file.txt
> > @@ -94,8 +94,9 @@ OPTIONS
> >  	Instead of reading a list of objects on stdin, perform the
> >  	requested batch operation on all objects in the repository and
> >  	any alternate object stores (not just reachable objects).
> > -	Requires `--batch` or `--batch-check` be specified. Note that
> > -	the objects are visited in order sorted by their hashes.
> > +	Requires `--batch` or `--batch-check` be specified. By default,
> > +	the objects are visited in order sorted by their hashes; see
> > +	also `--unordered` below.
> >  
> >  --buffer::
> >  	Normally batch output is flushed after each object is output, so
> 
> Since you're doing while-you're-at-it anyway: Isn't the --unordered
> documentation also incorrect to reference --batch, which I take as it
> lazily using as a shorthand for --batch-all-objects.

I don't think so. It says:

--unordered::
        When `--batch-all-objects` is in use, visit objects in an
        order which may be more efficient for accessing the object
        contents than hash order. The exact details of the order are
        unspecified, but if you do not require a specific order, this
        should generally result in faster output, especially with
        `--batch`.  Note that `cat-file` will still show each object
        only once, even if it is stored multiple times in the
        repository.

So it correctly mentions that it is affecting --batch-all-objects in the
first sentence. The "especially with --batch" is correct, too. The
ordering has more of an effect if you are accessing the full object,
since there we are increasing the locality which the delta-base cache
relies on.  Whereas with --batch-check, even with size or type, that
locality is much less important (it might help disk or even RAM caches a
bit, but we are examining each object independently, even if it's a
delta, and not caching the intermediate results in any way ourselves).

Does that answer your question, or were you thinking of something else?

-Peff
Ævar Arnfjörð Bjarmason Oct. 6, 2021, 9:02 a.m. UTC | #3
On Tue, Oct 05 2021, Jeff King wrote:

> On Tue, Oct 05, 2021 at 11:02:38PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> > diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
>> > index 4eb0421b3f..6467707c6e 100644
>> > --- a/Documentation/git-cat-file.txt
>> > +++ b/Documentation/git-cat-file.txt
>> > @@ -94,8 +94,9 @@ OPTIONS
>> >  	Instead of reading a list of objects on stdin, perform the
>> >  	requested batch operation on all objects in the repository and
>> >  	any alternate object stores (not just reachable objects).
>> > -	Requires `--batch` or `--batch-check` be specified. Note that
>> > -	the objects are visited in order sorted by their hashes.
>> > +	Requires `--batch` or `--batch-check` be specified. By default,
>> > +	the objects are visited in order sorted by their hashes; see
>> > +	also `--unordered` below.
>> >  
>> >  --buffer::
>> >  	Normally batch output is flushed after each object is output, so
>> 
>> Since you're doing while-you're-at-it anyway: Isn't the --unordered
>> documentation also incorrect to reference --batch, which I take as it
>> lazily using as a shorthand for --batch-all-objects.
>
> I don't think so. It says:
>
> --unordered::
>         When `--batch-all-objects` is in use, visit objects in an
>         order which may be more efficient for accessing the object
>         contents than hash order. The exact details of the order are
>         unspecified, but if you do not require a specific order, this
>         should generally result in faster output, especially with
>         `--batch`.  Note that `cat-file` will still show each object
>         only once, even if it is stored multiple times in the
>         repository.
>
> So it correctly mentions that it is affecting --batch-all-objects in the
> first sentence. The "especially with --batch" is correct, too. The
> ordering has more of an effect if you are accessing the full object,
> since there we are increasing the locality which the delta-base cache
> relies on.  Whereas with --batch-check, even with size or type, that
> locality is much less important (it might help disk or even RAM caches a
> bit, but we are examining each object independently, even if it's a
> delta, and not caching the intermediate results in any way ourselves).
>
> Does that answer your question, or were you thinking of something else?

Urgh.

This is the Nth time I've completely failed to mentally model how this
command works, after having hacked on it extensively.

I thought that --batch-all-objects and --batch were mutually exclusive,
but they're obviously not.

In my defense I think the help/code is very confusing. I hacked up some
WIP changes to change it from:
    
    $ git cat-file -h
    usage: git cat-file (-t [--allow-unknown-type] | -s [--allow-unknown-type] | -e | -p | <type> | --textconv | --filters) [--path=<path>] <object>
       or: git cat-file (--batch[=<format>] | --batch-check[=<format>]) [--follow-symlinks] [--textconv | --filters]
    
    <type> can be one of: blob, tree, commit, tag
        -t                    show object type
        -s                    show object size
        -e                    exit with zero when there's no error
        -p                    pretty-print object's content
        --textconv            for blob objects, run textconv on object's content
        --filters             for blob objects, run filters on object's content
        --path <blob>         use a specific path for --textconv/--filters
        --allow-unknown-type  allow -s and -t to work with broken/corrupt objects
        --buffer              buffer --batch output
        --batch[=<format>]    show info and content of objects fed from the standard input
        --batch-check[=<format>]
                              show info about objects fed from the standard input
        --follow-symlinks     follow in-tree symlinks (used with --batch or --batch-check)
        --batch-all-objects   show all objects with --batch or --batch-check
        --unordered           do not order --batch-all-objects output
    
To:
    
    usage: git cat-file (-e | -p) <object>
       or: git cat-file ( -t | -s ) [--allow-unknown-type] <object>
       or: git cat-file [--batch-all-objects] [--batch | --batch-check] [--buffer] [--follow-symlinks] [--unordered]
       or: git cat-file [--textconv|--filters] [--path=<path|tree-ish> <rev> | <rev>:<path|tree-ish>]
    
    Check object existence or emit object contents
        -e                    check if <object> exists
        -p                    pretty-print <object> content
    
    Emit [broken] object attributes
        -t                    show object type (one of 'blob', 'tree', 'commit', 'tag', ...)
        -s                    show object size
        --allow-unknown-type  allow -s and -t to work with broken/corrupt objects
    
    Run <rev>:<blobs|tree> via conversion or filter
        --textconv            for blob objects, run textconv on object's content
        --filters             for blob objects, run filters on object's content
        --path <blob>         use a specific path for --textconv/--filters
    
    Emit objects in batch via requests on STDIN, or --batch-all-objects
        --batch-all-objects   Emit all objects in the repository, instead of taking requests on STDIN
        --buffer              buffer --batch output
        --batch[=<format>]    show info and content of objects fed from the standard input
        --batch-check[=<format>]
                              show info about objects fed from the standard input
        --follow-symlinks     follow in-tree symlinks
        --unordered           do not order objects before emitting them

Which actually reflects reality, i.e. half of the options aren't
accepted by the batch mode, so grouping them makes sense, and the
current help gives the impression that e.g. --textconv can be used with
--batch, but it's effectively a CMDMODE.

Then we don't document that [--textconv|--filters] will fallback to -p
if given <tree-ish>, I'm not sure if that's intentional (but a
"fallthrough" comment suggests so).

And we silentnly accept --buffer etc. without the batch mode, but it
should die.

Looking at the history it seems you added --batch-all-objects around the
same time as the OPT_CMDMODE() was used in the command, so we should
probably have something like this to start with:

-- >8 --
Subject: [PATCH] cat-file: make --batch-all-objects a CMDMODE

The usage of OPT_CMDMODE() in "cat-file"[1] was added in parallel with
the development of[3] the --batch-all-objects option[4], so we've
since grown[5] checks that it can't be combined with other command
modes, when it should just be made a top-level command-mode
instead. It doesn't combine with --filters, --textconv etc.

1. b48158ac94c (cat-file: make the options mutually exclusive, 2015-05-03)
2. https://lore.kernel.org/git/xmqqtwspgusf.fsf@gitster.dls.corp.google.com/
3. https://lore.kernel.org/git/20150622104559.GG14475@peff.net/
4. 6a951937ae1 (cat-file: add --batch-all-objects option, 2015-06-22)
5. 321459439e1 (cat-file: support --textconv/--filters in batch mode, 2016-09-09)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/cat-file.c | 23 ++++++++++-------------
 1 file changed, 10 insertions(+), 13 deletions(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 243fe6844bc..50861bb02da 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -643,6 +643,8 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
 			    N_("for blob objects, run textconv on object's content"), 'c'),
 		OPT_CMDMODE(0, "filters", &opt,
 			    N_("for blob objects, run filters on object's content"), 'w'),
+		OPT_CMDMODE(0, "batch-all-objects", &opt,
+			    N_("show all objects with --batch or --batch-check"), 'b'),
 		OPT_STRING(0, "path", &force_path, N_("blob"),
 			   N_("use a specific path for --textconv/--filters")),
 		OPT_BOOL(0, "allow-unknown-type", &unknown_type,
@@ -658,8 +660,6 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
 			batch_option_callback),
 		OPT_BOOL(0, "follow-symlinks", &batch.follow_symlinks,
 			 N_("follow in-tree symlinks (used with --batch or --batch-check)")),
-		OPT_BOOL(0, "batch-all-objects", &batch.all_objects,
-			 N_("show all objects with --batch or --batch-check")),
 		OPT_BOOL(0, "unordered", &batch.unordered,
 			 N_("do not order --batch-all-objects output")),
 		OPT_END()
@@ -669,28 +669,25 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
 
 	batch.buffer_output = -1;
 	argc = parse_options(argc, argv, prefix, options, cat_file_usage, 0);
-
-	if (opt) {
+	if (argc && batch.enabled)
+		usage_with_options(cat_file_usage, options);
+	if (opt == 'b') {
+		batch.all_objects = 1;
+	} else if (opt) {
 		if (batch.enabled && (opt == 'c' || opt == 'w'))
 			batch.cmdmode = opt;
 		else if (argc == 1)
 			obj_name = argv[0];
 		else
 			usage_with_options(cat_file_usage, options);
-	}
-	if (!opt && !batch.enabled) {
+	} else if (!opt && !batch.enabled) {
 		if (argc == 2) {
 			exp_type = argv[0];
 			obj_name = argv[1];
 		} else
 			usage_with_options(cat_file_usage, options);
-	}
-	if (batch.enabled) {
-		if (batch.cmdmode != opt || argc)
-			usage_with_options(cat_file_usage, options);
-		if (batch.cmdmode && batch.all_objects)
-			die("--batch-all-objects cannot be combined with "
-			    "--textconv nor with --filters");
+	} else if (batch.enabled && batch.cmdmode != opt) {
+		usage_with_options(cat_file_usage, options);
 	}
 
 	if ((batch.follow_symlinks || batch.all_objects) && !batch.enabled) {
Jeff King Oct. 6, 2021, 4:15 p.m. UTC | #4
On Wed, Oct 06, 2021 at 11:02:59AM +0200, Ævar Arnfjörð Bjarmason wrote:

> I thought that --batch-all-objects and --batch were mutually exclusive,
> but they're obviously not.

Right. In fact, the former is useless without the latter (or
--batch-check).

> In my defense I think the help/code is very confusing. I hacked up some
> WIP changes to change it from:

That's fair, but...

> Looking at the history it seems you added --batch-all-objects around the
> same time as the OPT_CMDMODE() was used in the command, so we should
> probably have something like this to start with:
> 
> -- >8 --
> Subject: [PATCH] cat-file: make --batch-all-objects a CMDMODE
> 
> The usage of OPT_CMDMODE() in "cat-file"[1] was added in parallel with
> the development of[3] the --batch-all-objects option[4], so we've
> since grown[5] checks that it can't be combined with other command
> modes, when it should just be made a top-level command-mode
> instead. It doesn't combine with --filters, --textconv etc.

This is not right. --batch-all-objects does not provide a mode exclusive
with "-t", etc, by itself. It is --batch or --batch-check that does so.
Those in theory should be OPT_CMDMODE, but I don't think they can be,
because they also take an argument. So we'd need some OPT_CMDMODE_ARG()
or something, but then it needs _two_ value fields. So I think it would
require major surgery to parse-options.

Using --batch-all-objects without --batch or --batch-check would be an
error, and we do flag it as such.

So you are not wrong that using --batch-all-objects with -t is nonsense,
and we do indeed error on it currently. But it is not because the two
are themselves exclusive, but because of the chaining of the two rules.


The groupings you showed in your larger output mostly make sense, but...

>     Run <rev>:<blobs|tree> via conversion or filter
>         --textconv            for blob objects, run textconv on object's content
>         --filters             for blob objects, run filters on object's content
>         --path <blob>         use a specific path for --textconv/--filters
>     
>     Emit objects in batch via requests on STDIN, or --batch-all-objects
>         --batch-all-objects   Emit all objects in the repository, instead of taking requests on STDIN
>         --buffer              buffer --batch output
>         --batch[=<format>]    show info and content of objects fed from the standard input
>         --batch-check[=<format>]
> [...]

These groups aren't mutually exclusive. You can use --textconv in batch
mode. Which further muddies the CMDMODE waters; --batch is a mode that
overrides "-t", but _not_ "--textconv", where it is a modifier.

-Peff
Ævar Arnfjörð Bjarmason Oct. 7, 2021, 10:18 a.m. UTC | #5
On Wed, Oct 06 2021, Jeff King wrote:

> On Wed, Oct 06, 2021 at 11:02:59AM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> I thought that --batch-all-objects and --batch were mutually exclusive,
>> but they're obviously not.
>
> Right. In fact, the former is useless without the latter (or
> --batch-check).
>
>> In my defense I think the help/code is very confusing. I hacked up some
>> WIP changes to change it from:
>
> That's fair, but...
>
>> Looking at the history it seems you added --batch-all-objects around the
>> same time as the OPT_CMDMODE() was used in the command, so we should
>> probably have something like this to start with:
>> 
>> -- >8 --
>> Subject: [PATCH] cat-file: make --batch-all-objects a CMDMODE
>> 
>> The usage of OPT_CMDMODE() in "cat-file"[1] was added in parallel with
>> the development of[3] the --batch-all-objects option[4], so we've
>> since grown[5] checks that it can't be combined with other command
>> modes, when it should just be made a top-level command-mode
>> instead. It doesn't combine with --filters, --textconv etc.
>
> This is not right. --batch-all-objects does not provide a mode exclusive
> with "-t", etc, by itself.

Yes it does. See the "if (opt) {" branch on master. We just don't
implement it via a cmdmode, but --batch-all-objects can definitely be a
CMDMODE (I see you found that out below...)

> Those in theory should be OPT_CMDMODE, but I don't think they can be,
> because they also take an argument. So we'd need some OPT_CMDMODE_ARG()
> or something, but then it needs _two_ value fields. So I think it would
> require major surgery to parse-options.

Aside: I think it would be worth it to teach it a general concept that
option X is incompatible with option Y, or group X, Y, Z and declare
those as incompatible with another group.

The current CMDMODE check is rather cryptic seemingly because it's
trying to avoid re-looping over the list while it emits an error.

> Using --batch-all-objects without --batch or --batch-check would be an
> error, and we do flag it as such.

*nod*

> So you are not wrong that using --batch-all-objects with -t is nonsense,
> and we do indeed error on it currently. But it is not because the two
> are themselves exclusive, but because of the chaining of the two rules.

Isn't it nonsense? I think so. I suppose we could make it a a synonym of
some --batch=<fmt> in that context, but that just seems like complexity
for no good reason.

> The groupings you showed in your larger output mostly make sense, but...
>
>>     Run <rev>:<blobs|tree> via conversion or filter
>>         --textconv            for blob objects, run textconv on object's content
>>         --filters             for blob objects, run filters on object's content
>>         --path <blob>         use a specific path for --textconv/--filters
>>     
>>     Emit objects in batch via requests on STDIN, or --batch-all-objects
>>         --batch-all-objects   Emit all objects in the repository, instead of taking requests on STDIN
>>         --buffer              buffer --batch output
>>         --batch[=<format>]    show info and content of objects fed from the standard input
>>         --batch-check[=<format>]
>> [...]
>
> These groups aren't mutually exclusive. You can use --textconv in batch
> mode. Which further muddies the CMDMODE waters; --batch is a mode that
> overrides "-t", but _not_ "--textconv", where it is a modifier.

Indeed, I conflated --batch* there again, those two are mutually
exclusive with --batch-all-objects, but not the other two. Will update
if I get to submitting this...
Jeff King Oct. 8, 2021, 2:30 a.m. UTC | #6
On Thu, Oct 07, 2021 at 12:18:45PM +0200, Ævar Arnfjörð Bjarmason wrote:

> >> The usage of OPT_CMDMODE() in "cat-file"[1] was added in parallel with
> >> the development of[3] the --batch-all-objects option[4], so we've
> >> since grown[5] checks that it can't be combined with other command
> >> modes, when it should just be made a top-level command-mode
> >> instead. It doesn't combine with --filters, --textconv etc.
> >
> > This is not right. --batch-all-objects does not provide a mode exclusive
> > with "-t", etc, by itself.
> 
> Yes it does. See the "if (opt) {" branch on master. We just don't
> implement it via a cmdmode, but --batch-all-objects can definitely be a
> CMDMODE (I see you found that out below...)

I agree that if you make it a CMDMODE it does not introduce any bugs.
But it is semantically confusing. You would not make, say, --buffer a
CMDMODE option. It is a flag which only takes effect under certain
modes. And the same is true of --batch-all-objects, which modifies the
batch cmd modes.

In fact, it _would_ be a bug to make it a CMDMODE if --batch were
correctly marked as one (but it is not sufficient to reason the other
way; --batch without --batch-all-objects is still mutually exclusive
with -t, etc).

What really makes things confusing, IMHO, is the --textconv and --filter
options. They are marked as CMDMODEs, and they are indeed mutually
exclusive with -t, etc. But they also work with --batch, which is itself
a different mode.

So I don't think OPT_CMDMODE could ever present this complete set of
rules, because they are not all mutually exclusive with each other. But
I think calling "--batch-all-objects" a mode is just muddying the waters
even further.

-Peff
Ævar Arnfjörð Bjarmason Oct. 8, 2021, 7:54 a.m. UTC | #7
On Thu, Oct 07 2021, Jeff King wrote:

> On Thu, Oct 07, 2021 at 12:18:45PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> >> The usage of OPT_CMDMODE() in "cat-file"[1] was added in parallel with
>> >> the development of[3] the --batch-all-objects option[4], so we've
>> >> since grown[5] checks that it can't be combined with other command
>> >> modes, when it should just be made a top-level command-mode
>> >> instead. It doesn't combine with --filters, --textconv etc.
>> >
>> > This is not right. --batch-all-objects does not provide a mode exclusive
>> > with "-t", etc, by itself.
>> 
>> Yes it does. See the "if (opt) {" branch on master. We just don't
>> implement it via a cmdmode, but --batch-all-objects can definitely be a
>> CMDMODE (I see you found that out below...)
>
> I agree that if you make it a CMDMODE it does not introduce any bugs.
> But it is semantically confusing. You would not make, say, --buffer a
> CMDMODE option. It is a flag which only takes effect under certain
> modes. And the same is true of --batch-all-objects, which modifies the
> batch cmd modes.
>
> In fact, it _would_ be a bug to make it a CMDMODE if --batch were
> correctly marked as one (but it is not sufficient to reason the other
> way; --batch without --batch-all-objects is still mutually exclusive
> with -t, etc).
>
 > What really makes things confusing, IMHO, is the --textconv and --filter
> options. They are marked as CMDMODEs, and they are indeed mutually
> exclusive with -t, etc. But they also work with --batch, which is itself
> a different mode.
>
> So I don't think OPT_CMDMODE could ever present this complete set of
> rules, because they are not all mutually exclusive with each other. But
> I think calling "--batch-all-objects" a mode is just muddying the waters
> even further.

I think we've got some different understanding of what a CMDMODE
means. --batch-all-objects should be a cmdmode, but --batch, --buffer
etc. can't be. Similarly it's not a bug that --filters and --textconv
are cmdmodes, but you think that's bad.

I think it's fair to say that you think this because "the batch mode" is
a ... mode, that cat-file operates in, therefore it's weird that we
don't call it a "command", but that when you're doing "cat-file --batch
--textconv[...]" you're in "textconv mode" or whatever.

I agree that it's a bit weird. But OPT_CMDMODE() under the hood is just
a way to label N options as being mutually exclusive with each other,
and not needing to follow-up parse_options() with a bunch of "die() if x
&& y" etc.

So both in terms of code clarity and to enable later clever use of
parse_options() I think just squinting a bit and reading it as
s/OPT_CMDMODE/OPT_MUTUALLY_EXCLUSIVE_WITH_OTHER_SUCH/ makes the most
sense.

E.g. for any command-mode we should be able to teach the completion that:

    git cat-file --batch-all-objects --<tab>

Should only complete those options that go with it. We don't have that
full picture yet (since we just have cmdmode, but no way to say A goes
with B and C, but not D), but a rough working start at that is to
exclude all the other cmdmode options.
Junio C Hamano Oct. 8, 2021, 8:34 p.m. UTC | #8
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>> In fact, it _would_ be a bug to make it a CMDMODE if --batch were
>> correctly marked as one (but it is not sufficient to reason the other
>> way; --batch without --batch-all-objects is still mutually exclusive
>> with -t, etc).
>>
>  > What really makes things confusing, IMHO, is the --textconv and --filter
>> options. They are marked as CMDMODEs, and they are indeed mutually
>> exclusive with -t, etc. But they also work with --batch, which is itself
>> a different mode.
>>
>> So I don't think OPT_CMDMODE could ever present this complete set of
>> rules, because they are not all mutually exclusive with each other. But
>> I think calling "--batch-all-objects" a mode is just muddying the waters
>> even further.
>
> I think we've got some different understanding of what a CMDMODE
> means. --batch-all-objects should be a cmdmode, but --batch, --buffer
> etc. can't be. Similarly it's not a bug that --filters and --textconv
> are cmdmodes, but you think that's bad.

Among options[] elements, "batch" and "batch-check" take &batch, and
they are obviously mutually exclusive.  "batch-all-objects" can flip
the batch.all_objects flag to affect operations that use &batch
(namely, these two), so it is more like a modifier and can never be
a cmdmode.

Is "git cat-file -t tag --batch" a valid way to invoke the command?
Are there options (like "-t" in the above example) that are marked
with OPT_CMDMODE that can be used with "--batch" or "--batch-check"?

If the answer is "no", then "--batch" and "--batch-check" could also
be command modes, but I suspect OPT_CMDMODE() does not have enough
flexibility to say "use the &opt to record which command mode is
requested, and by the way, there is this extra pointer &batch to
stuff necessary information in and use this callback to fill it", so
even if "--batch" and "--batch-check" are incompatible with existing
command modes, it needs a bit or preparatory work to make them.
Jeff King Oct. 8, 2021, 9:44 p.m. UTC | #9
On Fri, Oct 08, 2021 at 01:34:26PM -0700, Junio C Hamano wrote:

> >> So I don't think OPT_CMDMODE could ever present this complete set of
> >> rules, because they are not all mutually exclusive with each other. But
> >> I think calling "--batch-all-objects" a mode is just muddying the waters
> >> even further.
> >
> > I think we've got some different understanding of what a CMDMODE
> > means. --batch-all-objects should be a cmdmode, but --batch, --buffer
> > etc. can't be. Similarly it's not a bug that --filters and --textconv
> > are cmdmodes, but you think that's bad.
> 
> Among options[] elements, "batch" and "batch-check" take &batch, and
> they are obviously mutually exclusive.  "batch-all-objects" can flip
> the batch.all_objects flag to affect operations that use &batch
> (namely, these two), so it is more like a modifier and can never be
> a cmdmode.
> 
> Is "git cat-file -t tag --batch" a valid way to invoke the command?
> Are there options (like "-t" in the above example) that are marked
> with OPT_CMDMODE that can be used with "--batch" or "--batch-check"?
> 
> If the answer is "no", then "--batch" and "--batch-check" could also
> be command modes, but I suspect OPT_CMDMODE() does not have enough
> flexibility to say "use the &opt to record which command mode is
> requested, and by the way, there is this extra pointer &batch to
> stuff necessary information in and use this callback to fill it", so
> even if "--batch" and "--batch-check" are incompatible with existing
> command modes, it needs a bit or preparatory work to make them.

Yes, it would definitely need that extension. But it's also weirder than
that. --textconv is an OPT_CMDMODE(), because it is mutually exclusive
with "-t", etc. But it is not exclusive with "--batch". So there is not
a single set of mutually exclusive cmdmodes. There are three sets:

  1. single-object options like "-t", "-p", etc

  2. --textconv and --filters, which can take a single object or batch
     input

  3. batch options like --batch-check --batch

(1) is mutually exclusive with (2) and (3), but the latter two are not
exclusive with each other.

The current code uses OPT_CMDMODE() for (1) and (2), and then manually
enforces the exclusion between (1) and (3). But IMHO it is (2) that is
the odd-man out, in that it can be its own mode or a modifier, and it
probably should not be OPT_CMDMODE() (but from the end-user perspective,
that is OK, though it may influence how we document or group things).

I do not think talking about --batch-all-objects makes sense here. As
you said above (and I said already several times), it is a modifier for
--batch/--batch-check. Making it into a CMDMODE does not break anything
(currently), but:

  - it does not really help; it only makes sense with batch options, and
    we already must manually catch those interacting with single-object
    options (because making them OPT_CMDMODE does not work for the
    reasons above).

  - if we _were_ to make the batch options CMDMODEs, then it would be
    doing the wrong thing (though I don't foresee that happening because
    of those same reasons)

  - if we allowed "--textconv --batch --batch-all-objects", then it
    would likewise do the wrong thing (by conflicting with --textconv's
    CMDMODE). We disallow that now, but it is a potentially well-formed
    request (show all objects, textconv-ing blobs). It's not useful
    enough for anybody to have cared, but I think it shows why the
    semantic distinction of treating --batch-all-objects as a
    batch-option and not itself a mode is important.

-Peff
Junio C Hamano Oct. 8, 2021, 10:04 p.m. UTC | #10
Jeff King <peff@peff.net> writes:

> Yes, it would definitely need that extension. But it's also weirder than
> that. --textconv is an OPT_CMDMODE(), because it is mutually exclusive
> with "-t", etc.

Yeah, in hindsight, we should have made "--textconv" a modifier for
"-p", because it is not a true cmdmode.  It is much easier to
understand if you imagine "--textconv", without a command mode,
implies the "-p" mode, but when a command mode like "--batch" is
given, that would apply.  And it is job of other individual command
modes to notice that "--textconv" modifier does not make sense in
their context and issue a warning.

> The current code uses OPT_CMDMODE() for (1) and (2), and then manually
> enforces the exclusion between (1) and (3). But IMHO it is (2) that is
> the odd-man out, in that it can be its own mode or a modifier, and it
> probably should not be OPT_CMDMODE() (but from the end-user perspective,
> that is OK, though it may influence how we document or group things).

I guess we are exactly on the same page (see "'textconv' is a
modifier, which implies '-p' command mode unless otherwise
specified" above).
diff mbox series

Patch

diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index 4eb0421b3f..6467707c6e 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -94,8 +94,9 @@  OPTIONS
 	Instead of reading a list of objects on stdin, perform the
 	requested batch operation on all objects in the repository and
 	any alternate object stores (not just reachable objects).
-	Requires `--batch` or `--batch-check` be specified. Note that
-	the objects are visited in order sorted by their hashes.
+	Requires `--batch` or `--batch-check` be specified. By default,
+	the objects are visited in order sorted by their hashes; see
+	also `--unordered` below.
 
 --buffer::
 	Normally batch output is flushed after each object is output, so