diff mbox series

[v4,8/8] cat-file: add --batch-command remote-object-info command

Message ID 20220502170904.2770649-9-calvinwan@google.com (mailing list archive)
State New, archived
Headers show
Series cat-file: add --batch-command remote-object-info command | expand

Commit Message

Calvin Wan May 2, 2022, 5:09 p.m. UTC
Since the `info` command in cat-file --batch-command prints object info
for a given object, it is natural to add another command in cat-file
--batch-command to print object info for a given object from a remote.
Add `remote-object-info` to cat-file --batch-command.

While `info` takes object ids one at a time, this creates overhead when
making requests to a server so `remote-object-info` instead can take
multiple object ids at once.

cat-file --batch-command is generally implemented in the following
manner:

 - Receive and parse input from user
 - Call respective function attached to command
 - Set batch mode state, get object info, print object info

In --buffer mode, this changes to:

 - Receive and parse input from user
 - Store respective function attached to command in a queue
 - After flush, loop through commands in queue
    - Call respective function attached to command
    - Set batch mode state, get object info, print object info

Notice how the getting and printing of object info is accomplished one
at a time. As described above, this creates a problem for making
requests to a server. Therefore, `remote-object-info` is implemented in
the following manner:

 - Receive and parse input from user
 If command is `remote-object-info`:
    - Get object info from remote
    - Loop through object info
        - Call respective function attached to `info`
        - Set batch mode state, use passed in object info, print object
          info
 Else:
    - Call respective function attached to command
    - Parse input, get object info, print object info

And finally for --buffer mode `remote-object-info`:
 - Receive and parse input from user
 - Store respective function attached to command in a queue
 - After flush, loop through commands in queue:
    If command is `remote-object-info`:
        - Get object info from remote
        - Loop through object info
            - Call respective function attached to `info`
            - Set batch mode state, use passed in object info, print
              object info
    Else:
        - Call respective function attached to command
        - Set batch mode state, get object info, print object info

To summarize, `remote-object-info` gets object info from the remote and
then generates multiples `info` commands with the object info passed in.
It cannot be implemented similarly to `info` and `content` because of
the overhead with remote commenication.

---
 Documentation/git-cat-file.txt |  16 +-
 builtin/cat-file.c             | 200 +++++++++++++++----
 t/t1006-cat-file.sh            | 347 +++++++++++++++++++++++++++++++++
 3 files changed, 522 insertions(+), 41 deletions(-)

Comments

Jonathan Tan May 4, 2022, 9:27 p.m. UTC | #1
Thanks for the patch. I'll limit myself to high-level comments for now,
since the design may need to change if we want to support mixing
remote-object-info with other commands in a --batch-command --buffer
invocation.

Calvin Wan <calvinwan@google.com> writes:
> Since the `info` command in cat-file --batch-command prints object info
> for a given object, it is natural to add another command in cat-file
> --batch-command to print object info for a given object from a remote.
> Add `remote-object-info` to cat-file --batch-command.
> 
> While `info` takes object ids one at a time, this creates overhead when
> making requests to a server so `remote-object-info` instead can take
> multiple object ids at once.
> 
> cat-file --batch-command is generally implemented in the following
> manner:
> 
>  - Receive and parse input from user
>  - Call respective function attached to command
>  - Set batch mode state, get object info, print object info

You mention below that this only works in --buffer mode, so I don't
think that this is relevant.

> In --buffer mode, this changes to:
> 
>  - Receive and parse input from user
>  - Store respective function attached to command in a queue
>  - After flush, loop through commands in queue
>     - Call respective function attached to command
>     - Set batch mode state, get object info, print object info
> 
> Notice how the getting and printing of object info is accomplished one
> at a time. As described above, this creates a problem for making
> requests to a server. Therefore, `remote-object-info` is implemented in
> the following manner:

If you do want this command to work with --buffer and without, probably
clearer to replace "Notice" with "Notice that when cat-file is invoked
both without --buffer and with it". (Having said that, it makes sense to
me to only make it work with --buffer - making one request per OID would
probably not be something that the user would want.)

>  - Receive and parse input from user
>  If command is `remote-object-info`:
>     - Get object info from remote
>     - Loop through object info
>         - Call respective function attached to `info`
>         - Set batch mode state, use passed in object info, print object
>           info
>  Else:
>     - Call respective function attached to command
>     - Parse input, get object info, print object info
> 
> And finally for --buffer mode `remote-object-info`:
>  - Receive and parse input from user
>  - Store respective function attached to command in a queue
>  - After flush, loop through commands in queue:
>     If command is `remote-object-info`:
>         - Get object info from remote
>         - Loop through object info
>             - Call respective function attached to `info`
>             - Set batch mode state, use passed in object info, print
>               object info
>     Else:
>         - Call respective function attached to command
>         - Set batch mode state, get object info, print object info
> 
> To summarize, `remote-object-info` gets object info from the remote and
> then generates multiples `info` commands with the object info passed in.
> It cannot be implemented similarly to `info` and `content` because of
> the overhead with remote commenication.

What happens if remote-object-info is mixed with other commands?

> diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
> index 24a811f0ef..0d9e8e6214 100644
> --- a/Documentation/git-cat-file.txt
> +++ b/Documentation/git-cat-file.txt
> @@ -115,6 +115,10 @@ info <object>::
>  	Print object info for object reference `<object>`. This corresponds to the
>  	output of `--batch-check`.
>  
> +remote-object-info <remote> [<object>]::
> +	Print object info for object references `<object>` at specified <remote>.
> +	This command may only be combined with `--buffer`.

Here you mention that we need --buffer.

> @@ -253,13 +258,14 @@ newline. The available atoms are:
>  
>  `objectsize:disk`::
>  	The size, in bytes, that the object takes up on disk. See the
> -	note about on-disk sizes in the `CAVEATS` section below.
> +	note about on-disk sizes in the `CAVEATS` section below. Not
> +	supported by `remote-object-info`.
>  
>  `deltabase`::
>  	If the object is stored as a delta on-disk, this expands to the
>  	full hex representation of the delta base object name.
>  	Otherwise, expands to the null OID (all zeroes). See `CAVEATS`
> -	below.
> +	below. Not supported by `remote-object-info`.

OK - makes sense that these are not supported by remote-object-info,
because the remote may not even store objects in delta format.

> @@ -32,11 +35,14 @@ struct batch_options {
>  	int unordered;
>  	int transform_mode; /* may be 'w' or 'c' for --filters or --textconv */
>  	const char *format;
> +	int use_remote_info;
>  };
>  
>  #define DEFAULT_FORMAT "%(objectname) %(objecttype) %(objectsize)"
>  
>  static const char *force_path;
> +static struct object_info *remote_object_info;
> +static struct oid_array object_info_oids = OID_ARRAY_INIT;

Hmm...I would have expected the information to appear in the same data
structure used to store other command invocations like "contents",
"info", and "flush", not separately like this. This design doesn't seem
to support mixing commands.

> @@ -538,6 +611,7 @@ typedef void (*parse_cmd_fn_t)(struct batch_options *, const char *,
>  struct queued_cmd {
>  	parse_cmd_fn_t fn;
>  	char *line;
> +	const char *name;
>  };

Rather than store a pointer like this (and thus needing to keep track of
the lifetime of the memory pointed to by this pointer), if we only need
to know whether this queued_cmd is a remote-object-info call, we can
just use "unsigned is_remote_object_info : 1".

> @@ -565,9 +639,49 @@ static const struct parse_cmd {
>  } commands[] = {
>  	{ "contents", parse_cmd_contents, 1},
>  	{ "info", parse_cmd_info, 1},
> +	{ "remote-object-info", parse_cmd_info, 1},
>  	{ "flush", NULL, 0},
>  };

Double parse_cmd_info?
Calvin Wan May 5, 2022, 6:13 p.m. UTC | #2
> You mention below that this only works in --buffer mode, so I don't
> think that this is relevant.

It works in non-buffer mode as well. I meant that if you wanted to use
`remote-object-info` with another option, then --filter is the only
one that it works with. I tried to match the documentation for
--batch-command that says something similar:

--batch-command=<format>
Enter a command mode that reads commands and arguments from stdin. May
only be combined with --buffer, --textconv or --filters

> What happens if remote-object-info is mixed with other commands?
It works with other commands. The idea is that you would enter a line like:

remote-object-info <remote> <oid> <oid> ... <oid>

rather than

remote-object-info <remote> <oid>
remote-object-info <remote> <oid>
remote-object-info <remote> <oid>

since each invocation of remote-object-info would cause another round
trip to the server.

> Hmm...I would have expected the information to appear in the same data
> structure used to store other command invocations like "contents",
> "info", and "flush", not separately like this. This design doesn't seem
> to support mixing commands.

Here's an example of how it works with other commands. Let's say I
call git cat-file --batch-command --buffer and input the following:

remote-object-info remote1 oid1 oid2 oid3
contents oid4
info oid5
remote-object-info remote2 oid6 oid7 oid8
flush

This would result in the following:

(call transport and store info in separate structure)
info oid1 (using passed in object-info)
info oid2
info oid3
contents oid4
info oid5
(call transport and store info in separate structure)
info oid6
info oid7
info oid8

> Rather than store a pointer like this (and thus needing to keep track of
> the lifetime of the memory pointed to by this pointer), if we only need
> to know whether this queued_cmd is a remote-object-info call, we can
> just use "unsigned is_remote_object_info : 1".

OK

>> @@ -565,9 +639,49 @@ static const struct parse_cmd {
>>  } commands[] = {
>>       { "contents", parse_cmd_contents, 1},
>>       { "info", parse_cmd_info, 1},
>> +     { "remote-object-info", parse_cmd_info, 1},
>>       { "flush", NULL, 0},
>>  };

> Double parse_cmd_info?

See above for why remote-object-info becomes info



On Wed, May 4, 2022 at 2:27 PM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> Thanks for the patch. I'll limit myself to high-level comments for now,
> since the design may need to change if we want to support mixing
> remote-object-info with other commands in a --batch-command --buffer
> invocation.
>
> Calvin Wan <calvinwan@google.com> writes:
> > Since the `info` command in cat-file --batch-command prints object info
> > for a given object, it is natural to add another command in cat-file
> > --batch-command to print object info for a given object from a remote.
> > Add `remote-object-info` to cat-file --batch-command.
> >
> > While `info` takes object ids one at a time, this creates overhead when
> > making requests to a server so `remote-object-info` instead can take
> > multiple object ids at once.
> >
> > cat-file --batch-command is generally implemented in the following
> > manner:
> >
> >  - Receive and parse input from user
> >  - Call respective function attached to command
> >  - Set batch mode state, get object info, print object info
>
> You mention below that this only works in --buffer mode, so I don't
> think that this is relevant.
>
> > In --buffer mode, this changes to:
> >
> >  - Receive and parse input from user
> >  - Store respective function attached to command in a queue
> >  - After flush, loop through commands in queue
> >     - Call respective function attached to command
> >     - Set batch mode state, get object info, print object info
> >
> > Notice how the getting and printing of object info is accomplished one
> > at a time. As described above, this creates a problem for making
> > requests to a server. Therefore, `remote-object-info` is implemented in
> > the following manner:
>
> If you do want this command to work with --buffer and without, probably
> clearer to replace "Notice" with "Notice that when cat-file is invoked
> both without --buffer and with it". (Having said that, it makes sense to
> me to only make it work with --buffer - making one request per OID would
> probably not be something that the user would want.)
>
> >  - Receive and parse input from user
> >  If command is `remote-object-info`:
> >     - Get object info from remote
> >     - Loop through object info
> >         - Call respective function attached to `info`
> >         - Set batch mode state, use passed in object info, print object
> >           info
> >  Else:
> >     - Call respective function attached to command
> >     - Parse input, get object info, print object info
> >
> > And finally for --buffer mode `remote-object-info`:
> >  - Receive and parse input from user
> >  - Store respective function attached to command in a queue
> >  - After flush, loop through commands in queue:
> >     If command is `remote-object-info`:
> >         - Get object info from remote
> >         - Loop through object info
> >             - Call respective function attached to `info`
> >             - Set batch mode state, use passed in object info, print
> >               object info
> >     Else:
> >         - Call respective function attached to command
> >         - Set batch mode state, get object info, print object info
> >
> > To summarize, `remote-object-info` gets object info from the remote and
> > then generates multiples `info` commands with the object info passed in.
> > It cannot be implemented similarly to `info` and `content` because of
> > the overhead with remote commenication.
>
> What happens if remote-object-info is mixed with other commands?
>
> > diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
> > index 24a811f0ef..0d9e8e6214 100644
> > --- a/Documentation/git-cat-file.txt
> > +++ b/Documentation/git-cat-file.txt
> > @@ -115,6 +115,10 @@ info <object>::
> >       Print object info for object reference `<object>`. This corresponds to the
> >       output of `--batch-check`.
> >
> > +remote-object-info <remote> [<object>]::
> > +     Print object info for object references `<object>` at specified <remote>.
> > +     This command may only be combined with `--buffer`.
>
> Here you mention that we need --buffer.
>
> > @@ -253,13 +258,14 @@ newline. The available atoms are:
> >
> >  `objectsize:disk`::
> >       The size, in bytes, that the object takes up on disk. See the
> > -     note about on-disk sizes in the `CAVEATS` section below.
> > +     note about on-disk sizes in the `CAVEATS` section below. Not
> > +     supported by `remote-object-info`.
> >
> >  `deltabase`::
> >       If the object is stored as a delta on-disk, this expands to the
> >       full hex representation of the delta base object name.
> >       Otherwise, expands to the null OID (all zeroes). See `CAVEATS`
> > -     below.
> > +     below. Not supported by `remote-object-info`.
>
> OK - makes sense that these are not supported by remote-object-info,
> because the remote may not even store objects in delta format.
>
> > @@ -32,11 +35,14 @@ struct batch_options {
> >       int unordered;
> >       int transform_mode; /* may be 'w' or 'c' for --filters or --textconv */
> >       const char *format;
> > +     int use_remote_info;
> >  };
> >
> >  #define DEFAULT_FORMAT "%(objectname) %(objecttype) %(objectsize)"
> >
> >  static const char *force_path;
> > +static struct object_info *remote_object_info;
> > +static struct oid_array object_info_oids = OID_ARRAY_INIT;
>
> Hmm...I would have expected the information to appear in the same data
> structure used to store other command invocations like "contents",
> "info", and "flush", not separately like this. This design doesn't seem
> to support mixing commands.
>
> > @@ -538,6 +611,7 @@ typedef void (*parse_cmd_fn_t)(struct batch_options *, const char *,
> >  struct queued_cmd {
> >       parse_cmd_fn_t fn;
> >       char *line;
> > +     const char *name;
> >  };
>
> Rather than store a pointer like this (and thus needing to keep track of
> the lifetime of the memory pointed to by this pointer), if we only need
> to know whether this queued_cmd is a remote-object-info call, we can
> just use "unsigned is_remote_object_info : 1".
>
> > @@ -565,9 +639,49 @@ static const struct parse_cmd {
> >  } commands[] = {
> >       { "contents", parse_cmd_contents, 1},
> >       { "info", parse_cmd_info, 1},
> > +     { "remote-object-info", parse_cmd_info, 1},
> >       { "flush", NULL, 0},
> >  };
>
> Double parse_cmd_info?
Junio C Hamano May 5, 2022, 6:44 p.m. UTC | #3
Calvin Wan <calvinwan@google.com> writes:

> It works with other commands. The idea is that you would enter a line like:
>
> remote-object-info <remote> <oid> <oid> ... <oid>
>
> rather than
>
> remote-object-info <remote> <oid>
> remote-object-info <remote> <oid>
> remote-object-info <remote> <oid>
>
> since each invocation of remote-object-info would cause another round
> trip to the server.

Hmm.  It looks unfortunate, but if we do not care about hitting the
pkt-line length limit, that might be OK.  I dunno.

I would have expected to see something like

	start batch
	request 1
	request 2
	...
	request 2000
	flush batch

during which the other side patiently listens to our requests.  They
(meaning the local process that reads the above and talks to a
remote as needed) can coalesce the requests of the same kind
(e.g. going to the same remote) while buffering and optimize their
operation without having the caller of them to worry about it that
way, no?
Junio C Hamano May 5, 2022, 7:09 p.m. UTC | #4
Junio C Hamano <gitster@pobox.com> writes:

> I would have expected to see something like
>
> 	start batch
> 	request 1
> 	request 2
> 	...
> 	request 2000
> 	flush batch
>
> during which the other side patiently listens to our requests.  They
> (meaning the local process that reads the above and talks to a
> remote as needed) can coalesce the requests of the same kind
> (e.g. going to the same remote) while buffering and optimize their
> operation without having the caller of them to worry about it that
> way, no?

Ah, that is in the batch mode, and you were syaing that "one request
with multiple objects" would allow multiple object-info requests to
be hanled efficiently even in non-batch mode.  If that was what you
were talking about, I think that does make sense.  Thanks.
Calvin Wan May 5, 2022, 7:19 p.m. UTC | #5
> Ah, that is in the batch mode, and you were syaing that "one request
> with multiple objects" would allow multiple object-info requests to
> be hanled efficiently even in non-batch mode.  If that was what you
> were talking about, I think that does make sense.  Thanks.

Correct, also I think you mean "buffer" mode rather than "batch" mode.

On Thu, May 5, 2022 at 12:09 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Junio C Hamano <gitster@pobox.com> writes:
>
> > I would have expected to see something like
> >
> >       start batch
> >       request 1
> >       request 2
> >       ...
> >       request 2000
> >       flush batch
> >
> > during which the other side patiently listens to our requests.  They
> > (meaning the local process that reads the above and talks to a
> > remote as needed) can coalesce the requests of the same kind
> > (e.g. going to the same remote) while buffering and optimize their
> > operation without having the caller of them to worry about it that
> > way, no?
>
> Ah, that is in the batch mode, and you were syaing that "one request
> with multiple objects" would allow multiple object-info requests to
> be hanled efficiently even in non-batch mode.  If that was what you
> were talking about, I think that does make sense.  Thanks.
>
ZheNing Hu July 31, 2022, 3:24 p.m. UTC | #6
Calvin Wan <calvinwan@google.com> 于2022年5月3日周二 06:33写道:
>
> Since the `info` command in cat-file --batch-command prints object info
> for a given object, it is natural to add another command in cat-file
> --batch-command to print object info for a given object from a remote.
> Add `remote-object-info` to cat-file --batch-command.
>
> While `info` takes object ids one at a time, this creates overhead when
> making requests to a server so `remote-object-info` instead can take
> multiple object ids at once.
>
> cat-file --batch-command is generally implemented in the following
> manner:
>
>  - Receive and parse input from user
>  - Call respective function attached to command
>  - Set batch mode state, get object info, print object info
>
> In --buffer mode, this changes to:
>
>  - Receive and parse input from user
>  - Store respective function attached to command in a queue
>  - After flush, loop through commands in queue
>     - Call respective function attached to command
>     - Set batch mode state, get object info, print object info
>
> Notice how the getting and printing of object info is accomplished one
> at a time. As described above, this creates a problem for making
> requests to a server. Therefore, `remote-object-info` is implemented in
> the following manner:
>
>  - Receive and parse input from user
>  If command is `remote-object-info`:
>     - Get object info from remote
>     - Loop through object info
>         - Call respective function attached to `info`
>         - Set batch mode state, use passed in object info, print object
>           info
>  Else:
>     - Call respective function attached to command
>     - Parse input, get object info, print object info
>
> And finally for --buffer mode `remote-object-info`:
>  - Receive and parse input from user
>  - Store respective function attached to command in a queue
>  - After flush, loop through commands in queue:
>     If command is `remote-object-info`:
>         - Get object info from remote
>         - Loop through object info
>             - Call respective function attached to `info`
>             - Set batch mode state, use passed in object info, print
>               object info
>     Else:
>         - Call respective function attached to command
>         - Set batch mode state, get object info, print object info
>
> To summarize, `remote-object-info` gets object info from the remote and
> then generates multiples `info` commands with the object info passed in.
> It cannot be implemented similarly to `info` and `content` because of
> the overhead with remote commenication.
>
> ---
>  Documentation/git-cat-file.txt |  16 +-
>  builtin/cat-file.c             | 200 +++++++++++++++----
>  t/t1006-cat-file.sh            | 347 +++++++++++++++++++++++++++++++++
>  3 files changed, 522 insertions(+), 41 deletions(-)
>
> diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
> index 24a811f0ef..0d9e8e6214 100644
> --- a/Documentation/git-cat-file.txt
> +++ b/Documentation/git-cat-file.txt
> @@ -115,6 +115,10 @@ info <object>::
>         Print object info for object reference `<object>`. This corresponds to the
>         output of `--batch-check`.
>
> +remote-object-info <remote> [<object>]::
> +       Print object info for object references `<object>` at specified <remote>.
> +       This command may only be combined with `--buffer`.
> +
>  flush::
>         Used with `--buffer` to execute all preceding commands that were issued
>         since the beginning or since the last flush was issued. When `--buffer`
> @@ -245,7 +249,8 @@ newline. The available atoms are:
>         The full hex representation of the object name.
>
>  `objecttype`::
> -       The type of the object (the same as `cat-file -t` reports).
> +       The type of the object (the same as `cat-file -t` reports). See
> +       `CAVEATS` below.
>
>  `objectsize`::
>         The size, in bytes, of the object (the same as `cat-file -s`
> @@ -253,13 +258,14 @@ newline. The available atoms are:
>
>  `objectsize:disk`::
>         The size, in bytes, that the object takes up on disk. See the
> -       note about on-disk sizes in the `CAVEATS` section below.
> +       note about on-disk sizes in the `CAVEATS` section below. Not
> +       supported by `remote-object-info`.
>
>  `deltabase`::
>         If the object is stored as a delta on-disk, this expands to the
>         full hex representation of the delta base object name.
>         Otherwise, expands to the null OID (all zeroes). See `CAVEATS`
> -       below.
> +       below. Not supported by `remote-object-info`.
>
>  `rest`::
>         If this atom is used in the output string, input lines are split

Why not forbidden %(rest) here too?

> +static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
> +{
> +       int retval = 0;
> +       size_t i;
> +       struct remote *remote = NULL;
> +       struct object_id oid;
> +       struct string_list object_info_options = STRING_LIST_INIT_NODUP;
> +       static struct transport *gtransport;
> +       char *format = DEFAULT_FORMAT;
> +
> +       if (opt->format)
> +               format = xstrdup(opt->format);
> +
> +       remote = remote_get(argv[0]);
> +       if (!remote)
> +               die(_("must supply valid remote when using --object-info"));
> +       oid_array_clear(&object_info_oids);
> +       for (i = 1; i < argc; i++) {
> +               if (get_oid(argv[i], &oid))
> +                       die(_("malformed object id '%s'"), argv[i]);
> +               oid_array_append(&object_info_oids, &oid);
> +       }
> +
> +       gtransport = transport_get(remote, NULL);
> +       if (gtransport->smart_options) {
> +               size_t j;
> +               int include_size = 0, include_type = 0;
> +
> +               remote_object_info = xcalloc(object_info_oids.nr, sizeof(struct object_info));
> +               gtransport->smart_options->object_info = 1;
> +               gtransport->smart_options->object_info_oids = &object_info_oids;
> +               /**
> +                * 'size' is the only option currently supported.
> +                * Other options that are passed in the format will default to a
> +                * standard fetch request rather than object-info.
> +                */
> +               if (strstr(format, "%(objectsize)")) {
> +                       string_list_append(&object_info_options, "size");
> +                       include_size = 1;
> +               }
> +               if (strstr(format, "%(objecttype)")) {
> +                       string_list_append(&object_info_options, "type");
> +                       include_type = 1;
> +               }
> +               if (strstr(format, "%(objectsize:disk)"))
> +                       die(_("objectsize:disk is currently not supported with remote-object-info"));
> +               if (strstr(format, "%(deltabase)"))
> +                       die(_("deltabase is currently not supported with remote-object-info"));

%(rest) too?

> +               if (object_info_options.nr > 0) {
> +                       gtransport->smart_options->object_info_options = &object_info_options;
> +                       for (j = 0; j < object_info_oids.nr; j++) {
> +                               if (include_size)
> +                                       remote_object_info[j].sizep = xcalloc(1, sizeof(long));
> +                               if (include_type)
> +                                       remote_object_info[j].typep = xcalloc(1, sizeof(enum object_type));
> +                       }
> +                       gtransport->smart_options->object_info_data = &remote_object_info;
> +                       retval = transport_fetch_refs(gtransport, NULL);
> +               }
> +       } else {
> +               retval = -1;
> +       }
> +       return retval;
> +}

Maybe such code style will be better?

        if (!gtransport->smart_options) {
               return -1;
        }
        ...
        return transport_fetch_refs(gtransport, NULL);

> +
>  struct object_cb_data {
>         struct batch_options *opt;
>         struct expand_data *expand;
> @@ -538,6 +611,7 @@ typedef void (*parse_cmd_fn_t)(struct batch_options *, const char *,
>  struct queued_cmd {
>         parse_cmd_fn_t fn;
>         char *line;
> +       const char *name;
>  };
>
>  static void parse_cmd_contents(struct batch_options *opt,
> @@ -565,9 +639,49 @@ static const struct parse_cmd {
>  } commands[] = {
>         { "contents", parse_cmd_contents, 1},
>         { "info", parse_cmd_info, 1},
> +       { "remote-object-info", parse_cmd_info, 1},
>         { "flush", NULL, 0},
>  };
>
> +static void parse_remote_info(struct batch_options *opt,
> +                          char *line,
> +                          struct strbuf *output,
> +                          struct expand_data *data,
> +                          const struct parse_cmd *p_cmd,
> +                          struct queued_cmd *q_cmd)
> +{
> +       int count;
> +       size_t i;
> +       const char **argv;
> +
> +       count = split_cmdline(line, &argv);
> +       if (get_remote_info(opt, count, argv))
> +               goto cleanup;
> +       opt->use_remote_info = 1;
> +       data->skip_object_info = 1;
> +       data->mark_query = 0;
> +       for (i = 0; i < object_info_oids.nr; i++) {
> +               if (remote_object_info[i].sizep)
> +                       data->size = *remote_object_info[i].sizep;
> +               if (remote_object_info[i].typep)
> +                       data->type = *remote_object_info[i].typep;
> +
> +               data->oid = object_info_oids.oid[i];
> +               if (p_cmd)
> +                       p_cmd->fn(opt, argv[i+1], output, data);
> +               else
> +                       q_cmd->fn(opt, argv[i+1], output, data);
> +       }
> +       opt->use_remote_info = 0;
> +       data->skip_object_info = 0;
> +       data->mark_query = 1;
> +
> +cleanup:
> +       for (i = 0; i < object_info_oids.nr; i++)
> +               free_object_info_contents(&remote_object_info[i]);
> +       free(remote_object_info);
> +}
> +
>  static void dispatch_calls(struct batch_options *opt,
>                 struct strbuf *output,
>                 struct expand_data *data,
> @@ -579,8 +693,12 @@ static void dispatch_calls(struct batch_options *opt,
>         if (!opt->buffer_output)
>                 die(_("flush is only for --buffer mode"));
>
> -       for (i = 0; i < nr; i++)
> -               cmd[i].fn(opt, cmd[i].line, output, data);
> +       for (i = 0; i < nr; i++) {
> +               if (!strcmp(cmd[i].name, "remote-object-info"))
> +                       parse_remote_info(opt, cmd[i].line, output, data, NULL, &cmd[i]);
> +               else
> +                       cmd[i].fn(opt, cmd[i].line, output, data);
> +       }
>
>         fflush(stdout);
>  }
> @@ -640,11 +758,17 @@ static void batch_objects_command(struct batch_options *opt,
>                         dispatch_calls(opt, output, data, queued_cmd, nr);
>                         free_cmds(queued_cmd, &nr);
>                 } else if (!opt->buffer_output) {
> -                       cmd->fn(opt, p, output, data);
> +                       if (!strcmp(cmd->name, "remote-object-info")) {
> +                               char *line = xstrdup_or_null(p);
> +                               parse_remote_info(opt, line, output, data, cmd, NULL);

memory leak: "line".

> +                       } else {
> +                               cmd->fn(opt, p, output, data);
> +                       }
>                 } else {
>                         ALLOC_GROW(queued_cmd, nr + 1, alloc);
>                         call.fn = cmd->fn;
>                         call.line = xstrdup_or_null(p);
> +                       call.name = cmd->name;
>                         queued_cmd[nr++] = call;
>                 }
>         }
> +
> +test_expect_success 'remote-object-info fails on unspported filter option (objectsize:disk)' '
> +       (
> +               cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> +
> +               test_must_fail git cat-file --batch-command="%(objectsize:disk)" 2>err <<-EOF &&
> +               remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
> +               EOF
> +               test_i18ngrep "objectsize:disk is currently not supported with remote-object-info" err
> +       )
> +'
> +
> +test_expect_success 'remote-object-info fails on unspported filter option (deltabase)' '
> +       (
> +               cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> +
> +               test_must_fail git cat-file --batch-command="%(deltabase)" 2>err <<-EOF &&
> +               remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
> +               EOF
> +               test_i18ngrep "deltabase is currently not supported with remote-object-info" err
> +       )
> +'
> +

%(rest) too?

> +set_transport_variables "server"
> +
> +test_expect_success 'batch-command remote-object-info file://' '
> +       (
> +               cd server &&
> +
> +               echo "$hello_sha1 $hello_size" >expect &&
> +               echo "$tree_sha1 $tree_size" >>expect &&
> +               echo "$commit_sha1 $commit_size" >>expect &&
> +               echo "$tag_sha1 $tag_size" >>expect &&
> +               git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
> +               remote-object-info "file://$(pwd)" $hello_sha1
> +               remote-object-info "file://$(pwd)" $tree_sha1
> +               remote-object-info "file://$(pwd)" $commit_sha1
> +               remote-object-info "file://$(pwd)" $tag_sha1
> +               EOF
> +               test_cmp expect actual
> +       )

Can we support <rev> instead of only <oid> here?

$ git cat-file --batch-check
HEAD
28583b8d8ca72730d7c9e0ea50861ad431a6dea4 commit 3038
master
ab336e8f1c8009c8b1aab8deb592148e69217085 commit 281
origin/master
23b219f8e3f2adfb0441e135f0a880e6124f766c commit 282
origin/master:git.c
e5d62fa5a92e95e1ede041ebf913d841744c31f8 blob 28398

So cat-file --batch-check can support it.

$git cat-file --batch-commands
remote-object-info "file://$(pwd)" master:git.c

I guess it cannot work now, right?
Calvin Wan Aug. 8, 2022, 5:43 p.m. UTC | #7
> >  `rest`::
> >         If this atom is used in the output string, input lines are split
>
> Why not forbidden %(rest) here too?

Good catch. This should definitely be forbidden. I thought at first
this was inconsequential, but since I also have remote as part of
the input, this would no longer hold true.

> > +               if (strstr(format, "%(objectsize:disk)"))
> > +                       die(_("objectsize:disk is currently not supported with remote-object-info"));
> > +               if (strstr(format, "%(deltabase)"))
> > +                       die(_("deltabase is currently not supported with remote-object-info"));
>
> %(rest) too?

ditto

> Maybe such code style will be better?
>
>         if (!gtransport->smart_options) {
>                return -1;
>         }
>         ...
>         return transport_fetch_refs(gtransport, NULL);

That does look better!

> > -                       cmd->fn(opt, p, output, data);
> > +                       if (!strcmp(cmd->name, "remote-object-info")) {
> > +                               char *line = xstrdup_or_null(p);
> > +                               parse_remote_info(opt, line, output, data, cmd, NULL);
>
> memory leak: "line".

ack

> > +test_expect_success 'remote-object-info fails on unspported filter option (deltabase)' '
> > +       (
> > +               cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
> > +
> > +               test_must_fail git cat-file --batch-command="%(deltabase)" 2>err <<-EOF &&
> > +               remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
> > +               EOF
> > +               test_i18ngrep "deltabase is currently not supported with remote-object-info" err
> > +       )
> > +'
> > +
>
> %(rest) too?

ditto

> > +test_expect_success 'batch-command remote-object-info file://' '
> > +       (
> > +               cd server &&
> > +
> > +               echo "$hello_sha1 $hello_size" >expect &&
> > +               echo "$tree_sha1 $tree_size" >>expect &&
> > +               echo "$commit_sha1 $commit_size" >>expect &&
> > +               echo "$tag_sha1 $tag_size" >>expect &&
> > +               git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
> > +               remote-object-info "file://$(pwd)" $hello_sha1
> > +               remote-object-info "file://$(pwd)" $tree_sha1
> > +               remote-object-info "file://$(pwd)" $commit_sha1
> > +               remote-object-info "file://$(pwd)" $tag_sha1
> > +               EOF
> > +               test_cmp expect actual
> > +       )
>
> Can we support <rev> instead of only <oid> here?

Not at the current moment. The server is unable to handle
anything besides oids.

>
> $ git cat-file --batch-check
> HEAD
> 28583b8d8ca72730d7c9e0ea50861ad431a6dea4 commit 3038
> master
> ab336e8f1c8009c8b1aab8deb592148e69217085 commit 281
> origin/master
> 23b219f8e3f2adfb0441e135f0a880e6124f766c commit 282
> origin/master:git.c
> e5d62fa5a92e95e1ede041ebf913d841744c31f8 blob 28398
>
> So cat-file --batch-check can support it.
>
> $git cat-file --batch-commands
> remote-object-info "file://$(pwd)" master:git.c
>
> I guess it cannot work now, right?

Correct.
diff mbox series

Patch

diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index 24a811f0ef..0d9e8e6214 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -115,6 +115,10 @@  info <object>::
 	Print object info for object reference `<object>`. This corresponds to the
 	output of `--batch-check`.
 
+remote-object-info <remote> [<object>]::
+	Print object info for object references `<object>` at specified <remote>.
+	This command may only be combined with `--buffer`.
+
 flush::
 	Used with `--buffer` to execute all preceding commands that were issued
 	since the beginning or since the last flush was issued. When `--buffer`
@@ -245,7 +249,8 @@  newline. The available atoms are:
 	The full hex representation of the object name.
 
 `objecttype`::
-	The type of the object (the same as `cat-file -t` reports).
+	The type of the object (the same as `cat-file -t` reports). See
+	`CAVEATS` below.
 
 `objectsize`::
 	The size, in bytes, of the object (the same as `cat-file -s`
@@ -253,13 +258,14 @@  newline. The available atoms are:
 
 `objectsize:disk`::
 	The size, in bytes, that the object takes up on disk. See the
-	note about on-disk sizes in the `CAVEATS` section below.
+	note about on-disk sizes in the `CAVEATS` section below. Not
+	supported by `remote-object-info`.
 
 `deltabase`::
 	If the object is stored as a delta on-disk, this expands to the
 	full hex representation of the delta base object name.
 	Otherwise, expands to the null OID (all zeroes). See `CAVEATS`
-	below.
+	below. Not supported by `remote-object-info`.
 
 `rest`::
 	If this atom is used in the output string, input lines are split
@@ -346,6 +352,10 @@  directory name.
 CAVEATS
 -------
 
+Note that since object type is currently not supported by the
+object-info command request, git fetches the entire object instead to
+get the object info.
+
 Note that the sizes of objects on disk are reported accurately, but care
 should be taken in drawing conclusions about which refs or objects are
 responsible for disk usage. The size of a packed non-delta object may be
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index cfe74c36eb..7baa9bed61 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -16,6 +16,9 @@ 
 #include "packfile.h"
 #include "object-store.h"
 #include "promisor-remote.h"
+#include "alias.h"
+#include "remote.h"
+#include "transport.h"
 
 enum batch_mode {
 	BATCH_MODE_CONTENTS,
@@ -32,11 +35,14 @@  struct batch_options {
 	int unordered;
 	int transform_mode; /* may be 'w' or 'c' for --filters or --textconv */
 	const char *format;
+	int use_remote_info;
 };
 
 #define DEFAULT_FORMAT "%(objectname) %(objecttype) %(objectsize)"
 
 static const char *force_path;
+static struct object_info *remote_object_info;
+static struct oid_array object_info_oids = OID_ARRAY_INIT;
 
 static int filter_object(const char *path, unsigned mode,
 			 const struct object_id *oid,
@@ -425,48 +431,115 @@  static void batch_one_object(const char *obj_name,
 	int flags = opt->follow_symlinks ? GET_OID_FOLLOW_SYMLINKS : 0;
 	enum get_oid_result result;
 
-	result = get_oid_with_context(the_repository, obj_name,
-				      flags, &data->oid, &ctx);
-	if (result != FOUND) {
-		switch (result) {
-		case MISSING_OBJECT:
-			printf("%s missing\n", obj_name);
-			break;
-		case SHORT_NAME_AMBIGUOUS:
-			printf("%s ambiguous\n", obj_name);
-			break;
-		case DANGLING_SYMLINK:
-			printf("dangling %"PRIuMAX"\n%s\n",
-			       (uintmax_t)strlen(obj_name), obj_name);
-			break;
-		case SYMLINK_LOOP:
-			printf("loop %"PRIuMAX"\n%s\n",
-			       (uintmax_t)strlen(obj_name), obj_name);
-			break;
-		case NOT_DIR:
-			printf("notdir %"PRIuMAX"\n%s\n",
-			       (uintmax_t)strlen(obj_name), obj_name);
-			break;
-		default:
-			BUG("unknown get_sha1_with_context result %d\n",
-			       result);
-			break;
+	if (!opt->use_remote_info) {
+		result = get_oid_with_context(the_repository, obj_name,
+						flags, &data->oid, &ctx);
+		if (result != FOUND) {
+			switch (result) {
+			case MISSING_OBJECT:
+				printf("%s missing\n", obj_name);
+				break;
+			case SHORT_NAME_AMBIGUOUS:
+				printf("%s ambiguous\n", obj_name);
+				break;
+			case DANGLING_SYMLINK:
+				printf("dangling %"PRIuMAX"\n%s\n",
+					(uintmax_t)strlen(obj_name), obj_name);
+				break;
+			case SYMLINK_LOOP:
+				printf("loop %"PRIuMAX"\n%s\n",
+					(uintmax_t)strlen(obj_name), obj_name);
+				break;
+			case NOT_DIR:
+				printf("notdir %"PRIuMAX"\n%s\n",
+					(uintmax_t)strlen(obj_name), obj_name);
+				break;
+			default:
+				BUG("unknown get_sha1_with_context result %d\n",
+					result);
+				break;
+			}
+			fflush(stdout);
+			return;
 		}
-		fflush(stdout);
-		return;
-	}
 
-	if (ctx.mode == 0) {
-		printf("symlink %"PRIuMAX"\n%s\n",
-		       (uintmax_t)ctx.symlink_path.len,
-		       ctx.symlink_path.buf);
-		fflush(stdout);
-		return;
+		if (ctx.mode == 0) {
+			printf("symlink %"PRIuMAX"\n%s\n",
+				(uintmax_t)ctx.symlink_path.len,
+				ctx.symlink_path.buf);
+			fflush(stdout);
+			return;
+		}
 	}
 
 	batch_object_write(obj_name, scratch, opt, data, NULL, 0);
 }
 
+static int get_remote_info(struct batch_options *opt, int argc, const char **argv)
+{
+	int retval = 0;
+	size_t i;
+	struct remote *remote = NULL;
+	struct object_id oid;
+	struct string_list object_info_options = STRING_LIST_INIT_NODUP;
+	static struct transport *gtransport;
+	char *format = DEFAULT_FORMAT;
+
+	if (opt->format)
+		format = xstrdup(opt->format);
+
+	remote = remote_get(argv[0]);
+	if (!remote)
+		die(_("must supply valid remote when using --object-info"));
+	oid_array_clear(&object_info_oids);
+	for (i = 1; i < argc; i++) {
+		if (get_oid(argv[i], &oid))
+			die(_("malformed object id '%s'"), argv[i]);
+		oid_array_append(&object_info_oids, &oid);
+	}
+
+	gtransport = transport_get(remote, NULL);
+	if (gtransport->smart_options) {
+		size_t j;
+		int include_size = 0, include_type = 0;
+
+		remote_object_info = xcalloc(object_info_oids.nr, sizeof(struct object_info));
+		gtransport->smart_options->object_info = 1;
+		gtransport->smart_options->object_info_oids = &object_info_oids;
+		/**
+		 * 'size' is the only option currently supported.
+		 * Other options that are passed in the format will default to a
+		 * standard fetch request rather than object-info.
+		 */
+		if (strstr(format, "%(objectsize)")) {
+			string_list_append(&object_info_options, "size");
+			include_size = 1;
+		}
+		if (strstr(format, "%(objecttype)")) {
+			string_list_append(&object_info_options, "type");
+			include_type = 1;
+		}
+		if (strstr(format, "%(objectsize:disk)"))
+			die(_("objectsize:disk is currently not supported with remote-object-info"));
+		if (strstr(format, "%(deltabase)"))
+			die(_("deltabase is currently not supported with remote-object-info"));
+		if (object_info_options.nr > 0) {
+			gtransport->smart_options->object_info_options = &object_info_options;
+			for (j = 0; j < object_info_oids.nr; j++) {
+				if (include_size)
+					remote_object_info[j].sizep = xcalloc(1, sizeof(long));
+				if (include_type)
+					remote_object_info[j].typep = xcalloc(1, sizeof(enum object_type));
+			}
+			gtransport->smart_options->object_info_data = &remote_object_info;
+			retval = transport_fetch_refs(gtransport, NULL);
+		}
+	} else {
+		retval = -1;
+	}
+	return retval;
+}
+
 struct object_cb_data {
 	struct batch_options *opt;
 	struct expand_data *expand;
@@ -538,6 +611,7 @@  typedef void (*parse_cmd_fn_t)(struct batch_options *, const char *,
 struct queued_cmd {
 	parse_cmd_fn_t fn;
 	char *line;
+	const char *name;
 };
 
 static void parse_cmd_contents(struct batch_options *opt,
@@ -565,9 +639,49 @@  static const struct parse_cmd {
 } commands[] = {
 	{ "contents", parse_cmd_contents, 1},
 	{ "info", parse_cmd_info, 1},
+	{ "remote-object-info", parse_cmd_info, 1},
 	{ "flush", NULL, 0},
 };
 
+static void parse_remote_info(struct batch_options *opt,
+			   char *line,
+			   struct strbuf *output,
+			   struct expand_data *data,
+			   const struct parse_cmd *p_cmd,
+			   struct queued_cmd *q_cmd)
+{
+	int count;
+	size_t i;
+	const char **argv;
+
+	count = split_cmdline(line, &argv);
+	if (get_remote_info(opt, count, argv))
+		goto cleanup;
+	opt->use_remote_info = 1;
+	data->skip_object_info = 1;
+	data->mark_query = 0;
+	for (i = 0; i < object_info_oids.nr; i++) {
+		if (remote_object_info[i].sizep)
+			data->size = *remote_object_info[i].sizep;
+		if (remote_object_info[i].typep)
+			data->type = *remote_object_info[i].typep;
+
+		data->oid = object_info_oids.oid[i];
+		if (p_cmd)
+			p_cmd->fn(opt, argv[i+1], output, data);
+		else
+			q_cmd->fn(opt, argv[i+1], output, data);
+	}
+	opt->use_remote_info = 0;
+	data->skip_object_info = 0;
+	data->mark_query = 1;
+
+cleanup:
+	for (i = 0; i < object_info_oids.nr; i++)
+		free_object_info_contents(&remote_object_info[i]);
+	free(remote_object_info);
+}
+
 static void dispatch_calls(struct batch_options *opt,
 		struct strbuf *output,
 		struct expand_data *data,
@@ -579,8 +693,12 @@  static void dispatch_calls(struct batch_options *opt,
 	if (!opt->buffer_output)
 		die(_("flush is only for --buffer mode"));
 
-	for (i = 0; i < nr; i++)
-		cmd[i].fn(opt, cmd[i].line, output, data);
+	for (i = 0; i < nr; i++) {
+		if (!strcmp(cmd[i].name, "remote-object-info"))
+			parse_remote_info(opt, cmd[i].line, output, data, NULL, &cmd[i]);
+		else
+			cmd[i].fn(opt, cmd[i].line, output, data);
+	}
 
 	fflush(stdout);
 }
@@ -640,11 +758,17 @@  static void batch_objects_command(struct batch_options *opt,
 			dispatch_calls(opt, output, data, queued_cmd, nr);
 			free_cmds(queued_cmd, &nr);
 		} else if (!opt->buffer_output) {
-			cmd->fn(opt, p, output, data);
+			if (!strcmp(cmd->name, "remote-object-info")) {
+				char *line = xstrdup_or_null(p);
+				parse_remote_info(opt, line, output, data, cmd, NULL);
+			} else {
+				cmd->fn(opt, p, output, data);
+			}
 		} else {
 			ALLOC_GROW(queued_cmd, nr + 1, alloc);
 			call.fn = cmd->fn;
 			call.line = xstrdup_or_null(p);
+			call.name = cmd->name;
 			queued_cmd[nr++] = call;
 		}
 	}
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index 1b85207694..69ee3f8330 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -2,6 +2,9 @@ 
 
 test_description='git cat-file'
 
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
 . ./test-lib.sh
 
 test_cmdmode_usage () {
@@ -1087,4 +1090,348 @@  test_expect_success 'batch-command flush without --buffer' '
 	grep "^fatal:.*flush is only for --buffer mode.*" err
 '
 
+# This section tests --batch-command with remote-object-info command
+# If a filter is not set, the filter defaults to "%(objectname) %(objectsize) %(objecttype)"
+# Since "%(objecttype)" is currently not supported by the command request, object-info,
+# the filters are set to "%(objectname) %(objectsize)".
+
+set_transport_variables () {
+    tree_sha1=$(git -C "$1" write-tree)
+	commit_sha1=$(echo_without_newline "$commit_message" | git -C "$1" commit-tree $tree_sha1)
+	tag_sha1=$(echo_without_newline "$tag_content" | git -C "$1" hash-object -t tag --stdin -w)
+	tag_size=$(strlen "$tag_content")
+}
+
+# Test --batch-command remote-object-info with 'git://' transport
+
+. "$TEST_DIRECTORY"/lib-git-daemon.sh
+start_git_daemon --export-all --enable=receive-pack
+daemon_parent=$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent
+
+test_expect_success 'create repo to be served by git-daemon' '
+	git init "$daemon_parent" &&
+	echo_without_newline "$hello_content" > $daemon_parent/hello &&
+	git -C "$daemon_parent" update-index --add hello
+'
+
+set_transport_variables "$daemon_parent"
+
+test_expect_success 'batch-command remote-object-info git://' '
+	(
+		cd "$daemon_parent" &&
+
+		echo "$hello_sha1 $hello_size" >expect &&
+		echo "$tree_sha1 $tree_size" >>expect &&
+		echo "$commit_sha1 $commit_size" >>expect &&
+		echo "$tag_sha1 $tag_size" >>expect &&
+		git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+		remote-object-info "$GIT_DAEMON_URL/parent" $hello_sha1
+		remote-object-info "$GIT_DAEMON_URL/parent" $tree_sha1
+		remote-object-info "$GIT_DAEMON_URL/parent" $commit_sha1
+		remote-object-info "$GIT_DAEMON_URL/parent" $tag_sha1
+		EOF
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'batch-command remote-object-info git:// multiple sha1 per line' '
+	(
+		cd "$daemon_parent" &&
+
+		echo "$hello_sha1 $hello_size" >expect &&
+		echo "$tree_sha1 $tree_size" >>expect &&
+		echo "$commit_sha1 $commit_size" >>expect &&
+		echo "$tag_sha1 $tag_size" >>expect &&
+		git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+		remote-object-info "$GIT_DAEMON_URL/parent" $hello_sha1 $tree_sha1 $commit_sha1 $tag_sha1
+		EOF
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'batch-command remote-object-info http:// default filter' '
+	(
+		cd "$daemon_parent" &&
+
+		echo "$hello_sha1 blob $hello_size" >expect &&
+		echo "$tree_sha1 tree $tree_size" >>expect &&
+		echo "$commit_sha1 commit $commit_size" >>expect &&
+		echo "$tag_sha1 tag $tag_size" >>expect &&
+		git cat-file --batch-command >actual <<-EOF &&
+		remote-object-info "$GIT_DAEMON_URL/parent" $hello_sha1 $tree_sha1
+		remote-object-info "$GIT_DAEMON_URL/parent" $commit_sha1 $tag_sha1
+		EOF
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'batch-command --buffer remote-object-info git://' '
+	(
+		cd "$daemon_parent" &&
+
+		echo "$hello_sha1 $hello_size" >expect &&
+		echo "$tree_sha1 $tree_size" >>expect &&
+		echo "$commit_sha1 $commit_size" >>expect &&
+		echo "$tag_sha1 $tag_size" >>expect &&
+		git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+		remote-object-info "$GIT_DAEMON_URL/parent" $hello_sha1 $tree_sha1
+		remote-object-info "$GIT_DAEMON_URL/parent" $commit_sha1 $tag_sha1
+		flush
+		EOF
+		test_cmp expect actual
+	)
+'
+
+stop_git_daemon
+
+# Test --batch-command remote-object-info with 'http://' transport
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'create repo to be served by http:// transport' '
+	git init "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config http.receivepack true &&
+	echo_without_newline "$hello_content" > $HTTPD_DOCUMENT_ROOT_PATH/http_parent/hello &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" update-index --add hello
+'
+set_transport_variables "$HTTPD_DOCUMENT_ROOT_PATH/http_parent"
+
+test_expect_success 'batch-command remote-object-info http://' '
+	(
+		cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+		echo "$hello_sha1 $hello_size" >expect &&
+		echo "$tree_sha1 $tree_size" >>expect &&
+		echo "$commit_sha1 $commit_size" >>expect &&
+		echo "$tag_sha1 $tag_size" >>expect &&
+		git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+		remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
+		remote-object-info "$HTTPD_URL/smart/http_parent" $tree_sha1
+		remote-object-info "$HTTPD_URL/smart/http_parent" $commit_sha1
+		remote-object-info "$HTTPD_URL/smart/http_parent" $tag_sha1
+		EOF
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'batch-command remote-object-info http:// one line' '
+	(
+		cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+		echo "$hello_sha1 $hello_size" >expect &&
+		echo "$tree_sha1 $tree_size" >>expect &&
+		echo "$commit_sha1 $commit_size" >>expect &&
+		echo "$tag_sha1 $tag_size" >>expect &&
+		git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+		remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1 $tree_sha1 $commit_sha1 $tag_sha1
+		EOF
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'batch-command --buffer remote-object-info http://' '
+	(
+		cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+		echo "$hello_sha1 $hello_size" >expect &&
+		echo "$tree_sha1 $tree_size" >>expect &&
+		echo "$commit_sha1 $commit_size" >>expect &&
+		echo "$tag_sha1 $tag_size" >>expect &&
+
+		git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+		remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1 $tree_sha1
+		remote-object-info "$HTTPD_URL/smart/http_parent" $commit_sha1 $tag_sha1
+		flush
+		EOF
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'batch-command remote-object-info http:// default filter' '
+	(
+		cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+		echo "$hello_sha1 blob $hello_size" >expect &&
+		echo "$tree_sha1 tree $tree_size" >>expect &&
+		echo "$commit_sha1 commit $commit_size" >>expect &&
+		echo "$tag_sha1 tag $tag_size" >>expect &&
+
+		git cat-file --batch-command >actual <<-EOF &&
+		remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1 $tree_sha1
+		remote-object-info "$HTTPD_URL/smart/http_parent" $commit_sha1 $tag_sha1
+		EOF
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'remote-object-info fails on unspported filter option (objectsize:disk)' '
+	(
+		cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+		test_must_fail git cat-file --batch-command="%(objectsize:disk)" 2>err <<-EOF &&
+		remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
+		EOF
+		test_i18ngrep "objectsize:disk is currently not supported with remote-object-info" err
+	)
+'
+
+test_expect_success 'remote-object-info fails on unspported filter option (deltabase)' '
+	(
+		cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+		test_must_fail git cat-file --batch-command="%(deltabase)" 2>err <<-EOF &&
+		remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
+		EOF
+		test_i18ngrep "deltabase is currently not supported with remote-object-info" err
+	)
+'
+
+test_expect_success 'remote-object-info fails on server with legacy protocol' '
+	(
+		cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+		test_must_fail git -c protocol.version=0 cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+		remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
+		EOF
+		test_i18ngrep "object-info requires protocol v2" err
+	)
+'
+
+test_expect_success 'remote-object-info fails on server with legacy protocol fallback' '
+	(
+		cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+
+		test_must_fail git -c protocol.version=0 cat-file --batch-command 2>err <<-EOF &&
+		remote-object-info "$HTTPD_URL/smart/http_parent" $hello_sha1
+		EOF
+		test_i18ngrep "object-info requires protocol v2" err
+	)
+'
+
+test_expect_success 'remote-object-info fails on malformed OID' '
+	(
+		cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+		malformed_object_id="this_id_is_not_valid" &&
+
+		test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+		remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
+		EOF
+		test_i18ngrep "malformed object id '$malformed_object_id'" err
+	)
+'
+
+test_expect_success 'remote-object-info fails on malformed OID fallback' '
+	(
+		cd "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+		malformed_object_id="this_id_is_not_valid" &&
+
+		test_must_fail git cat-file --batch-command 2>err <<-EOF &&
+		remote-object-info "$HTTPD_URL/smart/http_parent" $malformed_object_id
+		EOF
+		test_i18ngrep "malformed object id '$malformed_object_id'" err
+	)
+'
+
+test_expect_success 'remote-object-info fails on missing OID' '
+	git clone "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" missing_oid_repo &&
+	test_commit -C missing_oid_repo message1 c.txt &&
+	(
+		cd missing_oid_repo &&
+		object_id=$(git rev-parse message1:c.txt) &&
+		test_must_fail git cat-file --batch-command="%(objectname) %(objectsize)" 2>err <<-EOF &&
+		remote-object-info "$HTTPD_URL/smart/http_parent" $object_id
+		EOF
+		test_i18ngrep "object-info: not our ref $object_id" err
+	)
+'
+
+test_expect_success 'remote-object-info fails on missing OID fallback' '
+	(
+		cd missing_oid_repo &&
+		object_id=$(git rev-parse message1:c.txt) &&
+		test_must_fail git cat-file --batch-command 2>err <<-EOF &&
+		remote-object-info "$HTTPD_URL/smart/http_parent" $object_id
+		EOF
+		test_i18ngrep "fatal: remote error: upload-pack: not our ref $object_id" err
+	)
+'
+
+# Test --batch-command remote-object-info with 'file://' transport
+
+test_expect_success 'create repo to be served by file:// transport' '
+	git init server &&
+	git -C server config protocol.version 2 &&
+	echo_without_newline "$hello_content" > server/hello &&
+	git -C server update-index --add hello
+'
+
+set_transport_variables "server"
+
+test_expect_success 'batch-command remote-object-info file://' '
+	(
+		cd server &&
+
+		echo "$hello_sha1 $hello_size" >expect &&
+		echo "$tree_sha1 $tree_size" >>expect &&
+		echo "$commit_sha1 $commit_size" >>expect &&
+		echo "$tag_sha1 $tag_size" >>expect &&
+		git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+		remote-object-info "file://$(pwd)" $hello_sha1
+		remote-object-info "file://$(pwd)" $tree_sha1
+		remote-object-info "file://$(pwd)" $commit_sha1
+		remote-object-info "file://$(pwd)" $tag_sha1
+		EOF
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'batch-command remote-object-info file:// multiple sha1 per line' '
+	(
+		cd server &&
+
+		echo "$hello_sha1 $hello_size" >expect &&
+		echo "$tree_sha1 $tree_size" >>expect &&
+		echo "$commit_sha1 $commit_size" >>expect &&
+		echo "$tag_sha1 $tag_size" >>expect &&
+		git cat-file --batch-command="%(objectname) %(objectsize)" >actual <<-EOF &&
+		remote-object-info "file://$(pwd)" $hello_sha1 $tree_sha1 $commit_sha1 $tag_sha1
+		EOF
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'batch-command --buffer remote-object-info file://' '
+	(
+		cd server &&
+
+		echo "$hello_sha1 $hello_size" >expect &&
+		echo "$tree_sha1 $tree_size" >>expect &&
+		echo "$commit_sha1 $commit_size" >>expect &&
+		echo "$tag_sha1 $tag_size" >>expect &&
+		git cat-file --batch-command="%(objectname) %(objectsize)" --buffer >actual <<-EOF &&
+		remote-object-info "file://$(pwd)" $hello_sha1 $tree_sha1
+		remote-object-info "file://$(pwd)" $commit_sha1 $tag_sha1
+		flush
+		EOF
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'batch-command remote-object-info file:// default filter' '
+	(
+		cd server &&
+
+		echo "$hello_sha1 blob $hello_size" >expect &&
+		echo "$tree_sha1 tree $tree_size" >>expect &&
+		echo "$commit_sha1 commit $commit_size" >>expect &&
+		echo "$tag_sha1 tag $tag_size" >>expect &&
+		git cat-file --batch-command >actual <<-EOF &&
+		remote-object-info "file://$(pwd)" $hello_sha1 $tree_sha1
+		remote-object-info "file://$(pwd)" $commit_sha1 $tag_sha1
+		EOF
+		test_cmp expect actual
+	)
+'
+
 test_done