mbox series

[v11,0/8] cat-file: add remote-object-info to batch-command

Message ID 20250221190451.12536-1-eric.peijian@gmail.com (mailing list archive)
Headers show
Series cat-file: add remote-object-info to batch-command | expand

Message

Eric Ju Feb. 21, 2025, 7:04 p.m. UTC
This patch series is a continuation of Calvin Wan’s (calvinwan@google.com)
patch series [PATCH v5 0/6] cat-file: add --batch-command remote-object-info
command at [1].

Sometimes it is beneficial to retrieve information about an object without
having to download it completely. The server logic for retrieving size has
already been implemented and merged in "a2ba162cda (object-info: support for
retrieving object info, 2021-04-20)"[2]. This patch series implement the client
option for it.

This patch series add the `remote-object-info` command to
`cat-file --batch-command`. This command allows the client to make an
object-info command request to a server that supports protocol v2.

If the server uses protocol v2 but does not support the object-info capability,
`cat-file --batch-command` will die.

If a user attempts to use `remote-object-info` with protocol v1,,
`cat-file --batch-command` will die.

Currently, only the size (%(objectsize)) is supported in this implementation.
The type (%(objecttype)) is not included in this patch series, as it is not yet
supported on the server side either. The plan is to implement the necessary
logic for both the server and client in a subsequent series.

The default format for remote-object-info is set to %(objectname) %(objectsize).
Once %(objecttype) is supported, the default format will be unified accordingly.

If the batch command format includes unsupported fields such as %(objecttype),
%(objectsize:disk), or %(deltabase), the command will terminate with an error.

Changes since V10
================
- Add a check on command input to prevent overflow.
- Add other checks to prevent potential abuse.

Calvin Wan (4):
  fetch-pack: refactor packet writing
  fetch-pack: move fetch initialization
  serve: advertise object-info feature
  transport: add client support for object-info

Eric Ju (4):
  git-compat-util: add strtoul_ul() with error handling
  cat-file: add declaration of variable i inside its for loop
  t1006: split test utility functions into new "lib-cat-file.sh"
  cat-file: add remote-object-info to batch-command

 Documentation/git-cat-file.adoc        |  24 +-
 Makefile                               |   1 +
 builtin/cat-file.c                     | 125 ++++-
 connect.c                              |  34 ++
 connect.h                              |   8 +
 fetch-object-info.c                    |  85 ++++
 fetch-object-info.h                    |  22 +
 fetch-pack.c                           |  51 +-
 fetch-pack.h                           |   2 +
 git-compat-util.h                      |  20 +
 object-file.c                          |  11 +
 object-store-ll.h                      |   3 +
 serve.c                                |   4 +-
 t/lib-cat-file.sh                      |  16 +
 t/t1006-cat-file.sh                    |  13 +-
 t/t1017-cat-file-remote-object-info.sh | 664 +++++++++++++++++++++++++
 transport-helper.c                     |  11 +-
 transport.c                            |  28 +-
 transport.h                            |  11 +
 19 files changed, 1065 insertions(+), 68 deletions(-)
 create mode 100644 fetch-object-info.c
 create mode 100644 fetch-object-info.h
 create mode 100644 t/lib-cat-file.sh
 create mode 100755 t/t1017-cat-file-remote-object-info.sh

Range-diff against v10:
1:  a4a5aefa3e = 1:  814c53b402 git-compat-util: add strtoul_ul() with error handling
2:  c67e79804e = 2:  04f41100c4 cat-file: add declaration of variable i inside its for loop
3:  7f0b824714 = 3:  3af67e6648 t1006: split test utility functions into new "lib-cat-file.sh"
4:  0d22d6af6e = 4:  cb1088e436 fetch-pack: refactor packet writing
5:  34c34c7464 = 5:  614daac4bb fetch-pack: move fetch initialization
6:  54dd237c45 = 6:  4bc403fa2c serve: advertise object-info feature
7:  90a3d987d5 ! 7:  adae08d5a8 transport: add client support for object-info
    @@ transport.c: static int fetch_refs_via_pack(struct transport *transport,
      	args.reject_shallow_remote = transport->smart_options->reject_shallow;
     +	args.object_info = transport->smart_options->object_info;
     +
    -+	if (transport->smart_options && transport->smart_options->object_info
    ++	if (transport->smart_options->object_info
     +	    && transport->smart_options->object_info_oids->nr > 0) {
     +		struct packet_reader reader;
     +		struct object_info_args obj_info_args = { 0 };
8:  9d932c2cb2 ! 8:  975d39cb6a cat-file: add remote-object-info to batch-command
    @@ builtin/cat-file.c
     +#include "alias.h"
     +#include "remote.h"
     +#include "transport.h"
    ++
    ++/* Maximum length for a remote URL. While no universal standard exists,
    ++ * 8K is assumed to be a reasonable limit.
    ++ */
    ++#define MAX_REMOTE_URL_LEN (8*1024)
    ++/* Maximum number of objects allowed in a single remote-object-info request. */
    ++#define MAX_ALLOWED_OBJ_LIMIT 10000
    ++/* Maximum input size permitted for the remote-object-info command. */
    ++#define MAX_REMOTE_OBJ_INFO_LINE (MAX_REMOTE_URL_LEN + MAX_ALLOWED_OBJ_LIMIT * (GIT_MAX_HEXSZ + 1))
      
      enum batch_mode {
      	BATCH_MODE_CONTENTS,
    @@ builtin/cat-file.c: static void parse_cmd_info(struct batch_options *opt,
     +{
     +	int count;
     +	const char **argv;
    ++	char *line_to_split;
    ++
    ++	if (strlen(line) >= MAX_REMOTE_OBJ_INFO_LINE)
    ++		die(_("remote-object-info command input overflow "
    ++			"(no more than %d objects are allowed)"),
    ++			MAX_ALLOWED_OBJ_LIMIT);
     +
    -+	char *line_to_split = xstrdup_or_null(line);
    ++	line_to_split = xstrdup(line);
     +	count = split_cmdline(line_to_split, &argv);
    ++	if (count < 0)
    ++		die(_("split remote-object-info command"));
    ++
     +	if (get_remote_info(opt, count, argv))
     +		goto cleanup;
     +