mbox series

[v4,0/8] cat-file: add --batch-command remote-object-info command

Message ID 20220502170904.2770649-1-calvinwan@google.com (mailing list archive)
Headers show
Series cat-file: add --batch-command remote-object-info command | expand

Message

Calvin Wan May 2, 2022, 5:08 p.m. UTC
Sometimes it is useful to get information about an object without having
to download it completely. The server logic has already been implemented
as “a2ba162cda (object-info: support for retrieving object info,
2021-04-20)”. This patch implements the client option for it.

Add `--object-info` option to `cat-file --batch-command`. This option
allows the client to make an object-info command request to a server
that supports protocol v2. If the server is v2, but does not allow for
the object-info command request, the entire object is fetched and the
relevant object info is returned.

Summary of changes ==================

Patches 1, 2, 3, and 7 are small changes that setup the main
implementation. Patch 4 sets up object-info to be backwards compatible
for future patch series that adds additional attributes.  Patch 5 adds
internal trasnport functions to send and receive object-info command
request packets. Patch 6 adds the fallback if object-info is not
supported or fails.  Patch 8 adds the cat-file implementation.

Changes since V3 ================

 * Object-info is now implemented in cat-file --batch-command rather
   than fetch (new base commit)
 * Removed config option to advertise object-info
 * Added forwards and backwards compability for object-info
 * Split up some patches to better describe and visualize changes

Signed-off-by: Calvin Wan <calvinwan@google.com>
Helped-by: Jonathan Tan <jonathantanmy@google.com>

Calvin Wan (8):
  fetch-pack: refactor packet writing
  fetch-pack: move fetch default settings
  object-store: add function to free object_info contents
  object-info: send attribute packet regardless of object ids
  transport: add client side capability to request object-info
  transport: add object-info fallback to fetch
  cat-file: move parse_cmd and DEFAULT_FORMAT up
  cat-file: add --batch-command remote-object-info command

 Documentation/git-cat-file.txt |  16 +-
 builtin/cat-file.c             | 225 ++++++++++++++++-----
 fetch-pack.c                   |  61 ++++--
 fetch-pack.h                   |  10 +
 object-file.c                  |  16 ++
 object-store.h                 |   3 +
 protocol-caps.c                |  14 +-
 t/t1006-cat-file.sh            | 347 +++++++++++++++++++++++++++++++++
 transport-helper.c             |   7 +-
 transport.c                    |  97 ++++++++-
 transport.h                    |  11 ++
 11 files changed, 728 insertions(+), 79 deletions(-)

Comments

ZheNing Hu July 31, 2022, 3:02 p.m. UTC | #1
Hi, Calvin Wan,

Calvin Wan <calvinwan@google.com> 于2022年5月3日周二 08:14写道:
>
> Sometimes it is useful to get information about an object without having
> to download it completely. The server logic has already been implemented
> as “a2ba162cda (object-info: support for retrieving object info,
> 2021-04-20)”. This patch implements the client option for it.
>
> Add `--object-info` option to `cat-file --batch-command`. This option
> allows the client to make an object-info command request to a server
> that supports protocol v2. If the server is v2, but does not allow for
> the object-info command request, the entire object is fetched and the
> relevant object info is returned.
>
> Summary of changes ==================
>
> Patches 1, 2, 3, and 7 are small changes that setup the main
> implementation. Patch 4 sets up object-info to be backwards compatible
> for future patch series that adds additional attributes.  Patch 5 adds
> internal trasnport functions to send and receive object-info command
> request packets. Patch 6 adds the fallback if object-info is not
> supported or fails.  Patch 8 adds the cat-file implementation.
>

I have to say I am very curious about this feature. Since the current
git partial-clone interface supports only a few filters:

blob:limit
blob:none
sparse:oid
tree:0

Though these filters reduce the number of objects downloaded each time,
sometimes I just need *only* one blob object, git partial-clone will still
download some additional commits and tree objects from the remote.

This patch can get the remote-object-info by git cat-file, so can we go
further to get the remote-object-content of the object?

Something like:

$ git cat-file --batch-command
remote-object-content origin <oid>

> Changes since V3 ================
>
>  * Object-info is now implemented in cat-file --batch-command rather
>    than fetch (new base commit)
>  * Removed config option to advertise object-info
>  * Added forwards and backwards compability for object-info
>  * Split up some patches to better describe and visualize changes
>
> Signed-off-by: Calvin Wan <calvinwan@google.com>
> Helped-by: Jonathan Tan <jonathantanmy@google.com>
>
> Calvin Wan (8):
>   fetch-pack: refactor packet writing
>   fetch-pack: move fetch default settings
>   object-store: add function to free object_info contents
>   object-info: send attribute packet regardless of object ids
>   transport: add client side capability to request object-info
>   transport: add object-info fallback to fetch
>   cat-file: move parse_cmd and DEFAULT_FORMAT up
>   cat-file: add --batch-command remote-object-info command
>
>  Documentation/git-cat-file.txt |  16 +-
>  builtin/cat-file.c             | 225 ++++++++++++++++-----
>  fetch-pack.c                   |  61 ++++--
>  fetch-pack.h                   |  10 +
>  object-file.c                  |  16 ++
>  object-store.h                 |   3 +
>  protocol-caps.c                |  14 +-
>  t/t1006-cat-file.sh            | 347 +++++++++++++++++++++++++++++++++
>  transport-helper.c             |   7 +-
>  transport.c                    |  97 ++++++++-
>  transport.h                    |  11 ++
>  11 files changed, 728 insertions(+), 79 deletions(-)
>
> --
> 2.36.0.rc2.10170.gb555eefa6f
>

Thanks!

ZheNing Hu
Calvin Wan Aug. 8, 2022, 5:32 p.m. UTC | #2
> I have to say I am very curious about this feature. Since the current
> git partial-clone interface supports only a few filters:
>
> blob:limit
> blob:none
> sparse:oid
> tree:0
>
> Though these filters reduce the number of objects downloaded each time,
> sometimes I just need *only* one blob object, git partial-clone will still
> download some additional commits and tree objects from the remote.
>
> This patch can get the remote-object-info by git cat-file, so can we go
> further to get the remote-object-content of the object?
>
> Something like:
>
> $ git cat-file --batch-command
> remote-object-content origin <oid>

I think this can be potentially future work. Part of my hesitation stems
from the fact that there is no need to print out any object info when
downloading an object. Therefore, it seems better suited to first be
implemented in git fetch (or added as a filter to partial-clone). And
then remote-object-content can be a combination of fetch +
remote-object-info.
Junio C Hamano Aug. 13, 2022, 10:17 p.m. UTC | #3
Style and coccinelle fixes; please squash in when you reroll.

 * var = xcalloc(count, sizeof(*var)) --> CALLOC_ARRAY(var, count)
   for count != 1

 * sizeof (type) --> sizeof(type)

Thanks.

 builtin/cat-file.c | 2 +-
 transport.c        | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 57c090f249..4afe82322f 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -515,7 +515,7 @@ static int get_remote_info(struct batch_options *opt, int argc, const char **arg
 		size_t j;
 		int include_size = 0, include_type = 0;
 
-		remote_object_info = xcalloc(object_info_oids.nr, sizeof(struct object_info));
+		CALLOC_ARRAY(remote_object_info, object_info_oids.nr);
 		gtransport->smart_options->object_info = 1;
 		gtransport->smart_options->object_info_oids = &object_info_oids;
 		/**
diff --git a/transport.c b/transport.c
index 64bcc311ff..87197f0ec7 100644
--- a/transport.c
+++ b/transport.c
@@ -442,7 +442,7 @@ static int fetch_refs_via_pack(struct transport *transport,
 	struct ref *refs = NULL;
 	struct fetch_pack_args args;
 	struct ref *refs_tmp = NULL;
-	struct ref *object_info_refs = xcalloc(1, sizeof (struct ref));
+	struct ref *object_info_refs = xcalloc(1, sizeof(struct ref));
 
 	memset(&args, 0, sizeof(args));
 	args.uploadpack = data->options.uploadpack;
@@ -479,7 +479,7 @@ static int fetch_refs_via_pack(struct transport *transport,
 		args.quiet = 1;
 		args.no_progress = 1;
 		for (i = 0; i < transport->smart_options->object_info_oids->nr; i++) {
-			struct ref *temp_ref = xcalloc(1, sizeof (struct ref));
+			struct ref *temp_ref = xcalloc(1, sizeof(struct ref));
 			temp_ref->old_oid = *(transport->smart_options->object_info_oids->oid + i);
 			temp_ref->exact_oid = 1;
 			ref->next = temp_ref;