diff mbox series

[RFC] transport: list refs before fetch if necessary

Message ID 20180925225355.74237-1-jonathantanmy@google.com (mailing list archive)
State New, archived
Headers show
Series [RFC] transport: list refs before fetch if necessary | expand

Commit Message

Jonathan Tan Sept. 25, 2018, 10:53 p.m. UTC
The built-in bundle transport and the transport helper interface do not
work when transport_fetch_refs() is called immediately after transport
creation.

Evidence: fetch_refs_from_bundle() relies on data->header being
initialized in get_refs_from_bundle(), and fetch() in transport-helper.c
relies on either data->fetch or data->import being set by get_helper(),
but neither transport_helper_init() nor fetch() calls get_helper().

Up until the introduction of the partial clone feature, this has not
been a problem, because transport_fetch_refs() is always called after
transport_get_remote_refs(). With the introduction of the partial clone
feature, which involves calling transport_fetch_refs() (to fetch objects
by their OIDs) without transport_get_remote_refs(), this is still not a
problem, but only coincidentally - we do not support partially cloning a
bundle, and as for cloning using a transport-helper-using protocol, it
so happens that before transport_fetch_refs() is called, fetch_refs() in
fetch-object.c calls transport_set_option(), which means that the
aforementioned get_helper() is invoked through set_helper_option() in
transport-helper.c. In the future, though, there may be other use cases
in which we want to fetch without requiring listing of remote refs, so
this is still worth fixing.

This could be fixed by fixing the transports themselves, but it doesn't
seem like a good idea to me to open up previously untested code paths;
also, there may be transport helpers in the wild that assume that "list"
is always called before "fetch". Instead, fix this by having
transport_fetch_refs() call transport_get_remote_refs() to ensure that
the latter is always called at least once, unless the transport
explicitly states that it supports fetching without listing refs.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
I discovered this while investigating the possibility of taking
advantage of the fact that protocol v2 allows us to fetch without first
invoking ls-refs. This is useful both when lazily fetching to a partial
clone, and when invoking "git fetch --no-tags <remote> <sha-1>" (note
that tag following must be disabled).

Any comments on this (for or against) is appreciated, and suggestions of
better approaches are appreciated too.
---
 transport-helper.c   |  1 +
 transport-internal.h |  6 ++++++
 transport.c          | 12 ++++++++++++
 3 files changed, 19 insertions(+)

Comments

Junio C Hamano Sept. 25, 2018, 11:09 p.m. UTC | #1
Jonathan Tan <jonathantanmy@google.com> writes:

> diff --git a/transport-helper.c b/transport-helper.c
> index 143ca008c8..7213fa0d32 100644
> --- a/transport-helper.c
> +++ b/transport-helper.c
> @@ -1105,6 +1105,7 @@ static struct ref *get_refs_list(struct transport *transport, int for_push,
>  }
>  
>  static struct transport_vtable vtable = {
> +	0,
>  	set_helper_option,
>  	get_refs_list,
>  	fetch,
> diff --git a/transport-internal.h b/transport-internal.h
> index 1cde6258a7..004bee5e36 100644
> --- a/transport-internal.h
> +++ b/transport-internal.h
> @@ -6,6 +6,12 @@ struct transport;
>  struct argv_array;
>  
>  struct transport_vtable {
> +	/**
> +	 * This transport supports the fetch() function being called
> +	 * without get_refs_list() first being called.
> +	 */
> +	unsigned fetch_without_list : 1;
> +
>  	/**
>  	 * Returns 0 if successful, positive if the option is not
>  	 * recognized or is inapplicable, and negative if the option
> diff --git a/transport.c b/transport.c
> index 1c76d64aba..ee8a78ff37 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -703,6 +703,7 @@ static int disconnect_git(struct transport *transport)
>  }
>  
>  static struct transport_vtable taken_over_vtable = {
> +	1,
>  	NULL,
>  	get_refs_via_connect,
>  	fetch_refs_via_pack,
> @@ -852,6 +853,7 @@ void transport_check_allowed(const char *type)
>  }
>  
>  static struct transport_vtable bundle_vtable = {
> +	0,
>  	NULL,
>  	get_refs_from_bundle,
>  	fetch_refs_from_bundle,
> @@ -861,6 +863,7 @@ static struct transport_vtable bundle_vtable = {
>  };
>  
>  static struct transport_vtable builtin_smart_vtable = {
> +	1,
>  	NULL,
>  	get_refs_via_connect,
>  	fetch_refs_via_pack,

Up to this point I think I understand the change.  We gain one new
trait for each transport, many of the transport cannot run fetch
without first seeing the advertisement, some are OK, so we have 0 or
1 in these vtables as appropriately.

> @@ -1224,6 +1227,15 @@ int transport_fetch_refs(struct transport *transport, struct ref *refs)
>  	struct ref **heads = NULL;
>  	struct ref *rm;
>  
> +	if (!transport->vtable->fetch_without_list)
> +		/*
> +		 * Some transports (e.g. the built-in bundle transport and the
> +		 * transport helper interface) do not work when fetching is
> +		 * done immediately after transport creation. List the remote
> +		 * refs anyway (if not already listed) as a workaround.
> +		 */
> +		transport_get_remote_refs(transport, NULL);
> +

But this I do not quite understand.  It looks saying "when asked to
fetch, if the transport does not allow us to do so without first
getting the advertisement, lazily do that", and that may be a good
thing to do, but then aren't the current set of callers already
calling transport-get-remote-refs elsewhere before they call
transport-fetch-refs?  IOW, I would have expected to see a matching
removal, or at least a code that turns an unconditional call to
get-remote-refs to a conditional one that is done only for the
transport that lacks the capability, or something along that line.

... ah, do you mean that this is not a new feature, but is a bugfix
for some callers that are not calling get-remote-refs before calling
fetch-refs, and the bit is to work around the fact that some
transport not just can function without get-remote-refs first but do
not want to call it?

IOW, I am a bit confused by this comment (copied from an earlier part)

> +	/**
> +	 * This transport supports the fetch() function being called
> +	 * without get_refs_list() first being called.
> +	 */

Shouldn't it read more like "this transport does not want its
get-refs-list called when fetch-refs is done"?

I dunno.


>  	for (rm = refs; rm; rm = rm->next) {
>  		nr_refs++;
>  		if (rm->peer_ref &&
diff mbox series

Patch

diff --git a/transport-helper.c b/transport-helper.c
index 143ca008c8..7213fa0d32 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -1105,6 +1105,7 @@  static struct ref *get_refs_list(struct transport *transport, int for_push,
 }
 
 static struct transport_vtable vtable = {
+	0,
 	set_helper_option,
 	get_refs_list,
 	fetch,
diff --git a/transport-internal.h b/transport-internal.h
index 1cde6258a7..004bee5e36 100644
--- a/transport-internal.h
+++ b/transport-internal.h
@@ -6,6 +6,12 @@  struct transport;
 struct argv_array;
 
 struct transport_vtable {
+	/**
+	 * This transport supports the fetch() function being called
+	 * without get_refs_list() first being called.
+	 */
+	unsigned fetch_without_list : 1;
+
 	/**
 	 * Returns 0 if successful, positive if the option is not
 	 * recognized or is inapplicable, and negative if the option
diff --git a/transport.c b/transport.c
index 1c76d64aba..ee8a78ff37 100644
--- a/transport.c
+++ b/transport.c
@@ -703,6 +703,7 @@  static int disconnect_git(struct transport *transport)
 }
 
 static struct transport_vtable taken_over_vtable = {
+	1,
 	NULL,
 	get_refs_via_connect,
 	fetch_refs_via_pack,
@@ -852,6 +853,7 @@  void transport_check_allowed(const char *type)
 }
 
 static struct transport_vtable bundle_vtable = {
+	0,
 	NULL,
 	get_refs_from_bundle,
 	fetch_refs_from_bundle,
@@ -861,6 +863,7 @@  static struct transport_vtable bundle_vtable = {
 };
 
 static struct transport_vtable builtin_smart_vtable = {
+	1,
 	NULL,
 	get_refs_via_connect,
 	fetch_refs_via_pack,
@@ -1224,6 +1227,15 @@  int transport_fetch_refs(struct transport *transport, struct ref *refs)
 	struct ref **heads = NULL;
 	struct ref *rm;
 
+	if (!transport->vtable->fetch_without_list)
+		/*
+		 * Some transports (e.g. the built-in bundle transport and the
+		 * transport helper interface) do not work when fetching is
+		 * done immediately after transport creation. List the remote
+		 * refs anyway (if not already listed) as a workaround.
+		 */
+		transport_get_remote_refs(transport, NULL);
+
 	for (rm = refs; rm; rm = rm->next) {
 		nr_refs++;
 		if (rm->peer_ref &&