diff mbox series

fetch: delay fetch_if_missing=0 until after config

Message ID 20191007181825.13463-1-jonathantanmy@google.com (mailing list archive)
State New, archived
Headers show
Series fetch: delay fetch_if_missing=0 until after config | expand

Commit Message

Jonathan Tan Oct. 7, 2019, 6:18 p.m. UTC
When running "git fetch" in a partial clone with no blobs, for example,
by:

  git clone --filter=blob:none --no-checkout \
    https://kernel.googlesource.com/pub/scm/git/git
  git -C git fetch

"git fetch" will fail to load the config blob object, printing "unable
to load config blob object".

This is because fetch_if_missing is set to 0 before the config is
processed. Git must set fetch_if_missing to 0 before the fetch because
as part of the fetch, packfile negotiation happens (and we do not want
to fetch any missing objects when checking existence of objects), but we
do not need to set it so early. Move the setting of fetch_if_missing to
the earliest possible point in cmd_fetch(), right before any fetching
happens.
---
This is not a full solution, but this helps in the use case described in
the commit message. The full solution probably will involve teaching the
fetch mechanism to support arbitrary struct repository objects, and by
moving fetch_if_missing into the repository object. (Alternatively, we
could add the equivalent of OBJECT_INFO_SKIP_FETCH_OBJECT to functions
like parse_commit() that are used by files like negotiator/default.c, or
split up commit parsing into object reading - which already has that
flag - and commit parsing.)
---
 builtin/fetch.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Emily Shaffer Oct. 23, 2019, 9:03 p.m. UTC | #1
On Mon, Oct 07, 2019 at 11:18:25AM -0700, Jonathan Tan wrote:
> When running "git fetch" in a partial clone with no blobs, for example,
> by:
> 
>   git clone --filter=blob:none --no-checkout \
>     https://kernel.googlesource.com/pub/scm/git/git
>   git -C git fetch
> 
> "git fetch" will fail to load the config blob object, printing "unable
> to load config blob object".

I'm having some trouble figuring out which object is actually missing.
Is this the .git/config object? (That doesn't make much sense to me...)
Is it .gitmodules?

> 
> This is because fetch_if_missing is set to 0 before the config is
> processed. Git must set fetch_if_missing to 0 before the fetch because
> as part of the fetch, packfile negotiation happens (and we do not want
> to fetch any missing objects when checking existence of objects), but we
> do not need to set it so early. Move the setting of fetch_if_missing to
> the earliest possible point in cmd_fetch(), right before any fetching
> happens.

Doubts aside about what's actually failing, I definitely agree with the
premise of not setting this until the last moment we need it. Plus, I
may be alone here, but it'd make it easier for me to understand the code
if I saw a note explaining *why* we don't want to fetch_if_missing in
this case.

By the way, I think I understand that this is OK to go in
unconditionally because:
 - In the full clone case, it's a no-op; we haven't got anything that's
   missing, so who cares.
 - In the filter case, it's as you said - we don't want to
   fetch_if_missing because that will turn someone's partial clone into
   a a full clone.
   - This probably applies to bare checkout, too.

Of course if I'm wrong I'd like to know, but that's how I understand it
at the moment.

> ---
> This is not a full solution, but this helps in the use case described in
> the commit message. The full solution probably will involve teaching the
> fetch mechanism to support arbitrary struct repository objects, and by
> moving fetch_if_missing into the repository object. (Alternatively, we
> could add the equivalent of OBJECT_INFO_SKIP_FETCH_OBJECT to functions
> like parse_commit() that are used by files like negotiator/default.c, or
> split up commit parsing into object reading - which already has that
> flag - and commit parsing.)

Ah, I remember this was listed as one of the potential intern projects -
I think we dismissed it as being too tech-debt-y for an intern to feel
good about. :(

> ---
>  builtin/fetch.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 24d382b2fb..865ae6677d 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -1666,8 +1666,6 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>  
>  	packet_trace_identity("fetch");
>  
> -	fetch_if_missing = 0;
> -
>  	/* Record the command line for the reflog */
>  	strbuf_addstr(&default_rla, "fetch");
>  	for (i = 1; i < argc; i++)
> @@ -1734,6 +1732,8 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>  		}
>  	}
>  
> +	fetch_if_missing = 0;
> +
>  	if (remote) {
>  		if (filter_options.choice || has_promisor_remote())
>  			fetch_one_setup_partial(remote);
> -- 
> 2.23.0.581.g78d2f28ef7-goog
> 

The diff, though, looks fine for me.

 - Emily
diff mbox series

Patch

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 24d382b2fb..865ae6677d 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1666,8 +1666,6 @@  int cmd_fetch(int argc, const char **argv, const char *prefix)
 
 	packet_trace_identity("fetch");
 
-	fetch_if_missing = 0;
-
 	/* Record the command line for the reflog */
 	strbuf_addstr(&default_rla, "fetch");
 	for (i = 1; i < argc; i++)
@@ -1734,6 +1732,8 @@  int cmd_fetch(int argc, const char **argv, const char *prefix)
 		}
 	}
 
+	fetch_if_missing = 0;
+
 	if (remote) {
 		if (filter_options.choice || has_promisor_remote())
 			fetch_one_setup_partial(remote);