diff mbox series

[3/7] remote-curl: rediscover repository when fetching refs

Message ID a1b86a0cbbedcc6610b2c563e9e38d439338869d.1701863960.git.ps@pks.im (mailing list archive)
State Superseded
Headers show
Series clone: fix init of refdb with wrong object format | expand

Commit Message

Patrick Steinhardt Dec. 6, 2023, 12:39 p.m. UTC
We're about to change git-clone(1) so that we set up the reference
database at a later point. This change will cause git-remote-curl(1) to
not detect the repository anymore due to "HEAD" not having been created
yet at the time it spawns, and thus causes it to error out once it is
asked to fetch the references.

We can address this issue by trying to re-discover the Git repository in
case none was detected at startup time. With this change, the clone will
look as following:

  1. git-clone(1) sets up the initial repository, excluding the
     reference database.

  2. git-clone(1) spawns git-remote-curl(1), which will be unable to
     detect the repository due to a missing "HEAD".

  3. git-clone(1) asks git-remote-curl(1) to list remote references.
     This works just fine as this step does not require a local
     repository

  4. git-clone(1) creates the reference database as it has now learned
     about the object format.

  5. git-clone(1) asks git-remote-curl(1) to fetch the remote packfile.
     The latter notices that it doesn't have a repository available, but
     it now knows to try and re-discover it.

If the re-discovery succeeds in the last step we can continue with the
clone.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 remote-curl.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Comments

Junio C Hamano Dec. 8, 2023, 11:09 p.m. UTC | #1
Patrick Steinhardt <ps@pks.im> writes:

> We're about to change git-clone(1) so that we set up the reference
> database at a later point. This change will cause git-remote-curl(1) to
> not detect the repository anymore due to "HEAD" not having been created
> yet at the time it spawns, and thus causes it to error out once it is
> asked to fetch the references.
>
> We can address this issue by trying to re-discover the Git repository in
> case none was detected at startup time. With this change, the clone will
> look as following:
>
>   1. git-clone(1) sets up the initial repository, excluding the
>      reference database.
>
>   2. git-clone(1) spawns git-remote-curl(1), which will be unable to
>      detect the repository due to a missing "HEAD".
>
>   3. git-clone(1) asks git-remote-curl(1) to list remote references.
>      This works just fine as this step does not require a local
>      repository
>
>   4. git-clone(1) creates the reference database as it has now learned
>      about the object format.

Sorry, but I am not sure I understand this step.  I assume you mean
by "the object format" what hash function is used to index the
objects (which can be learned from the remote "origin" in step 2 and
we can choose to use the one they use), not what ref backend is used
(which is purely a local matter and we do not need to know what is
used at the "origin").  Why do we need to wait initializing ref
backend until we learn what hash is being in use?

>   5. git-clone(1) asks git-remote-curl(1) to fetch the remote packfile.
>      The latter notices that it doesn't have a repository available, but
>      it now knows to try and re-discover it.
>
> If the re-discovery succeeds in the last step we can continue with the
> clone.

OK.
Patrick Steinhardt Dec. 11, 2023, 11:35 a.m. UTC | #2
On Sat, Dec 09, 2023 at 08:09:32AM +0900, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > We're about to change git-clone(1) so that we set up the reference
> > database at a later point. This change will cause git-remote-curl(1) to
> > not detect the repository anymore due to "HEAD" not having been created
> > yet at the time it spawns, and thus causes it to error out once it is
> > asked to fetch the references.
> >
> > We can address this issue by trying to re-discover the Git repository in
> > case none was detected at startup time. With this change, the clone will
> > look as following:
> >
> >   1. git-clone(1) sets up the initial repository, excluding the
> >      reference database.
> >
> >   2. git-clone(1) spawns git-remote-curl(1), which will be unable to
> >      detect the repository due to a missing "HEAD".
> >
> >   3. git-clone(1) asks git-remote-curl(1) to list remote references.
> >      This works just fine as this step does not require a local
> >      repository
> >
> >   4. git-clone(1) creates the reference database as it has now learned
> >      about the object format.
> 
> Sorry, but I am not sure I understand this step.  I assume you mean
> by "the object format" what hash function is used to index the
> objects (which can be learned from the remote "origin" in step 2 and
> we can choose to use the one they use), not what ref backend is used
> (which is purely a local matter and we do not need to know what is
> used at the "origin").

Yes, exactly. I'm never quite sure whether I should be saying "hash
function" or "object format". I'll convert the message to say "hash
function" instead to clarify.

> Why do we need to wait initializing ref backend until we learn what
> hash is being in use?

This is because of the reftable backend. With the files backend it never
mattered much, because we do not encode the object format anywhere. But
with the reftable backend we do indeed encode the object format in the
tables' header, so it's important to initialize it with the correct
format right from the start.

I'll amend the commit message.

Patrick
diff mbox series

Patch

diff --git a/remote-curl.c b/remote-curl.c
index ef05752ca5..fc29757b65 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -1564,8 +1564,11 @@  int cmd_main(int argc, const char **argv)
 		if (buf.len == 0)
 			break;
 		if (starts_with(buf.buf, "fetch ")) {
-			if (nongit)
-				die(_("remote-curl: fetch attempted without a local repo"));
+			if (nongit) {
+				setup_git_directory_gently(&nongit);
+				if (nongit)
+					die(_("remote-curl: fetch attempted without a local repo"));
+			}
 			parse_fetch(&buf);
 
 		} else if (!strcmp(buf.buf, "list") || starts_with(buf.buf, "list ")) {