diff mbox series

[1/7] setup: extract function to create the refdb

Message ID b69c57d27269c9b40c9e4394861dffd8a8b9860c.1701863960.git.ps@pks.im (mailing list archive)
State Superseded
Headers show
Series clone: fix init of refdb with wrong object format | expand

Commit Message

Patrick Steinhardt Dec. 6, 2023, 12:39 p.m. UTC
We're about to let callers skip creation of the reference database when
calling `init_db()`. Extract the logic into a standalone function so
that it becomes easier to do this refactoring.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 setup.c | 95 ++++++++++++++++++++++++++++++++++-----------------------
 1 file changed, 57 insertions(+), 38 deletions(-)

Comments

karthik nayak Dec. 6, 2023, 9:10 p.m. UTC | #1
On Wed, Dec 6, 2023 at 1:40 PM Patrick Steinhardt <ps@pks.im> wrote:
> +static void create_reference_database(const char *initial_branch, int quiet)
> +{
> +       struct strbuf err = STRBUF_INIT;
> +       int reinit = is_reinit();
> +
> +       /*
> +        * We need to create a "refs" dir in any case so that older
> +        * versions of git can tell that this is a repository.
> +        */

How does this work though, even if an earlier version of git can tell
that this is a repository,
it still won't be able to read the reftable backend. In that sense,
what do we achieve here?

> +       safe_create_dir(git_path("refs"), 1);
> +       adjust_shared_perm(git_path("refs"));
> +

Not related to your commit per se, but we ignore the return value
here, shouldn't we die in this case?
Patrick Steinhardt Dec. 7, 2023, 7:22 a.m. UTC | #2
On Wed, Dec 06, 2023 at 10:10:37PM +0100, Karthik Nayak wrote:
> On Wed, Dec 6, 2023 at 1:40 PM Patrick Steinhardt <ps@pks.im> wrote:
> > +static void create_reference_database(const char *initial_branch, int quiet)
> > +{
> > +       struct strbuf err = STRBUF_INIT;
> > +       int reinit = is_reinit();
> > +
> > +       /*
> > +        * We need to create a "refs" dir in any case so that older
> > +        * versions of git can tell that this is a repository.
> > +        */
> 
> How does this work though, even if an earlier version of git can tell
> that this is a repository, it still won't be able to read the reftable
> backend. In that sense, what do we achieve here?

This is a good question, and there is related ongoing discussion about
this topic in the thread starting at [1]. There are a few benefits to
letting clients discover such repos even if they don't understand the
new reference backend format:

  - They know to stop walking up the parent-directory chain. Otherwise a
    client might end up detecting a Git repository in the parent dir.

  - The user gets a proper error message why the repository cannot be
    accessed. Instead of failing to detect the repository altogether we
    instead say that we don't understand the "extensions.refFormat"
    extension.

Maybe there are other cases I can't think of right now.

> > +       safe_create_dir(git_path("refs"), 1);
> > +       adjust_shared_perm(git_path("refs"));
> > +
> 
> Not related to your commit per se, but we ignore the return value
> here, shouldn't we die in this case?

While the end result wouldn't be quite what the user asks for, the only
negative consequence is that the repository is inaccessible to others. I
think this failure mode is comparatively benign -- if it were the other
way round and we'd over-share the repository it would more severe.

So while I don't think that dying makes much sense here, I could
certainly see us adding a warning so that the user at least knows that
something went wrong. I'd rather want to keep this out of the current
patch series, but could certainly see such a warning added in a follow
up patch series.

Patrick

[1]: <ZWcOvjGPVS_CMUAk@tanuki>
Junio C Hamano Dec. 8, 2023, 10:54 p.m. UTC | #3
Patrick Steinhardt <ps@pks.im> writes:

> On Wed, Dec 06, 2023 at 10:10:37PM +0100, Karthik Nayak wrote:
>> On Wed, Dec 6, 2023 at 1:40 PM Patrick Steinhardt <ps@pks.im> wrote:
>> > +static void create_reference_database(const char *initial_branch, int quiet)
>> > +{
>> > +       struct strbuf err = STRBUF_INIT;
>> > +       int reinit = is_reinit();
>> > +
>> > +       /*
>> > +        * We need to create a "refs" dir in any case so that older
>> > +        * versions of git can tell that this is a repository.
>> > +        */
>> 
>> How does this work though, even if an earlier version of git can tell
>> that this is a repository, it still won't be able to read the reftable
>> backend. In that sense, what do we achieve here?
>
> This is a good question, and there is related ongoing discussion about
> this topic in the thread starting at [1]. There are a few benefits to
> letting clients discover such repos even if they don't understand the
> new reference backend format:
>
>   - They know to stop walking up the parent-directory chain. Otherwise a
>     client might end up detecting a Git repository in the parent dir.
>
>   - The user gets a proper error message why the repository cannot be
>     accessed. Instead of failing to detect the repository altogether we
>     instead say that we don't understand the "extensions.refFormat"
>     extension.

Yup, both are very good reasons.  Would it help to sneak a condensed
version of it in the in-code comment, perhaps?
Patrick Steinhardt Dec. 11, 2023, 11:34 a.m. UTC | #4
On Sat, Dec 09, 2023 at 07:54:52AM +0900, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > On Wed, Dec 06, 2023 at 10:10:37PM +0100, Karthik Nayak wrote:
> >> On Wed, Dec 6, 2023 at 1:40 PM Patrick Steinhardt <ps@pks.im> wrote:
> >> > +static void create_reference_database(const char *initial_branch, int quiet)
> >> > +{
> >> > +       struct strbuf err = STRBUF_INIT;
> >> > +       int reinit = is_reinit();
> >> > +
> >> > +       /*
> >> > +        * We need to create a "refs" dir in any case so that older
> >> > +        * versions of git can tell that this is a repository.
> >> > +        */
> >> 
> >> How does this work though, even if an earlier version of git can tell
> >> that this is a repository, it still won't be able to read the reftable
> >> backend. In that sense, what do we achieve here?
> >
> > This is a good question, and there is related ongoing discussion about
> > this topic in the thread starting at [1]. There are a few benefits to
> > letting clients discover such repos even if they don't understand the
> > new reference backend format:
> >
> >   - They know to stop walking up the parent-directory chain. Otherwise a
> >     client might end up detecting a Git repository in the parent dir.
> >
> >   - The user gets a proper error message why the repository cannot be
> >     accessed. Instead of failing to detect the repository altogether we
> >     instead say that we don't understand the "extensions.refFormat"
> >     extension.
> 
> Yup, both are very good reasons.  Would it help to sneak a condensed
> version of it in the in-code comment, perhaps?

Sure, let's do so. I failed to condense this meaningfully, but hope that
the result will be okay regardless of that.

Patrick
karthik nayak Dec. 11, 2023, 3:50 p.m. UTC | #5
Patrick Steinhardt <ps@pks.im> writes:

> On Wed, Dec 06, 2023 at 10:10:37PM +0100, Karthik Nayak wrote:
>> On Wed, Dec 6, 2023 at 1:40 PM Patrick Steinhardt <ps@pks.im> wrote:
>> > +       /*
>> > +        * We need to create a "refs" dir in any case so that older
>> > +        * versions of git can tell that this is a repository.
>> > +        */
>>
>> How does this work though, even if an earlier version of git can tell
>> that this is a repository, it still won't be able to read the reftable
>> backend. In that sense, what do we achieve here?
>
> This is a good question, and there is related ongoing discussion about
> this topic in the thread starting at [1]. There are a few benefits to
> letting clients discover such repos even if they don't understand the
> new reference backend format:
>
>   - They know to stop walking up the parent-directory chain. Otherwise a
>     client might end up detecting a Git repository in the parent dir.
>
>   - The user gets a proper error message why the repository cannot be
>     accessed. Instead of failing to detect the repository altogether we
>     instead say that we don't understand the "extensions.refFormat"
>     extension.
>
> Maybe there are other cases I can't think of right now.
> [1]: <ZWcOvjGPVS_CMUAk@tanuki>

Thank Patrick, this does indeed make a lot of sense now. +1 that this
would be super useful as a comment here.
diff mbox series

Patch

diff --git a/setup.c b/setup.c
index fc592dc6dd..9fcb64159f 100644
--- a/setup.c
+++ b/setup.c
@@ -1885,6 +1885,60 @@  void initialize_repository_version(int hash_algo, int reinit)
 		git_config_set_gently("extensions.objectformat", NULL);
 }
 
+static int is_reinit(void)
+{
+	struct strbuf buf = STRBUF_INIT;
+	char junk[2];
+	int ret;
+
+	git_path_buf(&buf, "HEAD");
+	ret = !access(buf.buf, R_OK) || readlink(buf.buf, junk, sizeof(junk) - 1) != -1;
+	strbuf_release(&buf);
+	return ret;
+}
+
+static void create_reference_database(const char *initial_branch, int quiet)
+{
+	struct strbuf err = STRBUF_INIT;
+	int reinit = is_reinit();
+
+	/*
+	 * We need to create a "refs" dir in any case so that older
+	 * versions of git can tell that this is a repository.
+	 */
+	safe_create_dir(git_path("refs"), 1);
+	adjust_shared_perm(git_path("refs"));
+
+	if (refs_init_db(&err))
+		die("failed to set up refs db: %s", err.buf);
+
+	/*
+	 * Point the HEAD symref to the initial branch with if HEAD does
+	 * not yet exist.
+	 */
+	if (!reinit) {
+		char *ref;
+
+		if (!initial_branch)
+			initial_branch = git_default_branch_name(quiet);
+
+		ref = xstrfmt("refs/heads/%s", initial_branch);
+		if (check_refname_format(ref, 0) < 0)
+			die(_("invalid initial branch name: '%s'"),
+			    initial_branch);
+
+		if (create_symref("HEAD", ref, NULL) < 0)
+			exit(1);
+		free(ref);
+	}
+
+	if (reinit && initial_branch)
+		warning(_("re-init: ignored --initial-branch=%s"),
+			initial_branch);
+
+	strbuf_release(&err);
+}
+
 static int create_default_files(const char *template_path,
 				const char *original_git_dir,
 				const char *initial_branch,
@@ -1896,10 +1950,8 @@  static int create_default_files(const char *template_path,
 	struct stat st1;
 	struct strbuf buf = STRBUF_INIT;
 	char *path;
-	char junk[2];
 	int reinit;
 	int filemode;
-	struct strbuf err = STRBUF_INIT;
 	const char *init_template_dir = NULL;
 	const char *work_tree = get_git_work_tree();
 
@@ -1919,6 +1971,8 @@  static int create_default_files(const char *template_path,
 	reset_shared_repository();
 	git_config(git_default_config, NULL);
 
+	reinit = is_reinit();
+
 	/*
 	 * We must make sure command-line options continue to override any
 	 * values we might have just re-read from the config.
@@ -1962,39 +2016,7 @@  static int create_default_files(const char *template_path,
 		adjust_shared_perm(get_git_dir());
 	}
 
-	/*
-	 * We need to create a "refs" dir in any case so that older
-	 * versions of git can tell that this is a repository.
-	 */
-	safe_create_dir(git_path("refs"), 1);
-	adjust_shared_perm(git_path("refs"));
-
-	if (refs_init_db(&err))
-		die("failed to set up refs db: %s", err.buf);
-
-	/*
-	 * Point the HEAD symref to the initial branch with if HEAD does
-	 * not yet exist.
-	 */
-	path = git_path_buf(&buf, "HEAD");
-	reinit = (!access(path, R_OK)
-		  || readlink(path, junk, sizeof(junk)-1) != -1);
-	if (!reinit) {
-		char *ref;
-
-		if (!initial_branch)
-			initial_branch = git_default_branch_name(quiet);
-
-		ref = xstrfmt("refs/heads/%s", initial_branch);
-		if (check_refname_format(ref, 0) < 0)
-			die(_("invalid initial branch name: '%s'"),
-			    initial_branch);
-
-		if (create_symref("HEAD", ref, NULL) < 0)
-			exit(1);
-		free(ref);
-	}
-
+	create_reference_database(initial_branch, quiet);
 	initialize_repository_version(fmt->hash_algo, 0);
 
 	/* Check filemode trustability */
@@ -2158,9 +2180,6 @@  int init_db(const char *git_dir, const char *real_git_dir,
 				      prev_bare_repository,
 				      init_shared_repository,
 				      flags & INIT_DB_QUIET);
-	if (reinit && initial_branch)
-		warning(_("re-init: ignored --initial-branch=%s"),
-			initial_branch);
 
 	create_object_directory();