Message ID | b69c57d27269c9b40c9e4394861dffd8a8b9860c.1701863960.git.ps@pks.im (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | clone: fix init of refdb with wrong object format | expand |
On Wed, Dec 6, 2023 at 1:40 PM Patrick Steinhardt <ps@pks.im> wrote: > +static void create_reference_database(const char *initial_branch, int quiet) > +{ > + struct strbuf err = STRBUF_INIT; > + int reinit = is_reinit(); > + > + /* > + * We need to create a "refs" dir in any case so that older > + * versions of git can tell that this is a repository. > + */ How does this work though, even if an earlier version of git can tell that this is a repository, it still won't be able to read the reftable backend. In that sense, what do we achieve here? > + safe_create_dir(git_path("refs"), 1); > + adjust_shared_perm(git_path("refs")); > + Not related to your commit per se, but we ignore the return value here, shouldn't we die in this case?
On Wed, Dec 06, 2023 at 10:10:37PM +0100, Karthik Nayak wrote: > On Wed, Dec 6, 2023 at 1:40 PM Patrick Steinhardt <ps@pks.im> wrote: > > +static void create_reference_database(const char *initial_branch, int quiet) > > +{ > > + struct strbuf err = STRBUF_INIT; > > + int reinit = is_reinit(); > > + > > + /* > > + * We need to create a "refs" dir in any case so that older > > + * versions of git can tell that this is a repository. > > + */ > > How does this work though, even if an earlier version of git can tell > that this is a repository, it still won't be able to read the reftable > backend. In that sense, what do we achieve here? This is a good question, and there is related ongoing discussion about this topic in the thread starting at [1]. There are a few benefits to letting clients discover such repos even if they don't understand the new reference backend format: - They know to stop walking up the parent-directory chain. Otherwise a client might end up detecting a Git repository in the parent dir. - The user gets a proper error message why the repository cannot be accessed. Instead of failing to detect the repository altogether we instead say that we don't understand the "extensions.refFormat" extension. Maybe there are other cases I can't think of right now. > > + safe_create_dir(git_path("refs"), 1); > > + adjust_shared_perm(git_path("refs")); > > + > > Not related to your commit per se, but we ignore the return value > here, shouldn't we die in this case? While the end result wouldn't be quite what the user asks for, the only negative consequence is that the repository is inaccessible to others. I think this failure mode is comparatively benign -- if it were the other way round and we'd over-share the repository it would more severe. So while I don't think that dying makes much sense here, I could certainly see us adding a warning so that the user at least knows that something went wrong. I'd rather want to keep this out of the current patch series, but could certainly see such a warning added in a follow up patch series. Patrick [1]: <ZWcOvjGPVS_CMUAk@tanuki>
Patrick Steinhardt <ps@pks.im> writes: > On Wed, Dec 06, 2023 at 10:10:37PM +0100, Karthik Nayak wrote: >> On Wed, Dec 6, 2023 at 1:40 PM Patrick Steinhardt <ps@pks.im> wrote: >> > +static void create_reference_database(const char *initial_branch, int quiet) >> > +{ >> > + struct strbuf err = STRBUF_INIT; >> > + int reinit = is_reinit(); >> > + >> > + /* >> > + * We need to create a "refs" dir in any case so that older >> > + * versions of git can tell that this is a repository. >> > + */ >> >> How does this work though, even if an earlier version of git can tell >> that this is a repository, it still won't be able to read the reftable >> backend. In that sense, what do we achieve here? > > This is a good question, and there is related ongoing discussion about > this topic in the thread starting at [1]. There are a few benefits to > letting clients discover such repos even if they don't understand the > new reference backend format: > > - They know to stop walking up the parent-directory chain. Otherwise a > client might end up detecting a Git repository in the parent dir. > > - The user gets a proper error message why the repository cannot be > accessed. Instead of failing to detect the repository altogether we > instead say that we don't understand the "extensions.refFormat" > extension. Yup, both are very good reasons. Would it help to sneak a condensed version of it in the in-code comment, perhaps?
On Sat, Dec 09, 2023 at 07:54:52AM +0900, Junio C Hamano wrote: > Patrick Steinhardt <ps@pks.im> writes: > > > On Wed, Dec 06, 2023 at 10:10:37PM +0100, Karthik Nayak wrote: > >> On Wed, Dec 6, 2023 at 1:40 PM Patrick Steinhardt <ps@pks.im> wrote: > >> > +static void create_reference_database(const char *initial_branch, int quiet) > >> > +{ > >> > + struct strbuf err = STRBUF_INIT; > >> > + int reinit = is_reinit(); > >> > + > >> > + /* > >> > + * We need to create a "refs" dir in any case so that older > >> > + * versions of git can tell that this is a repository. > >> > + */ > >> > >> How does this work though, even if an earlier version of git can tell > >> that this is a repository, it still won't be able to read the reftable > >> backend. In that sense, what do we achieve here? > > > > This is a good question, and there is related ongoing discussion about > > this topic in the thread starting at [1]. There are a few benefits to > > letting clients discover such repos even if they don't understand the > > new reference backend format: > > > > - They know to stop walking up the parent-directory chain. Otherwise a > > client might end up detecting a Git repository in the parent dir. > > > > - The user gets a proper error message why the repository cannot be > > accessed. Instead of failing to detect the repository altogether we > > instead say that we don't understand the "extensions.refFormat" > > extension. > > Yup, both are very good reasons. Would it help to sneak a condensed > version of it in the in-code comment, perhaps? Sure, let's do so. I failed to condense this meaningfully, but hope that the result will be okay regardless of that. Patrick
Patrick Steinhardt <ps@pks.im> writes: > On Wed, Dec 06, 2023 at 10:10:37PM +0100, Karthik Nayak wrote: >> On Wed, Dec 6, 2023 at 1:40 PM Patrick Steinhardt <ps@pks.im> wrote: >> > + /* >> > + * We need to create a "refs" dir in any case so that older >> > + * versions of git can tell that this is a repository. >> > + */ >> >> How does this work though, even if an earlier version of git can tell >> that this is a repository, it still won't be able to read the reftable >> backend. In that sense, what do we achieve here? > > This is a good question, and there is related ongoing discussion about > this topic in the thread starting at [1]. There are a few benefits to > letting clients discover such repos even if they don't understand the > new reference backend format: > > - They know to stop walking up the parent-directory chain. Otherwise a > client might end up detecting a Git repository in the parent dir. > > - The user gets a proper error message why the repository cannot be > accessed. Instead of failing to detect the repository altogether we > instead say that we don't understand the "extensions.refFormat" > extension. > > Maybe there are other cases I can't think of right now. > [1]: <ZWcOvjGPVS_CMUAk@tanuki> Thank Patrick, this does indeed make a lot of sense now. +1 that this would be super useful as a comment here.
diff --git a/setup.c b/setup.c index fc592dc6dd..9fcb64159f 100644 --- a/setup.c +++ b/setup.c @@ -1885,6 +1885,60 @@ void initialize_repository_version(int hash_algo, int reinit) git_config_set_gently("extensions.objectformat", NULL); } +static int is_reinit(void) +{ + struct strbuf buf = STRBUF_INIT; + char junk[2]; + int ret; + + git_path_buf(&buf, "HEAD"); + ret = !access(buf.buf, R_OK) || readlink(buf.buf, junk, sizeof(junk) - 1) != -1; + strbuf_release(&buf); + return ret; +} + +static void create_reference_database(const char *initial_branch, int quiet) +{ + struct strbuf err = STRBUF_INIT; + int reinit = is_reinit(); + + /* + * We need to create a "refs" dir in any case so that older + * versions of git can tell that this is a repository. + */ + safe_create_dir(git_path("refs"), 1); + adjust_shared_perm(git_path("refs")); + + if (refs_init_db(&err)) + die("failed to set up refs db: %s", err.buf); + + /* + * Point the HEAD symref to the initial branch with if HEAD does + * not yet exist. + */ + if (!reinit) { + char *ref; + + if (!initial_branch) + initial_branch = git_default_branch_name(quiet); + + ref = xstrfmt("refs/heads/%s", initial_branch); + if (check_refname_format(ref, 0) < 0) + die(_("invalid initial branch name: '%s'"), + initial_branch); + + if (create_symref("HEAD", ref, NULL) < 0) + exit(1); + free(ref); + } + + if (reinit && initial_branch) + warning(_("re-init: ignored --initial-branch=%s"), + initial_branch); + + strbuf_release(&err); +} + static int create_default_files(const char *template_path, const char *original_git_dir, const char *initial_branch, @@ -1896,10 +1950,8 @@ static int create_default_files(const char *template_path, struct stat st1; struct strbuf buf = STRBUF_INIT; char *path; - char junk[2]; int reinit; int filemode; - struct strbuf err = STRBUF_INIT; const char *init_template_dir = NULL; const char *work_tree = get_git_work_tree(); @@ -1919,6 +1971,8 @@ static int create_default_files(const char *template_path, reset_shared_repository(); git_config(git_default_config, NULL); + reinit = is_reinit(); + /* * We must make sure command-line options continue to override any * values we might have just re-read from the config. @@ -1962,39 +2016,7 @@ static int create_default_files(const char *template_path, adjust_shared_perm(get_git_dir()); } - /* - * We need to create a "refs" dir in any case so that older - * versions of git can tell that this is a repository. - */ - safe_create_dir(git_path("refs"), 1); - adjust_shared_perm(git_path("refs")); - - if (refs_init_db(&err)) - die("failed to set up refs db: %s", err.buf); - - /* - * Point the HEAD symref to the initial branch with if HEAD does - * not yet exist. - */ - path = git_path_buf(&buf, "HEAD"); - reinit = (!access(path, R_OK) - || readlink(path, junk, sizeof(junk)-1) != -1); - if (!reinit) { - char *ref; - - if (!initial_branch) - initial_branch = git_default_branch_name(quiet); - - ref = xstrfmt("refs/heads/%s", initial_branch); - if (check_refname_format(ref, 0) < 0) - die(_("invalid initial branch name: '%s'"), - initial_branch); - - if (create_symref("HEAD", ref, NULL) < 0) - exit(1); - free(ref); - } - + create_reference_database(initial_branch, quiet); initialize_repository_version(fmt->hash_algo, 0); /* Check filemode trustability */ @@ -2158,9 +2180,6 @@ int init_db(const char *git_dir, const char *real_git_dir, prev_bare_repository, init_shared_repository, flags & INIT_DB_QUIET); - if (reinit && initial_branch) - warning(_("re-init: ignored --initial-branch=%s"), - initial_branch); create_object_directory();
We're about to let callers skip creation of the reference database when calling `init_db()`. Extract the logic into a standalone function so that it becomes easier to do this refactoring. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- setup.c | 95 ++++++++++++++++++++++++++++++++++----------------------- 1 file changed, 57 insertions(+), 38 deletions(-)