Message ID | 20231002024034.2611-4-ebiederm@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | initial support for multiple hash functions | expand |
"Eric W. Biederman" <ebiederm@gmail.com> writes: > From: "Eric W. Biederman" <ebiederm@xmission.com> > > We currently have support for using a full stage 4 SHA-256 > implementation. However, we'd like to support interoperability with > SHA-1 repositories as well. The transition plan anticipates a > compatibility hash algorithm configuration option that we can use to > implement support for this. Perhaps add See section "Object names on the command line" in git/Documentation/technical/hash-function-transition.txt . ? That section does not use the language "compatibility hash algorithm" though, and I think "hash compatibility option" is easier to say. Hmm, or are you talking about "compatObjectFormat" discussed in that doc? > Let's add an element to the repository > structure that indicates the compatibility hash algorithm so we can use > it when we need to consider interoperability between algorithms. How about just Add a hash compatibility option to the repository structure to consider interoperability between hash algorithms. ? Aside: already we are seeing multiple keywords "compatibility", "transition", "interoperability" to all mean roughly similar things. I hope we can settle on just one (ideally) in the codebase by the end of this series. > Add a helper function repo_set_compat_hash_algo that takes a > compatibility hash algorithm and sets "repo->compat_hash_algo". If > GIT_HASH_UNKNOWN is passed as the compatibility hash algorithm > "repo->compat_hash_algo" is set to NULL. > > For now, the code results in "repo->compat_hash_algo" always being set > to NULL, but that will change once a configuration option is added. It's not clear to me whether you are talking about a config option to describe the different stages of transition around algorithms, or a hash algorithm itself (SHA1, SHA256, UNKNOWN). > Inspired-by: brian m. carlson <sandals@crustytoothpaste.net> > Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> > --- > repository.c | 8 ++++++++ > repository.h | 4 ++++ > setup.c | 3 +++ > 3 files changed, 15 insertions(+) > > diff --git a/repository.c b/repository.c > index a7679ceeaa45..80252b79e93e 100644 > --- a/repository.c > +++ b/repository.c > @@ -104,6 +104,13 @@ void repo_set_hash_algo(struct repository *repo, int hash_algo) > repo->hash_algo = &hash_algos[hash_algo]; > } > > +void repo_set_compat_hash_algo(struct repository *repo, int algo) > +{ > + if (hash_algo_by_ptr(repo->hash_algo) == algo) > + BUG("hash_algo and compat_hash_algo match"); > + repo->compat_hash_algo = algo ? &hash_algos[algo] : NULL; > +} Ah, OK. So we are talking about an algorithm itself. Looking at this code it seems like a compat_hash_algo is something like "the hash algorithm I want my repository to start using but which has not already". Such a description would have been useful in the commit message. Nit: I think BUG("compat_hash_algo may not be the same as hash_algo"); is more natural because the error message should explain the badness of the behavior rather than merely reflect the triggering condition. And the "star of the show" here is the new compat_hash_algo member, so it makes sense to emphasize that more as the only subject of the sentence instead of grouping it together with hash_algo (given them equal importance). > + > /* > * Attempt to resolve and set the provided 'gitdir' for repository 'repo'. > * Return 0 upon success and a non-zero value upon failure. > @@ -184,6 +191,7 @@ int repo_init(struct repository *repo, > goto error; > > repo_set_hash_algo(repo, format.hash_algo); > + repo_set_compat_hash_algo(repo, GIT_HASH_UNKNOWN); > repo->repository_format_worktree_config = format.worktree_config; > > /* take ownership of format.partial_clone */ > diff --git a/repository.h b/repository.h > index 5f18486f6465..bf3fc601cc53 100644 > --- a/repository.h > +++ b/repository.h > @@ -160,6 +160,9 @@ struct repository { > /* Repository's current hash algorithm, as serialized on disk. */ > const struct git_hash_algo *hash_algo; > > + /* Repository's compatibility hash algorithm. */ Perhaps add "May not be the same as hash_algo." ? > + const struct git_hash_algo *compat_hash_algo; > + > /* A unique-id for tracing purposes. */ > int trace2_repo_id; > > @@ -199,6 +202,7 @@ void repo_set_gitdir(struct repository *repo, const char *root, > const struct set_gitdir_args *extra_args); > void repo_set_worktree(struct repository *repo, const char *path); > void repo_set_hash_algo(struct repository *repo, int algo); > +void repo_set_compat_hash_algo(struct repository *repo, int compat_algo); > void initialize_the_repository(void); > RESULT_MUST_BE_USED > int repo_init(struct repository *r, const char *gitdir, const char *worktree); > diff --git a/setup.c b/setup.c > index 18927a847b86..aa8bf5da5226 100644 > --- a/setup.c > +++ b/setup.c > @@ -1564,6 +1564,8 @@ const char *setup_git_directory_gently(int *nongit_ok) > } > if (startup_info->have_repository) { > repo_set_hash_algo(the_repository, repo_fmt.hash_algo); > + repo_set_compat_hash_algo(the_repository, > + GIT_HASH_UNKNOWN); > the_repository->repository_format_worktree_config = > repo_fmt.worktree_config; > /* take ownership of repo_fmt.partial_clone */ > @@ -1657,6 +1659,7 @@ void check_repository_format(struct repository_format *fmt) > check_repository_format_gently(get_git_dir(), fmt, NULL); > startup_info->have_repository = 1; > repo_set_hash_algo(the_repository, fmt->hash_algo); > + repo_set_compat_hash_algo(the_repository, GIT_HASH_UNKNOWN); > the_repository->repository_format_worktree_config = > fmt->worktree_config; > the_repository->repository_format_partial_clone = > -- > 2.41.0
On Sun, Oct 01, 2023 at 09:40:08PM -0500, Eric W. Biederman wrote: > From: "Eric W. Biederman" <ebiederm@xmission.com> > > We currently have support for using a full stage 4 SHA-256 > implementation. What is a "full stage 4 SHA-256 implementation"? I was assuming that you referred to "Documentation/technical/hash-function-transition.txt", but it does not mention stages either. > However, we'd like to support interoperability with > SHA-1 repositories as well. The transition plan anticipates a > compatibility hash algorithm configuration option that we can use to > implement support for this. Let's add an element to the repository > structure that indicates the compatibility hash algorithm so we can use > it when we need to consider interoperability between algorithms. > > Add a helper function repo_set_compat_hash_algo that takes a > compatibility hash algorithm and sets "repo->compat_hash_algo". If > GIT_HASH_UNKNOWN is passed as the compatibility hash algorithm > "repo->compat_hash_algo" is set to NULL. > > For now, the code results in "repo->compat_hash_algo" always being set > to NULL, but that will change once a configuration option is added. > > Inspired-by: brian m. carlson <sandals@crustytoothpaste.net> > Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> > --- > repository.c | 8 ++++++++ > repository.h | 4 ++++ > setup.c | 3 +++ > 3 files changed, 15 insertions(+) > > diff --git a/repository.c b/repository.c > index a7679ceeaa45..80252b79e93e 100644 > --- a/repository.c > +++ b/repository.c > @@ -104,6 +104,13 @@ void repo_set_hash_algo(struct repository *repo, int hash_algo) > repo->hash_algo = &hash_algos[hash_algo]; > } > > +void repo_set_compat_hash_algo(struct repository *repo, int algo) > +{ > + if (hash_algo_by_ptr(repo->hash_algo) == algo) > + BUG("hash_algo and compat_hash_algo match"); > + repo->compat_hash_algo = algo ? &hash_algos[algo] : NULL; > +} > + > /* > * Attempt to resolve and set the provided 'gitdir' for repository 'repo'. > * Return 0 upon success and a non-zero value upon failure. > @@ -184,6 +191,7 @@ int repo_init(struct repository *repo, > goto error; > > repo_set_hash_algo(repo, format.hash_algo); > + repo_set_compat_hash_algo(repo, GIT_HASH_UNKNOWN); > repo->repository_format_worktree_config = format.worktree_config; > > /* take ownership of format.partial_clone */ > diff --git a/repository.h b/repository.h > index 5f18486f6465..bf3fc601cc53 100644 > --- a/repository.h > +++ b/repository.h > @@ -160,6 +160,9 @@ struct repository { > /* Repository's current hash algorithm, as serialized on disk. */ > const struct git_hash_algo *hash_algo; > > + /* Repository's compatibility hash algorithm. */ > + const struct git_hash_algo *compat_hash_algo; > + > /* A unique-id for tracing purposes. */ > int trace2_repo_id; > > @@ -199,6 +202,7 @@ void repo_set_gitdir(struct repository *repo, const char *root, > const struct set_gitdir_args *extra_args); > void repo_set_worktree(struct repository *repo, const char *path); > void repo_set_hash_algo(struct repository *repo, int algo); > +void repo_set_compat_hash_algo(struct repository *repo, int compat_algo); > void initialize_the_repository(void); > RESULT_MUST_BE_USED > int repo_init(struct repository *r, const char *gitdir, const char *worktree); > diff --git a/setup.c b/setup.c > index 18927a847b86..aa8bf5da5226 100644 > --- a/setup.c > +++ b/setup.c > @@ -1564,6 +1564,8 @@ const char *setup_git_directory_gently(int *nongit_ok) > } > if (startup_info->have_repository) { > repo_set_hash_algo(the_repository, repo_fmt.hash_algo); > + repo_set_compat_hash_algo(the_repository, > + GIT_HASH_UNKNOWN); > the_repository->repository_format_worktree_config = > repo_fmt.worktree_config; > /* take ownership of repo_fmt.partial_clone */ > @@ -1657,6 +1659,7 @@ void check_repository_format(struct repository_format *fmt) > check_repository_format_gently(get_git_dir(), fmt, NULL); > startup_info->have_repository = 1; > repo_set_hash_algo(the_repository, fmt->hash_algo); > + repo_set_compat_hash_algo(the_repository, GIT_HASH_UNKNOWN); > the_repository->repository_format_worktree_config = > fmt->worktree_config; > the_repository->repository_format_partial_clone = There's also `init_db()`, where we call `repo_set_hash_algo()`. Would we have to call `repo_set_compat_hash_algo()` there, too? There are some other locations when handling remotes or clones, but I don't think those are relevant right now. Patrick
diff --git a/repository.c b/repository.c index a7679ceeaa45..80252b79e93e 100644 --- a/repository.c +++ b/repository.c @@ -104,6 +104,13 @@ void repo_set_hash_algo(struct repository *repo, int hash_algo) repo->hash_algo = &hash_algos[hash_algo]; } +void repo_set_compat_hash_algo(struct repository *repo, int algo) +{ + if (hash_algo_by_ptr(repo->hash_algo) == algo) + BUG("hash_algo and compat_hash_algo match"); + repo->compat_hash_algo = algo ? &hash_algos[algo] : NULL; +} + /* * Attempt to resolve and set the provided 'gitdir' for repository 'repo'. * Return 0 upon success and a non-zero value upon failure. @@ -184,6 +191,7 @@ int repo_init(struct repository *repo, goto error; repo_set_hash_algo(repo, format.hash_algo); + repo_set_compat_hash_algo(repo, GIT_HASH_UNKNOWN); repo->repository_format_worktree_config = format.worktree_config; /* take ownership of format.partial_clone */ diff --git a/repository.h b/repository.h index 5f18486f6465..bf3fc601cc53 100644 --- a/repository.h +++ b/repository.h @@ -160,6 +160,9 @@ struct repository { /* Repository's current hash algorithm, as serialized on disk. */ const struct git_hash_algo *hash_algo; + /* Repository's compatibility hash algorithm. */ + const struct git_hash_algo *compat_hash_algo; + /* A unique-id for tracing purposes. */ int trace2_repo_id; @@ -199,6 +202,7 @@ void repo_set_gitdir(struct repository *repo, const char *root, const struct set_gitdir_args *extra_args); void repo_set_worktree(struct repository *repo, const char *path); void repo_set_hash_algo(struct repository *repo, int algo); +void repo_set_compat_hash_algo(struct repository *repo, int compat_algo); void initialize_the_repository(void); RESULT_MUST_BE_USED int repo_init(struct repository *r, const char *gitdir, const char *worktree); diff --git a/setup.c b/setup.c index 18927a847b86..aa8bf5da5226 100644 --- a/setup.c +++ b/setup.c @@ -1564,6 +1564,8 @@ const char *setup_git_directory_gently(int *nongit_ok) } if (startup_info->have_repository) { repo_set_hash_algo(the_repository, repo_fmt.hash_algo); + repo_set_compat_hash_algo(the_repository, + GIT_HASH_UNKNOWN); the_repository->repository_format_worktree_config = repo_fmt.worktree_config; /* take ownership of repo_fmt.partial_clone */ @@ -1657,6 +1659,7 @@ void check_repository_format(struct repository_format *fmt) check_repository_format_gently(get_git_dir(), fmt, NULL); startup_info->have_repository = 1; repo_set_hash_algo(the_repository, fmt->hash_algo); + repo_set_compat_hash_algo(the_repository, GIT_HASH_UNKNOWN); the_repository->repository_format_worktree_config = fmt->worktree_config; the_repository->repository_format_partial_clone =