Message ID | 20200716062818.GC3242764@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | extensions.* fixes for 2.28 (Re: [PATCH] setup: warn about un-enabled extensions) | expand |
Jonathan Nieder <jrnieder@gmail.com> writes: > Now that we officially permit repository extensions in repository > format v0, permit upgrading a repository with extensions from v0 to v1 > as well. > > For example, this means a repository where the user has set > "extensions.preciousObjects" can use "git fetch --filter=blob:none > origin" to upgrade the repository to use v1 and the partial clone > extension. > > To avoid mistakes, continue to forbid repository format upgrades in v0 > repositories with an unrecognized extension. This way, a v0 user > using a misspelled extension field gets a chance to correct the > mistake before updating to the less forgiving v1 format. This needs to be managed carefully. When the next extension is added to the codebase, that extension may be "known" to Git, but I do not think it is a good idea to honor it in v0 repository, or allow upgrading v0 repository to v1 with such an extension that weren't "known" to Git. For example, a topic in flight adds objectformat extension and I do not think it should be honored in v0 repository. Having said that, the approach is OK for now at the tip of tonight's master, but the point is "known" vs "unknown" must be fixed right with some means. E.g. tell people to throw the "new" extensions to the list of "unknown extensions" in check_repo_format() when they add new ones, or something. Thanks. > + if (!repo_fmt.version && repo_fmt.unknown_extensions.nr) > + return error("cannot upgrade repository format: " > + "unknown extension %s", > + repo_fmt.unknown_extensions.items[0].string); > > strbuf_addf(&repo_version, "%d", target_version); > git_config_set("core.repositoryformatversion", repo_version.buf); > diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh > index 51d1eba6050..6aa0f313bdd 100755 > --- a/t/t0410-partial-clone.sh > +++ b/t/t0410-partial-clone.sh > @@ -42,7 +42,7 @@ test_expect_success 'convert shallow clone to partial clone' ' > test_cmp_config -C client 1 core.repositoryformatversion > ' > > -test_expect_success 'converting to partial clone fails with noop extension' ' > +test_expect_success 'convert to partial clone with noop extension' ' > rm -fr server client && > test_create_repo server && > test_commit -C server my_commit 1 && > @@ -50,7 +50,7 @@ test_expect_success 'converting to partial clone fails with noop extension' ' > git clone --depth=1 "file://$(pwd)/server" client && > test_cmp_config -C client 0 core.repositoryformatversion && > git -C client config extensions.noop true && > - test_must_fail git -C client fetch --unshallow --filter="blob:none" > + git -C client fetch --unshallow --filter="blob:none" > ' > > test_expect_success 'converting to partial clone fails with unrecognized extension' '
On Thu, Jul 16, 2020 at 12:01:09AM -0700, Junio C Hamano wrote: > > To avoid mistakes, continue to forbid repository format upgrades in v0 > > repositories with an unrecognized extension. This way, a v0 user > > using a misspelled extension field gets a chance to correct the > > mistake before updating to the less forgiving v1 format. > > This needs to be managed carefully. When the next extension is > added to the codebase, that extension may be "known" to Git, but I > do not think it is a good idea to honor it in v0 repository, or > allow upgrading v0 repository to v1 with such an extension that > weren't "known" to Git. For example, a topic in flight adds > objectformat extension and I do not think it should be honored in v0 > repository. > > Having said that, the approach is OK for now at the tip of tonight's > master, but the point is "known" vs "unknown" must be fixed right > with some means. E.g. tell people to throw the "new" extensions to > the list of "unknown extensions" in check_repo_format() when they > add new ones, or something. Yeah, I agree with this line of reasoning. I'd prefer to see it addressed now, so that we don't have to remember to do anything later. I.e., for this patch to put the existing known extensions into the "good" list for v0, locking it into place forever, and leaving the objectformat topic with nothing particular it needs to do. But in the name of -rc1 expediency, I'm also OK moving forward with this for now. -Peff
On Thu, Jul 16, 2020 at 07:00:08AM -0400, Jeff King wrote: > > > To avoid mistakes, continue to forbid repository format upgrades in v0 > > > repositories with an unrecognized extension. This way, a v0 user > > > using a misspelled extension field gets a chance to correct the > > > mistake before updating to the less forgiving v1 format. > > > > This needs to be managed carefully. When the next extension is > > added to the codebase, that extension may be "known" to Git, but I > > do not think it is a good idea to honor it in v0 repository, or > > allow upgrading v0 repository to v1 with such an extension that > > weren't "known" to Git. For example, a topic in flight adds > > objectformat extension and I do not think it should be honored in v0 > > repository. > > > > Having said that, the approach is OK for now at the tip of tonight's > > master, but the point is "known" vs "unknown" must be fixed right > > with some means. E.g. tell people to throw the "new" extensions to > > the list of "unknown extensions" in check_repo_format() when they > > add new ones, or something. > > Yeah, I agree with this line of reasoning. I'd prefer to see it > addressed now, so that we don't have to remember to do anything later. > I.e., for this patch to put the existing known extensions into the > "good" list for v0, locking it into place forever, and leaving the > objectformat topic with nothing particular it needs to do. > > But in the name of -rc1 expediency, I'm also OK moving forward with this > for now. Hmm, this is actually a bit trickier than I expected because of the way the code is written. It's much easier to complain about extensions in a v0 repository than it is to ignore them. But I'm not sure if that isn't the right way to go anyway. The patch I came up with is below (and goes on top of Jonathan's). Even if we decide this is the right direction, it can definitely happen post-v2.28. -- >8 -- Subject: verify_repository_format(): complain about new extensions in v0 repo We made the mistake in the past of respecting extensions.* even when the repository format version was set to 0. This is bad because forgetting to bump the repository version means that older versions of Git (which do not know about our extensions) won't complain. I.e., it's not a problem in itself, but it means your repository is in a state which does not give you the protection you think you're getting from older versions. For compatibility reasons, we are stuck with that decision for existing extensions. However, we'd prefer not to extend the damage further. We can do that by catching any newly-added extensions and complaining about the repository format. Note that this is a pretty heavy hammer: we'll refuse to work with the repository at all. A lesser option would be to ignore (possibly with a warning) any new extensions. But because of the way the extensions are handled, that puts the burden on each new extension that is added to remember to "undo" itself (because they are handled before we know for sure whether we are in a v1 repo or not, since we don't insist on a particular ordering of config entries). So one option would be to rewrite that handling to record any new extensions (and their values) during the config parse, and then only after proceed to handle new ones only if we're in a v1 repository. But I'm not sure if it's worth the trouble: - ignoring extensions is likely to end up with broken results anyway (e.g., ignoring a proposed objectformat extension means parsing any object data is likely to encounter errors) - this is a sign that whatever tool wrote the extension field is broken. We may be better off notifying immediately and forcefully so that such tools don't even appear to work accidentally. The only downside is that fixing the istuation is a little tricky, because programs like "git config" won't want to work with the repository. But: git config --file=.git/config core.repositoryformatversion 1 should still suffice. Signed-off-by: Jeff King <peff@peff.net> --- cache.h | 2 + setup.c | 96 ++++++++++++++++++++++++++++++++++------- t/t1302-repo-version.sh | 3 ++ 3 files changed, 85 insertions(+), 16 deletions(-) diff --git a/cache.h b/cache.h index 654426460c..0290849c19 100644 --- a/cache.h +++ b/cache.h @@ -1044,6 +1044,7 @@ struct repository_format { int hash_algo; char *work_tree; struct string_list unknown_extensions; + struct string_list v1_only_extensions; }; /* @@ -1057,6 +1058,7 @@ struct repository_format { .is_bare = -1, \ .hash_algo = GIT_HASH_SHA1, \ .unknown_extensions = STRING_LIST_INIT_DUP, \ + .v1_only_extensions = STRING_LIST_INIT_DUP, \ } /* diff --git a/setup.c b/setup.c index 3a81307602..c1480b2b60 100644 --- a/setup.c +++ b/setup.c @@ -447,6 +447,54 @@ static int read_worktree_config(const char *var, const char *value, void *vdata) return 0; } +enum extension_result { + EXTENSION_ERROR = -1, /* compatible with error(), etc */ + EXTENSION_UNKNOWN = 0, + EXTENSION_OK = 1 +}; + +/* + * Do not add new extensions to this function. It handles extensions which are + * respected even in v0-format repositories for historical compatibility. + */ +enum extension_result handle_extension_v0(const char *var, + const char *value, + const char *ext, + struct repository_format *data) +{ + if (!strcmp(ext, "noop")) { + return EXTENSION_OK; + } else if (!strcmp(ext, "preciousobjects")) { + data->precious_objects = git_config_bool(var, value); + return EXTENSION_OK; + } else if (!strcmp(ext, "partialclone")) { + if (!value) + return config_error_nonbool(var); + data->partial_clone = xstrdup(value); + return EXTENSION_OK; + } else if (!strcmp(ext, "worktreeconfig")) { + data->worktree_config = git_config_bool(var, value); + return EXTENSION_OK; + } + + return EXTENSION_UNKNOWN; +} + +/* + * Record any new extensions in this function. + */ +enum extension_result handle_extension(const char *var, + const char *value, + const char *ext, + struct repository_format *data) +{ + if (!strcmp(ext, "noop-v1")) { + return EXTENSION_OK; + } + + return EXTENSION_UNKNOWN; +} + static int check_repo_format(const char *var, const char *value, void *vdata) { struct repository_format *data = vdata; @@ -455,23 +503,25 @@ static int check_repo_format(const char *var, const char *value, void *vdata) if (strcmp(var, "core.repositoryformatversion") == 0) data->version = git_config_int(var, value); else if (skip_prefix(var, "extensions.", &ext)) { - /* - * record any known extensions here; otherwise, - * we fall through to recording it as unknown, and - * check_repository_format will complain - */ - if (!strcmp(ext, "noop")) - ; - else if (!strcmp(ext, "preciousobjects")) - data->precious_objects = git_config_bool(var, value); - else if (!strcmp(ext, "partialclone")) { - if (!value) - return config_error_nonbool(var); - data->partial_clone = xstrdup(value); - } else if (!strcmp(ext, "worktreeconfig")) - data->worktree_config = git_config_bool(var, value); - else + switch (handle_extension_v0(var, value, ext, data)) { + case EXTENSION_ERROR: + return -1; + case EXTENSION_OK: + return 0; + case EXTENSION_UNKNOWN: + break; + } + + switch (handle_extension(var, value, ext, data)) { + case EXTENSION_ERROR: + return -1; + case EXTENSION_OK: + string_list_append(&data->v1_only_extensions, ext); + return 0; + case EXTENSION_UNKNOWN: string_list_append(&data->unknown_extensions, ext); + return 0; + } } return read_worktree_config(var, value, vdata); @@ -510,6 +560,7 @@ static int check_repository_format_gently(const char *gitdir, struct repository_ set_repository_format_partial_clone(candidate->partial_clone); repository_format_worktree_config = candidate->worktree_config; string_list_clear(&candidate->unknown_extensions, 0); + string_list_clear(&candidate->v1_only_extensions, 0); if (repository_format_worktree_config) { /* @@ -588,6 +639,7 @@ int read_repository_format(struct repository_format *format, const char *path) void clear_repository_format(struct repository_format *format) { string_list_clear(&format->unknown_extensions, 0); + string_list_clear(&format->v1_only_extensions, 0); free(format->work_tree); free(format->partial_clone); init_repository_format(format); @@ -613,6 +665,18 @@ int verify_repository_format(const struct repository_format *format, return -1; } + if (format->version == 0 && format->v1_only_extensions.nr) { + int i; + + strbuf_addstr(err, + _("repo version is 0, but v1-only extensions found:")); + + for (i = 0; i < format->v1_only_extensions.nr; i++) + strbuf_addf(err, "\n\t%s", + format->v1_only_extensions.items[i].string); + return -1; + } + return 0; } diff --git a/t/t1302-repo-version.sh b/t/t1302-repo-version.sh index d60c042ce8..0acabb6d11 100755 --- a/t/t1302-repo-version.sh +++ b/t/t1302-repo-version.sh @@ -87,6 +87,9 @@ allow 1 allow 1 noop abort 1 no-such-extension allow 0 no-such-extension +allow 0 noop +abort 0 noop-v1 +allow 1 noop-v1 EOF test_expect_success 'precious-objects allowed' '
On 7/16/2020 8:25 AM, Jeff King wrote: > On Thu, Jul 16, 2020 at 07:00:08AM -0400, Jeff King wrote: > >>>> To avoid mistakes, continue to forbid repository format upgrades in v0 >>>> repositories with an unrecognized extension. This way, a v0 user >>>> using a misspelled extension field gets a chance to correct the >>>> mistake before updating to the less forgiving v1 format. >>> >>> This needs to be managed carefully. When the next extension is >>> added to the codebase, that extension may be "known" to Git, but I >>> do not think it is a good idea to honor it in v0 repository, or >>> allow upgrading v0 repository to v1 with such an extension that >>> weren't "known" to Git. For example, a topic in flight adds >>> objectformat extension and I do not think it should be honored in v0 >>> repository. >>> >>> Having said that, the approach is OK for now at the tip of tonight's >>> master, but the point is "known" vs "unknown" must be fixed right >>> with some means. E.g. tell people to throw the "new" extensions to >>> the list of "unknown extensions" in check_repo_format() when they >>> add new ones, or something. >> >> Yeah, I agree with this line of reasoning. I'd prefer to see it >> addressed now, so that we don't have to remember to do anything later. >> I.e., for this patch to put the existing known extensions into the >> "good" list for v0, locking it into place forever, and leaving the >> objectformat topic with nothing particular it needs to do. >> >> But in the name of -rc1 expediency, I'm also OK moving forward with this >> for now. > > Hmm, this is actually a bit trickier than I expected because of the way > the code is written. It's much easier to complain about extensions in a > v0 repository than it is to ignore them. But I'm not sure if that isn't > the right way to go anyway. > > The patch I came up with is below (and goes on top of Jonathan's). Even > if we decide this is the right direction, it can definitely happen > post-v2.28. > > -- >8 -- > Subject: verify_repository_format(): complain about new extensions in v0 repo > > We made the mistake in the past of respecting extensions.* even when the > repository format version was set to 0. This is bad because forgetting > to bump the repository version means that older versions of Git (which > do not know about our extensions) won't complain. I.e., it's not a > problem in itself, but it means your repository is in a state which does > not give you the protection you think you're getting from older > versions. > > For compatibility reasons, we are stuck with that decision for existing > extensions. However, we'd prefer not to extend the damage further. We > can do that by catching any newly-added extensions and complaining about > the repository format. > > Note that this is a pretty heavy hammer: we'll refuse to work with the > repository at all. A lesser option would be to ignore (possibly with a > warning) any new extensions. But because of the way the extensions are > handled, that puts the burden on each new extension that is added to > remember to "undo" itself (because they are handled before we know > for sure whether we are in a v1 repo or not, since we don't insist on a > particular ordering of config entries). > > So one option would be to rewrite that handling to record any new > extensions (and their values) during the config parse, and then only > after proceed to handle new ones only if we're in a v1 repository. But > I'm not sure if it's worth the trouble: > > - ignoring extensions is likely to end up with broken results anyway > (e.g., ignoring a proposed objectformat extension means parsing any > object data is likely to encounter errors) > > - this is a sign that whatever tool wrote the extension field is > broken. We may be better off notifying immediately and forcefully so > that such tools don't even appear to work accidentally. > > The only downside is that fixing the istuation is a little tricky, s/istuation/situation > because programs like "git config" won't want to work with the > repository. But: > > git config --file=.git/config core.repositoryformatversion 1 > > should still suffice. > > Signed-off-by: Jeff King <peff@peff.net> > --- > cache.h | 2 + > setup.c | 96 ++++++++++++++++++++++++++++++++++------- > t/t1302-repo-version.sh | 3 ++ > 3 files changed, 85 insertions(+), 16 deletions(-) > > diff --git a/cache.h b/cache.h > index 654426460c..0290849c19 100644 > --- a/cache.h > +++ b/cache.h > @@ -1044,6 +1044,7 @@ struct repository_format { > int hash_algo; > char *work_tree; > struct string_list unknown_extensions; > + struct string_list v1_only_extensions; > }; > > /* > @@ -1057,6 +1058,7 @@ struct repository_format { > .is_bare = -1, \ > .hash_algo = GIT_HASH_SHA1, \ > .unknown_extensions = STRING_LIST_INIT_DUP, \ > + .v1_only_extensions = STRING_LIST_INIT_DUP, \ > } > > /* > diff --git a/setup.c b/setup.c > index 3a81307602..c1480b2b60 100644 > --- a/setup.c > +++ b/setup.c > @@ -447,6 +447,54 @@ static int read_worktree_config(const char *var, const char *value, void *vdata) > return 0; > } > > +enum extension_result { > + EXTENSION_ERROR = -1, /* compatible with error(), etc */ > + EXTENSION_UNKNOWN = 0, > + EXTENSION_OK = 1 > +}; > + > +/* > + * Do not add new extensions to this function. It handles extensions which are > + * respected even in v0-format repositories for historical compatibility. > + */ > +enum extension_result handle_extension_v0(const char *var, > + const char *value, > + const char *ext, > + struct repository_format *data) ... > +/* > + * Record any new extensions in this function. > + */ > +enum extension_result handle_extension(const char *var, > + const char *value, > + const char *ext, > + struct repository_format *data) I like the split between these two methods to make it really clear the difference between "v0" and "v1". > struct repository_format *data = vdata; > @@ -455,23 +503,25 @@ static int check_repo_format(const char *var, const char *value, void *vdata) > if (strcmp(var, "core.repositoryformatversion") == 0) > data->version = git_config_int(var, value); > else if (skip_prefix(var, "extensions.", &ext)) { ... > + switch (handle_extension_v0(var, value, ext, data)) { > + case EXTENSION_ERROR: > + return -1; > + case EXTENSION_OK: > + return 0; > + case EXTENSION_UNKNOWN: > + break; > + } > + > + switch (handle_extension(var, value, ext, data)) { > + case EXTENSION_ERROR: > + return -1; > + case EXTENSION_OK: > + string_list_append(&data->v1_only_extensions, ext); > + return 0; > + case EXTENSION_UNKNOWN: > string_list_append(&data->unknown_extensions, ext); > + return 0; > + } > } And it makes this loop much cleaner. > @@ -613,6 +665,18 @@ int verify_repository_format(const struct repository_format *format, > return -1; > } > > + if (format->version == 0 && format->v1_only_extensions.nr) { > + int i; > + > + strbuf_addstr(err, > + _("repo version is 0, but v1-only extensions found:")); > + > + for (i = 0; i < format->v1_only_extensions.nr; i++) > + strbuf_addf(err, "\n\t%s", > + format->v1_only_extensions.items[i].string); > + return -1; > + } > + > return 0; > } > > diff --git a/t/t1302-repo-version.sh b/t/t1302-repo-version.sh > index d60c042ce8..0acabb6d11 100755 > --- a/t/t1302-repo-version.sh > +++ b/t/t1302-repo-version.sh > @@ -87,6 +87,9 @@ allow 1 > allow 1 noop > abort 1 no-such-extension > allow 0 no-such-extension > +allow 0 noop > +abort 0 noop-v1 > +allow 1 noop-v1 LGTM. Thanks, -Stolee
Jeff King <peff@peff.net> writes: > Yeah, I agree with this line of reasoning. I'd prefer to see it > addressed now, so that we don't have to remember to do anything later. Very true. Also the documentation may need some updating, e.g. "These 4 extensions are honored without adding repositoryFormatVersion to your repository (as special cases)" to avoid further confusion e.g. "why doesn't my objectFormat=SHA-3 does not take effect by itself?". > I.e., for this patch to put the existing known extensions into the > "good" list for v0, locking it into place forever, and leaving the > objectformat topic with nothing particular it needs to do. > > But in the name of -rc1 expediency, I'm also OK moving forward with this > for now. I'm OK, too, as I said. I'd need to kick out bc/sha-2-part-3 topic out of my tree while that infrastructure is in place on top of these two patches, though. Thanks.
Jeff King <peff@peff.net> writes: > Hmm, this is actually a bit trickier than I expected because of the way > the code is written. It's much easier to complain about extensions in a > v0 repository than it is to ignore them. But I'm not sure if that isn't > the right way to go anyway. > > The patch I came up with is below (and goes on top of Jonathan's). Even > if we decide this is the right direction, it can definitely happen > post-v2.28. It must happen already in 'seen' if we want to keep bc/sha-2-part-3 with us, though ;-) > So one option would be to rewrite that handling to record any new > extensions (and their values) during the config parse, and then only > after proceed to handle new ones only if we're in a v1 repository. I do not think it would be too bad for read_repository_format() to call git_config_from_file() to collect extensions.* in a string list while looking for core.repositoryformatversion. Then the function can iterate over the string list to call check_repo_format() itself. > I'm not sure if it's worth the trouble: > > - ignoring extensions is likely to end up with broken results anyway > (e.g., ignoring a proposed objectformat extension means parsing any > object data is likely to encounter errors) The primary reason why the safety calls for ignore/reject is the namespace collision. We may decide to use extensions.objectformat to record what hash algorithms are used for objects in the repository, while the end user and their tools may use it for totally different purpose (perhaps they have a custom script around "git repack" that reads the variable to learn what command line options e.g. --window=800 to pass). A new version of Git that supports SHA-2 will suddenly break their configuration, when the users are 100% happy with the current SHA-1 system, with the way their tool uses extensions.objectformat configuration variable and a newer version of Git that happens to know how to also work with SHA-2, using their v0 repository. And the reasoning 'ignoring would make the problem get noticed anyway' does not apply to such users at all, does it? We need to declare that any names under "extensions.*" is off limits by end users regardless and write it in big flasing red letters if we haven't already done so. It is enforced in v1 repositories by dying upon seeing an unrecognised extension, but not entirely. When the users are lucky, a known-but-name-collided extension may stop with a type error (e.g. "extensions.objectformat=frotz" may say "frotz is not among the accepted hash algorithms") but that is not guaranteed. In v0 repositories, enforcing it after the fact would cause the same trouble as the tightening caused.
Jeff King <peff@peff.net> writes: > Subject: verify_repository_format(): complain about new extensions in v0 repo > > We made the mistake in the past of respecting extensions.* even when the > repository format version was set to 0. This is bad because forgetting > to bump the repository version means that older versions of Git (which > do not know about our extensions) won't complain. I.e., it's not a > problem in itself, but it means your repository is in a state which does > not give you the protection you think you're getting from older > versions. > > For compatibility reasons, we are stuck with that decision for existing > extensions. However, we'd prefer not to extend the damage further. We > can do that by catching any newly-added extensions and complaining about > the repository format. Looking good overall, but I needed this to build from the source. Thanks. setup.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/setup.c b/setup.c index e29659b7b9..e69bd28ed6 100644 --- a/setup.c +++ b/setup.c @@ -457,10 +457,10 @@ enum extension_result { * Do not add new extensions to this function. It handles extensions which are * respected even in v0-format repositories for historical compatibility. */ -enum extension_result handle_extension_v0(const char *var, - const char *value, - const char *ext, - struct repository_format *data) +static enum extension_result handle_extension_v0(const char *var, + const char *value, + const char *ext, + struct repository_format *data) { if (!strcmp(ext, "noop")) { return EXTENSION_OK; @@ -483,10 +483,10 @@ enum extension_result handle_extension_v0(const char *var, /* * Record any new extensions in this function. */ -enum extension_result handle_extension(const char *var, - const char *value, - const char *ext, - struct repository_format *data) +static enum extension_result handle_extension(const char *var, + const char *value, + const char *ext, + struct repository_format *data) { if (!strcmp(ext, "noop-v1")) { return EXTENSION_OK;
On Thu, Jul 16, 2020 at 09:32:35AM -0700, Junio C Hamano wrote: > We need to declare that any names under "extensions.*" is off limits > by end users regardless and write it in big flasing red letters if > we haven't already done so. I thought this was already well-understood, and it was definitely part of the plan since 2015. Are other tools really sticking stuff in extensions.* that we don't know about? -Peff
On Thu, Jul 16, 2020 at 09:49:14AM -0700, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > Subject: verify_repository_format(): complain about new extensions in v0 repo > > > > We made the mistake in the past of respecting extensions.* even when the > > repository format version was set to 0. This is bad because forgetting > > to bump the repository version means that older versions of Git (which > > do not know about our extensions) won't complain. I.e., it's not a > > problem in itself, but it means your repository is in a state which does > > not give you the protection you think you're getting from older > > versions. > > > > For compatibility reasons, we are stuck with that decision for existing > > extensions. However, we'd prefer not to extend the damage further. We > > can do that by catching any newly-added extensions and complaining about > > the repository format. > > Looking good overall, but I needed this to build from the source. Oof, thanks. I did this as a one-off not even on a branch, and my config.mak magic loosens -Werror in that case (because usually a detached HEAD means I'm investigating some old commit, and quite a few of them don't build without warnings these days). Thankfully it seems I only managed a minor error without the compiler there to help me. :) -Peff
Junio C Hamano <gitster@pobox.com> writes: > Jeff King <peff@peff.net> writes: > >> Hmm, this is actually a bit trickier than I expected because of the way >> the code is written. It's much easier to complain about extensions in a >> v0 repository than it is to ignore them. But I'm not sure if that isn't >> the right way to go anyway. >> >> The patch I came up with is below (and goes on top of Jonathan's). Even >> if we decide this is the right direction, it can definitely happen >> post-v2.28. > > It must happen already in 'seen' if we want to keep bc/sha-2-part-3 > with us, though ;-) FWIW, I needed to adjust t0001 while merging the SHA-2 topic. The internal use of "git config" via test_config triggers the "this is not a Git repository as the value of repositoryformatversion and the defined set of extensions are incompatible". diff --cc t/t0001-init.sh index 6d2467995e,34d2064660..ff538c0eed --- a/t/t0001-init.sh +++ b/t/t0001-init.sh ... -test_expect_success 'extensions.objectFormat is not honored with repo version 0' ' ++test_expect_success 'extensions.objectFormat would cause an error in repo version 0' ' + git init --object-format=sha256 explicit-v0 && - test_config -C explicit-v0 core.repositoryformatversion 0 && - git -C explicit-v0 rev-parse --show-object-format >actual && - echo sha1 >expected && - test_cmp expected actual ++ v=$(git config --file=explicit-v0/.git/config core.repositoryformatversion 0) && ++ test_when_finished " ++ git config --file=explicit-v0/.git/config core.repositoryformatversion $v ++ " && ++ git config --file=explicit-v0/.git/config core.repositoryformatversion 0 && ++ test_must_fail git -C explicit-v0 rev-parse --show-object-format >actual + '
(replying from vacation; back tomorrow) Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: >> Yeah, I agree with this line of reasoning. I'd prefer to see it >> addressed now, so that we don't have to remember to do anything later. > > Very true. Also the documentation may need some updating, > e.g. "These 4 extensions are honored without adding > repositoryFormatVersion to your repository (as special cases)" to > avoid further confusion e.g. "why doesn't my objectFormat=SHA-3 does > not take effect by itself?". Yes, I agree, especially about documentation. For 2.29, I would like to do or see the following: - defining the list of repository format v0 supported extensions as "these and no more", futureproofing along the lines suggested in Peff's patch. I like the general approach taken there since it allows parsing the relevant config in a single pass, so I think it basically takes the right approach. (That said, it might be possible to simplify a bit with further changes, e.g. by using the configset API.) When doing this for real, we'd want to document the set of supported extensions. That is especially useful to independent implementers wanting to support Git's formats, since it tells them "this is the minimum set of extensions that you must either handle or error out cleanly on to maintain compatibility with Git's repository format v0". - improving the behavior when an extension not supported in v0 is encountered in a v0 repository. For extensions that are supported in v1 and not v0, we should presumably error out so the user can repair the repository, and we can put the "noop" extension in that category for the sake of easy testing. We can also include a check in "git fsck" for repositories that request the undefined behavior of v0 repositories with non-v0 extensions, for faster diagnosis. What about unrecognized extensions that are potentially extensions yet to be defined? Should these be silently ignored to match the historical behavior, or should we error out even in repository format v0? I lean toward the latter; we'll need to be cautious, though, e.g. by making this a separate patch so we can easily tweak it if this ends up being disruptive in some unanticipated way. - making "git init" use repository format v1 by default. It's been long enough that users can count on Git implementations supporting it. This way, users are less likely to run into v0+extensions confusion, just because users are less likely to be using v0. Does that sound like a good plan to others? If so, are there any steps beyond the two first patches in jn/v0-with-extensions-fix that we would want in order to prepare for it in 2.28? My preference would be to move forward in 2.28 with the first two patches in that topic branch (i.e., *not* the third yet), since they don't produce any user facing behavior that would create danger for users or clash with this plan. Today, the only extensions we recognize are in that set of extensions that we'll want to continue to recognize in v0 (except possibly the for-testing extension "noop"). The steps to take with additional extensions are more subtle so it seems reasonable for them to bake in "next" and then "master" for a 2.29 release. Thanks, Jonathan
Jonathan Nieder <jrnieder@gmail.com> writes: > - defining the list of repository format v0 supported extensions as > "these and no more", futureproofing along the lines suggested in > Peff's patch. I like the general approach taken there since it > allows parsing the relevant config in a single pass, so I think > it basically takes the right approach. (That said, it might be > possible to simplify a bit with further changes, e.g. by using the > configset API.) > > When doing this for real, we'd want to document the set of > supported extensions. That is especially useful to independent > implementers wanting to support Git's formats, since it tells > them "this is the minimum set of extensions that you must > either handle or error out cleanly on to maintain compatibility > with Git's repository format v0". Good. > - improving the behavior when an extension not supported in v0 is > encountered in a v0 repository. For extensions that are supported > in v1 and not v0, we should presumably error out so the user can > repair the repository, and we can put the "noop" extension in that > category for the sake of easy testing. We can also include a check > in "git fsck" for repositories that request the undefined behavior > of v0 repositories with non-v0 extensions, for faster diagnosis. > > What about unrecognized extensions that are potentially extensions > yet to be defined? Should these be silently ignored to match the > historical behavior, or should we error out even in repository > format v0? I lean toward the latter; we'll need to be cautious, > though, e.g. by making this a separate patch so we can easily tweak > it if this ends up being disruptive in some unanticipated way. I disagree with your first paragraph. Those that weren't honored by mistake back in v0 days, in addition to those that aren't known to us even now, should just be silently ignored, not causing an error. > - making "git init" use repository format v1 by default. It's been > long enough that users can count on Git implementations supporting > it. This way, users are less likely to run into v0+extensions > confusion, just because users are less likely to be using v0. Absolutely. I would think this is a very good move. > Does that sound like a good plan to others? If so, are there any > steps beyond the two first patches in jn/v0-with-extensions-fix that > we would want in order to prepare for it in 2.28? > > My preference would be to move forward in 2.28 with the first two > patches in that topic branch (i.e., *not* the third yet), since they > don't produce any user facing behavior that would create danger for > users or clash with this plan. Yup, I agree. I'd give another name to the third commit and then rewind jn/v0-with-extensions-fix by one to prevent mistakes from happening. Thanks.
On Thu, Jul 16, 2020 at 03:37:19PM -0700, Jonathan Nieder wrote: > For 2.29, I would like to do or see the following: > > - defining the list of repository format v0 supported extensions as > "these and no more", futureproofing along the lines suggested in > Peff's patch. I like the general approach taken there since it > allows parsing the relevant config in a single pass, so I think > it basically takes the right approach. (That said, it might be > possible to simplify a bit with further changes, e.g. by using the > configset API.) > > When doing this for real, we'd want to document the set of > supported extensions. That is especially useful to independent > implementers wanting to support Git's formats, since it tells > them "this is the minimum set of extensions that you must > either handle or error out cleanly on to maintain compatibility > with Git's repository format v0". I think we should still consider people setting v0 along with extensions to be a bug. It was never documented to work that way and we are being nice by keeping the existing behavior, but it is still wrong (and pre-extension versions of Git will silently ignore them). I don't mind making other implementers aware of the special status, but we should be careful not to endorse the broken setup. > - making "git init" use repository format v1 by default. It's been > long enough that users can count on Git implementations supporting > it. This way, users are less likely to run into v0+extensions > confusion, just because users are less likely to be using v0. That's probably reasonable. It will be mildly annoying for people like me who are often testing old versions of Git, but I'm sure I will survive. We should make sure that all major implementations handle v1 reasonably first, though (and that they did so long enough ago that it's not likely to cause problems). > My preference would be to move forward in 2.28 with the first two > patches in that topic branch (i.e., *not* the third yet), since they > don't produce any user facing behavior that would create danger for > users or clash with this plan. Today, the only extensions we > recognize are in that set of extensions that we'll want to continue to > recognize in v0 (except possibly the for-testing extension "noop"). > The steps to take with additional extensions are more subtle so it > seems reasonable for them to bake in "next" and then "master" for a > 2.29 release. I'm OK with the plan to ship the first two patches for 2.28, and leave my patch for later (or even soften it from "die" to "ignore with a warning"). I think leaving "noop" in that set of special extensions makes sense, since it lets us test that case easily (and I added a "noop-v1" in my patch to test the other one; clearly we could also flip it and have noop-v0). -Peff
On Thu, Jul 16, 2020 at 04:50:34PM -0700, Junio C Hamano wrote: > > - improving the behavior when an extension not supported in v0 is > > encountered in a v0 repository. For extensions that are supported > > in v1 and not v0, we should presumably error out so the user can > > repair the repository, and we can put the "noop" extension in that > > category for the sake of easy testing. We can also include a check > > in "git fsck" for repositories that request the undefined behavior > > of v0 repositories with non-v0 extensions, for faster diagnosis. > > > > What about unrecognized extensions that are potentially extensions > > yet to be defined? Should these be silently ignored to match the > > historical behavior, or should we error out even in repository > > format v0? I lean toward the latter; we'll need to be cautious, > > though, e.g. by making this a separate patch so we can easily tweak > > it if this ends up being disruptive in some unanticipated way. > > I disagree with your first paragraph. Those that weren't honored by > mistake back in v0 days, in addition to those that aren't known to us > even now, should just be silently ignored, not causing an error. That's very much the opposite of my patch. As we add new extensions, those "unknowns" will start to die(). I remain unconvinced that there are a bunch of unknown extension.* config options hanging around in the wild, but maybe I'm being naive. It seems to me more likely that users will be helped by warning about extensions that _should_ have had v1 set than that they will be harmed because they put random crap in their extensions.* config. But maybe you know of a specific example? Anyway, if we move to "v1" as the default for "git init" anyway, then the number of people being helped would become much smaller. > > My preference would be to move forward in 2.28 with the first two > > patches in that topic branch (i.e., *not* the third yet), since they > > don't produce any user facing behavior that would create danger for > > users or clash with this plan. > > Yup, I agree. I'd give another name to the third commit and then > rewind jn/v0-with-extensions-fix by one to prevent mistakes from > happening. Thanks. OK. I was confused to see it still at the tip in the latest What's Cooking, but I think we're just crossing emails. :) -Peff
Jeff King <peff@peff.net> writes: > Anyway, if we move to "v1" as the default for "git init" anyway, then > the number of people being helped would become much smaller. Yup. So in that sense I do not think I care too deeply either way. >> > My preference would be to move forward in 2.28 with the first two >> > patches in that topic branch (i.e., *not* the third yet), since they >> > don't produce any user facing behavior that would create danger for >> > users or clash with this plan. >> >> Yup, I agree. I'd give another name to the third commit and then >> rewind jn/v0-with-extensions-fix by one to prevent mistakes from >> happening. Thanks. > > OK. I was confused to see it still at the tip in the latest What's > Cooking, but I think we're just crossing emails. :) Yes.
diff --git a/cache.h b/cache.h index 126ec56c7f3..654426460cc 100644 --- a/cache.h +++ b/cache.h @@ -1042,7 +1042,6 @@ struct repository_format { int worktree_config; int is_bare; int hash_algo; - int has_extensions; char *work_tree; struct string_list unknown_extensions; }; diff --git a/setup.c b/setup.c index 87bf0112cf3..3a81307602e 100644 --- a/setup.c +++ b/setup.c @@ -455,7 +455,6 @@ static int check_repo_format(const char *var, const char *value, void *vdata) if (strcmp(var, "core.repositoryformatversion") == 0) data->version = git_config_int(var, value); else if (skip_prefix(var, "extensions.", &ext)) { - data->has_extensions = 1; /* * record any known extensions here; otherwise, * we fall through to recording it as unknown, and @@ -553,13 +552,16 @@ int upgrade_repository_format(int target_version) if (repo_fmt.version >= target_version) return 0; - if (verify_repository_format(&repo_fmt, &err) < 0 || - (!repo_fmt.version && repo_fmt.has_extensions)) { - warning("unable to upgrade repository format from %d to %d: %s", - repo_fmt.version, target_version, err.buf); + if (verify_repository_format(&repo_fmt, &err) < 0) { + error("cannot upgrade repository format from %d to %d: %s", + repo_fmt.version, target_version, err.buf); strbuf_release(&err); return -1; } + if (!repo_fmt.version && repo_fmt.unknown_extensions.nr) + return error("cannot upgrade repository format: " + "unknown extension %s", + repo_fmt.unknown_extensions.items[0].string); strbuf_addf(&repo_version, "%d", target_version); git_config_set("core.repositoryformatversion", repo_version.buf); diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh index 51d1eba6050..6aa0f313bdd 100755 --- a/t/t0410-partial-clone.sh +++ b/t/t0410-partial-clone.sh @@ -42,7 +42,7 @@ test_expect_success 'convert shallow clone to partial clone' ' test_cmp_config -C client 1 core.repositoryformatversion ' -test_expect_success 'converting to partial clone fails with noop extension' ' +test_expect_success 'convert to partial clone with noop extension' ' rm -fr server client && test_create_repo server && test_commit -C server my_commit 1 && @@ -50,7 +50,7 @@ test_expect_success 'converting to partial clone fails with noop extension' ' git clone --depth=1 "file://$(pwd)/server" client && test_cmp_config -C client 0 core.repositoryformatversion && git -C client config extensions.noop true && - test_must_fail git -C client fetch --unshallow --filter="blob:none" + git -C client fetch --unshallow --filter="blob:none" ' test_expect_success 'converting to partial clone fails with unrecognized extension' '
Now that we officially permit repository extensions in repository format v0, permit upgrading a repository with extensions from v0 to v1 as well. For example, this means a repository where the user has set "extensions.preciousObjects" can use "git fetch --filter=blob:none origin" to upgrade the repository to use v1 and the partial clone extension. To avoid mistakes, continue to forbid repository format upgrades in v0 repositories with an unrecognized extension. This way, a v0 user using a misspelled extension field gets a chance to correct the mistake before updating to the less forgiving v1 format. While we're here, make the error message for failure to upgrade the repository format a bit shorter, and present it as an error, not a warning. Reported-by: Huan Huan Chen <huanhuanchen@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- Apologies again for the trouble, and thanks for your patient help. cache.h | 1 - setup.c | 12 +++++++----- t/t0410-partial-clone.sh | 4 ++-- 3 files changed, 9 insertions(+), 8 deletions(-)