Message ID | 014a408ea5d9894197c60f8d712749ea3cc39c9d.1633641339.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Sparse Index: integrate with reset | expand |
On 08/10/21 04.15, Victoria Dye via GitGitGadget wrote: > From: Victoria Dye <vdye@github.com> > > Add a new `--force-full-index` option to `git update-index`, which skips > explicitly setting `command_requires_full_index`. This option, intended for > use in internal testing purposes only, lets `git update-index` run as a > command without sparse index compatibility implemented, even after it > receives updates to otherwise use the sparse index. > > The specific test `--force-full-index` is intended for - `t1092 - > sparse-index is expanded and converted back` - verifies index compatibility > in commands that do not change the default (enabled) > `command_requires_full_index` repo setting. In the past, the test used `git > reset`. However, as `reset` and other commands are integrated with the > sparse index, the command used in the test would need to keep changing. > Conversely, the `--force-full-index` option makes `git update-index` behave > like a not-yet-sparse-aware command, and can be used in the test > indefinitely without interfering with future sparse index integrations. > > Helped-by: Junio C Hamano <gitster@pobox.com> > Signed-off-by: Victoria Dye <vdye@github.com> Grammar looks OK. Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
"Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes: > + /* > + * If --force-full-index is set, the command should skip manually > + * setting `command_requires_full_index`. > + */ Hmph, doesn't that feel unnaturally backwards, though? The settings.command_requires_full_index bit forces read-cache to call ensure_full_index() immediately after the in-core index is read from the disk. If we are forcing operating on the full index, I'd imagine that we'd be making sure that ensure_full_index() to be called. I do not see anything in the code that ensures active_cache_changed to be flipped on. So the new test that says git -C sparse-index -c core.fsmonitor="" update-index --force-full-index may not call ensure_full_index(), but because nothing marks the_index as changed, I think we won't call write_locked_index() at the end of cmd_update_index(). IOW, what we have in the test patch may be an expensive noop, no? Or perhaps I am reading the patch completely incorrectly. I dunno. > + prepare_repo_settings(r); > + if (!use_default_full_index) > + r->settings.command_requires_full_index = 1; > + > entries = read_cache(); > if (entries < 0) > die("cache corrupted"); > diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh > index 889079f55b8..4aa4fef7b4f 100755 > --- a/t/t1092-sparse-checkout-compatibility.sh > +++ b/t/t1092-sparse-checkout-compatibility.sh > @@ -635,7 +635,7 @@ test_expect_success 'sparse-index is expanded and converted back' ' > init_repos && > > GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ > - git -C sparse-index -c core.fsmonitor="" reset --hard && > + git -C sparse-index -c core.fsmonitor="" update-index --force-full-index && > test_region index convert_to_sparse trace2.txt && > test_region index ensure_full_index trace2.txt > '
Junio C Hamano wrote: > "Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes: > >> + /* >> + * If --force-full-index is set, the command should skip manually >> + * setting `command_requires_full_index`. >> + */ > > Hmph, doesn't that feel unnaturally backwards, though? > > The settings.command_requires_full_index bit forces read-cache to > call ensure_full_index() immediately after the in-core index is read > from the disk. If we are forcing operating on the full index, I'd > imagine that we'd be making sure that ensure_full_index() to be > called. > I tried coming up with a user-facing name that wasn't too focused on the internal implementation, but it ends up being misleading. The intention was to have this be a variation of `git update-index` that uses the default setting for `command_requires_full_index` but then proceeds to read and write the index as `update-index` normally would. Something like `--use-default-index-sparsity` might have been more accurate? > I do not see anything in the code that ensures active_cache_changed > to be flipped on. So the new test that says > > git -C sparse-index -c core.fsmonitor="" update-index --force-full-index > > may not call ensure_full_index(), but because nothing marks > the_index as changed, I think we won't call write_locked_index() at > the end of cmd_update_index(). IOW, what we have in the test patch > may be an expensive noop, no? > In the test's use-case, `active_cache_changed` ends up set to `CACHE_TREE_CHANGED`, which forces writing the index. It is still effectively a no-op, but it serves the needs of the test. In any case, Elijah suggested using a `test-tool` subcommand for this purpose [1], which I think is more appropriate overall. Something like `test-tool read-write-cache` can be implemented to make no mention of `command_requires_full_index` (therefore using its default value) and force a basic read & write of the index. It also eliminates the issue of having a user-facing name at all, and can easily be removed once all sparse index integrations are done. [1] https://lore.kernel.org/git/CABPp-BF+bEUcyE0N79uRCkpCayJx_NMqOpnMSHHrpJM5a9hAWw@mail.gmail.com/
Victoria Dye <vdye@github.com> writes: > I tried coming up with a user-facing name that wasn't too focused on the > internal implementation, but it ends up being misleading. The intention was > to have this be a variation of `git update-index` that uses the default > setting for `command_requires_full_index` but then proceeds to read and > write the index as `update-index` normally would. Something like > `--use-default-index-sparsity` might have been more accurate? The option name in the reviewed patch does imply "we force expanding to full" and not "use the default", so it probably needs renaming, if we want the "use the default" semantics. But is that useful in the context of the test you are using it in place of "reset" or "mv"? Even if the default is somehow flipped to use sparse always, wouldn't the particular test want the index expanded? I dunno. > In the test's use-case, `active_cache_changed` ends up set to > `CACHE_TREE_CHANGED`, which forces writing the index. It is still > effectively a no-op, but it serves the needs of the test. Ah, cache-tree is updated, then it's OK. As to test-tool vs end-user-accessible-command, I do not have a strong opinion, but use your imagination and ask Derrick or somebody else for their imagination to see if such a "force expand" feature may be something the end-users might need an access to in order to dig themselves out of a hole (in which case, it may be better to make it end-user-accessible) or not (in which case, test-tool is more appropriate). Thanks.
On 10/8/21 1:19 PM, Junio C Hamano wrote: > Victoria Dye <vdye@github.com> writes: > >> I tried coming up with a user-facing name that wasn't too focused on the >> internal implementation, but it ends up being misleading. The intention was >> to have this be a variation of `git update-index` that uses the default >> setting for `command_requires_full_index` but then proceeds to read and >> write the index as `update-index` normally would. Something like >> `--use-default-index-sparsity` might have been more accurate? > > The option name in the reviewed patch does imply "we force expanding > to full" and not "use the default", so it probably needs renaming, > if we want the "use the default" semantics. But is that useful in > the context of the test you are using it in place of "reset" or "mv"? > Even if the default is somehow flipped to use sparse always, wouldn't > the particular test want the index expanded? I dunno. > >> In the test's use-case, `active_cache_changed` ends up set to >> `CACHE_TREE_CHANGED`, which forces writing the index. It is still >> effectively a no-op, but it serves the needs of the test. > > Ah, cache-tree is updated, then it's OK. > > As to test-tool vs end-user-accessible-command, I do not have a > strong opinion, but use your imagination and ask Derrick or somebody > else for their imagination to see if such a "force expand" feature > may be something the end-users might need an access to in order to > dig themselves out of a hole (in which case, it may be better to > make it end-user-accessible) or not (in which case, test-tool is > more appropriate). I think there is something to be said about the name being confusing, because the current implementation focuses on "expand a sparse index upon read" but it also allows the index to be written as sparse. Conversely, if the user runs git -c index.sparse=false update-index ... then the index.sparse config setting forbids conversion from full to sparse, but does not say anything about expanding to full. Perhaps this should be corrected: the index.sparse=false setting should expand a sparse index to a full one, then prevent it from being converted to a sparse one on write. This diff should do it: diff --git a/read-cache.c b/read-cache.c index 564283c7e7e..04df1051e18 100644 --- a/read-cache.c +++ b/read-cache.c @@ -2376,7 +2376,8 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist) if (!istate->repo) istate->repo = the_repository; prepare_repo_settings(istate->repo); - if (istate->repo->settings.command_requires_full_index) + if (!istate->repo->settings.sparse_index || + istate->repo->settings.command_requires_full_index) ensure_full_index(istate); return istate->cache_nr; Victoria, what are your thoughts about including such a change? Junio, would it be better to change the config setting, and then update this test to use the config setting over a command-line flag? This would allow us to punt on the --force-full-index flag until we have time to focus on the 'git update-index' command and interactions like this. Thanks, -Stolee
Derrick Stolee wrote: > On 10/8/21 1:19 PM, Junio C Hamano wrote: >> Victoria Dye <vdye@github.com> writes: >> >>> I tried coming up with a user-facing name that wasn't too focused on the >>> internal implementation, but it ends up being misleading. The intention was >>> to have this be a variation of `git update-index` that uses the default >>> setting for `command_requires_full_index` but then proceeds to read and >>> write the index as `update-index` normally would. Something like >>> `--use-default-index-sparsity` might have been more accurate? >> >> The option name in the reviewed patch does imply "we force expanding >> to full" and not "use the default", so it probably needs renaming, >> if we want the "use the default" semantics. But is that useful in >> the context of the test you are using it in place of "reset" or "mv"? >> Even if the default is somehow flipped to use sparse always, wouldn't >> the particular test want the index expanded? I dunno. >> >>> In the test's use-case, `active_cache_changed` ends up set to >>> `CACHE_TREE_CHANGED`, which forces writing the index. It is still >>> effectively a no-op, but it serves the needs of the test. >> >> Ah, cache-tree is updated, then it's OK. >> >> As to test-tool vs end-user-accessible-command, I do not have a >> strong opinion, but use your imagination and ask Derrick or somebody >> else for their imagination to see if such a "force expand" feature >> may be something the end-users might need an access to in order to >> dig themselves out of a hole (in which case, it may be better to >> make it end-user-accessible) or not (in which case, test-tool is >> more appropriate). > > I think there is something to be said about the name being confusing, > because the current implementation focuses on "expand a sparse index > upon read" but it also allows the index to be written as sparse. > This helps clarify what I was misinterpreting in the test. It isn't looking for "default" behavior, it's verifying whether trace2 logs capture index expansion and collapse when those operations are expected to happen, regardless of whether that's because `command_requires_full_index` is 1 or because the command needs to use entries inside of sparse directories. With that interpretation, I can replace the command with `git reset update-folder1 -- folder1/a` and get the same result (without needing to change the test in the future *or* add a new `git` command option / `test-tool` subcommand). > Conversely, if the user runs > > git -c index.sparse=false update-index ... > > then the index.sparse config setting forbids conversion from full to > sparse, but does not say anything about expanding to full. > > Perhaps this should be corrected: the index.sparse=false setting > should expand a sparse index to a full one, then prevent it from > being converted to a sparse one on write. > > This diff should do it: > > diff --git a/read-cache.c b/read-cache.c > index 564283c7e7e..04df1051e18 100644 > --- a/read-cache.c > +++ b/read-cache.c > @@ -2376,7 +2376,8 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist) > if (!istate->repo) > istate->repo = the_repository; > prepare_repo_settings(istate->repo); > - if (istate->repo->settings.command_requires_full_index) > + if (!istate->repo->settings.sparse_index || > + istate->repo->settings.command_requires_full_index) > ensure_full_index(istate); > > return istate->cache_nr; > > Victoria, what are your thoughts about including such a change? > I think this is a worthwhile change, but I'd prefer submitting it separately (either in an upcoming sparse index integration or on its own). It's not directly needed by anything in this series, and I'd like to avoid adding features to the scope if possible.
Derrick Stolee <stolee@gmail.com> writes: > Junio, would it be better to change the config setting, and then > update this test to use the config setting over a command-line flag? > This would allow us to punt on the --force-full-index flag until we > have time to focus on the 'git update-index' command and interactions > like this. I do not have a strong opinion on where we add the feature; as long as we have a way to let us avoid having to unnecessarily change this particular test, that's perfectly fine, and if we can reuse it as a way for end-users to help those who are debugging their issues, that would be an added bonus. Thanks.
diff --git a/Documentation/git-update-index.txt b/Documentation/git-update-index.txt index 2853f168d97..06255e321a3 100644 --- a/Documentation/git-update-index.txt +++ b/Documentation/git-update-index.txt @@ -24,6 +24,7 @@ SYNOPSIS [--[no-]fsmonitor] [--really-refresh] [--unresolve] [--again | -g] [--info-only] [--index-info] + [--force-full-index] [-z] [--stdin] [--index-version <n>] [--verbose] [--] [<file>...] @@ -170,6 +171,10 @@ time. Version 4 is relatively young (first released in 1.8.0 in October 2012). Other Git implementations such as JGit and libgit2 may not support it yet. +--force-full-index:: + Force the command to operate on a full index, expanding a sparse + index if necessary. + -z:: Only meaningful with `--stdin` or `--index-info`; paths are separated with NUL character instead of LF. diff --git a/builtin/update-index.c b/builtin/update-index.c index 187203e8bb5..32ada3ead77 100644 --- a/builtin/update-index.c +++ b/builtin/update-index.c @@ -964,6 +964,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix) int split_index = -1; int force_write = 0; int fsmonitor = -1; + int use_default_full_index = 0; struct lock_file lock_file = LOCK_INIT; struct parse_opt_ctx_t ctx; strbuf_getline_fn getline_fn; @@ -1069,6 +1070,8 @@ int cmd_update_index(int argc, const char **argv, const char *prefix) {OPTION_SET_INT, 0, "no-fsmonitor-valid", &mark_fsmonitor_only, NULL, N_("clear fsmonitor valid bit"), PARSE_OPT_NOARG | PARSE_OPT_NONEG, NULL, UNMARK_FLAG}, + OPT_SET_INT(0, "force-full-index", &use_default_full_index, + N_("run with full index explicitly required"), 1), OPT_END() }; @@ -1082,6 +1085,14 @@ int cmd_update_index(int argc, const char **argv, const char *prefix) if (newfd < 0) lock_error = errno; + /* + * If --force-full-index is set, the command should skip manually + * setting `command_requires_full_index`. + */ + prepare_repo_settings(r); + if (!use_default_full_index) + r->settings.command_requires_full_index = 1; + entries = read_cache(); if (entries < 0) die("cache corrupted"); diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index 889079f55b8..4aa4fef7b4f 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -635,7 +635,7 @@ test_expect_success 'sparse-index is expanded and converted back' ' init_repos && GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ - git -C sparse-index -c core.fsmonitor="" reset --hard && + git -C sparse-index -c core.fsmonitor="" update-index --force-full-index && test_region index convert_to_sparse trace2.txt && test_region index ensure_full_index trace2.txt '