Message ID | 49813c8d9ed94fd56f30eb204d346eb5a30473ca.1633440057.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Sparse Index: integrate with reset | expand |
On Tue, Oct 5, 2021 at 6:21 AM Victoria Dye via GitGitGadget <gitgitgadget@gmail.com> wrote: > > From: Victoria Dye <vdye@github.com> > > `reset --soft` does not modify the index, so no compatibility changes are > needed for it to function without expanding the index. For all other reset > modes (`--mixed`, `--hard`, `--keep`, `--merge`), the full index is > explicitly expanded with `ensure_full_index` to maintain current behavior. "to maintain current behavior"? You are changing code here, which suggests some kind of behavior is changing, but that description seems to be claiming the opposite. Is it some kind of preventative change to add ensure_full_index calls in an additional place, with a later patch in the series intending to remove the other one(s), so you're making sure that later changes won't cause unwanted behavioral changes? Or was something else meant here? If the above wasn't what you meant, but you're adding ensure_full_index calls, does that suggest that we had some important code paths that were not protected by such calls? I thought Stolee said we had them all covered (at least to the best of our knowledge), so I'm curious if we just discovered we missed some. If so, are there other codepaths like this one where we missed protective ensure_full_index calls? > Additionally, the `read_cache()` check verifying an uncorrupted index is > moved after argument parsing and preparing the repo settings. The index is > not used by the preceding argument handling, but `read_cache()` does need to > be run after enabling sparse index for the command and before resetting. This seems to be discussing what code changes are being made, but not why. I'm guessing at the reasoning, but is it something along the lines of: """ Also, make sure to read_cache() after setting command_requires_full_index = 0, so that we don't unnecessarily expand the index as part of our early index-corruption check. """ ? > > Signed-off-by: Victoria Dye <vdye@github.com> > --- > builtin/reset.c | 10 +++++++--- > cache-tree.c | 1 + > 2 files changed, 8 insertions(+), 3 deletions(-) > > diff --git a/builtin/reset.c b/builtin/reset.c > index 3b75d3b2f20..e1f2a2bb2c4 100644 > --- a/builtin/reset.c > +++ b/builtin/reset.c > @@ -184,6 +184,7 @@ static int read_from_tree(const struct pathspec *pathspec, > opt.flags.override_submodule_config = 1; > opt.repo = the_repository; > > + ensure_full_index(&the_index); > if (do_diff_cache(tree_oid, &opt)) > return 1; > diffcore_std(&opt); > @@ -261,9 +262,6 @@ static void parse_args(struct pathspec *pathspec, > } > *rev_ret = rev; > > - if (read_cache() < 0) > - die(_("index file corrupt")); > - > parse_pathspec(pathspec, 0, > PATHSPEC_PREFER_FULL | > (patch_mode ? PATHSPEC_PREFIX_ORIGIN : 0), > @@ -409,6 +407,12 @@ int cmd_reset(int argc, const char **argv, const char *prefix) > if (intent_to_add && reset_type != MIXED) > die(_("-N can only be used with --mixed")); > > + prepare_repo_settings(the_repository); > + the_repository->settings.command_requires_full_index = 0; > + > + if (read_cache() < 0) > + die(_("index file corrupt")); > + > /* Soft reset does not touch the index file nor the working tree > * at all, but requires them in a good order. Other resets reset > * the index file to the tree object we are switching to. */ > diff --git a/cache-tree.c b/cache-tree.c > index 90919f9e345..9be19c85b66 100644 > --- a/cache-tree.c > +++ b/cache-tree.c > @@ -776,6 +776,7 @@ void prime_cache_tree(struct repository *r, > cache_tree_free(&istate->cache_tree); > istate->cache_tree = cache_tree(); > > + ensure_full_index(istate); > prime_cache_tree_rec(r, istate->cache_tree, tree); > istate->cache_changed |= CACHE_TREE_CHANGED; > trace2_region_leave("cache-tree", "prime_cache_tree", the_repository); > -- > gitgitgadget >
Elijah Newren <newren@gmail.com> writes: > On Tue, Oct 5, 2021 at 6:21 AM Victoria Dye via GitGitGadget > <gitgitgadget@gmail.com> wrote: >> >> From: Victoria Dye <vdye@github.com> >> >> `reset --soft` does not modify the index, so no compatibility changes are >> needed for it to function without expanding the index. For all other reset >> modes (`--mixed`, `--hard`, `--keep`, `--merge`), the full index is >> explicitly expanded with `ensure_full_index` to maintain current behavior. > > "to maintain current behavior"? You are changing code here, which > suggests some kind of behavior is changing, but that description seems > to be claiming the opposite. Is it some kind of preventative change > to add ensure_full_index calls in an additional place, with a later > patch in the series intending to remove the other one(s), so you're > making sure that later changes won't cause unwanted behavioral > changes? Or was something else meant here? > > If the above wasn't what you meant, but you're adding > ensure_full_index calls, does that suggest that we had some important > code paths that were not protected by such calls? The original called read_cache() before we know which mode we operate in, near the end of parse_args(), which resulted in an unconditional call to ensure_full_index() in repo_read_index(). This patch delays the call to read_cache(). If parse_pathspec() and everything the original called after the point where it called read_cache() needed to have a populated in-core index, the change can break things---I didn't check thoroughly, but I am guessing it is OK. >> Additionally, the `read_cache()` check verifying an uncorrupted index is >> moved after argument parsing and preparing the repo settings. The index is >> not used by the preceding argument handling, but `read_cache()` does need to >> be run after enabling sparse index for the command and before resetting. > > This seems to be discussing what code changes are being made, but not > why. I'm guessing at the reasoning, but is it something along the > lines of: > > """ > Also, make sure to read_cache() after setting > command_requires_full_index = 0, so that we don't unnecessarily expand > the index as part of our early index-corruption check. > """ I think it is more like "we used to expand very early for all modes, but with this change we move the read_cache() call to much later, and force it not to expand. The modes that call read_from_tree() needs in-core index fully expanded, so we do so there, but the soft reset does not call it and would stop expanding."
diff --git a/builtin/reset.c b/builtin/reset.c index 3b75d3b2f20..e1f2a2bb2c4 100644 --- a/builtin/reset.c +++ b/builtin/reset.c @@ -184,6 +184,7 @@ static int read_from_tree(const struct pathspec *pathspec, opt.flags.override_submodule_config = 1; opt.repo = the_repository; + ensure_full_index(&the_index); if (do_diff_cache(tree_oid, &opt)) return 1; diffcore_std(&opt); @@ -261,9 +262,6 @@ static void parse_args(struct pathspec *pathspec, } *rev_ret = rev; - if (read_cache() < 0) - die(_("index file corrupt")); - parse_pathspec(pathspec, 0, PATHSPEC_PREFER_FULL | (patch_mode ? PATHSPEC_PREFIX_ORIGIN : 0), @@ -409,6 +407,12 @@ int cmd_reset(int argc, const char **argv, const char *prefix) if (intent_to_add && reset_type != MIXED) die(_("-N can only be used with --mixed")); + prepare_repo_settings(the_repository); + the_repository->settings.command_requires_full_index = 0; + + if (read_cache() < 0) + die(_("index file corrupt")); + /* Soft reset does not touch the index file nor the working tree * at all, but requires them in a good order. Other resets reset * the index file to the tree object we are switching to. */ diff --git a/cache-tree.c b/cache-tree.c index 90919f9e345..9be19c85b66 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -776,6 +776,7 @@ void prime_cache_tree(struct repository *r, cache_tree_free(&istate->cache_tree); istate->cache_tree = cache_tree(); + ensure_full_index(istate); prime_cache_tree_rec(r, istate->cache_tree, tree); istate->cache_changed |= CACHE_TREE_CHANGED; trace2_region_leave("cache-tree", "prime_cache_tree", the_repository);