Message ID | b67cd9d07f8a3802e5b50d58965f283620cd3876.1635515487.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | sparse-index: expand/collapse based on 'index.sparse' | expand |
On Fri, Oct 29, 2021 at 6:56 AM Victoria Dye via GitGitGadget <gitgitgadget@gmail.com> wrote: > > From: Victoria Dye <vdye@github.com> > > If `command_requires_full_index` is false, ensure correct in-core index > sparsity on read by calling `ensure_correct_sparsity`. This change is meant > to update the how the index is read in a command after sparse index-related s/update the how/update how/ ? > repository settings are modified. Previously, for example, if `index.sparse` > were changed from `true` to `false`, the in-core index on the next command > would be sparse. The index would only be expanded to full when it was next > written to disk. > > By adding a call to `ensure_correct_sparsity`, the in-core index now matches > the sparsity dictated by the relevant repository settings as soon as it is > read into memory, rather than when it is later written to disk. I split up reading this series across different days, and when I got here, my first reaction was "Okay, but why would you want that? Sounds like extra work for no gain." I went and looked up the cover letter and saw you mentioned that this "introduces the ability to enable/disable the sparse index on a command-by-command basis (e.g., allowing a user to troubleshoot a sparse-aware command with '-c index.sparse=false' [1]). That seems like a good reason to me, and sounds like it belongs in this commit message. But it sounds like you had other reasons in mind. If so, could you share them; I'm having difficulty understanding how this would make a difference other than in the troubleshooting scenario. > > Helped-by: Junio C Hamano <gitster@pobox.com> > Co-authored-by: Derrick Stolee <dstolee@microsoft.com> > Signed-off-by: Victoria Dye <vdye@github.com> > --- > read-cache.c | 8 ++++++ > t/t1092-sparse-checkout-compatibility.sh | 31 ++++++++++++++++++++++++ > 2 files changed, 39 insertions(+) > > diff --git a/read-cache.c b/read-cache.c > index a78b88a41bf..b3772ba70a1 100644 > --- a/read-cache.c > +++ b/read-cache.c > @@ -2337,9 +2337,17 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist) > > if (!istate->repo) > istate->repo = the_repository; > + > + /* > + * If the command explicitly requires a full index, force it > + * to be full. Otherwise, correct the sparsity based on repository > + * settings and other properties of the index (if necessary). > + */ > prepare_repo_settings(istate->repo); > if (istate->repo->settings.command_requires_full_index) > ensure_full_index(istate); > + else > + ensure_correct_sparsity(istate); > > return istate->cache_nr; > > diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh > index ca91c6a67f8..59accde1fa3 100755 > --- a/t/t1092-sparse-checkout-compatibility.sh > +++ b/t/t1092-sparse-checkout-compatibility.sh > @@ -694,6 +694,37 @@ test_expect_success 'sparse-index is expanded and converted back' ' > test_region index ensure_full_index trace2.txt > ' > > +test_expect_success 'index.sparse disabled inline uses full index' ' > + init_repos && > + > + # When index.sparse is disabled inline with `git status`, the > + # index is expanded at the beginning of the execution then never > + # converted back to sparse. It is then written to disk as a full index. > + rm -f trace2.txt && > + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ > + git -C sparse-index -c index.sparse=false status && > + ! test_region index convert_to_sparse trace2.txt && > + test_region index ensure_full_index trace2.txt && > + > + # Since index.sparse is set to true at a repo level, the index > + # is converted from full to sparse when read, then never expanded > + # over the course of `git status`. It is written to disk as a sparse > + # index. > + rm -f trace2.txt && > + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ > + git -C sparse-index status && > + test_region index convert_to_sparse trace2.txt && > + ! test_region index ensure_full_index trace2.txt && > + > + # Now that the index has been written to disk as sparse, it is not > + # converted to sparse (or expanded to full) when read by `git status`. > + rm -f trace2.txt && > + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ > + git -C sparse-index status && > + ! test_region index convert_to_sparse trace2.txt && > + ! test_region index ensure_full_index trace2.txt > +' > + > ensure_not_expanded () { > rm -f trace2.txt && > echo >>sparse-index/untracked.txt && > -- > gitgitgadget The rest of the patch looks fine.
Elijah Newren wrote: > On Fri, Oct 29, 2021 at 6:56 AM Victoria Dye via GitGitGadget > <gitgitgadget@gmail.com> wrote: >> >> From: Victoria Dye <vdye@github.com> >> >> If `command_requires_full_index` is false, ensure correct in-core index >> sparsity on read by calling `ensure_correct_sparsity`. This change is meant >> to update the how the index is read in a command after sparse index-related > > s/update the how/update how/ ? > This was probably the result of mixing up "update the way" and "update how". I'll go with the latter. >> repository settings are modified. Previously, for example, if `index.sparse` >> were changed from `true` to `false`, the in-core index on the next command >> would be sparse. The index would only be expanded to full when it was next >> written to disk. >> >> By adding a call to `ensure_correct_sparsity`, the in-core index now matches >> the sparsity dictated by the relevant repository settings as soon as it is >> read into memory, rather than when it is later written to disk. > > I split up reading this series across different days, and when I got > here, my first reaction was "Okay, but why would you want that? > Sounds like extra work for no gain." I went and looked up the cover > letter and saw you mentioned that this "introduces the ability to > enable/disable the sparse index on a command-by-command basis (e.g., > allowing a user to troubleshoot a sparse-aware command with '-c > index.sparse=false' [1]). That seems like a good reason to me, and > sounds like it belongs in this commit message. But it sounds like you > had other reasons in mind. If so, could you share them; I'm having > difficulty understanding how this would make a difference other than > in the troubleshooting scenario. > The troubleshooting scenario was what inspired this change in the first place, but it applies any time someone changes repository settings outside of `git sparse-checkout init`. One relevant scenario I can imagine is if a user enables `index.sparse, commands would not use the sparse index until a command that *writes* the index is executed: $ git config index.sparse true $ git diff # read full index, full in-core, no write $ git add . # read full index, full in-core, write sparse $ git diff # read sparse index, sparse in-core, no write Another scenario might be enabling `index.sparse`, then running a command that turns out to be surprisingly slow because it's operating on a full index in-core (i.e., much slower than the convert_to_sparse + command with sparse index would be). These are definitely edge cases - they rely on manual config changes, and they're only really noticeable 1) if the config change is immediately followed by index read-only commands and 2) if the first index-writing command after the config change is slow. That said, on the off chance that a user (or future developer) encounter one of these scenarios, I think it'll be helpful to have `index.sparse` take effect "immediately" after the config change. I'll include these (and the troubleshooting) examples in the updated commit message. >> >> Helped-by: Junio C Hamano <gitster@pobox.com> >> Co-authored-by: Derrick Stolee <dstolee@microsoft.com> >> Signed-off-by: Victoria Dye <vdye@github.com> >> --- >> read-cache.c | 8 ++++++ >> t/t1092-sparse-checkout-compatibility.sh | 31 ++++++++++++++++++++++++ >> 2 files changed, 39 insertions(+) >> >> diff --git a/read-cache.c b/read-cache.c >> index a78b88a41bf..b3772ba70a1 100644 >> --- a/read-cache.c >> +++ b/read-cache.c >> @@ -2337,9 +2337,17 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist) >> >> if (!istate->repo) >> istate->repo = the_repository; >> + >> + /* >> + * If the command explicitly requires a full index, force it >> + * to be full. Otherwise, correct the sparsity based on repository >> + * settings and other properties of the index (if necessary). >> + */ >> prepare_repo_settings(istate->repo); >> if (istate->repo->settings.command_requires_full_index) >> ensure_full_index(istate); >> + else >> + ensure_correct_sparsity(istate); >> >> return istate->cache_nr; >> >> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh >> index ca91c6a67f8..59accde1fa3 100755 >> --- a/t/t1092-sparse-checkout-compatibility.sh >> +++ b/t/t1092-sparse-checkout-compatibility.sh >> @@ -694,6 +694,37 @@ test_expect_success 'sparse-index is expanded and converted back' ' >> test_region index ensure_full_index trace2.txt >> ' >> >> +test_expect_success 'index.sparse disabled inline uses full index' ' >> + init_repos && >> + >> + # When index.sparse is disabled inline with `git status`, the >> + # index is expanded at the beginning of the execution then never >> + # converted back to sparse. It is then written to disk as a full index. >> + rm -f trace2.txt && >> + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ >> + git -C sparse-index -c index.sparse=false status && >> + ! test_region index convert_to_sparse trace2.txt && >> + test_region index ensure_full_index trace2.txt && >> + >> + # Since index.sparse is set to true at a repo level, the index >> + # is converted from full to sparse when read, then never expanded >> + # over the course of `git status`. It is written to disk as a sparse >> + # index. >> + rm -f trace2.txt && >> + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ >> + git -C sparse-index status && >> + test_region index convert_to_sparse trace2.txt && >> + ! test_region index ensure_full_index trace2.txt && >> + >> + # Now that the index has been written to disk as sparse, it is not >> + # converted to sparse (or expanded to full) when read by `git status`. >> + rm -f trace2.txt && >> + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ >> + git -C sparse-index status && >> + ! test_region index convert_to_sparse trace2.txt && >> + ! test_region index ensure_full_index trace2.txt >> +' >> + >> ensure_not_expanded () { >> rm -f trace2.txt && >> echo >>sparse-index/untracked.txt && >> -- >> gitgitgadget > > The rest of the patch looks fine. >
diff --git a/read-cache.c b/read-cache.c index a78b88a41bf..b3772ba70a1 100644 --- a/read-cache.c +++ b/read-cache.c @@ -2337,9 +2337,17 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist) if (!istate->repo) istate->repo = the_repository; + + /* + * If the command explicitly requires a full index, force it + * to be full. Otherwise, correct the sparsity based on repository + * settings and other properties of the index (if necessary). + */ prepare_repo_settings(istate->repo); if (istate->repo->settings.command_requires_full_index) ensure_full_index(istate); + else + ensure_correct_sparsity(istate); return istate->cache_nr; diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index ca91c6a67f8..59accde1fa3 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -694,6 +694,37 @@ test_expect_success 'sparse-index is expanded and converted back' ' test_region index ensure_full_index trace2.txt ' +test_expect_success 'index.sparse disabled inline uses full index' ' + init_repos && + + # When index.sparse is disabled inline with `git status`, the + # index is expanded at the beginning of the execution then never + # converted back to sparse. It is then written to disk as a full index. + rm -f trace2.txt && + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ + git -C sparse-index -c index.sparse=false status && + ! test_region index convert_to_sparse trace2.txt && + test_region index ensure_full_index trace2.txt && + + # Since index.sparse is set to true at a repo level, the index + # is converted from full to sparse when read, then never expanded + # over the course of `git status`. It is written to disk as a sparse + # index. + rm -f trace2.txt && + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ + git -C sparse-index status && + test_region index convert_to_sparse trace2.txt && + ! test_region index ensure_full_index trace2.txt && + + # Now that the index has been written to disk as sparse, it is not + # converted to sparse (or expanded to full) when read by `git status`. + rm -f trace2.txt && + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ + git -C sparse-index status && + ! test_region index convert_to_sparse trace2.txt && + ! test_region index ensure_full_index trace2.txt +' + ensure_not_expanded () { rm -f trace2.txt && echo >>sparse-index/untracked.txt &&