diff mbox series

[v3,01/26] sparse-index: API protection strategy

Message ID 4731f610ba6e70a271126e513f8df46d4e0cc5c6.1618261697.git.gitgitgadget@gmail.com (mailing list archive)
State Accepted
Commit 839a66349e094a28f29577a8b311491b9d6b907b
Headers show
Series Sparse Index: API protections | expand

Commit Message

Derrick Stolee April 12, 2021, 9:07 p.m. UTC
From: Derrick Stolee <dstolee@microsoft.com>

Edit and expand the sparse-index design document with the plan for
guarding index operations with ensure_full_index().

Notably, the plan has changed to not have an expand_to_path() method in
favor of checking for a sparse-directory hit inside of the
index_path_pos() API.

The changes that follow this one will incrementally add
ensure_full_index() guards to iterations over all cache entries. Some
iterations over the cache entries are not protected due to a few
categories listed in the document. Since these are not being modified,
here is a short list of the files and methods that will not receive
these guards:

Looking for non-zero stage:
* builtin/add.c:chmod_pathspec()
* builtin/merge.c:count_unmerged_entries()
* merge-ort.c:record_conflicted_index_entries()
* read-cache.c:unmerged_index()
* rerere.c:check_one_conflict(), find_conflict(), rerere_remaining()
* revision.c:prepare_show_merge()
* sequencer.c:append_conflicts_hint()
* wt-status.c:wt_status_collect_changes_initial()

Looking for submodules:
* builtin/submodule--helper.c:module_list_compute()
* submodule.c: several methods
* worktree.c:validate_no_submodules()

Part of the index API:
* name-hash.c: lazy init methods
* preload-index.c:preload_thread(), preload_index()
* read-cache.c: file format methods

Checking for correct order of cache entries:
* read-cache.c:check_ce_order()

Ignores SKIP_WORKTREE entries or already aware:
* unpack-trees.c:mark_new_skip_worktree()
* wt-status.c:wt_status_check_sparse_checkout()

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/technical/sparse-index.txt | 37 ++++++++++++++++++++++--
 1 file changed, 35 insertions(+), 2 deletions(-)
diff mbox series

Patch

diff --git a/Documentation/technical/sparse-index.txt b/Documentation/technical/sparse-index.txt
index 8d3d80804604..3b24c1a219f8 100644
--- a/Documentation/technical/sparse-index.txt
+++ b/Documentation/technical/sparse-index.txt
@@ -85,8 +85,41 @@  index, as well.
 
 Next, consumers of the index will be guarded against operating on a
 sparse-index by inserting calls to `ensure_full_index()` or
-`expand_index_to_path()`. After these guards are in place, we can begin
-leaving sparse-directory entries in the in-memory index structure.
+`expand_index_to_path()`. If a specific path is requested, then those will
+be protected from within the `index_file_exists()` and `index_name_pos()`
+API calls: they will call `ensure_full_index()` if necessary. The
+intention here is to preserve existing behavior when interacting with a
+sparse-checkout. We don't want a change to happen by accident, without
+tests. Many of these locations may not need any change before removing the
+guards, but we should not do so without tests to ensure the expected
+behavior happens.
+
+It may be desirable to _change_ the behavior of some commands in the
+presence of a sparse index or more generally in any sparse-checkout
+scenario. In such cases, these should be carefully communicated and
+tested. No such behavior changes are intended during this phase.
+
+During a scan of the codebase, not every iteration of the cache entries
+needs an `ensure_full_index()` check. The basic reasons include:
+
+1. The loop is scanning for entries with non-zero stage. These entries
+   are not collapsed into a sparse-directory entry.
+
+2. The loop is scanning for submodules. These entries are not collapsed
+   into a sparse-directory entry.
+
+3. The loop is part of the index API, especially around reading or
+   writing the format.
+
+4. The loop is checking for correct order of cache entries and that is
+   correct if and only if the sparse-directory entries are in the correct
+   location.
+
+5. The loop ignores entries with the `SKIP_WORKTREE` bit set, or is
+   otherwise already aware of sparse directory entries.
+
+6. The sparse-index is disabled at this point when using the split-index
+   feature, so no effort is made to protect the split-index API.
 
 Even after inserting these guards, we will keep expanding sparse-indexes
 for most Git commands using the `command_requires_full_index` repository