mbox series

[v4,00/12] Sparse-index: integrate with status

Message ID pull.932.v4.git.1621598382.gitgitgadget@gmail.com (mailing list archive)
Headers show
Series Sparse-index: integrate with status | expand

Message

Derrick Stolee via GitGitGadget May 21, 2021, 11:59 a.m. UTC
This is the first "payoff" series in the sparse-index work. It makes 'git
status' very fast when a sparse-index is enabled on a repository with
cone-mode sparse-checkout (and a small populated set).

This is based on ds/sparse-index-protections AND mt/add-rm-sparse-checkout.
The latter branch is needed because it changes the behavior of 'git add'
around sparse entries, which changes the expectations of a test added in
patch 1.

The approach here is to audit the places where ensure_full_index() pops up
while doing normal commands with pathspecs within the sparse-checkout
definition. Each of these are checked and tested. In the end, the
sparse-index is integrated with these features:

 * git status
 * FS Monitor index extension.

The performance tests in p2000-sparse-operations.sh improve by 95% or more,
even when compared with the full-index cases, not just the sparse-index
cases that previously had extra overhead.

Hopefully this is the first example of how ds/sparse-index-protections has
done the basic work to do these conversions safely, making them look easier
than they seemed when starting this adventure.

Thanks, -Stolee


Updates in V4
=============

 * The previous patch "unpack-trees: stop recursing into sparse directories"
   was confusing, and actually a bit sloppy.
 * It has been replaced with "unpack-trees: be careful around sparse
   directory entries" which takes the sparse-directory checks and raises
   them higher up into unpack_trees.c instead of in diff-lib.c.


Updates in V3
=============

Sorry that this was a long time coming. I got a little side-tracked on other
projects, but I also worked to get the sparse-index feature working against
the Scalar functional tests, which contain many special cases around the
sparse-checkout feature as they were inherited from special cases that arose
in the virtualized environment of VFS for Git. This version contains my
fixes based on that investigation. Most of these were easy to identify and
fix, but I was blocked for a long time struggling with a bug when combining
the sparse-index with the builtin FS Monitor feature, but I've reported my
findings already [1].

[1]
https://lore.kernel.org/git/0b9e54ba-ac27-e537-7bef-1b4448f92352@gmail.com/

 * Updated comments and tests based on the v2 feedback.
 * Expanded the test repository data shape based on the special cases found
   during my investigation.
 * Added several commits that either fix errors in the status code, or fix
   errors in the previous sparse-index series, specifically:
   * When in a conflict state, the cache-tree fails to update. For now, skip
     writing a sparse-index until this can be resolved more carefully.
   * When expanding a sparse-directory entry, we set the CE_SKIP_WORKTREE
     bit but forgot the CE_EXTENDED bit.
   * git status had failures if there was a sparse-directory entry as the
     first entry within a directory.
   * When expanding a directory to report its status, such as when a
     sparse-directory is staged but doesn't exist at HEAD (such as in an
     orphaned commit) we did not previously recurse correctly into
     subdirectories.
   * Be extra careful with the FS Monitor data when expanding or contracting
     an index. This version now abandons all FS Monitor data at these
     conversion points with the expectation that in the future these
     conversions will be rare so the FS Monitor feature can work
     efficiently. Updates in V2

----------------------------------------------------------------------------

 * Based on the feedback, it is clear that 'git add' will require much more
   careful testing and thought. I'm splitting it out of this series and it
   will return with a follow-up.
 * Test cases are improved, both in coverage and organization.
 * The previous "unpack-trees: make sparse aware" patch is split into three
   now.
 * Stale messages based on an old implementation of the "protections" topic
   are now fixed.
 * Performance tests were re-run.

Derrick Stolee (12):
  sparse-index: skip indexes with unmerged entries
  sparse-index: include EXTENDED flag when expanding
  t1092: expand repository data shape
  t1092: add tests for status/add and sparse files
  unpack-trees: preserve cache_bottom
  unpack-trees: compare sparse directories correctly
  unpack-trees: be careful around sparse directory entries
  dir.c: accept a directory as part of cone-mode patterns
  status: skip sparse-checkout percentage with sparse-index
  status: use sparse-index throughout
  wt-status: expand added sparse directory entries
  fsmonitor: integrate with sparse index

 builtin/commit.c                         |   3 +
 dir.c                                    |  11 +++
 read-cache.c                             |  10 +-
 sparse-index.c                           |  27 +++++-
 t/t1092-sparse-checkout-compatibility.sh | 117 ++++++++++++++++++++++-
 t/t7519-status-fsmonitor.sh              |  48 ++++++++++
 unpack-trees.c                           |  26 ++++-
 wt-status.c                              |  64 ++++++++++++-
 wt-status.h                              |   1 +
 9 files changed, 295 insertions(+), 12 deletions(-)


base-commit: f723f370c89ad61f4f40aabfd3540b1ce19c00e5
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-932%2Fderrickstolee%2Fsparse-index%2Fstatus-and-add-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-932/derrickstolee/sparse-index/status-and-add-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/932

Range-diff vs v3:

  1:  5a2ed3d1d701 =  1:  5a2ed3d1d701 sparse-index: skip indexes with unmerged entries
  2:  8aa41e749471 =  2:  8aa41e749471 sparse-index: include EXTENDED flag when expanding
  3:  70971b1f9261 =  3:  70971b1f9261 t1092: expand repository data shape
  4:  a80b5a41153f =  4:  a80b5a41153f t1092: add tests for status/add and sparse files
  5:  07a45b661c4a =  5:  07a45b661c4a unpack-trees: preserve cache_bottom
  6:  cc4a526e7947 =  6:  cc4a526e7947 unpack-trees: compare sparse directories correctly
  7:  598375d3531f <  -:  ------------ unpack-trees: stop recursing into sparse directories
  -:  ------------ >  7:  e28df7f9395d unpack-trees: be careful around sparse directory entries
  8:  47da2b317237 =  8:  2cc3a93d4434 dir.c: accept a directory as part of cone-mode patterns
  9:  bc1512981493 =  9:  5011feb1aa04 status: skip sparse-checkout percentage with sparse-index
 10:  5b1ae369a7cd = 10:  9f2ce5301dc9 status: use sparse-index throughout
 11:  3b42783d4a86 = 11:  24417e095243 wt-status: expand added sparse directory entries
 12:  b72507f514d1 = 12:  584d4b559a91 fsmonitor: integrate with sparse index