mbox series

[v2,0/5] Sparse index: fetch, pull, ls-files

Message ID pull.1080.v2.git.1638992395.gitgitgadget@gmail.com (mailing list archive)
Headers show
Series Sparse index: fetch, pull, ls-files | expand

Message

Philippe Blain via GitGitGadget Dec. 8, 2021, 7:39 p.m. UTC
This is based on ld/sparse-index-blame (merged with 'master' due to an
unrelated build issue).

Here are two relatively-simple patches that further the sparse index
integrations.

Did you know that 'fetch' and 'pull' read the index? I didn't, or this would
have been an integration much earlier in the cycle. They read the index to
look for the .gitmodules file in case there are submodules that need to be
fetched. Since looking for a file by name is already protected, we only need
to disable 'command_requires_full_index' and we are done.

The 'ls-files' builtin is useful when debugging the index, and some scripts
use it, too. We are not changing the default behavior which expands a sparse
index in order to show all of the cached blobs. Instead, we add a '--sparse'
option that allows us to see the sparse directory entries upon request.
Combined with --debug, we can see a lot of index details, such as:

$ git ls-files --debug --sparse
LICENSE
  ctime: 1634910503:287405820
  mtime: 1634910503:287405820
  dev: 16777220 ino: 119325319
  uid: 501  gid: 20
  size: 1098    flags: 200000
README.md
  ctime: 1634910503:288090279
  mtime: 1634910503:288090279
  dev: 16777220 ino: 119325320
  uid: 501  gid: 20
  size: 934 flags: 200000
bin/index.js
  ctime: 1634910767:828434033
  mtime: 1634910767:828434033
  dev: 16777220 ino: 119325520
  uid: 501  gid: 20
  size: 7292    flags: 200000
examples/
  ctime: 0:0
  mtime: 0:0
  dev: 0    ino: 0
  uid: 0    gid: 0
  size: 0   flags: 40004000
package.json
  ctime: 1634910503:288676330
  mtime: 1634910503:288676330
  dev: 16777220 ino: 119325321
  uid: 501  gid: 20
  size: 680 flags: 200000


(In this example, the 'examples/' directory is sparse.)

Thanks!


Updates in v2
=============

 * Rebased onto latest ld/sparse-index-blame without issue.
 * Updated the test to use diff-of-diffs instead of a sequence of greps.
 * Added patches that remove the use of 'test-tool read-cache --table' and
   its implementation.

Derrick Stolee (5):
  fetch/pull: use the sparse index
  ls-files: add --sparse option
  t1092: replace 'read-cache --table' with 'ls-files --sparse'
  t1091/t3705: remove 'test-tool read-cache --table'
  test-read-cache: remove --table, --expand options

 Documentation/git-ls-files.txt           |   4 +
 builtin/fetch.c                          |   2 +
 builtin/ls-files.c                       |  12 ++-
 builtin/pull.c                           |   2 +
 t/helper/test-read-cache.c               |  64 ++---------
 t/t1091-sparse-checkout-builtin.sh       |  25 ++++-
 t/t1092-sparse-checkout-compatibility.sh | 129 ++++++++++++++++++++---
 t/t3705-add-sparse-checkout.sh           |   8 +-
 8 files changed, 165 insertions(+), 81 deletions(-)


base-commit: 3fffe69d24e4ecc95246766f5396303a953695ff
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1080%2Fderrickstolee%2Fsparse-index%2Ffetch-pull-ls-files-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1080/derrickstolee/sparse-index/fetch-pull-ls-files-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1080

Range-diff vs v1:

 1:  451056e1a77 ! 1:  f72001638d1 fetch/pull: use the sparse index
     @@ builtin/pull.c: int cmd_pull(int argc, const char **argv, const char *prefix)
      
       ## t/t1092-sparse-checkout-compatibility.sh ##
      @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse index is not expanded: blame' '
     - 	ensure_not_expanded blame deep/deeper1/deepest/a
     + 	done
       '
       
      +test_expect_success 'sparse index is not expanded: fetch/pull' '
 2:  e42c0feec94 ! 2:  58b5eca4835 ls-files: add --sparse option
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse index is n
      +	test_all_match git ls-files &&
      +
      +	# With --sparse, the sparse index data changes behavior.
     -+	git -C sparse-index ls-files --sparse >sparse-index-out &&
     -+	grep "^folder1/\$" sparse-index-out &&
     -+	grep "^folder2/\$" sparse-index-out &&
     ++	git -C sparse-index ls-files >dense &&
     ++	git -C sparse-index ls-files --sparse >sparse &&
     ++
     ++	cat >expect <<-\EOF &&
     ++	@@ -13,13 +13,9 @@
     ++	 e
     ++	 folder1-
     ++	 folder1.x
     ++	-folder1/0/0/0
     ++	-folder1/0/1
     ++	-folder1/a
     ++	+folder1/
     ++	 folder10
     ++	-folder2/0/0/0
     ++	-folder2/0/1
     ++	-folder2/a
     ++	+folder2/
     ++	 g
     ++	-x/a
     ++	+x/
     ++	 z
     ++	EOF
     ++
     ++	diff -u dense sparse | tail -n +3 >actual &&
     ++	test_cmp expect actual &&
      +
      +	# With --sparse and no sparse index, nothing changes.
     -+	git -C sparse-checkout ls-files --sparse >sparse-checkout-out &&
     -+	grep "^folder1/0/0/0\$" sparse-checkout-out &&
     -+	! grep "/\$" sparse-checkout-out &&
     ++	git -C sparse-checkout ls-files >dense &&
     ++	git -C sparse-checkout ls-files --sparse >sparse &&
     ++	test_cmp dense sparse &&
      +
      +	write_script edit-content <<-\EOF &&
      +	mkdir folder1 &&
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse index is n
      +	git -C sparse-index ls-files --sparse --modified >sparse-index-out &&
      +	test_must_be_empty sparse-index-out &&
      +
     -+	run_on_sparse git sparse-checkout add folder1 &&
     ++	# Add folder1 to the sparse-checkout cone and
     ++	# check that ls-files shows the expanded files.
     ++	test_sparse_match git sparse-checkout add folder1 &&
      +	test_sparse_match git ls-files --modified &&
     -+	grep "^folder1/a\$" sparse-checkout-out &&
     -+	grep "^folder1/a\$" sparse-index-out &&
      +
     -+	# Double-check index expansion
     ++	git -C sparse-index ls-files >dense &&
     ++	git -C sparse-index ls-files --sparse >sparse &&
     ++
     ++	cat >expect <<-\EOF &&
     ++	@@ -17,9 +17,7 @@
     ++	 folder1/0/1
     ++	 folder1/a
     ++	 folder10
     ++	-folder2/0/0/0
     ++	-folder2/0/1
     ++	-folder2/a
     ++	+folder2/
     ++	 g
     ++	-x/a
     ++	+x/
     ++	 z
     ++	EOF
     ++
     ++	diff -u dense sparse | tail -n +3 >actual &&
     ++	test_cmp expect actual &&
     ++
     ++	# Double-check index expansion is avoided
      +	ensure_not_expanded ls-files --sparse
      +'
      +
 -:  ----------- > 3:  5ffae2a03ae t1092: replace 'read-cache --table' with 'ls-files --sparse'
 -:  ----------- > 4:  b98e5e6d2bc t1091/t3705: remove 'test-tool read-cache --table'
 -:  ----------- > 5:  f31a24eeb9b test-read-cache: remove --table, --expand options

Comments

Elijah Newren Dec. 9, 2021, 5:23 a.m. UTC | #1
On Wed, Dec 8, 2021 at 11:39 AM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> This is based on ld/sparse-index-blame (merged with 'master' due to an
> unrelated build issue).
>
> Here are two relatively-simple patches that further the sparse index
> integrations.
>
> Did you know that 'fetch' and 'pull' read the index? I didn't, or this would
> have been an integration much earlier in the cycle. They read the index to
> look for the .gitmodules file in case there are submodules that need to be
> fetched. Since looking for a file by name is already protected, we only need
> to disable 'command_requires_full_index' and we are done.
>
> The 'ls-files' builtin is useful when debugging the index, and some scripts
> use it, too. We are not changing the default behavior which expands a sparse
> index in order to show all of the cached blobs. Instead, we add a '--sparse'
> option that allows us to see the sparse directory entries upon request.
> Combined with --debug, we can see a lot of index details, such as:
>
> $ git ls-files --debug --sparse
> LICENSE
>   ctime: 1634910503:287405820
>   mtime: 1634910503:287405820
>   dev: 16777220 ino: 119325319
>   uid: 501  gid: 20
>   size: 1098    flags: 200000
> README.md
>   ctime: 1634910503:288090279
>   mtime: 1634910503:288090279
>   dev: 16777220 ino: 119325320
>   uid: 501  gid: 20
>   size: 934 flags: 200000
> bin/index.js
>   ctime: 1634910767:828434033
>   mtime: 1634910767:828434033
>   dev: 16777220 ino: 119325520
>   uid: 501  gid: 20
>   size: 7292    flags: 200000
> examples/
>   ctime: 0:0
>   mtime: 0:0
>   dev: 0    ino: 0
>   uid: 0    gid: 0
>   size: 0   flags: 40004000
> package.json
>   ctime: 1634910503:288676330
>   mtime: 1634910503:288676330
>   dev: 16777220 ino: 119325321
>   uid: 501  gid: 20
>   size: 680 flags: 200000
>
>
> (In this example, the 'examples/' directory is sparse.)
>
> Thanks!
>
>
> Updates in v2
> =============
>
>  * Rebased onto latest ld/sparse-index-blame without issue.
>  * Updated the test to use diff-of-diffs instead of a sequence of greps.
>  * Added patches that remove the use of 'test-tool read-cache --table' and
>    its implementation.

I still think a couple things in patch 2 deserve some comments about
the expectations.  Other than that, though, the series reads nicely
and I was only able to spot a few other very minor items.

> Derrick Stolee (5):
>   fetch/pull: use the sparse index
>   ls-files: add --sparse option
>   t1092: replace 'read-cache --table' with 'ls-files --sparse'
>   t1091/t3705: remove 'test-tool read-cache --table'
>   test-read-cache: remove --table, --expand options
>
>  Documentation/git-ls-files.txt           |   4 +
>  builtin/fetch.c                          |   2 +
>  builtin/ls-files.c                       |  12 ++-
>  builtin/pull.c                           |   2 +
>  t/helper/test-read-cache.c               |  64 ++---------
>  t/t1091-sparse-checkout-builtin.sh       |  25 ++++-
>  t/t1092-sparse-checkout-compatibility.sh | 129 ++++++++++++++++++++---
>  t/t3705-add-sparse-checkout.sh           |   8 +-
>  8 files changed, 165 insertions(+), 81 deletions(-)
>
>
> base-commit: 3fffe69d24e4ecc95246766f5396303a953695ff
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1080%2Fderrickstolee%2Fsparse-index%2Ffetch-pull-ls-files-v2
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1080/derrickstolee/sparse-index/fetch-pull-ls-files-v2
> Pull-Request: https://github.com/gitgitgadget/git/pull/1080
>
> Range-diff vs v1:
>
>  1:  451056e1a77 ! 1:  f72001638d1 fetch/pull: use the sparse index
>      @@ builtin/pull.c: int cmd_pull(int argc, const char **argv, const char *prefix)
>
>        ## t/t1092-sparse-checkout-compatibility.sh ##
>       @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse index is not expanded: blame' '
>      -  ensure_not_expanded blame deep/deeper1/deepest/a
>      +  done
>        '
>
>       +test_expect_success 'sparse index is not expanded: fetch/pull' '
>  2:  e42c0feec94 ! 2:  58b5eca4835 ls-files: add --sparse option
>      @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse index is n
>       + test_all_match git ls-files &&
>       +
>       + # With --sparse, the sparse index data changes behavior.
>      -+ git -C sparse-index ls-files --sparse >sparse-index-out &&
>      -+ grep "^folder1/\$" sparse-index-out &&
>      -+ grep "^folder2/\$" sparse-index-out &&
>      ++ git -C sparse-index ls-files >dense &&
>      ++ git -C sparse-index ls-files --sparse >sparse &&
>      ++
>      ++ cat >expect <<-\EOF &&
>      ++ @@ -13,13 +13,9 @@
>      ++  e
>      ++  folder1-
>      ++  folder1.x
>      ++ -folder1/0/0/0
>      ++ -folder1/0/1
>      ++ -folder1/a
>      ++ +folder1/
>      ++  folder10
>      ++ -folder2/0/0/0
>      ++ -folder2/0/1
>      ++ -folder2/a
>      ++ +folder2/
>      ++  g
>      ++ -x/a
>      ++ +x/
>      ++  z
>      ++ EOF
>      ++
>      ++ diff -u dense sparse | tail -n +3 >actual &&
>      ++ test_cmp expect actual &&
>       +
>       + # With --sparse and no sparse index, nothing changes.
>      -+ git -C sparse-checkout ls-files --sparse >sparse-checkout-out &&
>      -+ grep "^folder1/0/0/0\$" sparse-checkout-out &&
>      -+ ! grep "/\$" sparse-checkout-out &&
>      ++ git -C sparse-checkout ls-files >dense &&
>      ++ git -C sparse-checkout ls-files --sparse >sparse &&
>      ++ test_cmp dense sparse &&
>       +
>       + write_script edit-content <<-\EOF &&
>       + mkdir folder1 &&
>      @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse index is n
>       + git -C sparse-index ls-files --sparse --modified >sparse-index-out &&
>       + test_must_be_empty sparse-index-out &&
>       +
>      -+ run_on_sparse git sparse-checkout add folder1 &&
>      ++ # Add folder1 to the sparse-checkout cone and
>      ++ # check that ls-files shows the expanded files.
>      ++ test_sparse_match git sparse-checkout add folder1 &&
>       + test_sparse_match git ls-files --modified &&
>      -+ grep "^folder1/a\$" sparse-checkout-out &&
>      -+ grep "^folder1/a\$" sparse-index-out &&
>       +
>      -+ # Double-check index expansion
>      ++ git -C sparse-index ls-files >dense &&
>      ++ git -C sparse-index ls-files --sparse >sparse &&
>      ++
>      ++ cat >expect <<-\EOF &&
>      ++ @@ -17,9 +17,7 @@
>      ++  folder1/0/1
>      ++  folder1/a
>      ++  folder10
>      ++ -folder2/0/0/0
>      ++ -folder2/0/1
>      ++ -folder2/a
>      ++ +folder2/
>      ++  g
>      ++ -x/a
>      ++ +x/
>      ++  z
>      ++ EOF
>      ++
>      ++ diff -u dense sparse | tail -n +3 >actual &&
>      ++ test_cmp expect actual &&
>      ++
>      ++ # Double-check index expansion is avoided
>       + ensure_not_expanded ls-files --sparse
>       +'
>       +
>  -:  ----------- > 3:  5ffae2a03ae t1092: replace 'read-cache --table' with 'ls-files --sparse'
>  -:  ----------- > 4:  b98e5e6d2bc t1091/t3705: remove 'test-tool read-cache --table'
>  -:  ----------- > 5:  f31a24eeb9b test-read-cache: remove --table, --expand options
>
> --
> gitgitgadget