mbox series

[v2,0/2] Mark cat-file sparse-index compatible

Message ID pull.1770.v2.git.git.1725052243.gitgitgadget@gmail.com (mailing list archive)
Headers show
Series Mark cat-file sparse-index compatible | expand

Message

John Cai via GitGitGadget Aug. 30, 2024, 9:10 p.m. UTC
Please note that this is my first contribution to git. I've tried to follow
the instructions about how to correctly submit a patch (I'm using
GitGitGadget as getting Outlook to do plain text e-mail correctly seems
impossible), but please let me know if I've missed something.

My motivation for making this change is purely performance. We have a large
repository that we enable the sparse index for, and I am developing a
pre-commit hook that (among other things) uses git cat-file to get the
staged contents of certain files. Without this change, getting the contents
of a single small file from the index can take upwards of 10 seconds due to
the index expansion. After this change, it only takes ~0.3 seconds unless
the file is outside of the sparse index.

Kevin Lyles (2):
  Allow using stdin in run_on_* functions
  Mark 'git cat-file' sparse-index compatible

 builtin/cat-file.c                       |  3 ++
 t/t1092-sparse-checkout-compatibility.sh | 48 +++++++++++++++++++++---
 2 files changed, 46 insertions(+), 5 deletions(-)


base-commit: 80ccd8a2602820fdf896a8e8894305225f86f61d
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1770%2Fklylesatepic%2Fkl%2Fmark-cat-file-sparse-index-compatible-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1770/klylesatepic/kl/mark-cat-file-sparse-index-compatible-v2
Pull-Request: https://github.com/git/git/pull/1770

Range-diff vs v1:

 -:  ----------- > 1:  7067a4c5da2 Allow using stdin in run_on_* functions
 1:  60c15dd246a ! 2:  a92825e502f Mark `cat-file` sparse-index compatible
     @@ Metadata
      Author: Kevin Lyles <klyles+github@epic.com>
      
       ## Commit message ##
     -    Mark `cat-file` sparse-index compatible
     +    Mark 'git cat-file' sparse-index compatible
      
     -    `cat-file` will expand a sparse index to a full index when needed, but
     -    is currently marked as needing a full index (or rather, not marked as
     -    not needing a full index). This results in much slower `cat-file`
     +    This change affects how 'git cat-file' works with the index when
     +    specifying an object with the ":<path>" syntax (which will give file
     +    contents from the index).
     +
     +    'git cat-file' will expand a sparse index to a full index when needed,
     +    but is currently marked as needing a full index (or rather, not marked
     +    as not needing a full index). This results in much slower 'git cat-file'
          operations when working within the sparse index, since we expand the
          index whether it's needed or not.
      
     -    Mark `cat-file` as not needing a full index, so that you only pay the
     -    cost of expanding the sparse index to a full index when you request a
     -    file outside of the sparse index.
     +    Mark 'git cat-file' as not needing a full index, so that you only pay
     +    the cost of expanding the sparse index to a full index when you request
     +    a file outside of the sparse index.
      
          Add tests to ensure both that:
     -    - `cat-file` returns the correct file contents whether or not the file
     -      is in the sparse index
     -    - `cat-file` warns about expanding to the full index any time you
     -      request something outside of the sparse index
     +    - 'git cat-file' returns the correct file contents whether or not the
     +      file is in the sparse index
     +    - 'git cat-file' expands to the full index any time you request
     +      something outside of the sparse index
      
          Signed-off-by: Kevin Lyles <klyles+github@epic.com>
      
     @@ builtin/cat-file.c: int cmd_cat_file(int argc, const char **argv, const char *pr
       		if (opt_cw)
      
       ## t/t1092-sparse-checkout-compatibility.sh ##
     -@@ t/t1092-sparse-checkout-compatibility.sh: init_repos_as_submodules () {
     - }
     - 
     - run_on_sparse () {
     -+	tee run_on_sparse-checkout run_on_sparse-index &&
     -+
     - 	(
     - 		cd sparse-checkout &&
     - 		GIT_PROGRESS_DELAY=100000 "$@" >../sparse-checkout-out 2>../sparse-checkout-err
     --	) &&
     -+	) <run_on_sparse-checkout &&
     - 	(
     - 		cd sparse-index &&
     - 		GIT_PROGRESS_DELAY=100000 "$@" >../sparse-index-out 2>../sparse-index-err
     --	)
     -+	) <run_on_sparse-index
     - }
     - 
     - run_on_all () {
     -+	tee run_on_all-full run_on_all-sparse &&
     -+
     - 	(
     - 		cd full-checkout &&
     - 		GIT_PROGRESS_DELAY=100000 "$@" >../full-checkout-out 2>../full-checkout-err
     --	) &&
     --	run_on_sparse "$@"
     -+	) <run_on_all-full &&
     -+	run_on_sparse "$@" <run_on_all-sparse
     - }
     - 
     - test_all_match () {
     -@@ t/t1092-sparse-checkout-compatibility.sh: test_sparse_unstaged () {
     - 	done
     - }
     - 
     --# Usage: test_sprase_checkout_set "<c1> ... <cN>" "<s1> ... <sM>"
     -+# Usage: test_sparse_checkout_set "<c1> ... <cN>" "<s1> ... <sM>"
     - # Verifies that "git sparse-checkout set <c1> ... <cN>" succeeds and
     - # leaves the sparse index in a state where <s1> ... <sM> are sparse
     - # directories (and <c1> ... <cN> are not).
      @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'advice.sparseIndexExpanded' '
       	grep "The sparse index is expanding to a full index" err
       '
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'advice.sparseInde
      +	echo "new content" >>sparse-index/deep/a &&
      +	run_on_all git add deep/a &&
      +
     -+	test_all_match git cat-file -p HEAD:deep/a &&
     -+	ensure_not_expanded cat-file -p HEAD:deep/a &&
     -+	test_all_match git cat-file -p HEAD:folder1/a &&
     -+	ensure_not_expanded cat-file -p HEAD:folder1/a &&
     -+
      +	test_all_match git cat-file -p :deep/a &&
      +	ensure_not_expanded cat-file -p :deep/a &&
     -+	run_on_all git cat-file -p :folder1/a &&
     -+	test_cmp full-checkout-out sparse-checkout-out &&
     -+	test_cmp full-checkout-out sparse-index-out &&
     -+	test_cmp full-checkout-err sparse-checkout-err &&
     ++	test_all_match git cat-file -p :folder1/a &&
      +	ensure_expanded cat-file -p :folder1/a'
      +
      +test_expect_success 'cat-file --batch' '
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'advice.sparseInde
      +	echo "new content" >>sparse-index/deep/a &&
      +	run_on_all git add deep/a &&
      +
     -+	cat <<-\EOF | test_all_match git cat-file --batch &&
     -+	HEAD:deep/a
     -+	:deep/a
     -+	EOF
     -+	cat <<-\EOF | ensure_not_expanded cat-file --batch &&
     -+	HEAD:deep/a
     -+	:deep/a
     -+	EOF
     ++	echo ":deep/a">in &&
     ++	test_all_match git cat-file --batch <in &&
     ++	ensure_not_expanded cat-file --batch <in &&
      +
     -+	echo "HEAD:folder1/a" | test_all_match git cat-file --batch &&
     -+	echo "HEAD:folder1/a" | ensure_not_expanded cat-file --batch &&
     ++	echo ":folder1/a">in &&
     ++	test_all_match git cat-file --batch <in &&
     ++	ensure_expanded cat-file --batch <in &&
      +
     -+	echo ":folder1/a" | run_on_all git cat-file --batch &&
     -+	test_cmp full-checkout-out sparse-checkout-out &&
     -+	test_cmp full-checkout-out sparse-index-out &&
     -+	test_cmp full-checkout-err sparse-checkout-err &&
     -+	echo ":folder1/a" | ensure_expanded cat-file --batch &&
     -+
     -+	cat <<-\EOF | run_on_all git cat-file --batch &&
     ++	cat <<-\EOF >in &&
      +	:deep/a
      +	:folder1/a
      +	EOF
     -+	test_cmp full-checkout-out sparse-checkout-out &&
     -+	test_cmp full-checkout-out sparse-index-out &&
     -+	test_cmp full-checkout-err sparse-checkout-err &&
     -+	cat <<-\EOF | ensure_expanded cat-file --batch
     -+	:deep/a
     -+	:folder1/a
     -+	EOF'
     ++	test_all_match git cat-file --batch <in &&
     ++	ensure_expanded cat-file --batch <in'
      +
       test_done