diff mbox series

[v1] worktree: integrate with sparse-index

Message ID 20230605161644.491424-1-cheskaqiqi@gmail.com (mailing list archive)
State Superseded
Headers show
Series [v1] worktree: integrate with sparse-index | expand

Commit Message

Shuqi Liang June 5, 2023, 4:16 p.m. UTC
The index is read in 'worktree.c' at two points:

1.The 'validate_no_submodules' function, which checks if there are any
submodules present in the worktree.

2.The 'check_clean_worktree' function, which verifies if a worktree is
'clean', i.e., there are no untracked or modified but uncommitted files.
This is done by running the 'git status' command, and an error message
is thrown if the worktree is not clean. Given that 'git status' is
already sparse-aware, the function is also sparse-aware.

Hence we can just set the requires-full-index to false for
"git worktree".

Add tests that verify that 'git worktree' behaves correctly when the
sparse index is enabled and test to ensure the index is not expanded.

The `p2000` tests demonstrate a ~20% execution time reduction for
'git worktree' using a sparse index:

(Note:the p2000 test results did't reflect the huge speedup because of
the index reading time is minuscule comparing to the filesystem
operations.)

Test                                       before  after
-----------------------------------------------------------------------
2000.102: git worktree add....(full-v3)    3.15    2.82  -10.5%
2000.103: git worktree add....(full-v4)    3.14    2.84  -9.6%
2000.104: git worktree add....(sparse-v3)  2.59    2.14  -16.4%
2000.105: git worktree add....(sparse-v4)  2.10    1.57  -25.2%

Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
---
 builtin/worktree.c                       |  4 ++++
 t/perf/p2000-sparse-operations.sh        |  1 +
 t/t1092-sparse-checkout-compatibility.sh | 23 +++++++++++++++++++++++
 3 files changed, 28 insertions(+)

Comments

Victoria Dye June 5, 2023, 7:16 p.m. UTC | #1
Shuqi Liang wrote:
> The index is read in 'worktree.c' at two points:
> 
> 1.The 'validate_no_submodules' function, which checks if there are any
> submodules present in the worktree.
> 
> 2.The 'check_clean_worktree' function, which verifies if a worktree is
> 'clean', i.e., there are no untracked or modified but uncommitted files.
> This is done by running the 'git status' command, and an error message
> is thrown if the worktree is not clean. Given that 'git status' is
> already sparse-aware, the function is also sparse-aware.
> 
> Hence we can just set the requires-full-index to false for
> "git worktree".

Thanks for the detailed analysis! This lines up with my understanding of the
command as well; I'm glad the sparse index integration is so straightforward
here!

> 
> Add tests that verify that 'git worktree' behaves correctly when the
> sparse index is enabled and test to ensure the index is not expanded.
> 
> The `p2000` tests demonstrate a ~20% execution time reduction for
> 'git worktree' using a sparse index:
> 
> (Note:the p2000 test results did't reflect the huge speedup because of

s/did't/didn't

(not worth fixing if you don't end up re-rolling, though!)

> the index reading time is minuscule comparing to the filesystem
> operations.)
> 
> Test                                       before  after
> -----------------------------------------------------------------------
> 2000.102: git worktree add....(full-v3)    3.15    2.82  -10.5%
> 2000.103: git worktree add....(full-v4)    3.14    2.84  -9.6%
> 2000.104: git worktree add....(sparse-v3)  2.59    2.14  -16.4%
> 2000.105: git worktree add....(sparse-v4)  2.10    1.57  -25.2%
> 
> Helped-by: Victoria Dye <vdye@github.com>
> Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
> ---
>  builtin/worktree.c                       |  4 ++++
>  t/perf/p2000-sparse-operations.sh        |  1 +
>  t/t1092-sparse-checkout-compatibility.sh | 23 +++++++++++++++++++++++
>  3 files changed, 28 insertions(+)
> 
> diff --git a/builtin/worktree.c b/builtin/worktree.c
> index f3180463be..db14bff1a3 100644
> --- a/builtin/worktree.c
> +++ b/builtin/worktree.c
> @@ -1200,5 +1200,9 @@ int cmd_worktree(int ac, const char **av, const char *prefix)
>  		prefix = "";
>  
>  	ac = parse_options(ac, av, prefix, options, git_worktree_usage, 0);
> +
> +	prepare_repo_settings(the_repository);
> +	the_repository->settings.command_requires_full_index = 0;
> +
>  	return fn(ac, av, prefix);
>  }
> diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh
> index 901cc493ef..1422136c73 100755
> --- a/t/perf/p2000-sparse-operations.sh
> +++ b/t/perf/p2000-sparse-operations.sh
> @@ -131,5 +131,6 @@ test_perf_on_all git describe --dirty
>  test_perf_on_all 'echo >>new && git describe --dirty'
>  test_perf_on_all git diff-files
>  test_perf_on_all git diff-files -- $SPARSE_CONE/a
> +test_perf_on_all "git worktree add ../temp && git worktree remove ../temp"

This, like the 'git stash' performance tests, involves multiple steps to
ensure we return to a clean state after the test is executed. Makes sense.

>  
>  test_done
> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index a63d0cc222..6ed691d338 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -2180,4 +2180,27 @@ test_expect_success 'sparse index is not expanded: diff-files' '
>  	ensure_not_expanded diff-files -- "deep/*"
>  '
>  
> +test_expect_success 'worktree' '
> +	init_repos &&
> +
> +	write_script edit-contents <<-\EOF &&
> +	echo text >>"$1"
> +	EOF
> +
> +	test_all_match git worktree add .worktrees/hotfix &&
> +	test_sparse_match ls .worktrees/hotfix &&

I see why you're comparing 'sparse-checkout' to 'sparse-index' here (their
worktrees should both contain only the files matched by the sparse-checkout
patterns, unlike 'full-checkout' which will contain all files), but this
won't catch bugs that apply to both sparse-checkout and sparse-index (e.g.,
if the sparse checkout patterns weren't applied and the full worktrees were
checked out). 

To make sure that doesn't happen, you could add a section that compares each 
test repo's default worktree to a new worktree, e.g.:

	for repo in full-checkout sparse-checkout sparse-index
	do
		worktree=${repo}-wt &&
		git -C $repo worktree add $worktree &&
		
		# Compare worktree content with 'ls'

		# Compare index content with 'ls-files --sparse'

		# Any other comparisons that are useful
		
		git worktree remove $worktree || return 1
	done

> +	test_all_match git worktree remove .worktrees/hotfix &&
> +
> +	test_all_match git worktree add .worktrees/hotfix &&
> +	run_on_all ../edit-contents .worktrees/hotfix/deep/a &&
> +	test_all_match test_must_fail git worktree remove .worktrees/hotfix
> +'
> +
> +test_expect_success 'worktree is not expanded' '
> +	init_repos &&
> +
> +	test_all_match git worktree add .worktrees/hotfix &&

Shouldn't 'git worktree add' not expand the index? Why use 'test_all_match'
instead of 'ensure_not_expanded'?

> +	ensure_not_expanded worktree remove .worktrees/hotfix> +'
> +
>  test_done
Shuqi Liang June 5, 2023, 8:16 p.m. UTC | #2
On Mon, Jun 5, 2023 at 3:16 PM Victoria Dye <vdye@github.com> wrote:

> >
> > The `p2000` tests demonstrate a ~20% execution time reduction for
> > 'git worktree' using a sparse index:
> >
> > (Note:the p2000 test results did't reflect the huge speedup because of
>
> s/did't/didn't
>
> (not worth fixing if you don't end up re-rolling, though!)
>
> > the index reading time is minuscule comparing to the filesystem
> > operations.)

Will fix it!

> >
> > +test_expect_success 'worktree' '
> > +     init_repos &&
> > +
> > +     write_script edit-contents <<-\EOF &&
> > +     echo text >>"$1"
> > +     EOF
> > +
> > +     test_all_match git worktree add .worktrees/hotfix &&
> > +     test_sparse_match ls .worktrees/hotfix &&
>
> I see why you're comparing 'sparse-checkout' to 'sparse-index' here (their
> worktrees should both contain only the files matched by the sparse-checkout
> patterns, unlike 'full-checkout' which will contain all files), but this
> won't catch bugs that apply to both sparse-checkout and sparse-index (e.g.,
> if the sparse checkout patterns weren't applied and the full worktrees were
> checked out).
>
> To make sure that doesn't happen, you could add a section that compares each
> test repo's default worktree to a new worktree, e.g.:
>
>         for repo in full-checkout sparse-checkout sparse-index
>         do
>                 worktree=${repo}-wt &&
>                 git -C $repo worktree add $worktree &&
>
>                 # Compare worktree content with 'ls'
>
>                 # Compare index content with 'ls-files --sparse'
>
>                 # Any other comparisons that are useful
>
>                 git worktree remove $worktree || return 1
>         done
>

Will do !

> > +test_expect_success 'worktree is not expanded' '
> > +     init_repos &&
> > +
> > +     test_all_match git worktree add .worktrees/hotfix &&
>
> Shouldn't 'git worktree add' not expand the index? Why use 'test_all_match'
> instead of 'ensure_not_expanded'?

Here's my perspective on why my use of "test_all_match" instead of
"ensure_not_expanded" in "git worktree add":

The functions "validate_no_submodules" and "check_clean_worktree" are
specifically related to the "git worktree remove" command, and "git
worktree add" doesn't require index reading, so with or without the
"ensure_full_index" wouldn't affect the "git worktree add" command.
I look forward to hearing your thoughts regarding whether my
understanding is correct or not.

Thanks for your valuable feedback!
Victoria Dye June 6, 2023, 4:22 a.m. UTC | #3
Shuqi Liang wrote:
>>> +test_expect_success 'worktree is not expanded' '
>>> +     init_repos &&
>>> +
>>> +     test_all_match git worktree add .worktrees/hotfix &&
>>
>> Shouldn't 'git worktree add' not expand the index? Why use 'test_all_match'
>> instead of 'ensure_not_expanded'?
> 
> Here's my perspective on why my use of "test_all_match" instead of
> "ensure_not_expanded" in "git worktree add":
> 
> The functions "validate_no_submodules" and "check_clean_worktree" are
> specifically related to the "git worktree remove" command, and "git
> worktree add" doesn't require index reading, so with or without the
> "ensure_full_index" wouldn't affect the "git worktree add" command.
> I look forward to hearing your thoughts regarding whether my
> understanding is correct or not.

I see, thanks for the explanation. I could understand it both ways: on one
hand, you don't want redundant/unnecessary tests; on the other hand, that
test design decision relies pretty heavily on knowing the internal
implementation details, which the tests conceptually shouldn't have
visibility to. 

I'd still lean towards using 'ensure_not_expanded' (it protects us from
future changes causing index expansion, although that seems fairly
unlikely). However, if you do choose to stick with not using
'ensure_not_expanded', I'd recommend using 'git -C sparse-index worktree add
.worktrees/hotfix' instead of 'test_all_match'. The 'worktree' test already
compares behavior across the three test repositories; to keep things focused
on index expansion, only the 'sparse-index' repo should be set up & tested.

> 
> Thanks for your valuable feedback!
diff mbox series

Patch

diff --git a/builtin/worktree.c b/builtin/worktree.c
index f3180463be..db14bff1a3 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -1200,5 +1200,9 @@  int cmd_worktree(int ac, const char **av, const char *prefix)
 		prefix = "";
 
 	ac = parse_options(ac, av, prefix, options, git_worktree_usage, 0);
+
+	prepare_repo_settings(the_repository);
+	the_repository->settings.command_requires_full_index = 0;
+
 	return fn(ac, av, prefix);
 }
diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh
index 901cc493ef..1422136c73 100755
--- a/t/perf/p2000-sparse-operations.sh
+++ b/t/perf/p2000-sparse-operations.sh
@@ -131,5 +131,6 @@  test_perf_on_all git describe --dirty
 test_perf_on_all 'echo >>new && git describe --dirty'
 test_perf_on_all git diff-files
 test_perf_on_all git diff-files -- $SPARSE_CONE/a
+test_perf_on_all "git worktree add ../temp && git worktree remove ../temp"
 
 test_done
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index a63d0cc222..6ed691d338 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -2180,4 +2180,27 @@  test_expect_success 'sparse index is not expanded: diff-files' '
 	ensure_not_expanded diff-files -- "deep/*"
 '
 
+test_expect_success 'worktree' '
+	init_repos &&
+
+	write_script edit-contents <<-\EOF &&
+	echo text >>"$1"
+	EOF
+
+	test_all_match git worktree add .worktrees/hotfix &&
+	test_sparse_match ls .worktrees/hotfix &&
+	test_all_match git worktree remove .worktrees/hotfix &&
+
+	test_all_match git worktree add .worktrees/hotfix &&
+	run_on_all ../edit-contents .worktrees/hotfix/deep/a &&
+	test_all_match test_must_fail git worktree remove .worktrees/hotfix
+'
+
+test_expect_success 'worktree is not expanded' '
+	init_repos &&
+
+	test_all_match git worktree add .worktrees/hotfix &&
+	ensure_not_expanded worktree remove .worktrees/hotfix
+'
+
 test_done