diff mbox series

[v4] write-tree: integrate with sparse index

Message ID 20230421004108.32554-1-cheskaqiqi@gmail.com (mailing list archive)
State New, archived
Headers show
Series [v4] write-tree: integrate with sparse index | expand

Commit Message

Shuqi Liang April 21, 2023, 12:41 a.m. UTC
Update 'git write-tree' to allow using the sparse-index in memory
without expanding to a full one.

The recursive algorithm for update_one() was already updated in 2de37c5
(cache-tree: integrate with sparse directory entries, 2021-03-03) to
handle sparse directory entries in the index. Hence we can just set the
requires-full-index to false for "write-tree".

The `p2000` tests demonstrate a ~96% execution time reduction for 'git
write-tree' using a sparse index:

Test                                           before  after
-----------------------------------------------------------------
2000.78: git write-tree (full-v3)              0.34    0.33 -2.9%
2000.79: git write-tree (full-v4)              0.32    0.30 -6.3%
2000.80: git write-tree (sparse-v3)            0.47    0.02 -95.8%
2000.81: git write-tree (sparse-v4)            0.45    0.02 -95.6%

Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
---

* Modified the code to ensure prepare_repo_settings() is called only 
when inside a repository.

* Change 'write-tree on all' to just 'write-tree'.

* Have a baseline 'test_all_match git write-tree' before making any 
changes to the index.

* Add 'git status --porcelain=v2'.

* Ensuring that SKIP_WORKTREE files weren't materialized on disk by
using "test_path_is_missing".

* Use 'test_all_match' on the 'git update-index'.


 builtin/write-tree.c                     |  9 ++++++---
 t/t1092-sparse-checkout-compatibility.sh | 20 +++++++++++++++-----
 2 files changed, 21 insertions(+), 8 deletions(-)

Comments

Victoria Dye April 21, 2023, 9:42 p.m. UTC | #1
Shuqi Liang wrote:
> Update 'git write-tree' to allow using the sparse-index in memory
> without expanding to a full one.
> 
> The recursive algorithm for update_one() was already updated in 2de37c5
> (cache-tree: integrate with sparse directory entries, 2021-03-03) to
> handle sparse directory entries in the index. Hence we can just set the
> requires-full-index to false for "write-tree".
> 
> The `p2000` tests demonstrate a ~96% execution time reduction for 'git
> write-tree' using a sparse index:
> 
> Test                                           before  after
> -----------------------------------------------------------------
> 2000.78: git write-tree (full-v3)              0.34    0.33 -2.9%
> 2000.79: git write-tree (full-v4)              0.32    0.30 -6.3%
> 2000.80: git write-tree (sparse-v3)            0.47    0.02 -95.8%
> 2000.81: git write-tree (sparse-v4)            0.45    0.02 -95.6%

Please update your commit message to explain only the incremental updates on
top of 1a65b41b38a (write-tree: integrate with sparse index, 2023-04-03);
that patch's message (what you have here) does not accurately describe what
_this_ patch is doing.

> diff --git a/builtin/write-tree.c b/builtin/write-tree.c
> index 32e302a813..a9d5c20cde 100644
> --- a/builtin/write-tree.c
> +++ b/builtin/write-tree.c
> @@ -38,12 +38,15 @@ int cmd_write_tree(int argc, const char **argv, const char *cmd_prefix)
>  	};
>  
>  	git_config(git_default_config, NULL);
> +	
> +	if (the_repository->gitdir) {
> +		prepare_repo_settings(the_repository);
> +		the_repository->settings.command_requires_full_index = 0;
> +	}
> +
>  	argc = parse_options(argc, argv, cmd_prefix, write_tree_options,
>  			     write_tree_usage, 0);
>  
> -	prepare_repo_settings(the_repository);
> -	the_repository->settings.command_requires_full_index = 0;
> -

What is the functional benefit of this change? AFAICT, we don't need
'command_requires_full_index' to be set before 'parse_options' in this case,
so this won't have any effect on the behavior of 'write-tree'.

> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index 9bbc0d646b..d3eb31326b 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -2055,22 +2055,32 @@ test_expect_success 'grep sparse directory within submodules' '
>  	test_cmp actual expect
>  '
>  
> -test_expect_success 'write-tree on all' '
> +test_expect_success 'write-tree' '
>  	init_repos &&
>  
> +	test_all_match git write-tree &&
> +
>  	write_script edit-contents <<-\EOF &&
>  	echo text >>"$1"
>  	EOF
>  
> +	# make a change inside the sparse cone
>  	run_on_all ../edit-contents deep/a &&
> -	run_on_all git update-index deep/a &&
> +	test_all_match git update-index deep/a &&
>  	test_all_match git write-tree &&
> +	test_all_match git status --porcelain=v2 &&
>  
> +	# make a change outside the sparse cone
>  	run_on_all mkdir -p folder1 &&
>  	run_on_all cp a folder1/a &&
>  	run_on_all ../edit-contents folder1/a &&
> -	run_on_all git update-index folder1/a &&
> -	test_all_match git write-tree
> +	test_all_match git update-index folder1/a &&
> +	test_all_match git write-tree &&
> +	test_all_match git status --porcelain=v2 &&
> +	
> +	# check that SKIP_WORKTREE files are not materialized
> +	test_path_is_missing sparse-checkout/folder2/a &&
> +	test_path_is_missing sparse-index/folder2/a

Test updates look good!

>  '
>  
>  test_expect_success 'sparse-index is not expanded: write-tree' '
> @@ -2080,7 +2090,7 @@ test_expect_success 'sparse-index is not expanded: write-tree' '
>  
>  	echo "test1" >>sparse-index/a &&
>  	git -C sparse-index update-index a &&
> -	ensure_not_expanded write-tree
> +	ensure_not_expanded write-tree 

nit: trailing whitespace should be removed

>  '
>  
>  test_done
Junio C Hamano April 24, 2023, 3:14 p.m. UTC | #2
Victoria Dye <vdye@github.com> writes:

> Shuqi Liang wrote:
>> Update 'git write-tree' to allow using the sparse-index in memory
>> without expanding to a full one.
>> 
>> The recursive algorithm for update_one() was already updated in 2de37c5
>> (cache-tree: integrate with sparse directory entries, 2021-03-03) to
>> handle sparse directory entries in the index. Hence we can just set the
>> requires-full-index to false for "write-tree".
>> 
>> The `p2000` tests demonstrate a ~96% execution time reduction for 'git
>> write-tree' using a sparse index:
>> 
>> Test                                           before  after
>> -----------------------------------------------------------------
>> 2000.78: git write-tree (full-v3)              0.34    0.33 -2.9%
>> 2000.79: git write-tree (full-v4)              0.32    0.30 -6.3%
>> 2000.80: git write-tree (sparse-v3)            0.47    0.02 -95.8%
>> 2000.81: git write-tree (sparse-v4)            0.45    0.02 -95.6%
>
> Please update your commit message to explain only the incremental updates on
> top of 1a65b41b38a (write-tree: integrate with sparse index, 2023-04-03);
> that patch's message (what you have here) does not accurately describe what
> _this_ patch is doing.

Good point.

In addition, as this is the first iteration of a follow-up topic,
"v4" on the subject line is a bit misleading.  Let's treat it as a
new and separate topic that build on top of the previous achievement.

Thanks for working on this topic and mentoring a new contributor.
diff mbox series

Patch

diff --git a/builtin/write-tree.c b/builtin/write-tree.c
index 32e302a813..a9d5c20cde 100644
--- a/builtin/write-tree.c
+++ b/builtin/write-tree.c
@@ -38,12 +38,15 @@  int cmd_write_tree(int argc, const char **argv, const char *cmd_prefix)
 	};
 
 	git_config(git_default_config, NULL);
+	
+	if (the_repository->gitdir) {
+		prepare_repo_settings(the_repository);
+		the_repository->settings.command_requires_full_index = 0;
+	}
+
 	argc = parse_options(argc, argv, cmd_prefix, write_tree_options,
 			     write_tree_usage, 0);
 
-	prepare_repo_settings(the_repository);
-	the_repository->settings.command_requires_full_index = 0;
-
 	ret = write_index_as_tree(&oid, &the_index, get_index_file(), flags,
 				  tree_prefix);
 	switch (ret) {
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 9bbc0d646b..d3eb31326b 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -2055,22 +2055,32 @@  test_expect_success 'grep sparse directory within submodules' '
 	test_cmp actual expect
 '
 
-test_expect_success 'write-tree on all' '
+test_expect_success 'write-tree' '
 	init_repos &&
 
+	test_all_match git write-tree &&
+
 	write_script edit-contents <<-\EOF &&
 	echo text >>"$1"
 	EOF
 
+	# make a change inside the sparse cone
 	run_on_all ../edit-contents deep/a &&
-	run_on_all git update-index deep/a &&
+	test_all_match git update-index deep/a &&
 	test_all_match git write-tree &&
+	test_all_match git status --porcelain=v2 &&
 
+	# make a change outside the sparse cone
 	run_on_all mkdir -p folder1 &&
 	run_on_all cp a folder1/a &&
 	run_on_all ../edit-contents folder1/a &&
-	run_on_all git update-index folder1/a &&
-	test_all_match git write-tree
+	test_all_match git update-index folder1/a &&
+	test_all_match git write-tree &&
+	test_all_match git status --porcelain=v2 &&
+	
+	# check that SKIP_WORKTREE files are not materialized
+	test_path_is_missing sparse-checkout/folder2/a &&
+	test_path_is_missing sparse-index/folder2/a
 '
 
 test_expect_success 'sparse-index is not expanded: write-tree' '
@@ -2080,7 +2090,7 @@  test_expect_success 'sparse-index is not expanded: write-tree' '
 
 	echo "test1" >>sparse-index/a &&
 	git -C sparse-index update-index a &&
-	ensure_not_expanded write-tree
+	ensure_not_expanded write-tree 
 '
 
 test_done