diff mbox series

[v4,3/4] diff: enable and test the sparse index

Message ID 273ee16b74ed4b36ffba5762fa892410317ea02b.1637620958.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series Sparse Index: diff and blame builtins | expand

Commit Message

Lessley Dennington Nov. 22, 2021, 10:42 p.m. UTC
From: Lessley Dennington <lessleydennington@gmail.com>

Enable the sparse index within the 'git diff' command. Its implementation
already safely integrates with the sparse index because it shares code
with the 'git status' and 'git checkout' commands that were already
integrated.  For more details see:

d76723e (status: use sparse-index throughout, 2021-07-14)
1ba5f45 (checkout: stop expanding sparse indexes, 2021-06-29)

The most interesting thing to do is to add tests that verify that 'git
diff' behaves correctly when the sparse index is enabled. These cases are:

1. The index is not expanded for 'diff' and 'diff --staged'
2. 'diff' and 'diff --staged' behave the same in full checkout, sparse
checkout, and sparse index repositories in the following partially-staged
scenarios (i.e. the index, HEAD, and working directory differ at a given
path):
    1. Path is within sparse-checkout cone
    2. Path is outside sparse-checkout cone
    3. A merge conflict exists for paths outside sparse-checkout cone

The `p2000` tests demonstrate a ~44% execution time reduction for 'git
diff' and a ~86% execution time reduction for 'git diff --staged' using a
sparse index:

Test                                      before  after
-------------------------------------------------------------
2000.30: git diff (full-v3)               0.33    0.34 +3.0%
2000.31: git diff (full-v4)               0.33    0.35 +6.1%
2000.32: git diff (sparse-v3)             0.53    0.31 -41.5%
2000.33: git diff (sparse-v4)             0.54    0.29 -46.3%
2000.34: git diff --cached (full-v3)      0.07    0.07 +0.0%
2000.35: git diff --cached (full-v4)      0.07    0.08 +14.3%
2000.36: git diff --cached (sparse-v3)    0.28    0.04 -85.7%
2000.37: git diff --cached (sparse-v4)    0.23    0.03 -87.0%

Co-authored-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Lessley Dennington <lessleydennington@gmail.com>
---
 builtin/diff.c                           |  5 +++
 t/perf/p2000-sparse-operations.sh        |  2 ++
 t/t1092-sparse-checkout-compatibility.sh | 46 ++++++++++++++++++++++++
 3 files changed, 53 insertions(+)

Comments

Elijah Newren Nov. 23, 2021, 7:47 a.m. UTC | #1
On Mon, Nov 22, 2021 at 2:42 PM Lessley Dennington via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Lessley Dennington <lessleydennington@gmail.com>
>
> Enable the sparse index within the 'git diff' command. Its implementation
> already safely integrates with the sparse index because it shares code
> with the 'git status' and 'git checkout' commands that were already
> integrated.  For more details see:
>
> d76723e (status: use sparse-index throughout, 2021-07-14)
> 1ba5f45 (checkout: stop expanding sparse indexes, 2021-06-29)

I preferred the references in your v3:

d76723ee53 (status: use sparse-index throughout, 2021-07-14)
1ba5f45132 (checkout: stop expanding sparse indexes, 2021-06-29)

because 7-character abbreviations aren't very future proof;
10-character seems better to me.

(Very micro nit.)

>
> The most interesting thing to do is to add tests that verify that 'git
> diff' behaves correctly when the sparse index is enabled. These cases are:
>
> 1. The index is not expanded for 'diff' and 'diff --staged'
> 2. 'diff' and 'diff --staged' behave the same in full checkout, sparse
> checkout, and sparse index repositories in the following partially-staged
> scenarios (i.e. the index, HEAD, and working directory differ at a given
> path):
>     1. Path is within sparse-checkout cone
>     2. Path is outside sparse-checkout cone
>     3. A merge conflict exists for paths outside sparse-checkout cone
>
> The `p2000` tests demonstrate a ~44% execution time reduction for 'git
> diff' and a ~86% execution time reduction for 'git diff --staged' using a
> sparse index:
>
> Test                                      before  after
> -------------------------------------------------------------
> 2000.30: git diff (full-v3)               0.33    0.34 +3.0%
> 2000.31: git diff (full-v4)               0.33    0.35 +6.1%
> 2000.32: git diff (sparse-v3)             0.53    0.31 -41.5%
> 2000.33: git diff (sparse-v4)             0.54    0.29 -46.3%
> 2000.34: git diff --cached (full-v3)      0.07    0.07 +0.0%
> 2000.35: git diff --cached (full-v4)      0.07    0.08 +14.3%
> 2000.36: git diff --cached (sparse-v3)    0.28    0.04 -85.7%
> 2000.37: git diff --cached (sparse-v4)    0.23    0.03 -87.0%
>
> Co-authored-by: Derrick Stolee <dstolee@microsoft.com>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> Signed-off-by: Lessley Dennington <lessleydennington@gmail.com>
> ---
>  builtin/diff.c                           |  5 +++
>  t/perf/p2000-sparse-operations.sh        |  2 ++
>  t/t1092-sparse-checkout-compatibility.sh | 46 ++++++++++++++++++++++++
>  3 files changed, 53 insertions(+)
>
> diff --git a/builtin/diff.c b/builtin/diff.c
> index dd8ce688ba7..fa4683377eb 100644
> --- a/builtin/diff.c
> +++ b/builtin/diff.c
> @@ -437,6 +437,11 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
>
>         prefix = setup_git_directory_gently(&nongit);
>
> +       if (!nongit) {
> +               prepare_repo_settings(the_repository);
> +               the_repository->settings.command_requires_full_index = 0;
> +       }
> +
>         if (!no_index) {
>                 /*
>                  * Treat git diff with at least one path outside of the
> diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh
> index bfd332120c8..5cf94627383 100755
> --- a/t/perf/p2000-sparse-operations.sh
> +++ b/t/perf/p2000-sparse-operations.sh
> @@ -113,5 +113,7 @@ test_perf_on_all git checkout -f -
>  test_perf_on_all git reset
>  test_perf_on_all git reset --hard
>  test_perf_on_all git reset -- does-not-exist
> +test_perf_on_all git diff
> +test_perf_on_all git diff --cached
>
>  test_done
> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index 44d5e11c762..53524660759 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -832,6 +832,52 @@ test_expect_success 'sparse-index is not expanded: merge conflict in cone' '
>         )
>  '
>
> +test_expect_success 'sparse index is not expanded: diff' '
> +       init_repos &&
> +
> +       write_script edit-contents <<-\EOF &&
> +       echo text >>$1
> +       EOF
> +
> +       # Add file within cone
> +       test_sparse_match git sparse-checkout set deep &&
> +       run_on_all ../edit-contents deep/testfile &&
> +       test_all_match git add deep/testfile &&
> +       run_on_all ../edit-contents deep/testfile &&
> +
> +       test_all_match git diff &&
> +       test_all_match git diff --staged &&
> +       ensure_not_expanded diff &&
> +       ensure_not_expanded diff --staged &&
> +
> +       # Add file outside cone
> +       test_all_match git reset --hard &&
> +       run_on_all mkdir newdirectory &&
> +       run_on_all ../edit-contents newdirectory/testfile &&
> +       test_sparse_match git sparse-checkout set newdirectory &&
> +       test_all_match git add newdirectory/testfile &&
> +       run_on_all ../edit-contents newdirectory/testfile &&
> +       test_sparse_match git sparse-checkout set &&
> +
> +       test_all_match git diff &&
> +       test_all_match git diff --staged &&
> +       ensure_not_expanded diff &&
> +       ensure_not_expanded diff --staged &&
> +
> +       # Merge conflict outside cone
> +       # The sparse checkout will report a warning that is not in the
> +       # full checkout, so we use `run_on_all` instead of
> +       # `test_all_match`
> +       run_on_all git reset --hard &&
> +       test_all_match git checkout merge-left &&
> +       test_all_match test_must_fail git merge merge-right &&
> +
> +       test_all_match git diff &&
> +       test_all_match git diff --staged &&
> +       ensure_not_expanded diff &&
> +       ensure_not_expanded diff --staged

You've changed some of the --staged to --cached, but based on Junio's
comments on the previous round, you probably want to convert the
others too.

> +'
> +
>  # NEEDSWORK: a sparse-checkout behaves differently from a full checkout
>  # in this scenario, but it shouldn't.
>  test_expect_success 'reset mixed and checkout orphan' '
> --
> gitgitgadget
>
Lessley Dennington Nov. 23, 2021, 2:53 p.m. UTC | #2
On 11/22/21 11:47 PM, Elijah Newren wrote:
> On Mon, Nov 22, 2021 at 2:42 PM Lessley Dennington via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>>
>> From: Lessley Dennington <lessleydennington@gmail.com>
>>
>> Enable the sparse index within the 'git diff' command. Its implementation
>> already safely integrates with the sparse index because it shares code
>> with the 'git status' and 'git checkout' commands that were already
>> integrated.  For more details see:
>>
>> d76723e (status: use sparse-index throughout, 2021-07-14)
>> 1ba5f45 (checkout: stop expanding sparse indexes, 2021-06-29)
> 
> I preferred the references in your v3:
> 
> d76723ee53 (status: use sparse-index throughout, 2021-07-14)
> 1ba5f45132 (checkout: stop expanding sparse indexes, 2021-06-29)
> 
> because 7-character abbreviations aren't very future proof;
> 10-character seems better to me.
> 
> (Very micro nit.)
> 
Appreciate the pointer - I'll return to the old formatting for v5.
>>
>> The most interesting thing to do is to add tests that verify that 'git
>> diff' behaves correctly when the sparse index is enabled. These cases are:
>>
>> 1. The index is not expanded for 'diff' and 'diff --staged'
>> 2. 'diff' and 'diff --staged' behave the same in full checkout, sparse
>> checkout, and sparse index repositories in the following partially-staged
>> scenarios (i.e. the index, HEAD, and working directory differ at a given
>> path):
>>      1. Path is within sparse-checkout cone
>>      2. Path is outside sparse-checkout cone
>>      3. A merge conflict exists for paths outside sparse-checkout cone
>>
>> The `p2000` tests demonstrate a ~44% execution time reduction for 'git
>> diff' and a ~86% execution time reduction for 'git diff --staged' using a
>> sparse index:
>>
>> Test                                      before  after
>> -------------------------------------------------------------
>> 2000.30: git diff (full-v3)               0.33    0.34 +3.0%
>> 2000.31: git diff (full-v4)               0.33    0.35 +6.1%
>> 2000.32: git diff (sparse-v3)             0.53    0.31 -41.5%
>> 2000.33: git diff (sparse-v4)             0.54    0.29 -46.3%
>> 2000.34: git diff --cached (full-v3)      0.07    0.07 +0.0%
>> 2000.35: git diff --cached (full-v4)      0.07    0.08 +14.3%
>> 2000.36: git diff --cached (sparse-v3)    0.28    0.04 -85.7%
>> 2000.37: git diff --cached (sparse-v4)    0.23    0.03 -87.0%
>>
>> Co-authored-by: Derrick Stolee <dstolee@microsoft.com>
>> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
>> Signed-off-by: Lessley Dennington <lessleydennington@gmail.com>
>> ---
>>   builtin/diff.c                           |  5 +++
>>   t/perf/p2000-sparse-operations.sh        |  2 ++
>>   t/t1092-sparse-checkout-compatibility.sh | 46 ++++++++++++++++++++++++
>>   3 files changed, 53 insertions(+)
>>
>> diff --git a/builtin/diff.c b/builtin/diff.c
>> index dd8ce688ba7..fa4683377eb 100644
>> --- a/builtin/diff.c
>> +++ b/builtin/diff.c
>> @@ -437,6 +437,11 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
>>
>>          prefix = setup_git_directory_gently(&nongit);
>>
>> +       if (!nongit) {
>> +               prepare_repo_settings(the_repository);
>> +               the_repository->settings.command_requires_full_index = 0;
>> +       }
>> +
>>          if (!no_index) {
>>                  /*
>>                   * Treat git diff with at least one path outside of the
>> diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh
>> index bfd332120c8..5cf94627383 100755
>> --- a/t/perf/p2000-sparse-operations.sh
>> +++ b/t/perf/p2000-sparse-operations.sh
>> @@ -113,5 +113,7 @@ test_perf_on_all git checkout -f -
>>   test_perf_on_all git reset
>>   test_perf_on_all git reset --hard
>>   test_perf_on_all git reset -- does-not-exist
>> +test_perf_on_all git diff
>> +test_perf_on_all git diff --cached
>>
>>   test_done
>> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
>> index 44d5e11c762..53524660759 100755
>> --- a/t/t1092-sparse-checkout-compatibility.sh
>> +++ b/t/t1092-sparse-checkout-compatibility.sh
>> @@ -832,6 +832,52 @@ test_expect_success 'sparse-index is not expanded: merge conflict in cone' '
>>          )
>>   '
>>
>> +test_expect_success 'sparse index is not expanded: diff' '
>> +       init_repos &&
>> +
>> +       write_script edit-contents <<-\EOF &&
>> +       echo text >>$1
>> +       EOF
>> +
>> +       # Add file within cone
>> +       test_sparse_match git sparse-checkout set deep &&
>> +       run_on_all ../edit-contents deep/testfile &&
>> +       test_all_match git add deep/testfile &&
>> +       run_on_all ../edit-contents deep/testfile &&
>> +
>> +       test_all_match git diff &&
>> +       test_all_match git diff --staged &&
>> +       ensure_not_expanded diff &&
>> +       ensure_not_expanded diff --staged &&
>> +
>> +       # Add file outside cone
>> +       test_all_match git reset --hard &&
>> +       run_on_all mkdir newdirectory &&
>> +       run_on_all ../edit-contents newdirectory/testfile &&
>> +       test_sparse_match git sparse-checkout set newdirectory &&
>> +       test_all_match git add newdirectory/testfile &&
>> +       run_on_all ../edit-contents newdirectory/testfile &&
>> +       test_sparse_match git sparse-checkout set &&
>> +
>> +       test_all_match git diff &&
>> +       test_all_match git diff --staged &&
>> +       ensure_not_expanded diff &&
>> +       ensure_not_expanded diff --staged &&
>> +
>> +       # Merge conflict outside cone
>> +       # The sparse checkout will report a warning that is not in the
>> +       # full checkout, so we use `run_on_all` instead of
>> +       # `test_all_match`
>> +       run_on_all git reset --hard &&
>> +       test_all_match git checkout merge-left &&
>> +       test_all_match test_must_fail git merge merge-right &&
>> +
>> +       test_all_match git diff &&
>> +       test_all_match git diff --staged &&
>> +       ensure_not_expanded diff &&
>> +       ensure_not_expanded diff --staged
> 
> You've changed some of the --staged to --cached, but based on Junio's
> comments on the previous round, you probably want to convert the
> others too.
> 
Will do for v5.
>> +'
>> +
>>   # NEEDSWORK: a sparse-checkout behaves differently from a full checkout
>>   # in this scenario, but it shouldn't.
>>   test_expect_success 'reset mixed and checkout orphan' '
>> --
>> gitgitgadget
>>
Junio C Hamano Nov. 23, 2021, 11:48 p.m. UTC | #3
"Lessley Dennington via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> diff --git a/builtin/diff.c b/builtin/diff.c
> index dd8ce688ba7..fa4683377eb 100644
> --- a/builtin/diff.c
> +++ b/builtin/diff.c
> @@ -437,6 +437,11 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
>  
>  	prefix = setup_git_directory_gently(&nongit);
>  
> +	if (!nongit) {
> +		prepare_repo_settings(the_repository);
> +		the_repository->settings.command_requires_full_index = 0;
> +	}
> +

It is very pleasing to see that with all the preparations, it is
jsut a very simple matter of adding four lines to enable the big
feature.

At this point immediately after asking setup_git_directory_gently(),
we positively know if we are in a repository, so the placement of
the new code makes perfect sense.

Nice.
diff mbox series

Patch

diff --git a/builtin/diff.c b/builtin/diff.c
index dd8ce688ba7..fa4683377eb 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -437,6 +437,11 @@  int cmd_diff(int argc, const char **argv, const char *prefix)
 
 	prefix = setup_git_directory_gently(&nongit);
 
+	if (!nongit) {
+		prepare_repo_settings(the_repository);
+		the_repository->settings.command_requires_full_index = 0;
+	}
+
 	if (!no_index) {
 		/*
 		 * Treat git diff with at least one path outside of the
diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh
index bfd332120c8..5cf94627383 100755
--- a/t/perf/p2000-sparse-operations.sh
+++ b/t/perf/p2000-sparse-operations.sh
@@ -113,5 +113,7 @@  test_perf_on_all git checkout -f -
 test_perf_on_all git reset
 test_perf_on_all git reset --hard
 test_perf_on_all git reset -- does-not-exist
+test_perf_on_all git diff
+test_perf_on_all git diff --cached
 
 test_done
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 44d5e11c762..53524660759 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -832,6 +832,52 @@  test_expect_success 'sparse-index is not expanded: merge conflict in cone' '
 	)
 '
 
+test_expect_success 'sparse index is not expanded: diff' '
+	init_repos &&
+
+	write_script edit-contents <<-\EOF &&
+	echo text >>$1
+	EOF
+
+	# Add file within cone
+	test_sparse_match git sparse-checkout set deep &&
+	run_on_all ../edit-contents deep/testfile &&
+	test_all_match git add deep/testfile &&
+	run_on_all ../edit-contents deep/testfile &&
+
+	test_all_match git diff &&
+	test_all_match git diff --staged &&
+	ensure_not_expanded diff &&
+	ensure_not_expanded diff --staged &&
+
+	# Add file outside cone
+	test_all_match git reset --hard &&
+	run_on_all mkdir newdirectory &&
+	run_on_all ../edit-contents newdirectory/testfile &&
+	test_sparse_match git sparse-checkout set newdirectory &&
+	test_all_match git add newdirectory/testfile &&
+	run_on_all ../edit-contents newdirectory/testfile &&
+	test_sparse_match git sparse-checkout set &&
+
+	test_all_match git diff &&
+	test_all_match git diff --staged &&
+	ensure_not_expanded diff &&
+	ensure_not_expanded diff --staged &&
+
+	# Merge conflict outside cone
+	# The sparse checkout will report a warning that is not in the
+	# full checkout, so we use `run_on_all` instead of
+	# `test_all_match`
+	run_on_all git reset --hard &&
+	test_all_match git checkout merge-left &&
+	test_all_match test_must_fail git merge merge-right &&
+
+	test_all_match git diff &&
+	test_all_match git diff --staged &&
+	ensure_not_expanded diff &&
+	ensure_not_expanded diff --staged
+'
+
 # NEEDSWORK: a sparse-checkout behaves differently from a full checkout
 # in this scenario, but it shouldn't.
 test_expect_success 'reset mixed and checkout orphan' '