diff mbox series

[GSOC,v1] diff-index: enable diff-index

Message ID 20230403190538.361840-1-nanth.raghul@gmail.com (mailing list archive)
State New, archived
Headers show
Series [GSOC,v1] diff-index: enable diff-index | expand

Commit Message

Raghul Nanth A April 3, 2023, 7:05 p.m. UTC
Uses the run_diff_index() function to generate its diff. This function
has been made sparse-index aware in the series that led to 8d2c3732
(Merge branch 'ld/sparse-diff-blame', 2021-12-21). Hence we can just
set the requires-full-index to false for "diff-index".

Performance metrics

  Test                                        HEAD~1            HEAD
  ------------------------------------------------------------------------------------
  2000.2: git diff-index HEAD (full-v3)       0.09(0.05+0.05)   0.09(0.06+0.04) +0.0%
  2000.3: git diff-index HEAD (full-v4)       0.09(0.05+0.05)   0.09(0.06+0.03) +0.0%
  2000.4: git diff-index HEAD (sparse-v3)     0.32(0.28+0.05)   0.01(0.01+0.04) -96.9%
  2000.5: git diff-index HEAD (sparse-v4)     0.34(0.29+0.06)   0.01(0.02+0.03) -97.1%
  2000.6: git diff-index HEAD~1 (full-v3)     3.77(3.62+0.14)   3.37(3.27+0.09) -10.6%
  2000.7: git diff-index HEAD~1 (full-v4)     3.18(3.07+0.11)   3.20(3.10+0.09) +0.6%
  2000.8: git diff-index HEAD~1 (sparse-v3)   3.78(3.65+0.12)   0.22(0.20+0.06) -94.2%
  2000.9: git diff-index HEAD~1 (sparse-v4)   3.86(3.74+0.12)   0.28(0.28+0.04) -92.7%

Signed-off-by: Raghul Nanth A <nanth.raghul@gmail.com>
---
 builtin/diff-index.c                     |  4 ++++
 t/perf/p2000-sparse-operations.sh        |  2 ++
 t/t1092-sparse-checkout-compatibility.sh | 18 ++++++++++++++++++
 3 files changed, 24 insertions(+)

Comments

Junio C Hamano April 4, 2023, 12:16 a.m. UTC | #1
Raghul Nanth A <nanth.raghul@gmail.com> writes:

> Uses the run_diff_index() function to generate its diff.

The sentence lacks a subject.
> +	test_all_match git diff-index HEAD --cached

See "git help cli".  Do not write rev after a dashed option.

Thanks.
Victoria Dye April 5, 2023, 5:53 p.m. UTC | #2
Raghul Nanth A wrote:
> diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh
> index 3242cfe91a..9e74cb22b9 100755
> --- a/t/perf/p2000-sparse-operations.sh
> +++ b/t/perf/p2000-sparse-operations.sh
> @@ -125,5 +125,7 @@ test_perf_on_all git checkout-index -f --all
>  test_perf_on_all git update-index --add --remove $SPARSE_CONE/a
>  test_perf_on_all "git rm -f $SPARSE_CONE/a && git checkout HEAD -- $SPARSE_CONE/a"
>  test_perf_on_all git grep --cached --sparse bogus -- "f2/f1/f1/*"
> +test_perf_on_all git diff-index HEAD
> +test_perf_on_all git diff-index HEAD~1

What is the benefit of testing 'diff-index' with 'HEAD' *and* 'HEAD~1'? I
wouldn't expect internal behavior in the command to change based on the
revision, so the performance should be nearly identical. I'd much rather see
'diff-index --cached' and/or other options & pathspecs exercised.

>  
>  test_done
> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index 801919009e..13801f327d 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -1996,6 +1996,24 @@ test_expect_success 'sparse index is not expanded: rm' '
>  	ensure_not_expanded rm -r deep
>  '
>  
> +test_expect_success 'sparse index is not expanded: diff-index' '
> +	init_repos &&
> +
> +	echo "new" >>sparse-index/g &&
> +	git -C sparse-index add g &&
> +	git -C sparse-index commit -m "dummy" &&
> +	ensure_not_expanded diff-index HEAD~1

As with the other tests, please exercise different options and pathspecs
with 'diff-index' to improve coverage.

> +'
> +
> +test_expect_success 'match all: diff-index' '
> +	init_repos &&
> +
> +	test_all_match git diff-index HEAD &&
> +	run_on_all rm g &&
> +	test_all_match git diff-index HEAD &&
> +	test_all_match git diff-index HEAD --cached
> +'

In addition to the '--cached' option, please test different pathspecs
(especially different wildcard variations; see the 'git grep' [1] and 'git
diff-files' [2] integrations for examples you could build off of).

Seeing that 'diff-files' needed 'pathspec_needs_expanded_index', it's
possible that this command needs similar treatment. I'm curious as to
whether 'diff' needs it as well - the tests in 't1092' don't cover 'diff'
with pathspecs, so it might be behaving incorrectly. If that's the case, it
would be nice to see pathspecs handled all in one place
('run_diff_index()'?), if possible.

[1] https://lore.kernel.org/git/20220923041842.27817-1-shaoxuan.yuan02@gmail.com/
[2] https://lore.kernel.org/git/20230322161820.3609-1-cheskaqiqi@gmail.com/

> +
>  test_expect_success 'grep with and --cached' '
>  	init_repos &&
>
Junio C Hamano April 5, 2023, 7:28 p.m. UTC | #3
Victoria Dye <vdye@github.com> writes:

> Raghul Nanth A wrote:
>> diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh
>> index 3242cfe91a..9e74cb22b9 100755
>> --- a/t/perf/p2000-sparse-operations.sh
>> +++ b/t/perf/p2000-sparse-operations.sh
>> @@ -125,5 +125,7 @@ test_perf_on_all git checkout-index -f --all
>>  test_perf_on_all git update-index --add --remove $SPARSE_CONE/a
>>  test_perf_on_all "git rm -f $SPARSE_CONE/a && git checkout HEAD -- $SPARSE_CONE/a"
>>  test_perf_on_all git grep --cached --sparse bogus -- "f2/f1/f1/*"
>> +test_perf_on_all git diff-index HEAD
>> +test_perf_on_all git diff-index HEAD~1
>
> What is the benefit of testing 'diff-index' with 'HEAD' *and* 'HEAD~1'? I
> wouldn't expect internal behavior in the command to change based on the
> revision, so the performance should be nearly identical. I'd much rather see
> 'diff-index --cached' and/or other options & pathspecs exercised.

Good point.  Comparing with HEAD~1 has a chance to compare _more_
paths (i.e. paths changed in the working tree plus paths changed
between the two commits), though it feels a bit too subtle if that
is what these two tests meant.

Testing with pathspec limited comparison, limiting within the cone
of interest or extending to outside the cone, does sound like a good
idea.  "diff-index --cached" to ignore working tree changes is also
an obvious thing we want to see working well.

> Seeing that 'diff-files' needed 'pathspec_needs_expanded_index', it's
> possible that this command needs similar treatment. I'm curious as to
> whether 'diff' needs it as well - the tests in 't1092' don't cover 'diff'
> with pathspecs, so it might be behaving incorrectly. If that's the case, it
> would be nice to see pathspecs handled all in one place
> ('run_diff_index()'?), if possible.

Thanks for a careful review and comment.
diff mbox series

Patch

diff --git a/builtin/diff-index.c b/builtin/diff-index.c
index 35dc9b23ee..8b9871d611 100644
--- a/builtin/diff-index.c
+++ b/builtin/diff-index.c
@@ -24,6 +24,10 @@  int cmd_diff_index(int argc, const char **argv, const char *prefix)
 		usage(diff_cache_usage);
 
 	git_config(git_diff_basic_config, NULL); /* no "diff" UI options */
+
+	prepare_repo_settings(the_repository);
+	the_repository->settings.command_requires_full_index = 0;
+
 	repo_init_revisions(the_repository, &rev, prefix);
 	rev.abbrev = 0;
 	prefix = precompose_argv_prefix(argc, argv, prefix);
diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh
index 3242cfe91a..9e74cb22b9 100755
--- a/t/perf/p2000-sparse-operations.sh
+++ b/t/perf/p2000-sparse-operations.sh
@@ -125,5 +125,7 @@  test_perf_on_all git checkout-index -f --all
 test_perf_on_all git update-index --add --remove $SPARSE_CONE/a
 test_perf_on_all "git rm -f $SPARSE_CONE/a && git checkout HEAD -- $SPARSE_CONE/a"
 test_perf_on_all git grep --cached --sparse bogus -- "f2/f1/f1/*"
+test_perf_on_all git diff-index HEAD
+test_perf_on_all git diff-index HEAD~1
 
 test_done
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 801919009e..13801f327d 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -1996,6 +1996,24 @@  test_expect_success 'sparse index is not expanded: rm' '
 	ensure_not_expanded rm -r deep
 '
 
+test_expect_success 'sparse index is not expanded: diff-index' '
+	init_repos &&
+
+	echo "new" >>sparse-index/g &&
+	git -C sparse-index add g &&
+	git -C sparse-index commit -m "dummy" &&
+	ensure_not_expanded diff-index HEAD~1
+'
+
+test_expect_success 'match all: diff-index' '
+	init_repos &&
+
+	test_all_match git diff-index HEAD &&
+	run_on_all rm g &&
+	test_all_match git diff-index HEAD &&
+	test_all_match git diff-index HEAD --cached
+'
+
 test_expect_success 'grep with and --cached' '
 	init_repos &&