diff mbox series

[5/6] tests: disable GIT_TEST_SPLIT_INDEX for sparse index tests

Message ID 20210817174938.3009923-6-szeder.dev@gmail.com (mailing list archive)
State Superseded
Headers show
Series Fix GIT_TEST_SPLIT_INDEX | expand

Commit Message

SZEDER Gábor Aug. 17, 2021, 5:49 p.m. UTC
The sparse index and split index features are said to be currently
incompatible [1], and consequently GIT_TEST_SPLIT_INDEX=1 might
interfere with the test cases exercising the sparse index feature.
Therefore GIT_TEST_SPLIT_INDEX is already explicitly disabled for the
whole of 't1092-sparse-checkout-compatibility.sh'.  There are,
however, two other test cases exercising sparse index, namely
'sparse-index enabled and disabled' in
't1091-sparse-checkout-builtin.sh' and 'status succeeds with sparse
index' in 't7519-status-fsmonitor.sh', and these two could fail with
GIT_TEST_SPLIT_INDEX=1 as well [2].

Unset GIT_TEST_SPLIT_INDEX and disable the split index in these two
test cases to avoid such interference.

Note that this is the minimal change to merely avoid failures when
these test cases are run with GIT_TEST_SPLIT_INDEX=1.  Interestingly,
though, without these changes the 'git sparse-checkout init --cone
--sparse-index' commands still succeed even with split index, and set
all the necessary configuration variables and create the initial
'$GIT_DIR/info/sparse-checkout' file, but the test failures are caused
by later sanity checks finding that the index is not in fact a sparse
index.  This indicates that 'git sparse-checkout init --sparse-index'
lacks some error checking and its tests lack coverage related to split
index, but fixing those issues (let alone making sparse index
comparible with split index) is beyond the scope of this patch series.

[1] https://public-inbox.org/git/48e9c3d6-407a-1843-2d91-22112410e3f8@gmail.com/

[2] Neither of these test cases fail at the moment, because
    GIT_TEST_SPLIT_INDEX=1 is broken and never splits the index, and
    it broke long before the sparse index feature was added.
    This patch series is about to fix GIT_TEST_SPLIT_INDEX, and then
    both test cases mentioned above would fail.

(The diff is best viewed with '--ignore-all-space')

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 t/t1091-sparse-checkout-builtin.sh | 25 +++++++++------
 t/t7519-status-fsmonitor.sh        | 51 ++++++++++++++++--------------
 2 files changed, 43 insertions(+), 33 deletions(-)

Comments

Derrick Stolee Aug. 17, 2021, 6:26 p.m. UTC | #1
On 8/17/2021 1:49 PM, SZEDER Gábor wrote:
> The sparse index and split index features are said to be currently
> incompatible [1], and consequently GIT_TEST_SPLIT_INDEX=1 might
> interfere with the test cases exercising the sparse index feature.
> Therefore GIT_TEST_SPLIT_INDEX is already explicitly disabled for the
> whole of 't1092-sparse-checkout-compatibility.sh'.  There are,
> however, two other test cases exercising sparse index, namely
> 'sparse-index enabled and disabled' in
> 't1091-sparse-checkout-builtin.sh' and 'status succeeds with sparse
> index' in 't7519-status-fsmonitor.sh', and these two could fail with
> GIT_TEST_SPLIT_INDEX=1 as well [2].
> 
> Unset GIT_TEST_SPLIT_INDEX and disable the split index in these two
> test cases to avoid such interference.
> 
> Note that this is the minimal change to merely avoid failures when
> these test cases are run with GIT_TEST_SPLIT_INDEX=1.  Interestingly,
> though, without these changes the 'git sparse-checkout init --cone
> --sparse-index' commands still succeed even with split index, and set
> all the necessary configuration variables and create the initial
> '$GIT_DIR/info/sparse-checkout' file, but the test failures are caused
> by later sanity checks finding that the index is not in fact a sparse
> index.  This indicates that 'git sparse-checkout init --sparse-index'
> lacks some error checking and its tests lack coverage related to split
> index, but fixing those issues (let alone making sparse index
> comparible with split index) is beyond the scope of this patch series.

s/comparible/compatible.

I agree that making these two things compatible is not something to
solve today. I'm not sure they should _ever_ be solved because of
the complexity involved (what if the base index is not sparse but
the tip wants to be, or vice-versa?, or if a directory must be
expanded because of a conflict?). They use very different approaches
to solve a similar problem: how to deal with large index files.

* The split index reduces index _write_ time by only editing a diff
  of the base index.

* The sparse index reduces index _read and write_ time by writing a
  smaller index, but only if the user is using cone mode sparse-
  checkout.

>  test_expect_success 'sparse-index enabled and disabled' '
> -	git -C repo sparse-checkout init --cone --sparse-index &&
> -	test_cmp_config -C repo true index.sparse &&
> -	test-tool -C repo read-cache --table >cache &&
> -	grep " tree " cache &&
> -
> -	git -C repo sparse-checkout disable &&
> -	test-tool -C repo read-cache --table >cache &&
> -	! grep " tree " cache &&
> -	git -C repo config --list >config &&
> -	! grep index.sparse config
> +	(
> +		sane_unset GIT_TEST_SPLIT_INDEX &&
> +		git -C repo update-index --no-split-index &&
> +
> +		git -C repo sparse-checkout init --cone --sparse-index &&
> +		test_cmp_config -C repo true index.sparse &&
> +		test-tool -C repo read-cache --table >cache &&
> +		grep " tree " cache &&
> +
> +		git -C repo sparse-checkout disable &&
> +		test-tool -C repo read-cache --table >cache &&
> +		! grep " tree " cache &&
> +		git -C repo config --list >config &&
> +		! grep index.sparse config
> +	)
>  '

This test is safe for now.

>  test_expect_success 'status succeeds with sparse index' '

This test is being edited in ds/sparse-index-ignored-files. v3
of the relevant patch was just sent today [1].

[1] https://lore.kernel.org/git/e66106f7a99d94145eec983ea5e72b7cf8a8a479.1629206603.git.gitgitgadget@gmail.com/

You might want to rebase on top of that topic. The edits to
the test are likely stable now.

Thanks,
-Stolee
SZEDER Gábor Aug. 17, 2021, 9:32 p.m. UTC | #2
On Tue, Aug 17, 2021 at 02:26:23PM -0400, Derrick Stolee wrote:
> On 8/17/2021 1:49 PM, SZEDER Gábor wrote:
> > The sparse index and split index features are said to be currently
> > incompatible [1], and consequently GIT_TEST_SPLIT_INDEX=1 might
> > interfere with the test cases exercising the sparse index feature.
> > Therefore GIT_TEST_SPLIT_INDEX is already explicitly disabled for the
> > whole of 't1092-sparse-checkout-compatibility.sh'.  There are,
> > however, two other test cases exercising sparse index, namely
> > 'sparse-index enabled and disabled' in
> > 't1091-sparse-checkout-builtin.sh' and 'status succeeds with sparse
> > index' in 't7519-status-fsmonitor.sh', and these two could fail with
> > GIT_TEST_SPLIT_INDEX=1 as well [2].
> > 
> > Unset GIT_TEST_SPLIT_INDEX and disable the split index in these two
> > test cases to avoid such interference.
> > 
> > Note that this is the minimal change to merely avoid failures when
> > these test cases are run with GIT_TEST_SPLIT_INDEX=1.  Interestingly,
> > though, without these changes the 'git sparse-checkout init --cone
> > --sparse-index' commands still succeed even with split index, and set
> > all the necessary configuration variables and create the initial
> > '$GIT_DIR/info/sparse-checkout' file, but the test failures are caused
> > by later sanity checks finding that the index is not in fact a sparse
> > index.  This indicates that 'git sparse-checkout init --sparse-index'
> > lacks some error checking and its tests lack coverage related to split
> > index, but fixing those issues (let alone making sparse index
> > comparible with split index) is beyond the scope of this patch series.
> 
> s/comparible/compatible.
> 
> I agree that making these two things compatible is not something to
> solve today. I'm not sure they should _ever_ be solved because of
> the complexity involved (what if the base index is not sparse but
> the tip wants to be, or vice-versa?,

I think that this's not an issue, because a shared index file is not
for forever, but it's expected that a new shared index is written
every once in a while anyway, see splitIndex.maxPercentChange.  So
whenever sparse index gets enabled/disabled we could just write a new
shared index accordingly.  Maybe even when the sparse patterns change,
if necessary.

> or if a directory must be expanded because of a conflict?).

I'm not quite up-to-date in sparse index terminology, so I'm not sure
that this means...  but as mentioned above we can just write a new
shared index when deemed necessary.

> They use very different approaches
> to solve a similar problem: how to deal with large index files.
> 
> * The split index reduces index _write_ time by only editing a diff
>   of the base index.
> 
> * The sparse index reduces index _read and write_ time by writing a
>   smaller index, but only if the user is using cone mode sparse-
>   checkout.

Yeah, I think that in general there is more to be gained with sparse
index than with split index, though the split index might further
reduce the write time of a sparse index, because the amount of data
written is proportional with the nr of changed files instead of the nr
of all files in the sparse index.  I'm not sure that it's worth it,
either.

My remarks about compatibility primarily stem from your remarks about
compatibility in response to my sparse vs. split test failure report
in:

  https://public-inbox.org/git/48e9c3d6-407a-1843-2d91-22112410e3f8@gmail.com/

  "the sparse-index is (currently) incompatible with the split-index"

I assumed that that "(currently)" implies that eventually the two will
be made compatible.

Anyway, I'm fine with leaving them incompatible, but 'git
sparse-checkout' should still learn about split index, so it won't
pretend to succeed after it couldn't do what it was asked to when
there is a split index.  Maybe vice-versa as well.

> >  test_expect_success 'sparse-index enabled and disabled' '
> > -	git -C repo sparse-checkout init --cone --sparse-index &&
> > -	test_cmp_config -C repo true index.sparse &&
> > -	test-tool -C repo read-cache --table >cache &&
> > -	grep " tree " cache &&
> > -
> > -	git -C repo sparse-checkout disable &&
> > -	test-tool -C repo read-cache --table >cache &&
> > -	! grep " tree " cache &&
> > -	git -C repo config --list >config &&
> > -	! grep index.sparse config
> > +	(
> > +		sane_unset GIT_TEST_SPLIT_INDEX &&
> > +		git -C repo update-index --no-split-index &&
> > +
> > +		git -C repo sparse-checkout init --cone --sparse-index &&
> > +		test_cmp_config -C repo true index.sparse &&
> > +		test-tool -C repo read-cache --table >cache &&
> > +		grep " tree " cache &&
> > +
> > +		git -C repo sparse-checkout disable &&
> > +		test-tool -C repo read-cache --table >cache &&
> > +		! grep " tree " cache &&
> > +		git -C repo config --list >config &&
> > +		! grep index.sparse config
> > +	)
> >  '
> 
> This test is safe for now.
> 
> >  test_expect_success 'status succeeds with sparse index' '
> 
> This test is being edited in ds/sparse-index-ignored-files. v3
> of the relevant patch was just sent today [1].
> 
> [1] https://lore.kernel.org/git/e66106f7a99d94145eec983ea5e72b7cf8a8a479.1629206603.git.gitgitgadget@gmail.com/
> 
> You might want to rebase on top of that topic. The edits to
> the test are likely stable now.

Oh, no :)  The test 'cone mode clears ignored subdirectories' added in
that patch series (as in 'seen', so not the newes version) fails with
the working GIT_TEST_SPLIT_INDEX=1 as well, and I don't immediately
see why, as it doesn't seem to use sparse index.
Derrick Stolee Aug. 18, 2021, 6:51 p.m. UTC | #3
On 8/17/2021 5:32 PM, SZEDER Gábor wrote:
> On Tue, Aug 17, 2021 at 02:26:23PM -0400, Derrick Stolee wrote:
>> On 8/17/2021 1:49 PM, SZEDER Gábor wrote:
>>> The sparse index and split index features are said to be currently
>>> incompatible [1], and consequently GIT_TEST_SPLIT_INDEX=1 might
>>> interfere with the test cases exercising the sparse index feature.
>>> Therefore GIT_TEST_SPLIT_INDEX is already explicitly disabled for the
>>> whole of 't1092-sparse-checkout-compatibility.sh'.  There are,
>>> however, two other test cases exercising sparse index, namely
>>> 'sparse-index enabled and disabled' in
>>> 't1091-sparse-checkout-builtin.sh' and 'status succeeds with sparse
>>> index' in 't7519-status-fsmonitor.sh', and these two could fail with
>>> GIT_TEST_SPLIT_INDEX=1 as well [2].
>>>
>>> Unset GIT_TEST_SPLIT_INDEX and disable the split index in these two
>>> test cases to avoid such interference.
>>>
>>> Note that this is the minimal change to merely avoid failures when
>>> these test cases are run with GIT_TEST_SPLIT_INDEX=1.  Interestingly,
>>> though, without these changes the 'git sparse-checkout init --cone
>>> --sparse-index' commands still succeed even with split index, and set
>>> all the necessary configuration variables and create the initial
>>> '$GIT_DIR/info/sparse-checkout' file, but the test failures are caused
>>> by later sanity checks finding that the index is not in fact a sparse
>>> index.  This indicates that 'git sparse-checkout init --sparse-index'
>>> lacks some error checking and its tests lack coverage related to split
>>> index, but fixing those issues (let alone making sparse index
>>> comparible with split index) is beyond the scope of this patch series.
>>
>> s/comparible/compatible.
>>
>> I agree that making these two things compatible is not something to
>> solve today. I'm not sure they should _ever_ be solved because of
>> the complexity involved (what if the base index is not sparse but
>> the tip wants to be, or vice-versa?,
> 
> I think that this's not an issue, because a shared index file is not
> for forever, but it's expected that a new shared index is written
> every once in a while anyway, see splitIndex.maxPercentChange.  So
> whenever sparse index gets enabled/disabled we could just write a new
> shared index accordingly.  Maybe even when the sparse patterns change,
> if necessary.
> 
>> or if a directory must be expanded because of a conflict?).
> 
> I'm not quite up-to-date in sparse index terminology, so I'm not sure
> that this means...  but as mentioned above we can just write a new
> shared index when deemed necessary.
> 
>> They use very different approaches
>> to solve a similar problem: how to deal with large index files.
>>
>> * The split index reduces index _write_ time by only editing a diff
>>   of the base index.
>>
>> * The sparse index reduces index _read and write_ time by writing a
>>   smaller index, but only if the user is using cone mode sparse-
>>   checkout.
> 
> Yeah, I think that in general there is more to be gained with sparse
> index than with split index, though the split index might further
> reduce the write time of a sparse index, because the amount of data
> written is proportional with the nr of changed files instead of the nr
> of all files in the sparse index.  I'm not sure that it's worth it,
> either.
> 
> My remarks about compatibility primarily stem from your remarks about
> compatibility in response to my sparse vs. split test failure report
> in:
> 
>   https://public-inbox.org/git/48e9c3d6-407a-1843-2d91-22112410e3f8@gmail.com/
> 
>   "the sparse-index is (currently) incompatible with the split-index"
> 
> I assumed that that "(currently)" implies that eventually the two will
> be made compatible.

I say "(currently)" because there is nothing stopping someone from
integrating them. I'm not sure anyone ever will.

> Anyway, I'm fine with leaving them incompatible, but 'git
> sparse-checkout' should still learn about split index, so it won't
> pretend to succeed after it couldn't do what it was asked to when
> there is a split index.  Maybe vice-versa as well.

You're saying that when 'git sparse-checkout init --cone --sparse-index'
doesn't write a sparse index because the index is already split, we
should actually die() saying "split index and sparse index are
incompatible" so the user knows why the sparse index was not enabled?
That seems reasonable for now, although it is successful in setting
the config to enable the sparse index opportunistically in the future.

>>>  test_expect_success 'sparse-index enabled and disabled' '
>>> -	git -C repo sparse-checkout init --cone --sparse-index &&
>>> -	test_cmp_config -C repo true index.sparse &&
>>> -	test-tool -C repo read-cache --table >cache &&
>>> -	grep " tree " cache &&
>>> -
>>> -	git -C repo sparse-checkout disable &&
>>> -	test-tool -C repo read-cache --table >cache &&
>>> -	! grep " tree " cache &&
>>> -	git -C repo config --list >config &&
>>> -	! grep index.sparse config
>>> +	(
>>> +		sane_unset GIT_TEST_SPLIT_INDEX &&
>>> +		git -C repo update-index --no-split-index &&
>>> +
>>> +		git -C repo sparse-checkout init --cone --sparse-index &&
>>> +		test_cmp_config -C repo true index.sparse &&
>>> +		test-tool -C repo read-cache --table >cache &&
>>> +		grep " tree " cache &&
>>> +
>>> +		git -C repo sparse-checkout disable &&
>>> +		test-tool -C repo read-cache --table >cache &&
>>> +		! grep " tree " cache &&
>>> +		git -C repo config --list >config &&
>>> +		! grep index.sparse config
>>> +	)
>>>  '
>>
>> This test is safe for now.
>>
>>>  test_expect_success 'status succeeds with sparse index' '
>>
>> This test is being edited in ds/sparse-index-ignored-files. v3
>> of the relevant patch was just sent today [1].
>>
>> [1] https://lore.kernel.org/git/e66106f7a99d94145eec983ea5e72b7cf8a8a479.1629206603.git.gitgitgadget@gmail.com/
>>
>> You might want to rebase on top of that topic. The edits to
>> the test are likely stable now.
> 
> Oh, no :)  The test 'cone mode clears ignored subdirectories' added in
> that patch series (as in 'seen', so not the newes version) fails with
> the working GIT_TEST_SPLIT_INDEX=1 as well, and I don't immediately
> see why, as it doesn't seem to use sparse index.
 
Oh interesting! This is because the index converts to a sparse index
in-memory solely so it can use the sparse directory entries to pick
which directories need to be deleted. In the presence of a split
index, convert_to_sparse() fails due to these lines:

	if (istate->split_index || istate->sparse_index || !istate->cache_nr ||
	    !core_apply_sparse_checkout || !core_sparse_checkout_cone)
		return 0;

	if (!istate->repo)
		istate->repo = the_repository;

	if (!(flags & SPARSE_INDEX_IGNORE_CONFIG)) {
		...

While this if here is checking the condition for creating it in-memory.

The fix is likely to do the following instead:

	if (istate->sparse_index || !istate->cache_nr ||
	    !core_apply_sparse_checkout || !core_sparse_checkout_cone)
		return 0;

	if (!istate->repo)
		istate->repo = the_repository;

	if (!(flags & SPARSE_INDEX_IGNORE_CONFIG)) {
		if (istate->split_index)
			return 0;
		...

and maybe even rename the flag to SPARSE_INDEX_MEMORY_ONLY or something.

Thanks for alerting me to this issue!

-Stolee
diff mbox series

Patch

diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index 38fc8340f5..3f94c41241 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -206,16 +206,21 @@  test_expect_success 'sparse-checkout disable' '
 '
 
 test_expect_success 'sparse-index enabled and disabled' '
-	git -C repo sparse-checkout init --cone --sparse-index &&
-	test_cmp_config -C repo true index.sparse &&
-	test-tool -C repo read-cache --table >cache &&
-	grep " tree " cache &&
-
-	git -C repo sparse-checkout disable &&
-	test-tool -C repo read-cache --table >cache &&
-	! grep " tree " cache &&
-	git -C repo config --list >config &&
-	! grep index.sparse config
+	(
+		sane_unset GIT_TEST_SPLIT_INDEX &&
+		git -C repo update-index --no-split-index &&
+
+		git -C repo sparse-checkout init --cone --sparse-index &&
+		test_cmp_config -C repo true index.sparse &&
+		test-tool -C repo read-cache --table >cache &&
+		grep " tree " cache &&
+
+		git -C repo sparse-checkout disable &&
+		test-tool -C repo read-cache --table >cache &&
+		! grep " tree " cache &&
+		git -C repo config --list >config &&
+		! grep index.sparse config
+	)
 '
 
 test_expect_success 'cone mode: init and set' '
diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh
index deea88d443..513e17065f 100755
--- a/t/t7519-status-fsmonitor.sh
+++ b/t/t7519-status-fsmonitor.sh
@@ -402,34 +402,39 @@  check_sparse_index_behavior () {
 }
 
 test_expect_success 'status succeeds with sparse index' '
-	git reset --hard &&
+	(
+		sane_unset GIT_TEST_SPLIT_INDEX &&
+		git update-index --no-split-index &&
 
-	test_config core.fsmonitor "$TEST_DIRECTORY/t7519/fsmonitor-all" &&
-	check_sparse_index_behavior ! &&
+		git reset --hard &&
 
-	write_script .git/hooks/fsmonitor-test<<-\EOF &&
-		printf "last_update_token\0"
-	EOF
-	git config core.fsmonitor .git/hooks/fsmonitor-test &&
-	check_sparse_index_behavior ! &&
+		test_config core.fsmonitor "$TEST_DIRECTORY/t7519/fsmonitor-all" &&
+		check_sparse_index_behavior ! &&
 
-	write_script .git/hooks/fsmonitor-test<<-\EOF &&
-		printf "last_update_token\0"
-		printf "dir1/modified\0"
-	EOF
-	check_sparse_index_behavior ! &&
+		write_script .git/hooks/fsmonitor-test<<-\EOF &&
+			printf "last_update_token\0"
+		EOF
+		git config core.fsmonitor .git/hooks/fsmonitor-test &&
+		check_sparse_index_behavior ! &&
 
-	cp -r dir1 dir1a &&
-	git add dir1a &&
-	git commit -m "add dir1a" &&
+		write_script .git/hooks/fsmonitor-test<<-\EOF &&
+			printf "last_update_token\0"
+			printf "dir1/modified\0"
+		EOF
+		check_sparse_index_behavior ! &&
 
-	# This one modifies outside the sparse-checkout definition
-	# and hence we expect to expand the sparse-index.
-	write_script .git/hooks/fsmonitor-test<<-\EOF &&
-		printf "last_update_token\0"
-		printf "dir1a/modified\0"
-	EOF
-	check_sparse_index_behavior
+		cp -r dir1 dir1a &&
+		git add dir1a &&
+		git commit -m "add dir1a" &&
+
+		# This one modifies outside the sparse-checkout definition
+		# and hence we expect to expand the sparse-index.
+		write_script .git/hooks/fsmonitor-test<<-\EOF &&
+			printf "last_update_token\0"
+			printf "dir1a/modified\0"
+		EOF
+		check_sparse_index_behavior
+	)
 '
 
 test_done