diff mbox series

[1/6] t1092: use ORT merge strategy

Message ID 7cad9eee90bcee3cb98be5c7a2edaca5e855c157.1629220124.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series Sparse Index: Integrate with merge, cherry-pick, rebase, and revert | expand

Commit Message

Derrick Stolee Aug. 17, 2021, 5:08 p.m. UTC
From: Derrick Stolee <dstolee@microsoft.com>

The sparse index will be compatible with the ORT merge strategy, so
let's use it explicitly in our tests.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/t1092-sparse-checkout-compatibility.sh | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Comments

Taylor Blau Aug. 18, 2021, 5:16 p.m. UTC | #1
On Tue, Aug 17, 2021 at 05:08:39PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
>
> The sparse index will be compatible with the ORT merge strategy, so
> let's use it explicitly in our tests.
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  t/t1092-sparse-checkout-compatibility.sh | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index ddc86bb4152..3e01e70fa0b 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -7,6 +7,11 @@ GIT_TEST_SPARSE_INDEX=
>
>  . ./test-lib.sh
>
> +# Force the use of the ORT merge algorithm until testing with the
> +# recursive strategy. We expect ORT to be used with sparse-index.
> +GIT_TEST_MERGE_ALGORITHM=ort
> +export GIT_TEST_MERGE_ALGORITHM
> +

Looks good, but are the lower hunks which set `-s ort` necessary, too? I
applied this series into my tree and t1092 still passes after reverting
the next two hunks.

Not worth a reroll on its own, just curious.

Thanks,
Taylor
Junio C Hamano Aug. 18, 2021, 6:10 p.m. UTC | #2
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Derrick Stolee <dstolee@microsoft.com>
>
> The sparse index will be compatible with the ORT merge strategy, so
> let's use it explicitly in our tests.

Unless you mean that the sparse index will only be compatible with
ort, but will never be with recursive, I do not quite see why this
is taking us into a good direction.  Is this because we want to gain
test coverage for ort early, before we flip the default to ort [*1*]?



[Footnote]

*1* If the answer is "no, it is because sparse index will not work
    with recursive", the please disregard the rest, but just in
    case it is not...

    It seems to me that it would let us live in the future in a more
    comprehensive way if we tweaked merge_recursive() and/or
    merge_recursive_generic() so that all internal callers, not just
    builtin/merge.c, would redirect to the ort machinery when say
    GIT_TEST_REPLACE_RECURSIVE_WITH_ORT environment exists, and
    doing it that way we do not need to sprinkle "-srecursive" and
    "-sort" everywhere in our tests at randomly chosen places to
    give test coverage to both strategies.


> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  t/t1092-sparse-checkout-compatibility.sh | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index ddc86bb4152..3e01e70fa0b 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -7,6 +7,11 @@ GIT_TEST_SPARSE_INDEX=
>  
>  . ./test-lib.sh
>  
> +# Force the use of the ORT merge algorithm until testing with the
> +# recursive strategy. We expect ORT to be used with sparse-index.
> +GIT_TEST_MERGE_ALGORITHM=ort
> +export GIT_TEST_MERGE_ALGORITHM
> +
>  test_expect_success 'setup' '
>  	git init initial-repo &&
>  	(
> @@ -501,7 +506,7 @@ test_expect_success 'merge with conflict outside cone' '
>  
>  	test_all_match git checkout -b merge-tip merge-left &&
>  	test_all_match git status --porcelain=v2 &&
> -	test_all_match test_must_fail git merge -m merge merge-right &&
> +	test_all_match test_must_fail git merge -sort -m merge merge-right &&
>  	test_all_match git status --porcelain=v2 &&
>  
>  	# Resolve the conflict in different ways:
> @@ -531,7 +536,7 @@ test_expect_success 'merge with outside renames' '
>  	do
>  		test_all_match git reset --hard &&
>  		test_all_match git checkout -f -b merge-$type update-deep &&
> -		test_all_match git merge -m "$type" rename-$type &&
> +		test_all_match git merge -sort -m "$type" rename-$type &&
>  		test_all_match git rev-parse HEAD^{tree} || return 1
>  	done
>  '
Derrick Stolee Aug. 18, 2021, 6:42 p.m. UTC | #3
On 8/18/2021 2:10 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> From: Derrick Stolee <dstolee@microsoft.com>
>>
>> The sparse index will be compatible with the ORT merge strategy, so
>> let's use it explicitly in our tests.
> 
> Unless you mean that the sparse index will only be compatible with
> ort, but will never be with recursive, I do not quite see why this
> is taking us into a good direction.  Is this because we want to gain
> test coverage for ort early, before we flip the default to ort [*1*]?

The sparse index will _work_ with the recursive merge strategy, it will
just continue to be slow, and likely slower than if we had a full index.
This is because the recursive merge algorithm will expand a sparse index
into a full one before doing any of its logic (hence my confidence that
it will work).

The main point why ORT is the focus is that the ORT strategy is required
so the sparse index can get the intended performance gains (i.e. it does
not expand in most cases). The ORT algorithm can resolve conflicts
outside the sparse-checkout cone without needing the index as a data
structure and instead the resulting tree is recorded in the correct
sparse directory entry.

> [Footnote]
> 
> *1* If the answer is "no, it is because sparse index will not work
>     with recursive", the please disregard the rest, but just in
>     case it is not...
> 
>     It seems to me that it would let us live in the future in a more
>     comprehensive way if we tweaked merge_recursive() and/or
>     merge_recursive_generic() so that all internal callers, not just
>     builtin/merge.c, would redirect to the ort machinery when say
>     GIT_TEST_REPLACE_RECURSIVE_WITH_ORT environment exists, and
>     doing it that way we do not need to sprinkle "-srecursive" and
>     "-sort" everywhere in our tests at randomly chosen places to
>     give test coverage to both strategies.

I could also change this patch to stop using ORT _all the time_ and
instead let the GIT_TEST_MERGE_ALGORITHM decide which is tested.

That is, except for the final tests that check that the index is not
expanded. Those tests must specify the ORT strategy explicitly.

I think I started playing with the GIT_TEST_MERGE_ALGORITHM because
it appears to override the command-line option, but I will need to
double-check that.

Thanks,
-Stolee
Junio C Hamano Aug. 18, 2021, 10:22 p.m. UTC | #4
Derrick Stolee <stolee@gmail.com> writes:

> On 8/18/2021 2:10 PM, Junio C Hamano wrote:
>> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
>> 
>>> From: Derrick Stolee <dstolee@microsoft.com>
>>>
>>> The sparse index will be compatible with the ORT merge strategy, so
>>> let's use it explicitly in our tests.
>> 
>> Unless you mean that the sparse index will only be compatible with
>> ort, but will never be with recursive, I do not quite see why this
>> is taking us into a good direction.  Is this because we want to gain
>> test coverage for ort early, before we flip the default to ort [*1*]?
>
> The sparse index will _work_ with the recursive merge strategy, it will
> just continue to be slow, and likely slower than if we had a full index.
> This is because the recursive merge algorithm will expand a sparse index
> into a full one before doing any of its logic (hence my confidence that
> it will work).

Ah, thanks for explanation.  The tests being touched are not about
correctness of the merge results but the damage the operation would
make to the sparseness of the index, and because in the longer term
the recursive backend is on its way out, we do want to focus on
testing how ORT performs.

> I could also change this patch to stop using ORT _all the time_ and
> instead let the GIT_TEST_MERGE_ALGORITHM decide which is tested.

No, that's OK.

It was unclear from the proposed log message (hence my questions),
but if it is made clear that we have a good reason why we want to
see how the sparse-index works with ORT, explicitly testing with ORT
like your patch does is the right thing to do, I would think.

With GIT_TEST_MERGE_ALGORITHM set to ort and exported, it is
puzzling why some "git merge" invocations gets "-s ort" in the same
patch, though.

Thanks.
Elijah Newren Aug. 20, 2021, 9:23 p.m. UTC | #5
On Wed, Aug 18, 2021 at 11:42 AM Derrick Stolee <stolee@gmail.com> wrote:
>
> >     It seems to me that it would let us live in the future in a more
> >     comprehensive way if we tweaked merge_recursive() and/or
> >     merge_recursive_generic() so that all internal callers, not just
> >     builtin/merge.c, would redirect to the ort machinery when say
> >     GIT_TEST_REPLACE_RECURSIVE_WITH_ORT environment exists, and
> >     doing it that way we do not need to sprinkle "-srecursive" and
> >     "-sort" everywhere in our tests at randomly chosen places to
> >     give test coverage to both strategies.

GIT_TEST_MERGE_ALGORITHM already does this; the testsuite already had
`-s recursive` sprinkled everywhere (due to contrast with `-s
resolve`), but since I wanted to use all existing recursive tests as
ort tests, then rather than tweaking all the test files and copying
tests and whatnot, we decided to just have GIT_TEST_MERGE_ALGORITHM
reinterpret "recursive" to whatever GIT_TEST_MERGE_ALGORITHM says.

> I could also change this patch to stop using ORT _all the time_ and
> instead let the GIT_TEST_MERGE_ALGORITHM decide which is tested.
>
> That is, except for the final tests that check that the index is not
> expanded. Those tests must specify the ORT strategy explicitly.
>
> I think I started playing with the GIT_TEST_MERGE_ALGORITHM because
> it appears to override the command-line option, but I will need to
> double-check that.

Yes, GIT_TEST_MERGE_ALGORITHM=ort reinterprets "recursive" to mean "ort".
Junio C Hamano Aug. 20, 2021, 11:32 p.m. UTC | #6
Elijah Newren <newren@gmail.com> writes:

> On Wed, Aug 18, 2021 at 11:42 AM Derrick Stolee <stolee@gmail.com> wrote:
>>
>> >     It seems to me that it would let us live in the future in a more
>> >     comprehensive way if we tweaked merge_recursive() and/or
>> >     merge_recursive_generic() so that all internal callers, not just
>> >     builtin/merge.c, would redirect to the ort machinery when say
>> >     GIT_TEST_REPLACE_RECURSIVE_WITH_ORT environment exists, and
>> >     doing it that way we do not need to sprinkle "-srecursive" and
>> >     "-sort" everywhere in our tests at randomly chosen places to
>> >     give test coverage to both strategies.
>
> GIT_TEST_MERGE_ALGORITHM already does this; the testsuite already had
> `-s recursive` sprinkled everywhere (due to contrast with `-s
> resolve`), but since I wanted to use all existing recursive tests as
> ort tests, then rather than tweaking all the test files and copying
> tests and whatnot, we decided to just have GIT_TEST_MERGE_ALGORITHM
> reinterpret "recursive" to whatever GIT_TEST_MERGE_ALGORITHM says.

I somehow thought that direct calls to merge_recursive() and
merge_recursive_generic() are not affected with that environment
variable, so you cannot influence what happens during "git am -3"
and "git stash apply" with that, but perhaps I was not reading the
code correctly.

It seems that merge_recursive() and merge_ort_recursive() are
interface compatible and the latter can serve as a drop-in
replacement for the former?
Elijah Newren Aug. 21, 2021, 12:20 a.m. UTC | #7
On Fri, Aug 20, 2021 at 4:32 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > On Wed, Aug 18, 2021 at 11:42 AM Derrick Stolee <stolee@gmail.com> wrote:
> >>
> >> >     It seems to me that it would let us live in the future in a more
> >> >     comprehensive way if we tweaked merge_recursive() and/or
> >> >     merge_recursive_generic() so that all internal callers, not just
> >> >     builtin/merge.c, would redirect to the ort machinery when say
> >> >     GIT_TEST_REPLACE_RECURSIVE_WITH_ORT environment exists, and
> >> >     doing it that way we do not need to sprinkle "-srecursive" and
> >> >     "-sort" everywhere in our tests at randomly chosen places to
> >> >     give test coverage to both strategies.
> >
> > GIT_TEST_MERGE_ALGORITHM already does this; the testsuite already had
> > `-s recursive` sprinkled everywhere (due to contrast with `-s
> > resolve`), but since I wanted to use all existing recursive tests as
> > ort tests, then rather than tweaking all the test files and copying
> > tests and whatnot, we decided to just have GIT_TEST_MERGE_ALGORITHM
> > reinterpret "recursive" to whatever GIT_TEST_MERGE_ALGORITHM says.
>
> I somehow thought that direct calls to merge_recursive() and
> merge_recursive_generic() are not affected with that environment
> variable, so you cannot influence what happens during "git am -3"
> and "git stash apply" with that, but perhaps I was not reading the
> code correctly.

Sorry for being unclear.  I was responding to the "sprinkling" portion
of the quote; GIT_TEST_MERGE_ALGORITHM allows us to avoid sprinkling
-srecursive and -sort in various places.

You are correct that merge_recursive() and merge_recursive_generic()
are unaffected by the environment variable; the environment variable
operates at a higher level in the code to choose whether to call e.g.
merge_recursive() vs. merge_incore_recursive().

> It seems that merge_recursive() and merge_ort_recursive() are
> interface compatible and the latter can serve as a drop-in
> replacement for the former?

Yes, merge_ort_recursive() and merge_ort_nonrecursive() were meant as
interface compatible drop-in replacements for merge_recursive() and
merge_trees(), to make it easy to switch callers over.

There is no such replacement for merge_recursive_generic(), though,
and builtin/{am, merge-recursive, stash}.c will all need to be
modified to work with merge-ort.  IIRC, when last we discussed that
interface, we realized that the three were using it a bit differently
and it had some hardcoded am-specific assumptions that were not
appropriate for the other two, so it's not clear to me we should port
that interface.
Junio C Hamano Aug. 21, 2021, 4:20 a.m. UTC | #8
Elijah Newren <newren@gmail.com> writes:

>> It seems that merge_recursive() and merge_ort_recursive() are
>> interface compatible and the latter can serve as a drop-in
>> replacement for the former?
>
> Yes, merge_ort_recursive() and merge_ort_nonrecursive() were meant as
> interface compatible drop-in replacements for merge_recursive() and
> merge_trees(), to make it easy to switch callers over.
>
> There is no such replacement for merge_recursive_generic(), though,
> and builtin/{am, merge-recursive, stash}.c will all need to be
> modified to work with merge-ort.

But merge_recursive_generic() eveantually calls into merge_recursive();
as long as you hook into the latter, you're covered, no?
Elijah Newren Aug. 21, 2021, 11:48 p.m. UTC | #9
On Fri, Aug 20, 2021 at 9:20 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> >> It seems that merge_recursive() and merge_ort_recursive() are
> >> interface compatible and the latter can serve as a drop-in
> >> replacement for the former?
> >
> > Yes, merge_ort_recursive() and merge_ort_nonrecursive() were meant as
> > interface compatible drop-in replacements for merge_recursive() and
> > merge_trees(), to make it easy to switch callers over.
> >
> > There is no such replacement for merge_recursive_generic(), though,
> > and builtin/{am, merge-recursive, stash}.c will all need to be
> > modified to work with merge-ort.
>
> But merge_recursive_generic() eventually calls into merge_recursive();
> as long as you hook into the latter, you're covered, no?

Okay, you caught me.  merge_ort_recursive() has one API difference
from merge_recursive(), namely, it does not allow opt->ancestor to be
anything other than NULL, whereas merge_recursive() does.  The only
caller where that distinction matters is
merge_recursive_generic()...but that does prevent
merge_recursive_generic() from just calling merge_ort_recursive().

The original patches I sent in for merge_incore_recursive() would have
provided for this same ability (and made merge_ort_recursive()
actually be a drop in replacement), but you rightfully pointed out
that the relevant opt->ancestor portion of the patches made no sense.
At the time, I responded (at
https://lore.kernel.org/git/CABPp-BFr=1iVY739cfh-1Hp82x-Mny-k4y0f3zZ_YuP3PxiGfQ@mail.gmail.com/):

"""
Yeah, you're making me lean towards thinking that
merge_recursive_generic() is just a broken API that I shouldn't port
over (even as a wrapper), and that I further shouldn't support using
the merge_ort_recursive() API in a fashion wanted by that function.
"""

The problem with opt->ancestor in merge_recursive_generic() is as follows:

  * When there is only one merge base (and opt->ancestor is not set),
merge_ort_recursive() and merge_recursive() will both set
opt->ancestor to the hash of the merge base commit.

  * The hash of the merge base is great when the merge base is a real
commit, less so when fake commits are generated (as
merge_recursive_generic() does) to pass along.

  * Because of the above, merge_recursive_generic() overrides
opt->ancestor with the value "constructed merge base" when there is
exactly one merge base tree.  That was done with git-am in mind, and
makes sense for am because it does create a fake or constructed merge
base.  It does not make sense to me for stash, which has a real
commit.  It also seems suboptimal or wrong for
builtin/merge-recursive.c as well -- it's hard to be sure since
builtin/merge-recursive simply takes an OID rather than actual branch
names and thus those OIDs could correspond to fake or constructed
commits, but builtin/merge-recursive does have the
better_branch_name() function it uses for o.branch1 and o.branch2 and
it seems like it should do the same on the merge base when it's
unique.


Rather than porting this bug (these bugs?) over from merge-recursive,
I'll just convert the merge_recursive_generic() callers over to the
new API.
diff mbox series

Patch

diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index ddc86bb4152..3e01e70fa0b 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -7,6 +7,11 @@  GIT_TEST_SPARSE_INDEX=
 
 . ./test-lib.sh
 
+# Force the use of the ORT merge algorithm until testing with the
+# recursive strategy. We expect ORT to be used with sparse-index.
+GIT_TEST_MERGE_ALGORITHM=ort
+export GIT_TEST_MERGE_ALGORITHM
+
 test_expect_success 'setup' '
 	git init initial-repo &&
 	(
@@ -501,7 +506,7 @@  test_expect_success 'merge with conflict outside cone' '
 
 	test_all_match git checkout -b merge-tip merge-left &&
 	test_all_match git status --porcelain=v2 &&
-	test_all_match test_must_fail git merge -m merge merge-right &&
+	test_all_match test_must_fail git merge -sort -m merge merge-right &&
 	test_all_match git status --porcelain=v2 &&
 
 	# Resolve the conflict in different ways:
@@ -531,7 +536,7 @@  test_expect_success 'merge with outside renames' '
 	do
 		test_all_match git reset --hard &&
 		test_all_match git checkout -f -b merge-$type update-deep &&
-		test_all_match git merge -m "$type" rename-$type &&
+		test_all_match git merge -sort -m "$type" rename-$type &&
 		test_all_match git rev-parse HEAD^{tree} || return 1
 	done
 '