diff mbox series

[3/6] merge: make sparse-aware with ORT

Message ID 4c1104a0dd3af4a895df42f43306c24965a0323c.1629220124.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series Sparse Index: Integrate with merge, cherry-pick, rebase, and revert | expand

Commit Message

Derrick Stolee Aug. 17, 2021, 5:08 p.m. UTC
From: Derrick Stolee <dstolee@microsoft.com>

Allow 'git merge' to operate without expanding a sparse index, at least
not immediately. The index still will be expanded in a few cases:

1. If the merge strategy is 'recursive', then we enable
   command_requires_full_index at the start of the merge_recursive()
   method. We expect sparse-index users to also have the 'ort' strategy
   enabled.

2. If the merge results in a conflicted file, then we expand the index
   before updating the working tree. The loop that iterates over the
   worktree replaces index entries and tracks 'origintal_cache_nr' which
   can become completely wrong if the index expands in the middle of the
   operation. This safety valve is important before that loop starts. A
   later change will focus this to only expand if we indeed have a
   conflict outside of the sparse-checkout cone.

Some test updates are required, including a mistaken 'git checkout -b'
that did not specify the base branch, causing merges to be fast-forward
merges.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/merge.c                          | 3 +++
 merge-ort.c                              | 8 ++++++++
 merge-recursive.c                        | 3 +++
 t/t1092-sparse-checkout-compatibility.sh | 8 ++++++--
 4 files changed, 20 insertions(+), 2 deletions(-)

Comments

Elijah Newren Aug. 20, 2021, 9:40 p.m. UTC | #1
On Tue, Aug 17, 2021 at 10:08 AM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Derrick Stolee <dstolee@microsoft.com>
>
> Allow 'git merge' to operate without expanding a sparse index, at least
> not immediately. The index still will be expanded in a few cases:
>
> 1. If the merge strategy is 'recursive', then we enable
>    command_requires_full_index at the start of the merge_recursive()
>    method. We expect sparse-index users to also have the 'ort' strategy
>    enabled.

What about `resolve`, `octopus`, `subtree` (which technically could be
implemented via either recursive or ort, such fun...) or a
user-defined strategy?

`resolve` and `octopus` would absolutely need a full index.  `subtree`
would if implemented via merge-recursive, and not if implemented via
merge-ort.

I'm not sure what to assume about user-defined strategies; I guess for
safety reasons and backward compatibility, we should always expand?
Or maybe there are no backward compatibility concerns since no one who
uses a sparse-index will attempt to use any pre-existing external
merge strategies (are there even any known ones in the wild or is this
still a theoretical capability?), and we can assume they will only use
ones written in the future?  Hmm...

> 2. If the merge results in a conflicted file, then we expand the index
>    before updating the working tree. The loop that iterates over the
>    worktree replaces index entries and tracks 'origintal_cache_nr' which
>    can become completely wrong if the index expands in the middle of the
>    operation. This safety valve is important before that loop starts. A
>    later change will focus this to only expand if we indeed have a
>    conflict outside of the sparse-checkout cone.

From reading the patch below, this is specific to ort, but that wasn't
clear on reading the commit message.

> Some test updates are required, including a mistaken 'git checkout -b'
> that did not specify the base branch, causing merges to be fast-forward
> merges.
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  builtin/merge.c                          | 3 +++
>  merge-ort.c                              | 8 ++++++++
>  merge-recursive.c                        | 3 +++
>  t/t1092-sparse-checkout-compatibility.sh | 8 ++++++--
>  4 files changed, 20 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/merge.c b/builtin/merge.c
> index 22f23990b37..926de328fbb 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -1276,6 +1276,9 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
>         if (argc == 2 && !strcmp(argv[1], "-h"))
>                 usage_with_options(builtin_merge_usage, builtin_merge_options);
>
> +       prepare_repo_settings(the_repository);
> +       the_repository->settings.command_requires_full_index = 0;
> +
>         /*
>          * Check if we are _not_ on a detached HEAD, i.e. if there is a
>          * current branch.
> diff --git a/merge-ort.c b/merge-ort.c
> index 6eb910d6f0c..8e754b769e1 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -4058,6 +4058,14 @@ static int record_conflicted_index_entries(struct merge_options *opt)
>         if (strmap_empty(&opt->priv->conflicted))
>                 return 0;
>
> +       /*
> +        * We are in a conflicted state. These conflicts might be inside
> +        * sparse-directory entries, so expand the index preemtively.

s/preemtively/preemptively/

> +        * Also, we set original_cache_nr below, but that might change if
> +        * index_name_pos() calls ask for paths within sparse directories.
> +        */
> +       ensure_full_index(index);
> +

This seems somewhat pessimistic; what if all the conflicts are within
the sparse-checkout?  Having conflicts contains within the
sparse-checkout seems likely, since we'd only get conflicts for files
modified by both sides of history, and sparse-checkouts are used when
users aren't going to modify files outside the sparse-checkout.

>         /* If any entries have skip_worktree set, we'll have to check 'em out */
>         state.force = 1;
>         state.quiet = 1;
> diff --git a/merge-recursive.c b/merge-recursive.c
> index 3355d50e8ad..1f563cd6874 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -3750,6 +3750,9 @@ int merge_recursive(struct merge_options *opt,
>         assert(opt->ancestor == NULL ||
>                !strcmp(opt->ancestor, "constructed merge base"));
>
> +       prepare_repo_settings(opt->repo);
> +       opt->repo->settings.command_requires_full_index = 1;
> +
>         if (merge_start(opt, repo_get_commit_tree(opt->repo, h1)))
>                 return -1;
>         clean = merge_recursive_internal(opt, h1, h2, merge_bases, result);
> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index 3e01e70fa0b..781ebd9a656 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -52,7 +52,7 @@ test_expect_success 'setup' '
>                 git checkout -b base &&
>                 for dir in folder1 folder2 deep
>                 do
> -                       git checkout -b update-$dir &&
> +                       git checkout -b update-$dir base &&
>                         echo "updated $dir" >$dir/a &&
>                         git commit -a -m "update $dir" || return 1
>                 done &&
> @@ -652,7 +652,11 @@ test_expect_success 'sparse-index is not expanded' '
>         echo >>sparse-index/extra.txt &&
>         ensure_not_expanded add extra.txt &&
>         echo >>sparse-index/untracked.txt &&
> -       ensure_not_expanded add .
> +       ensure_not_expanded add . &&
> +
> +       ensure_not_expanded checkout -f update-deep &&
> +       ensure_not_expanded merge -s ort -m merge update-folder1 &&
> +       ensure_not_expanded merge -s ort -m merge update-folder2

Can we just set GIT_TEST_MERGE_ALGORITHM=ort at the beginning of the
test file and then avoid repeating `-s ort`?


>  '
>
>  # NEEDSWORK: a sparse-checkout behaves differently from a full checkout
> --
> gitgitgadget
diff mbox series

Patch

diff --git a/builtin/merge.c b/builtin/merge.c
index 22f23990b37..926de328fbb 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1276,6 +1276,9 @@  int cmd_merge(int argc, const char **argv, const char *prefix)
 	if (argc == 2 && !strcmp(argv[1], "-h"))
 		usage_with_options(builtin_merge_usage, builtin_merge_options);
 
+	prepare_repo_settings(the_repository);
+	the_repository->settings.command_requires_full_index = 0;
+
 	/*
 	 * Check if we are _not_ on a detached HEAD, i.e. if there is a
 	 * current branch.
diff --git a/merge-ort.c b/merge-ort.c
index 6eb910d6f0c..8e754b769e1 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -4058,6 +4058,14 @@  static int record_conflicted_index_entries(struct merge_options *opt)
 	if (strmap_empty(&opt->priv->conflicted))
 		return 0;
 
+	/*
+	 * We are in a conflicted state. These conflicts might be inside
+	 * sparse-directory entries, so expand the index preemtively.
+	 * Also, we set original_cache_nr below, but that might change if
+	 * index_name_pos() calls ask for paths within sparse directories.
+	 */
+	ensure_full_index(index);
+
 	/* If any entries have skip_worktree set, we'll have to check 'em out */
 	state.force = 1;
 	state.quiet = 1;
diff --git a/merge-recursive.c b/merge-recursive.c
index 3355d50e8ad..1f563cd6874 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -3750,6 +3750,9 @@  int merge_recursive(struct merge_options *opt,
 	assert(opt->ancestor == NULL ||
 	       !strcmp(opt->ancestor, "constructed merge base"));
 
+	prepare_repo_settings(opt->repo);
+	opt->repo->settings.command_requires_full_index = 1;
+
 	if (merge_start(opt, repo_get_commit_tree(opt->repo, h1)))
 		return -1;
 	clean = merge_recursive_internal(opt, h1, h2, merge_bases, result);
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 3e01e70fa0b..781ebd9a656 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -52,7 +52,7 @@  test_expect_success 'setup' '
 		git checkout -b base &&
 		for dir in folder1 folder2 deep
 		do
-			git checkout -b update-$dir &&
+			git checkout -b update-$dir base &&
 			echo "updated $dir" >$dir/a &&
 			git commit -a -m "update $dir" || return 1
 		done &&
@@ -652,7 +652,11 @@  test_expect_success 'sparse-index is not expanded' '
 	echo >>sparse-index/extra.txt &&
 	ensure_not_expanded add extra.txt &&
 	echo >>sparse-index/untracked.txt &&
-	ensure_not_expanded add .
+	ensure_not_expanded add . &&
+
+	ensure_not_expanded checkout -f update-deep &&
+	ensure_not_expanded merge -s ort -m merge update-folder1 &&
+	ensure_not_expanded merge -s ort -m merge update-folder2
 '
 
 # NEEDSWORK: a sparse-checkout behaves differently from a full checkout