diff mbox series

[v2,5/6] merge-ort: fix merge.directoryRenames=false

Message ID b25225c3cabf7aafc659489e302193fd57829d5e.1741834001.git.gitgitgadget@gmail.com (mailing list archive)
State New
Headers show
Series Small new merge-ort features, prepping for deletion of merge-recursive.[ch] | expand

Commit Message

Elijah Newren March 13, 2025, 2:46 a.m. UTC
From: Elijah Newren <newren@gmail.com>

There are two issues here.

First, when merge.directoryRenames is set to false, there are a few code
paths that should be turned off.  I missed one; collect_renames() was
still doing some directory rename detection logic unconditionally.  It
ended up not having much effect because
get_provisional_directory_renames() was skipped earlier and not setting
up renames->dir_renames, but the code should still be skipped.

Second, the larger issue is that sometimes we get a cached_pair rename
from a previous commit being replayed mapping A->B, but in a subsequent
commit but collect_merge_info() doesn't even recurse into the
directory containing B because there are no source pairings for that
rename that are relevant; we can merge that commit fine without knowing
the rename.  But since the cached renames are added to the normal
renames, when we go to process it and find that B is not part of
opt->priv->paths, we hit the assertion error
  process_renames: Assertion `newinfo && ~newinfo->merged.clean` failed.
I think we could fix this at the beginning of detect_regular_renames() by
pruning from cached_pairs any entry whose destination isn't in
opt->priv->paths, but it's suboptimal in that we'd kind of like the
cached_pair to be restored afterwards so that it can help the subsequent
commit, but more importantly since it sits at the intersection of
the caching renames optimization and the relevant renames optimization,
and the trivial directory resolution optimization, and I don't currently
have Documentation/technical/remembering-renames.txt fully paged in, I'm
not sure if that's a full solution or a bandaid for the current
testcase.  However, since the remembering renames optimization was the
weakest of the set, and the optimization is far less important when
directory rename detection is off (as that implies far fewer potential
renames), let's just use a bigger hammer to ensure this special case is
fixed: turn off the rename caching.  We do the same thing already when
we encounter rename/rename(1to1) cases (as per `git grep -3
disabling.the.optimization`, though it uses a slightly different
triggering mechanism since it's trying to affect the next time that
merge_check_renames_reusable() is called), and I think it makes sense
to do the same here.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c              | 31 +++++++++++++++++++++++++++++--
 t/t3650-replay-basics.sh |  2 +-
 2 files changed, 30 insertions(+), 3 deletions(-)
diff mbox series

Patch

diff --git a/merge-ort.c b/merge-ort.c
index 1d3b690224e..785e5c6f24a 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -3404,6 +3404,11 @@  static int collect_renames(struct merge_options *opt,
 			pool_diff_free_filepair(&opt->priv->pool, p);
 			continue;
 		}
+		if (opt->detect_directory_renames == MERGE_DIRECTORY_RENAMES_NONE &&
+		    p->status == 'R' && 1) {
+			possibly_cache_new_pair(renames, p, side_index, NULL);
+			goto skip_directory_renames;
+		}
 
 		new_path = check_for_directory_rename(opt, p->two->path,
 						      side_index,
@@ -3421,6 +3426,7 @@  static int collect_renames(struct merge_options *opt,
 		if (new_path)
 			apply_directory_rename_modifications(opt, p, new_path);
 
+skip_directory_renames:
 		/*
 		 * p->score comes back from diffcore_rename_extended() with
 		 * the similarity of the renamed file.  The similarity is
@@ -5025,7 +5031,8 @@  static void merge_start(struct merge_options *opt, struct merge_result *result)
 	trace2_region_leave("merge", "allocate/init", opt->repo);
 }
 
-static void merge_check_renames_reusable(struct merge_result *result,
+static void merge_check_renames_reusable(struct merge_options *opt,
+					 struct merge_result *result,
 					 struct tree *merge_base,
 					 struct tree *side1,
 					 struct tree *side2)
@@ -5050,6 +5057,26 @@  static void merge_check_renames_reusable(struct merge_result *result,
 		return;
 	}
 
+	/*
+	 * Avoid using cached renames when directory rename detection is
+	 * turned off.  Cached renames are far less important in that case,
+	 * and they lead to testcases with an interesting intersection of
+	 * effects from relevant renames optimization, trivial directory
+	 * resolution optimization, and cached renames all converging when
+	 * the target of a cached rename is in a directory that
+	 * collect_merge_info() does not recurse into.  To avoid such
+	 * problems, simply disable cached renames for this case (similar
+	 * to the rename/rename(1to1) case; see the "disabling the
+	 * optimization" comment near that case).
+	 *
+	 * This could be revisited in the future; see the commit message
+	 * where this comment was added for some possible pointers.
+	 */
+	if (opt->detect_directory_renames == MERGE_DIRECTORY_RENAMES_NONE) {
+		renames->cached_pairs_valid_side = 0; /* neither side valid */
+		return;
+	}
+
 	/*
 	 * Handle other cases; note that merge_trees[0..2] will only
 	 * be NULL if opti is, or if all three were manually set to
@@ -5258,7 +5285,7 @@  void merge_incore_nonrecursive(struct merge_options *opt,
 
 	trace2_region_enter("merge", "merge_start", opt->repo);
 	assert(opt->ancestor != NULL);
-	merge_check_renames_reusable(result, merge_base, side1, side2);
+	merge_check_renames_reusable(opt, result, merge_base, side1, side2);
 	merge_start(opt, result);
 	/*
 	 * Record the trees used in this merge, so if there's a next merge in
diff --git a/t/t3650-replay-basics.sh b/t/t3650-replay-basics.sh
index cade7930765..58b37599357 100755
--- a/t/t3650-replay-basics.sh
+++ b/t/t3650-replay-basics.sh
@@ -195,7 +195,7 @@  test_expect_success 'using replay on bare repo to rebase multiple divergent bran
 	done
 '
 
-test_expect_failure 'merge.directoryRenames=false' '
+test_expect_success 'merge.directoryRenames=false' '
 	# create a test case that stress-tests the rename caching
 	git switch -c rename-onto &&