diff mbox series

[v4,1/6] t4001: add a test comparing basename similarity and content similarity

Message ID 3e6af929d135ef2dc239e2f47f92a7e2e91cbd17.1613031350.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series Optimization batch 7: use file basenames to guide rename detection | expand

Commit Message

Elijah Newren Feb. 11, 2021, 8:15 a.m. UTC
From: Elijah Newren <newren@gmail.com>

Add a simple test where a removed file is similar to two different added
files; one of them has the same basename, and the other has a slightly
higher content similarity.  Without break detection, filename similarity
of 100% trumps content similarity for pairing up related files.  For
any filename similarity less than 100%, the opposite is true -- content
similarity is all that matters.  Add a testcase that documents this.

Subsequent commits will add a new rule that includes an inbetween state,
where a mixture of filename similarity and content similarity are
weighed, and which will change the outcome of this testcase.

Signed-off-by: Elijah Newren <newren@gmail.com>
 t/t4001-diff-rename.sh | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)
diff mbox series


diff --git a/t/t4001-diff-rename.sh b/t/t4001-diff-rename.sh
index c16486a9d41a..797343b38106 100755
--- a/t/t4001-diff-rename.sh
+++ b/t/t4001-diff-rename.sh
@@ -262,4 +262,28 @@  test_expect_success 'diff-tree -l0 defaults to a big rename limit, not zero' '
 	grep "myotherfile.*myfile" actual
+test_expect_success 'basename similarity vs best similarity' '
+	mkdir subdir &&
+	test_write_lines line1 line2 line3 line4 line5 \
+			 line6 line7 line8 line9 line10 >subdir/file.txt &&
+	git add subdir/file.txt &&
+	git commit -m "base txt" &&
+	git rm subdir/file.txt &&
+	test_write_lines line1 line2 line3 line4 line5 \
+			  line6 line7 line8 >file.txt &&
+	test_write_lines line1 line2 line3 line4 line5 \
+			  line6 line7 line8 line9 >file.md &&
+	git add file.txt file.md &&
+	git commit -a -m "rename" &&
+	git diff-tree -r -M --name-status HEAD^ HEAD >actual &&
+	# subdir/file.txt is 89% similar to file.md, 78% similar to file.txt,
+	# but since same basenames are checked first...
+	cat >expected <<-\EOF &&
+	R088	subdir/file.txt	file.md
+	A	file.txt
+	test_cmp expected actual