diff mbox series

[04/20] merge-ort: use histogram diff

Message ID 564b072ac105ee9c3ccb30c6046ce66270fbbf53.1606635803.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series fundamentals of merge-ort implementation | expand

Commit Message

Elijah Newren Nov. 29, 2020, 7:43 a.m. UTC
From: Elijah Newren <newren@gmail.com>

In my cursory investigation, histogram diffs are about 2% slower than
Myers diffs.  Others have probably done more detailed benchmarks.  But,
in short, histogram diffs have been around for years and in a number of
cases provide obviously better looking diffs where Myers diffs are
unintelligible but the performance hit has kept them from becoming the
default.

However, there are real merge bugs we know about that have triggered on
git.git and linux.git, which I don't have a clue how to address without
the additional information that I believe is provided by histogram
diffs.  See the following:

https://lore.kernel.org/git/20190816184051.GB13894@sigill.intra.peff.net/
https://lore.kernel.org/git/CABPp-BHvJHpSJT7sdFwfNcPn_sOXwJi3=o14qjZS3M8Rzcxe2A@mail.gmail.com/
https://lore.kernel.org/git/CABPp-BGtez4qjbtFT1hQoREfcJPmk9MzjhY5eEq1QhXT23tFOw@mail.gmail.com/

I don't like mismerges.  I really don't like silent mismerges.  While I
am sometimes willing to make performance and correctness tradeoff, I'm
much more interested in correctness in general.  I want to fix the above
bugs.  I have not yet started doing so, but I believe histogram diff at
least gives me an angle.  Unfortunately, I can't rely on using the
information from histogram diff unless it's in use.  And it hasn't been
used because of a few percentage performance hit.

In testcases I have looked at, merge-ort is _much_ faster than
merge-recursive for non-trivial merges/rebases/cherry-picks.  As such,
this is a golden opportunity to switch out the underlying diff algorithm
(at least the one used by the merge machinery; git-diff and git-log are
separate questions); doing so will allow me to get additional data and
improved diffs, and I believe it will help me fix the above bugs at some
point in the future.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 4 ++++
 1 file changed, 4 insertions(+)
diff mbox series

Patch

diff --git a/merge-ort.c b/merge-ort.c
index 3581a7d278..d737762700 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -21,6 +21,7 @@ 
 #include "diffcore.h"
 #include "strmap.h"
 #include "tree.h"
+#include "xdiff-interface.h"
 
 struct merge_options_internal {
 	/*
@@ -234,6 +235,9 @@  static void merge_start(struct merge_options *opt, struct merge_result *result)
 
 	assert(opt->priv == NULL);
 
+	/* Default to histogram diff.  Actually, just hardcode it...for now. */
+	opt->xdl_opts = DIFF_WITH_ALG(opt, HISTOGRAM_DIFF);
+
 	/* Initialization of opt->priv, our internal merge data */
 	opt->priv = xcalloc(1, sizeof(*opt->priv));