diff mbox series

[RFC,2/2] merge-recursive: optimize time complexity for get_unmerged

Message ID 20250213090040.16133-3-meetsoni3017@gmail.com (mailing list archive)
State New
Headers show
Series merge-recursive: optimize time complexity | expand

Commit Message

Meet Soni Feb. 13, 2025, 9 a.m. UTC
Previously, `get_unmerged()` used `string_list_insert()`, which has an
O(n^2) complexity due to shifting elements on each insertion. It also
called `string_list_lookup()` before insertion, which performs a binary
search in O(log n). This combination made insertion costly, especially
for large index states, as each new entry required both a search and
potentially shifting many elements.

Replace `string_list_insert()` with `string_list_append()` to achieve
O(n) insertion. After all entries are added, sort the list in O(n log n)
and remove duplicates in O(n), reducing the overall complexity to
O(n log n). This improves performance significantly for large datasets
while maintaining correctness.

Signed-off-by: Meet Soni <meetsoni3017@gmail.com>
---
 merge-recursive.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

Comments

Elijah Newren Feb. 13, 2025, 5:11 p.m. UTC | #1
On Thu, Feb 13, 2025 at 1:01 AM Meet Soni <meetsoni3017@gmail.com> wrote:
>
> Previously, `get_unmerged()` used `string_list_insert()`, which has an
> O(n^2) complexity due to shifting elements on each insertion. It also
> called `string_list_lookup()` before insertion, which performs a binary
> search in O(log n).

Okay.

> This combination made insertion costly, especially
> for large index states, as each new entry required both a search and
> potentially shifting many elements.

Why does the combination make it costly?  O(log n) + O(n^2) is still
O(n^2), so I don't see why it matters to mention the combination.
Could you clarify?

Also, does it actually make it costly, or do you only suspect that it
does?  O(n^2) worst case sometimes behaves O(n) or O(n log n) in some
cases.  Since your commit message says "made insertion costly" instead
of "might make insertion costly", I think that would suggest you have
some performance numbers to back this up on some interesting real
world repository.  Do you?  Can you share them?

> Replace `string_list_insert()` with `string_list_append()` to achieve
> O(n) insertion. After all entries are added, sort the list in O(n log n)
> and remove duplicates in O(n), reducing the overall complexity to
> O(n log n).

Okay.

> This improves performance significantly for large datasets

That's a big claim; it may be true, but without evidence I don't
believe it for three reasons : (1) n here is the number of conflicts,
not the number of files in the repo or the number of lines being
merged.  Thus, n is typically small.  (2) Other O(n^2) behavior in
merge-recursive likely drowns this particular codepath out, so any
gains here just aren't going to be noticed, (3) After looking at the
code and knowing the specialized structure of the index, I think that
while string_list_insert() for n items in general is going to be
O(n^2), it will likely functionally be O(n log n) for this particular
code path, meaning you haven't actually improved the performance.

> while maintaining correctness.

More on that below.


> Signed-off-by: Meet Soni <meetsoni3017@gmail.com>
> ---
>  merge-recursive.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/merge-recursive.c b/merge-recursive.c
> index 884ccf99a5..6165993429 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -547,15 +547,15 @@ static struct string_list *get_unmerged(struct index_state *istate)
>                 if (!ce_stage(ce))
>                         continue;
>
> -               item = string_list_lookup(unmerged, ce->name);
> -               if (!item) {
> -                       item = string_list_insert(unmerged, ce->name);
> -                       item->util = xcalloc(1, sizeof(struct stage_data));
> -               }
> +               item = string_list_append(unmerged, ce->name);
> +               item->util = xcalloc(1, sizeof(struct stage_data));
> +
>                 e = item->util;
>                 e->stages[ce_stage(ce)].mode = ce->ce_mode;
>                 oidcpy(&e->stages[ce_stage(ce)].oid, &ce->oid);

Did you run any tests?  I'm not sure you maintained correctness here.

>         }
> +       string_list_sort(unmerged);
> +       string_list_remove_duplicates(unmerged, 1);
>
>         return unmerged;
>  }
> --
> 2.34.1

(As a side note, due to the specialized structure of the input, I
suspect this code could be modified to run in O(n), i.e. we could skip
the string_list_lookup and the string_list_sort and the
string_list_remove_duplicates...  But, it'd make the code trickier, so
it'd need to be carefully commented, the change would need to be
justified, and it'd need to be carefully tested.  Even if we weren't
planning to delete this entire file, I suspect it's not possible to
find a case justifying such a change without optimizing several other
things in merge-recursive first, but optimizing those things probably
results in a significant rewrite...which we've already done with
merge-ort.)
Junio C Hamano Feb. 13, 2025, 6:30 p.m. UTC | #2
Elijah Newren <newren@gmail.com> writes:

> (As a side note, due to the specialized structure of the input, I
> suspect this code could be modified to run in O(n), i.e. we could skip
> the string_list_lookup and the string_list_sort and the
> string_list_remove_duplicates...

Are you talking about the input being already sorted so we can just
walk the multiple input and merge them into a single stream?  In the
cost analysis you did earlier in the message I am responding to,
being able to go down to O(n) sounds really like a great thing ;-)

> But, it'd make the code trickier, so
> it'd need to be carefully commented, the change would need to be
> justified, and it'd need to be carefully tested.  

... and measured.

> Even if we weren't
> planning to delete this entire file, I suspect it's not possible to
> find a case justifying such a change without optimizing several other
> things in merge-recursive first, but optimizing those things probably
> results in a significant rewrite...which we've already done with
> merge-ort.)

Sounds like unless the performance issues are shared between the
two, it may not be worth to spend too much brain cycles only on the
"recursive" one?

Thanks.
Elijah Newren Feb. 13, 2025, 6:45 p.m. UTC | #3
On Thu, Feb 13, 2025 at 10:30 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > (As a side note, due to the specialized structure of the input, I
> > suspect this code could be modified to run in O(n), i.e. we could skip
> > the string_list_lookup and the string_list_sort and the
> > string_list_remove_duplicates...
>
> Are you talking about the input being already sorted so we can just
> walk the multiple input and merge them into a single stream?  In the

I'm not sure what you mean by "merge them into a single stream".  I
think you have the right idea that we are creating a string list of
information about unmerged entries, and since we're taking information
from the index which is already sorted, we can just either modify the
last entry in the list if it matches or append a new entry to it; no
need to walk, insert, or binary search the list at all.

> cost analysis you did earlier in the message I am responding to,
> being able to go down to O(n) sounds really like a great thing ;-)

Note first that we aren't going from O(n^2) -> O(n), we're only going
from O(n log n) -> O(n).  That's still great, but:

  * n is typically pretty small (number of unmerged files)
  * there's things in merge-recursive that are O(m^2), where typically
m >> n (number of files in repo, or number of lines in big files in
the repo)
  * merge-recursive is used by almost no one
  * we are planning to delete merge-recursive

So, although O(n) is great....

> > But, it'd make the code trickier, so
> > it'd need to be carefully commented, the change would need to be
> > justified, and it'd need to be carefully tested.
>
> ... and measured.

+1

> > Even if we weren't
> > planning to delete this entire file, I suspect it's not possible to
> > find a case justifying such a change without optimizing several other
> > things in merge-recursive first, but optimizing those things probably
> > results in a significant rewrite...which we've already done with
> > merge-ort.)
>
> Sounds like unless the performance issues are shared between the
> two, it may not be worth to spend too much brain cycles only on the
> "recursive" one?

...yep, exactly, and this is not a performance issue shared with the
ort backend; it's unique to the recursive one.
Meet Soni Feb. 14, 2025, 4:28 a.m. UTC | #4
On Thu, 13 Feb 2025 at 22:41, Elijah Newren <newren@gmail.com> wrote:
>
> On Thu, Feb 13, 2025 at 1:01 AM Meet Soni <meetsoni3017@gmail.com> wrote:
> >
> > Previously, `get_unmerged()` used `string_list_insert()`, which has an
> > O(n^2) complexity due to shifting elements on each insertion. It also
> > called `string_list_lookup()` before insertion, which performs a binary
> > search in O(log n).
>
> Okay.
>
> > This combination made insertion costly, especially
> > for large index states, as each new entry required both a search and
> > potentially shifting many elements.
>
> Why does the combination make it costly?  O(log n) + O(n^2) is still
> O(n^2), so I don't see why it matters to mention the combination.
> Could you clarify?
>
> Also, does it actually make it costly, or do you only suspect that it
> does?  O(n^2) worst case sometimes behaves O(n) or O(n log n) in some
> cases.  Since your commit message says "made insertion costly" instead
> of "might make insertion costly", I think that would suggest you have
> some performance numbers to back this up on some interesting real
> world repository.  Do you?  Can you share them?
>
Sorry, I should've specified, this patch is purely theoretical, I was
aiming for a trial
and error kind of approach.

> > Replace `string_list_insert()` with `string_list_append()` to achieve
> > O(n) insertion. After all entries are added, sort the list in O(n log n)
> > and remove duplicates in O(n), reducing the overall complexity to
> > O(n log n).
>
> Okay.
>
> > This improves performance significantly for large datasets
>
> That's a big claim; it may be true, but without evidence I don't
> believe it for three reasons : (1) n here is the number of conflicts,
> not the number of files in the repo or the number of lines being
> merged.  Thus, n is typically small.  (2) Other O(n^2) behavior in
> merge-recursive likely drowns this particular codepath out, so any
> gains here just aren't going to be noticed, (3) After looking at the
> code and knowing the specialized structure of the index, I think that
> while string_list_insert() for n items in general is going to be
> O(n^2), it will likely functionally be O(n log n) for this particular
> code path, meaning you haven't actually improved the performance.
>
> > while maintaining correctness.
>
> More on that below.
>
>
> > Signed-off-by: Meet Soni <meetsoni3017@gmail.com>
> > ---
> >  merge-recursive.c | 10 +++++-----
> >  1 file changed, 5 insertions(+), 5 deletions(-)
> >
> > diff --git a/merge-recursive.c b/merge-recursive.c
> > index 884ccf99a5..6165993429 100644
> > --- a/merge-recursive.c
> > +++ b/merge-recursive.c
> > @@ -547,15 +547,15 @@ static struct string_list *get_unmerged(struct index_state *istate)
> >                 if (!ce_stage(ce))
> >                         continue;
> >
> > -               item = string_list_lookup(unmerged, ce->name);
> > -               if (!item) {
> > -                       item = string_list_insert(unmerged, ce->name);
> > -                       item->util = xcalloc(1, sizeof(struct stage_data));
> > -               }
> > +               item = string_list_append(unmerged, ce->name);
> > +               item->util = xcalloc(1, sizeof(struct stage_data));
> > +
> >                 e = item->util;
> >                 e->stages[ce_stage(ce)].mode = ce->ce_mode;
> >                 oidcpy(&e->stages[ce_stage(ce)].oid, &ce->oid);
>
> Did you run any tests?  I'm not sure you maintained correctness here.

I didn't run any tests -- I wanted to, but I wasn’t sure how to do it
for this change. Since you suggested dropping this patch from the
series, I’ll do that. But for similar changes in the future, how should I go
about testing them?
>
> >         }
> > +       string_list_sort(unmerged);
> > +       string_list_remove_duplicates(unmerged, 1);
> >
> >         return unmerged;
> >  }
> > --
> > 2.34.1
>
> (As a side note, due to the specialized structure of the input, I
> suspect this code could be modified to run in O(n), i.e. we could skip
> the string_list_lookup and the string_list_sort and the
> string_list_remove_duplicates...  But, it'd make the code trickier, so
> it'd need to be carefully commented, the change would need to be
> justified, and it'd need to be carefully tested.  Even if we weren't
> planning to delete this entire file, I suspect it's not possible to
> find a case justifying such a change without optimizing several other
> things in merge-recursive first, but optimizing those things probably
> results in a significant rewrite...which we've already done with
> merge-ort.)
Makes sense.

Thankyou for reviewing,
Meet
Elijah Newren Feb. 14, 2025, 6:04 a.m. UTC | #5
On Thu, Feb 13, 2025 at 8:28 PM Meet Soni <meetsoni3017@gmail.com> wrote:
>
> On Thu, 13 Feb 2025 at 22:41, Elijah Newren <newren@gmail.com> wrote:
> >
> > On Thu, Feb 13, 2025 at 1:01 AM Meet Soni <meetsoni3017@gmail.com> wrote:
...
> > > diff --git a/merge-recursive.c b/merge-recursive.c
> > > index 884ccf99a5..6165993429 100644
> > > --- a/merge-recursive.c
> > > +++ b/merge-recursive.c
> > > @@ -547,15 +547,15 @@ static struct string_list *get_unmerged(struct index_state *istate)
> > >                 if (!ce_stage(ce))
> > >                         continue;
> > >
> > > -               item = string_list_lookup(unmerged, ce->name);
> > > -               if (!item) {
> > > -                       item = string_list_insert(unmerged, ce->name);
> > > -                       item->util = xcalloc(1, sizeof(struct stage_data));
> > > -               }
> > > +               item = string_list_append(unmerged, ce->name);
> > > +               item->util = xcalloc(1, sizeof(struct stage_data));
> > > +
> > >                 e = item->util;
> > >                 e->stages[ce_stage(ce)].mode = ce->ce_mode;
> > >                 oidcpy(&e->stages[ce_stage(ce)].oid, &ce->oid);
> >
> > Did you run any tests?  I'm not sure you maintained correctness here.
>
> I didn't run any tests -- I wanted to, but I wasn’t sure how to do it
> for this change. Since you suggested dropping this patch from the
> series, I’ll do that. But for similar changes in the future, how should I go
> about testing them?

As per Documentation/CodingGuidelines: "After any code change, make
sure that the entire test suite passes."  You can do that by running:
    cd t && make
(You probably want to also run that before making any changes, just to
verify that they all pass for you.  Then, if any test fails after you
make changes, you know it's because of your changes rather than
because you missed something in building or setting up the tests.)


And although it doesn't matter since we're dropping this patch, the
issue I noticed was that if there were, say, three unmerged entries
with the same path, the original code would create one entry in the
string list and modify it 3 times (each with a different ce_stage(ce).
Your modification would create three different entries (each with only
information from one stage) and drop two of them, meaning we no longer
have a single string_list_item that contains information from all 3
unmerged entries for the same path.  I'm pretty sure running the
existing tests would catch that kind of bug, which is what raised the
question.
Meet Soni Feb. 14, 2025, 8:24 a.m. UTC | #6
On Fri, 14 Feb 2025 at 11:35, Elijah Newren <newren@gmail.com> wrote:
>
> > > Did you run any tests?  I'm not sure you maintained correctness here.
> >
> > I didn't run any tests -- I wanted to, but I wasn’t sure how to do it
> > for this change. Since you suggested dropping this patch from the
> > series, I’ll do that. But for similar changes in the future, how should I go
> > about testing them?
>
> As per Documentation/CodingGuidelines: "After any code change, make
> sure that the entire test suite passes."  You can do that by running:
>     cd t && make
> (You probably want to also run that before making any changes, just to
> verify that they all pass for you.  Then, if any test fails after you
> make changes, you know it's because of your changes rather than
> because you missed something in building or setting up the tests.)
>
>
> And although it doesn't matter since we're dropping this patch, the
> issue I noticed was that if there were, say, three unmerged entries
> with the same path, the original code would create one entry in the
> string list and modify it 3 times (each with a different ce_stage(ce).
> Your modification would create three different entries (each with only
> information from one stage) and drop two of them, meaning we no longer
> have a single string_list_item that contains information from all 3
> unmerged entries for the same path.  I'm pretty sure running the
> existing tests would catch that kind of bug, which is what raised the
> question.

That's the thing -- I did run make in the t/ directory, and it passed. I was
just wondering if there's any other way to test this in isolation, in case
I want to verify such changes more directly in the future.

Thanks for the clarification!
Meet
Elijah Newren Feb. 14, 2025, 7 p.m. UTC | #7
On Fri, Feb 14, 2025 at 12:24 AM Meet Soni <meetsoni3017@gmail.com> wrote:
>
> On Fri, 14 Feb 2025 at 11:35, Elijah Newren <newren@gmail.com> wrote:
> >
> > > > Did you run any tests?  I'm not sure you maintained correctness here.
> > >
> > > I didn't run any tests -- I wanted to, but I wasn’t sure how to do it
> > > for this change. Since you suggested dropping this patch from the
> > > series, I’ll do that. But for similar changes in the future, how should I go
> > > about testing them?
> >
> > As per Documentation/CodingGuidelines: "After any code change, make
> > sure that the entire test suite passes."  You can do that by running:
> >     cd t && make
> > (You probably want to also run that before making any changes, just to
> > verify that they all pass for you.  Then, if any test fails after you
> > make changes, you know it's because of your changes rather than
> > because you missed something in building or setting up the tests.)
> >
> >
> > And although it doesn't matter since we're dropping this patch, the
> > issue I noticed was that if there were, say, three unmerged entries
> > with the same path, the original code would create one entry in the
> > string list and modify it 3 times (each with a different ce_stage(ce).
> > Your modification would create three different entries (each with only
> > information from one stage) and drop two of them, meaning we no longer
> > have a single string_list_item that contains information from all 3
> > unmerged entries for the same path.  I'm pretty sure running the
> > existing tests would catch that kind of bug, which is what raised the
> > question.
>
> That's the thing -- I did run make in the t/ directory, and it passed. I was
> just wondering if there's any other way to test this in isolation, in case
> I want to verify such changes more directly in the future.

Really?  Did you rebuild the code, after making your changes?  You may
have been running with a pre-changes version of the code.

I just applied your changes and ran the tests.  I see it fail as soon
as it gets to t1004.

$ cd t && make test
[... lots of output snipped ...]
*** t1004-read-tree-m-u-wf.sh ***
ok 1 - two-way setup
ok 2 - two-way not clobbering
ok 3 - two-way with incorrect --exclude-per-directory (1)
ok 4 - two-way with incorrect --exclude-per-directory (2)
ok 5 - two-way clobbering a ignored file
ok 6 - three-way not complaining on an untracked path in both
ok 7 - three-way not clobbering a working tree file
ok 8 - three-way not complaining on an untracked file
ok 9 - 3-way not overwriting local changes (setup)
ok 10 - 3-way not overwriting local changes (our side)
ok 11 - 3-way not overwriting local changes (their side)
ok 12 - funny symlink in work tree
ok 13 - funny symlink in work tree, un-unlink-able
ok 14 - D/F setup
ok 15 - D/F
ok 16 - D/F resolve
not ok 17 - D/F recursive
#
#
#        git reset --hard &&
#        git checkout side-b &&
#        git merge-recursive branch-point -- side-b side-a
#
#
# failed 1 among 17 test(s)
1..17
make[1]: *** [Makefile:77: t1004-read-tree-m-u-wf.sh] Error 1
make[1]: Leaving directory '/home/newren/floss/git/t'
make: *** [Makefile:63: test] Error 2


...and if go to the toplevel directory and run under prove so I can
see all the failures (and run the test suites in parallel), I see:

$ cd .. && make DEFAULT_TEST_TARGET=prove GIT_PROVE_OPTS='--timer
--state failed,slow,save --jobs 12' test
[... lots of output snipped ...]
Test Summary Report
-------------------
t3424-rebase-empty.sh                            (Wstat: 256 Tests: 20
Failed: 18)
  Failed tests:  3-20
  Non-zero exit status: 1
t3436-rebase-more-options.sh                     (Wstat: 256 Tests: 19
Failed: 17)
  Failed tests:  2-18
  Non-zero exit status: 1
t4151-am-abort.sh                                (Wstat: 256 Tests: 20
Failed: 12)
  Failed tests:  5-9, 12-16, 19-20
  Non-zero exit status: 1
t3407-rebase-abort.sh                            (Wstat: 256 Tests: 17
Failed: 8)
  Failed tests:  2-9
  Non-zero exit status: 1
t3428-rebase-signoff.sh                          (Wstat: 256 Tests: 7 Failed: 5)
  Failed tests:  2, 4-7
  Non-zero exit status: 1
t6409-merge-subtree.sh                           (Wstat: 256 Tests: 12
Failed: 5)
  Failed tests:  2-6
  Non-zero exit status: 1
t7102-reset.sh                                   (Wstat: 256 Tests: 38
Failed: 7)
  Failed tests:  14-20
  Non-zero exit status: 1
t6432-merge-recursive-space-options.sh           (Wstat: 256 Tests: 11
Failed: 4)
  Failed tests:  2, 7-8, 11
  Non-zero exit status: 1
t6430-merge-recursive.sh                         (Wstat: 256 Tests: 37
Failed: 15)
  Failed tests:  10-11, 13-20, 22-24, 28-29
  Non-zero exit status: 1
t3406-rebase-message.sh                          (Wstat: 256 Tests: 32
Failed: 8)
  Failed tests:  22, 24-27, 29-31
  Non-zero exit status: 1
t4200-rerere.sh                                  (Wstat: 256 Tests: 36
Failed: 5)
  Failed tests:  24-28
  Non-zero exit status: 1
t7201-co.sh                                      (Wstat: 256 Tests: 46
Failed: 5)
  Failed tests:  5-9
  Non-zero exit status: 1
t3418-rebase-continue.sh                         (Wstat: 256 Tests: 29
Failed: 7)
  Failed tests:  4, 6, 10-12, 26-27
  Non-zero exit status: 1
t3403-rebase-skip.sh                             (Wstat: 256 Tests: 20
Failed: 3)
  Failed tests:  2, 4, 9
  Non-zero exit status: 1
t4253-am-keep-cr-dos.sh                          (Wstat: 256 Tests: 7 Failed: 2)
  Failed tests:  6-7
  Non-zero exit status: 1
t9903-bash-prompt.sh                             (Wstat: 256 Tests: 67
Failed: 39)
  Failed tests:  16-31, 33-35, 37, 40-44, 46-52, 55-58, 60
                62, 67
  Non-zero exit status: 1
t3503-cherry-pick-root.sh                        (Wstat: 256 Tests: 6 Failed: 2)
  Failed tests:  5-6
  Non-zero exit status: 1
t3401-rebase-and-am-rename.sh                    (Wstat: 256 Tests: 10
Failed: 2)
  Failed tests:  4, 10
  Non-zero exit status: 1
t2407-worktree-heads.sh                          (Wstat: 256 Tests: 12
Failed: 2)
  Failed tests:  4-5
  Non-zero exit status: 1
t5407-post-rewrite-hook.sh                       (Wstat: 256 Tests: 17
Failed: 3)
  Failed tests:  4-6
  Non-zero exit status: 1
t2500-untracked-overwriting.sh                   (Wstat: 256 Tests: 10
Failed: 2)
  Failed tests:  9-10
  Non-zero exit status: 1
t4153-am-resume-override-opts.sh                 (Wstat: 256 Tests: 6 Failed: 1)
  Failed test:  3
  Non-zero exit status: 1
t1015-read-index-unmerged.sh                     (Wstat: 256 Tests: 6 Failed: 1)
  Failed test:  6
  Non-zero exit status: 1
t3509-cherry-pick-merge-df.sh                    (Wstat: 256 Tests: 9 Failed: 1)
  Failed test:  9
  Non-zero exit status: 1
t2023-checkout-m.sh                              (Wstat: 256 Tests: 5 Failed: 1)
  Failed test:  5
  Non-zero exit status: 1
t7615-diff-algo-with-mergy-operations.sh         (Wstat: 256 Tests: 7 Failed: 1)
  Failed test:  2
  Non-zero exit status: 1
t6427-diff3-conflict-markers.sh                  (Wstat: 256 Tests: 9 Failed: 1)
  Failed test:  8
  Non-zero exit status: 1
t1004-read-tree-m-u-wf.sh                        (Wstat: 256 Tests: 17
Failed: 1)
  Failed test:  17
  Non-zero exit status: 1
t3420-rebase-autostash.sh                        (Wstat: 256 Tests: 52
Failed: 10)
  Failed tests:  11-17, 21-23
  Non-zero exit status: 1
t4150-am.sh                                      (Wstat: 256 Tests: 87
Failed: 33)
  Failed tests:  34-40, 42-46, 48, 50-54, 57-62, 64-65, 67-71
                75, 87
  Non-zero exit status: 1
t7512-status-help.sh                             (Wstat: 256 Tests: 46
Failed: 3)
  Failed tests:  5-6, 29
  Non-zero exit status: 1
t3400-rebase.sh                                  (Wstat: 256 Tests: 39
Failed: 1)
  Failed test:  30
  Non-zero exit status: 1
t3404-rebase-interactive.sh                      (Wstat: 256 Tests:
131 Failed: 1)
  Failed test:  80
  Non-zero exit status: 1
Files=1031, Tests=30662, 70 wallclock secs ( 8.33 usr  2.13 sys +
248.60 cusr 516.60 csys = 775.66 CPU)
Result: FAIL
make[1]: *** [Makefile:73: prove] Error 1
make[1]: Leaving directory '/home/newren/floss/git/t'
make: *** [Makefile:3237: test] Error 2

I suspect this is a case where it was testing a version of git that
you built before making the changes.
Meet Soni Feb. 15, 2025, 8:42 a.m. UTC | #8
On Sat, 15 Feb 2025 at 00:30, Elijah Newren <newren@gmail.com> wrote:
>
> On Fri, Feb 14, 2025 at 12:24 AM Meet Soni <meetsoni3017@gmail.com> wrote:
> > That's the thing -- I did run make in the t/ directory, and it passed. I was
> > just wondering if there's any other way to test this in isolation, in case
> > I want to verify such changes more directly in the future.
>
> Really?  Did you rebuild the code, after making your changes?  You may
> have been running with a pre-changes version of the code.
>
> I just applied your changes and ran the tests.  I see it fail as soon
> as it gets to t1004.
>
> $ cd t && make test
> [... lots of output snipped ...]
> *** t1004-read-tree-m-u-wf.sh ***
> ok 1 - two-way setup
> ok 2 - two-way not clobbering
> ok 3 - two-way with incorrect --exclude-per-directory (1)
> ok 4 - two-way with incorrect --exclude-per-directory (2)
> ok 5 - two-way clobbering a ignored file
> ok 6 - three-way not complaining on an untracked path in both
> ok 7 - three-way not clobbering a working tree file
> ok 8 - three-way not complaining on an untracked file
> ok 9 - 3-way not overwriting local changes (setup)
> ok 10 - 3-way not overwriting local changes (our side)
> ok 11 - 3-way not overwriting local changes (their side)
> ok 12 - funny symlink in work tree
> ok 13 - funny symlink in work tree, un-unlink-able
> ok 14 - D/F setup
> ok 15 - D/F
> ok 16 - D/F resolve
> not ok 17 - D/F recursive
> #
> #
> #        git reset --hard &&
> #        git checkout side-b &&
> #        git merge-recursive branch-point -- side-b side-a
> #
> #
> # failed 1 among 17 test(s)
> 1..17
> make[1]: *** [Makefile:77: t1004-read-tree-m-u-wf.sh] Error 1
> make[1]: Leaving directory '/home/newren/floss/git/t'
> make: *** [Makefile:63: test] Error 2
>
>
> ...and if go to the toplevel directory and run under prove so I can
> see all the failures (and run the test suites in parallel), I see:
>
> $ cd .. && make DEFAULT_TEST_TARGET=prove GIT_PROVE_OPTS='--timer
> --state failed,slow,save --jobs 12' test
> [... lots of output snipped ...]
> Test Summary Report
> -------------------
> t3424-rebase-empty.sh                            (Wstat: 256 Tests: 20
> Failed: 18)
>   Failed tests:  3-20
>   Non-zero exit status: 1
> t3436-rebase-more-options.sh                     (Wstat: 256 Tests: 19
> Failed: 17)
>   Failed tests:  2-18
>   Non-zero exit status: 1
> t4151-am-abort.sh                                (Wstat: 256 Tests: 20
> Failed: 12)
>   Failed tests:  5-9, 12-16, 19-20
>   Non-zero exit status: 1
> t3407-rebase-abort.sh                            (Wstat: 256 Tests: 17
> Failed: 8)
>   Failed tests:  2-9
>   Non-zero exit status: 1
> t3428-rebase-signoff.sh                          (Wstat: 256 Tests: 7 Failed: 5)
>   Failed tests:  2, 4-7
>   Non-zero exit status: 1
> t6409-merge-subtree.sh                           (Wstat: 256 Tests: 12
> Failed: 5)
>   Failed tests:  2-6
>   Non-zero exit status: 1
> t7102-reset.sh                                   (Wstat: 256 Tests: 38
> Failed: 7)
>   Failed tests:  14-20
>   Non-zero exit status: 1
> t6432-merge-recursive-space-options.sh           (Wstat: 256 Tests: 11
> Failed: 4)
>   Failed tests:  2, 7-8, 11
>   Non-zero exit status: 1
> t6430-merge-recursive.sh                         (Wstat: 256 Tests: 37
> Failed: 15)
>   Failed tests:  10-11, 13-20, 22-24, 28-29
>   Non-zero exit status: 1
> t3406-rebase-message.sh                          (Wstat: 256 Tests: 32
> Failed: 8)
>   Failed tests:  22, 24-27, 29-31
>   Non-zero exit status: 1
> t4200-rerere.sh                                  (Wstat: 256 Tests: 36
> Failed: 5)
>   Failed tests:  24-28
>   Non-zero exit status: 1
> t7201-co.sh                                      (Wstat: 256 Tests: 46
> Failed: 5)
>   Failed tests:  5-9
>   Non-zero exit status: 1
> t3418-rebase-continue.sh                         (Wstat: 256 Tests: 29
> Failed: 7)
>   Failed tests:  4, 6, 10-12, 26-27
>   Non-zero exit status: 1
> t3403-rebase-skip.sh                             (Wstat: 256 Tests: 20
> Failed: 3)
>   Failed tests:  2, 4, 9
>   Non-zero exit status: 1
> t4253-am-keep-cr-dos.sh                          (Wstat: 256 Tests: 7 Failed: 2)
>   Failed tests:  6-7
>   Non-zero exit status: 1
> t9903-bash-prompt.sh                             (Wstat: 256 Tests: 67
> Failed: 39)
>   Failed tests:  16-31, 33-35, 37, 40-44, 46-52, 55-58, 60
>                 62, 67
>   Non-zero exit status: 1
> t3503-cherry-pick-root.sh                        (Wstat: 256 Tests: 6 Failed: 2)
>   Failed tests:  5-6
>   Non-zero exit status: 1
> t3401-rebase-and-am-rename.sh                    (Wstat: 256 Tests: 10
> Failed: 2)
>   Failed tests:  4, 10
>   Non-zero exit status: 1
> t2407-worktree-heads.sh                          (Wstat: 256 Tests: 12
> Failed: 2)
>   Failed tests:  4-5
>   Non-zero exit status: 1
> t5407-post-rewrite-hook.sh                       (Wstat: 256 Tests: 17
> Failed: 3)
>   Failed tests:  4-6
>   Non-zero exit status: 1
> t2500-untracked-overwriting.sh                   (Wstat: 256 Tests: 10
> Failed: 2)
>   Failed tests:  9-10
>   Non-zero exit status: 1
> t4153-am-resume-override-opts.sh                 (Wstat: 256 Tests: 6 Failed: 1)
>   Failed test:  3
>   Non-zero exit status: 1
> t1015-read-index-unmerged.sh                     (Wstat: 256 Tests: 6 Failed: 1)
>   Failed test:  6
>   Non-zero exit status: 1
> t3509-cherry-pick-merge-df.sh                    (Wstat: 256 Tests: 9 Failed: 1)
>   Failed test:  9
>   Non-zero exit status: 1
> t2023-checkout-m.sh                              (Wstat: 256 Tests: 5 Failed: 1)
>   Failed test:  5
>   Non-zero exit status: 1
> t7615-diff-algo-with-mergy-operations.sh         (Wstat: 256 Tests: 7 Failed: 1)
>   Failed test:  2
>   Non-zero exit status: 1
> t6427-diff3-conflict-markers.sh                  (Wstat: 256 Tests: 9 Failed: 1)
>   Failed test:  8
>   Non-zero exit status: 1
> t1004-read-tree-m-u-wf.sh                        (Wstat: 256 Tests: 17
> Failed: 1)
>   Failed test:  17
>   Non-zero exit status: 1
> t3420-rebase-autostash.sh                        (Wstat: 256 Tests: 52
> Failed: 10)
>   Failed tests:  11-17, 21-23
>   Non-zero exit status: 1
> t4150-am.sh                                      (Wstat: 256 Tests: 87
> Failed: 33)
>   Failed tests:  34-40, 42-46, 48, 50-54, 57-62, 64-65, 67-71
>                 75, 87
>   Non-zero exit status: 1
> t7512-status-help.sh                             (Wstat: 256 Tests: 46
> Failed: 3)
>   Failed tests:  5-6, 29
>   Non-zero exit status: 1
> t3400-rebase.sh                                  (Wstat: 256 Tests: 39
> Failed: 1)
>   Failed test:  30
>   Non-zero exit status: 1
> t3404-rebase-interactive.sh                      (Wstat: 256 Tests:
> 131 Failed: 1)
>   Failed test:  80
>   Non-zero exit status: 1
> Files=1031, Tests=30662, 70 wallclock secs ( 8.33 usr  2.13 sys +
> 248.60 cusr 516.60 csys = 775.66 CPU)
> Result: FAIL
> make[1]: *** [Makefile:73: prove] Error 1
> make[1]: Leaving directory '/home/newren/floss/git/t'
> make: *** [Makefile:3237: test] Error 2
>
> I suspect this is a case where it was testing a version of git that
> you built before making the changes.

Thanks! You're right. I ran the tests before running make. After
running make and testing again, it failed.

Meet
diff mbox series

Patch

diff --git a/merge-recursive.c b/merge-recursive.c
index 884ccf99a5..6165993429 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -547,15 +547,15 @@  static struct string_list *get_unmerged(struct index_state *istate)
 		if (!ce_stage(ce))
 			continue;
 
-		item = string_list_lookup(unmerged, ce->name);
-		if (!item) {
-			item = string_list_insert(unmerged, ce->name);
-			item->util = xcalloc(1, sizeof(struct stage_data));
-		}
+		item = string_list_append(unmerged, ce->name);
+		item->util = xcalloc(1, sizeof(struct stage_data));
+
 		e = item->util;
 		e->stages[ce_stage(ce)].mode = ce->ce_mode;
 		oidcpy(&e->stages[ce_stage(ce)].oid, &ce->oid);
 	}
+	string_list_sort(unmerged);
+	string_list_remove_duplicates(unmerged, 1);
 
 	return unmerged;
 }