diff mbox series

[1/3] merge-ort: copy a few small helper functions from merge-recursive.c

Message ID 0b455bd6fe7dff72c1849eb8466b97b96b2b90a9.1608054807.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series merge-ort: implement recursive merges | expand

Commit Message

Elijah Newren Dec. 15, 2020, 5:53 p.m. UTC
From: Elijah Newren <newren@gmail.com>

In a subsequent commit, we will implement the traditional recursiveness
that gave merge-recursive its name, namely merging non-unique
merge-bases to come up with a single virtual merge base.  Copy a few
helper functions from merge-recursive.c that we will use in the
implementation.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

Comments

Junio C Hamano Dec. 16, 2020, 1:16 a.m. UTC | #1
"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Elijah Newren <newren@gmail.com>
>
> In a subsequent commit, we will implement the traditional recursiveness
> that gave merge-recursive its name, namely merging non-unique
> merge-bases to come up with a single virtual merge base.  Copy a few
> helper functions from merge-recursive.c that we will use in the
> implementation.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ...
> @@ -1462,5 +1492,7 @@ void merge_incore_recursive(struct merge_options *opt,
>  			    struct commit *side2,
>  			    struct merge_result *result)
>  {
> +	(void)reverse_commit_list;
> +	(void)make_virtual_commit;

To keep these symbols referenced?  Cute ;-)

>  	die("Not yet implemented");
>  }
Derrick Stolee Dec. 16, 2020, 1:30 p.m. UTC | #2
On 12/15/2020 12:53 PM, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <newren@gmail.com>
> 
> In a subsequent commit, we will implement the traditional recursiveness
> that gave merge-recursive its name, namely merging non-unique
> merge-bases to come up with a single virtual merge base.  Copy a few
> helper functions from merge-recursive.c that we will use in the
> implementation.

I'm sure these are copies, but...

> +static struct commit_list *reverse_commit_list(struct commit_list *list)
> +{
> +	struct commit_list *next = NULL, *current, *backup;
> +	for (current = list; current; current = backup) {
> +		backup = current->next;
> +		current->next = next;
> +		next = current;
> +	}

The naming of 'next' seems backwards to me, since it is really
the "previous" node. Using something like 'previous' makes it
clear that you are reversing when you say

	current->next = previous;

Thanks,
-Stolee
Johannes Schindelin Dec. 16, 2020, 4:12 p.m. UTC | #3
Hi Junio,

On Tue, 15 Dec 2020, Junio C Hamano wrote:

> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Elijah Newren <newren@gmail.com>
> >
> > In a subsequent commit, we will implement the traditional recursiveness
> > that gave merge-recursive its name, namely merging non-unique
> > merge-bases to come up with a single virtual merge base.  Copy a few
> > helper functions from merge-recursive.c that we will use in the
> > implementation.
> >
> > Signed-off-by: Elijah Newren <newren@gmail.com>
> > ...
> > @@ -1462,5 +1492,7 @@ void merge_incore_recursive(struct merge_options *opt,
> >  			    struct commit *side2,
> >  			    struct merge_result *result)
> >  {
> > +	(void)reverse_commit_list;
> > +	(void)make_virtual_commit;
>
> To keep these symbols referenced?  Cute ;-)

I saw this technique used extensively in cURL. Note that we ourselves
introduced the first such thing in 2fb330ca723 (packed_delete_refs():
implement method, 2017-09-08).

However, we seem to have the `MAYBE_UNUSED` macro specifically for such
use cases (and use it in four instances as of time of writing). I wonder
whether we want to settle one one strategy instead of keeping both?

Ciao,
Dscho

>
> >  	die("Not yet implemented");
> >  }
>
Elijah Newren Dec. 16, 2020, 4:24 p.m. UTC | #4
On Wed, Dec 16, 2020 at 8:12 AM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> Hi Junio,
>
> On Tue, 15 Dec 2020, Junio C Hamano wrote:
>
> > "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> >
> > > From: Elijah Newren <newren@gmail.com>
> > >
> > > In a subsequent commit, we will implement the traditional recursiveness
> > > that gave merge-recursive its name, namely merging non-unique
> > > merge-bases to come up with a single virtual merge base.  Copy a few
> > > helper functions from merge-recursive.c that we will use in the
> > > implementation.
> > >
> > > Signed-off-by: Elijah Newren <newren@gmail.com>
> > > ...
> > > @@ -1462,5 +1492,7 @@ void merge_incore_recursive(struct merge_options *opt,
> > >                         struct commit *side2,
> > >                         struct merge_result *result)
> > >  {
> > > +   (void)reverse_commit_list;
> > > +   (void)make_virtual_commit;
> >
> > To keep these symbols referenced?  Cute ;-)
>
> I saw this technique used extensively in cURL. Note that we ourselves
> introduced the first such thing in 2fb330ca723 (packed_delete_refs():
> implement method, 2017-09-08).
>
> However, we seem to have the `MAYBE_UNUSED` macro specifically for such
> use cases (and use it in four instances as of time of writing). I wonder
> whether we want to settle one one strategy instead of keeping both?

Ooh, MAYBE_UNUSED, I like it.  Much more self-documenting.  I'm
sending another round to address Derrick's comment, so I'll include
this change while at it.
Junio C Hamano Dec. 16, 2020, 5:43 p.m. UTC | #5
Derrick Stolee <stolee@gmail.com> writes:

> On 12/15/2020 12:53 PM, Elijah Newren via GitGitGadget wrote:
>> From: Elijah Newren <newren@gmail.com>
>> 
>> In a subsequent commit, we will implement the traditional recursiveness
>> that gave merge-recursive its name, namely merging non-unique
>> merge-bases to come up with a single virtual merge base.  Copy a few
>> helper functions from merge-recursive.c that we will use in the
>> implementation.
>
> I'm sure these are copies, but...
>
>> +static struct commit_list *reverse_commit_list(struct commit_list *list)
>> +{
>> +	struct commit_list *next = NULL, *current, *backup;
>> +	for (current = list; current; current = backup) {
>> +		backup = current->next;
>> +		current->next = next;
>> +		next = current;
>> +	}
>
> The naming of 'next' seems backwards to me, since it is really
> the "previous" node. Using something like 'previous' makes it
> clear that you are reversing when you say
>
> 	current->next = previous;

Hmph.  I took "next" commit_list as "list is the original one, and
next is the reversed list, the next generation of what we received".

Calling it "previous" feels even more backwards when you view among
three "struct commit_list *" pointers, one (the one that holds the
eventual return value) is primarily used to denote the resulting
list itself, and the other two are used to point individual elements
on the original list.

I wonder if a slightly different codeflow may be easier to follow

	struct commit_list *result = NULL;
	while (list) {
        	struct commit_list *next = list->next;
		list->next = result;
		result = list;
		list = next;
	}
	return result;

if we were to try improving this for readability?  

I dunno if it matters too much, though.
Felipe Contreras Dec. 16, 2020, 6:54 p.m. UTC | #6
Junio C Hamano wrote:
> Derrick Stolee <stolee@gmail.com> writes:
> 
> > On 12/15/2020 12:53 PM, Elijah Newren via GitGitGadget wrote:
> >> From: Elijah Newren <newren@gmail.com>
> >> 
> >> In a subsequent commit, we will implement the traditional recursiveness
> >> that gave merge-recursive its name, namely merging non-unique
> >> merge-bases to come up with a single virtual merge base.  Copy a few
> >> helper functions from merge-recursive.c that we will use in the
> >> implementation.
> >
> > I'm sure these are copies, but...
> >
> >> +static struct commit_list *reverse_commit_list(struct commit_list *list)
> >> +{
> >> +	struct commit_list *next = NULL, *current, *backup;
> >> +	for (current = list; current; current = backup) {
> >> +		backup = current->next;
> >> +		current->next = next;
> >> +		next = current;
> >> +	}
> >
> > The naming of 'next' seems backwards to me, since it is really
> > the "previous" node. Using something like 'previous' makes it
> > clear that you are reversing when you say
> >
> > 	current->next = previous;
> 
> Hmph.  I took "next" commit_list as "list is the original one, and
> next is the reversed list, the next generation of what we received".
> 
> Calling it "previous" feels even more backwards when you view among
> three "struct commit_list *" pointers, one (the one that holds the
> eventual return value) is primarily used to denote the resulting
> list itself, and the other two are used to point individual elements
> on the original list.

Both feel awkward to me because to me previous/next are actually current
in my mind, and we should return current, not previous/next. Plus
there's no need for current, we can use list, as your example.

This is what I came up with:

  struct commit_list *current = NULL;
  while (list) {
          struct commit_list *tmp = list->next;
          list->next = current;
          current = list;
          list = tmp;
  }
  return current;

For completeness I checked how Linux does it, and surprisingly
llist_reverse_order() is very close to what I came up with:

  struct commit_list *new_list = NULL;
  while (list) {
          struct commit_list *tmp = list;
          list = list->next;
          tmp->next = new_list;
          new_list = tmp;
  }
  return new_list;

(obviously translated)

Maybe it's just me, but I find my version easier to read.

Cheers.
Elijah Newren Dec. 16, 2020, 7:20 p.m. UTC | #7
On Wed, Dec 16, 2020 at 9:43 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Derrick Stolee <stolee@gmail.com> writes:
>
> > On 12/15/2020 12:53 PM, Elijah Newren via GitGitGadget wrote:
> >> From: Elijah Newren <newren@gmail.com>
> >>
> >> In a subsequent commit, we will implement the traditional recursiveness
> >> that gave merge-recursive its name, namely merging non-unique
> >> merge-bases to come up with a single virtual merge base.  Copy a few
> >> helper functions from merge-recursive.c that we will use in the
> >> implementation.
> >
> > I'm sure these are copies, but...
> >
> >> +static struct commit_list *reverse_commit_list(struct commit_list *list)
> >> +{
> >> +    struct commit_list *next = NULL, *current, *backup;
> >> +    for (current = list; current; current = backup) {
> >> +            backup = current->next;
> >> +            current->next = next;
> >> +            next = current;
> >> +    }
> >
> > The naming of 'next' seems backwards to me, since it is really
> > the "previous" node. Using something like 'previous' makes it
> > clear that you are reversing when you say
> >
> >       current->next = previous;
>
> Hmph.  I took "next" commit_list as "list is the original one, and
> next is the reversed list, the next generation of what we received".
>
> Calling it "previous" feels even more backwards when you view among
> three "struct commit_list *" pointers, one (the one that holds the
> eventual return value) is primarily used to denote the resulting
> list itself, and the other two are used to point individual elements
> on the original list.
>
> I wonder if a slightly different codeflow may be easier to follow
>
>         struct commit_list *result = NULL;
>         while (list) {
>                 struct commit_list *next = list->next;
>                 list->next = result;
>                 result = list;
>                 list = next;
>         }
>         return result;
>
> if we were to try improving this for readability?

Looks like Felipe also came up with the same version you did (modulo
the temporary variable name).

> I dunno if it matters too much, though.

Yeah, reverse_commit_list() has been unchanged in merge-recursive.c
since its introduction in August of 2006.  The function's purpose
seems obvious from its name, so no one ever needs to look at or modify
the implementation.  I'm certain I'll never touch it again.  So, I
personally don't care what the particular implementation is, and I'm
happy to take whatever reviewers prefer here.

Since we have three people weighing in with opinions though -- and on
a point that's irrelevant to me -- do you want to make the call here
Junio?

Otherwise I'll just leave it as-is and not update further.
Junio C Hamano Dec. 16, 2020, 8:41 p.m. UTC | #8
Elijah Newren <newren@gmail.com> writes:

> I wonder if a slightly different codeflow may be easier to follow
>>
>>         struct commit_list *result = NULL;
>>         while (list) {
>>                 struct commit_list *next = list->next;
>>                 list->next = result;
>>                 result = list;
>>                 list = next;
>>         }
>>         return result;
>>
>> if we were to try improving this for readability?
>
> Looks like Felipe also came up with the same version you did (modulo
> the temporary variable name).

Funny if it was sent as a response to the message that already had
the same answer...

>> I dunno if it matters too much, though.
>
> Yeah, reverse_commit_list() has been unchanged in merge-recursive.c
> since its introduction in August of 2006.  The function's purpose
> seems obvious from its name, so no one ever needs to look at or modify
> the implementation.  I'm certain I'll never touch it again.  So, I
> personally don't care what the particular implementation is, and I'm
> happy to take whatever reviewers prefer here.
>
> Since we have three people weighing in with opinions though -- and on
> a point that's irrelevant to me -- do you want to make the call here
> Junio?

If I were pressed to give a suggestion, I have two ;-)

I would prefer to see us first find out if all other callers of
get_merge_bases() _care_ about a particular order of the resulting
list.  If they do not care [*1*] and if it seems feasible to teach
get_merge_bases() build its return value reversed already without
too much extra effort, then the commit list reverser can
disappear and get_merge_bases() can be fixed to return its commit
list in older-to-newer order.

If the above does not happen, then I'd prefer to see a single commit
list reverser in commit.c and have it *shared* between the two merge
backends, instead of adding another one in merge-ort.c while leaving
the one in merge-recursive.c behind.  And the single implementation
can be either "copied from merge-recursive as that may be
unintuitive or harder to follow but at least we know it is battle
tested", or the one we see above.  If we were to take the latter, we
really need to avoid making stupid mistakes while attempting to
replace an existing awkward one with "simplified" version, though.

Sorry for listing even more work, but since you asked ... ;-)


[Footnote]

*1* if those who care actually would benefit if we used
older-to-newer order, it would be even better.
Felipe Contreras Dec. 16, 2020, 9:25 p.m. UTC | #9
Junio C Hamano wrote:
> Elijah Newren <newren@gmail.com> writes:
> 
> > I wonder if a slightly different codeflow may be easier to follow
> >>
> >>         struct commit_list *result = NULL;
> >>         while (list) {
> >>                 struct commit_list *next = list->next;
> >>                 list->next = result;
> >>                 result = list;
> >>                 list = next;
> >>         }
> >>         return result;
> >>
> >> if we were to try improving this for readability?
> >
> > Looks like Felipe also came up with the same version you did (modulo
> > the temporary variable name).
> 
> Funny if it was sent as a response to the message that already had
> the same answer...

It's two changes, actually: s/result/current/ and s/next/tmp/.

But yeah, it's interesting that I used Elijahs's version as a basis and
arrived at virtually the same thing.

Cheers.
Elijah Newren Dec. 16, 2020, 9:34 p.m. UTC | #10
On Wed, Dec 16, 2020 at 12:42 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > I wonder if a slightly different codeflow may be easier to follow
> >>
> >>         struct commit_list *result = NULL;
> >>         while (list) {
> >>                 struct commit_list *next = list->next;
> >>                 list->next = result;
> >>                 result = list;
> >>                 list = next;
> >>         }
> >>         return result;
> >>
> >> if we were to try improving this for readability?
> >
> > Looks like Felipe also came up with the same version you did (modulo
> > the temporary variable name).
>
> Funny if it was sent as a response to the message that already had
> the same answer...


>
> >> I dunno if it matters too much, though.
> >
> > Yeah, reverse_commit_list() has been unchanged in merge-recursive.c
> > since its introduction in August of 2006.  The function's purpose
> > seems obvious from its name, so no one ever needs to look at or modify
> > the implementation.  I'm certain I'll never touch it again.  So, I
> > personally don't care what the particular implementation is, and I'm
> > happy to take whatever reviewers prefer here.
> >
> > Since we have three people weighing in with opinions though -- and on
> > a point that's irrelevant to me -- do you want to make the call here
> > Junio?
>
> If I were pressed to give a suggestion, I have two ;-)
>
> I would prefer to see us first find out if all other callers of
> get_merge_bases() _care_ about a particular order of the resulting
> list.  If they do not care [*1*] and if it seems feasible to teach
> get_merge_bases() build its return value reversed already without
> too much extra effort, then the commit list reverser can
> disappear and get_merge_bases() can be fixed to return its commit
> list in older-to-newer order.

The ones in sequencer.c and builtin/merge.c care about the order, but
they manually reverse it (because they are going to call the merge
machinery).  So, these two would benefit from it being reversed.
However...

notes-merge.c and submodule.c both have subtle dependencies on the
order (in ways that might be buggy).  They could perhaps be taught to
depend on the reversed order, but I'm leery of touching something that
looks possibly buggy (one even has a TODO highlighting it) and
becoming responsible.

get_octopus_merge_bases() depends on the order from get_merge_bases(),
and builtin/merge-base.c depends on that function.  So, the output
order of a command depends on it, which might thus affect user
scripts.

builtin/pull.c also depends on the order returned by get_octopus_merge_bases().

That's enough dependencies that I'm inclined to just leave this side
of things as they are.


> If the above does not happen, then I'd prefer to see a single commit
> list reverser in commit.c and have it *shared* between the two merge
> backends, instead of adding another one in merge-ort.c while leaving
> the one in merge-recursive.c behind.  And the single implementation
> can be either "copied from merge-recursive as that may be
> unintuitive or harder to follow but at least we know it is battle
> tested", or the one we see above.  If we were to take the latter, we
> really need to avoid making stupid mistakes while attempting to
> replace an existing awkward one with "simplified" version, though.

commit.c seems like a natural location.  I'll insert a patch at the
beginning of the series to move the function there, and just use
Johannes' original version from 2006 as-is -- that'll make it easier
to verify that my patch is simply moving things, anyway.

> Sorry for listing even more work, but since you asked ... ;-)
>
>
> [Footnote]
>
> *1* if those who care actually would benefit if we used
> older-to-newer order, it would be even better.
diff mbox series

Patch

diff --git a/merge-ort.c b/merge-ort.c
index 414e7b7eeac..05ba92c91a6 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -17,8 +17,10 @@ 
 #include "cache.h"
 #include "merge-ort.h"
 
+#include "alloc.h"
 #include "blob.h"
 #include "cache-tree.h"
+#include "commit.h"
 #include "commit-reach.h"
 #include "diff.h"
 #include "diffcore.h"
@@ -1348,6 +1350,34 @@  void merge_finalize(struct merge_options *opt,
 
 /*** Function Grouping: helper functions for merge_incore_*() ***/
 
+static inline void set_commit_tree(struct commit *c, struct tree *t)
+{
+	c->maybe_tree = t;
+}
+
+static struct commit *make_virtual_commit(struct repository *repo,
+					  struct tree *tree,
+					  const char *comment)
+{
+	struct commit *commit = alloc_commit_node(repo);
+
+	set_merge_remote_desc(commit, comment, (struct object *)commit);
+	set_commit_tree(commit, tree);
+	commit->object.parsed = 1;
+	return commit;
+}
+
+static struct commit_list *reverse_commit_list(struct commit_list *list)
+{
+	struct commit_list *next = NULL, *current, *backup;
+	for (current = list; current; current = backup) {
+		backup = current->next;
+		current->next = next;
+		next = current;
+	}
+	return next;
+}
+
 static void merge_start(struct merge_options *opt, struct merge_result *result)
 {
 	/* Sanity checks on opt */
@@ -1462,5 +1492,7 @@  void merge_incore_recursive(struct merge_options *opt,
 			    struct commit *side2,
 			    struct merge_result *result)
 {
+	(void)reverse_commit_list;
+	(void)make_virtual_commit;
 	die("Not yet implemented");
 }