Message ID | pull.1565.v6.git.1695522222723.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 6a4c9e7b32c4154345785bd7b8d4afee5fddcead |
Headers | show |
Series | [v6] merge-tree: add -X strategy option | expand |
Thanks for your advice! I've uploaded a new patch to support more strategy options and the parse option issue.
On 24/09/2023 03:23, Izzy via GitGitGadget wrote: Thanks for updating the patch, sorry it has taken me a while to look at this. > From: Tang Yuyi <winglovet@gmail.com> > > Add merge strategy option to produce more customizable merge result such > as automatically resolving conflicts. > > Signed-off-by: Tang Yuyi <winglovet@gmail.com> > --- > static int real_merge(struct merge_tree_options *o, > @@ -423,7 +425,7 @@ static int real_merge(struct merge_tree_options *o, > { > struct commit *parent1, *parent2; > struct commit_list *merge_bases = NULL; > - struct merge_options opt; > + struct merge_options opt = o->merge_options; Copying struct merge_options by value here is unusual in our code base. I don't think it introduces a bug (there is no function to free the resources allocated in struct merge_options so we do not end up freeing them twice for example) but it would be clearer that it was safe if you did struct merge_options *opt = &o->merge_options; and updated the code to reflect that opt is now a pointer or just replaced all uses of "opt" with "o->merge_options" > + if (xopts.nr && o.mode == MODE_TRIVIAL) > + die(_("--trivial-merge is incompatible with all other options")); This is good, thanks for adding it. > + for (int x = 0; x < xopts.nr; x++) This is not worth a re-roll on its own but as xopts.nr is a size_t we should declare x to be the same type rather than an int. The tests are pretty minimal, ideally we would check that "-X" works with "--stdin", that it is rejected by a trivial merge and that one of the other strategy options works. Best Wishes Phillip
On Mon, Oct 09, 2023 at 10:58:07AM +0100, Phillip Wood wrote: > > @@ -423,7 +425,7 @@ static int real_merge(struct merge_tree_options *o, > > { > > struct commit *parent1, *parent2; > > struct commit_list *merge_bases = NULL; > > - struct merge_options opt; > > + struct merge_options opt = o->merge_options; > > Copying struct merge_options by value here is unusual in our code base. I > don't think it introduces a bug (there is no function to free the resources > allocated in struct merge_options so we do not end up freeing them twice for > example) but it would be clearer that it was safe if you did > > struct merge_options *opt = &o->merge_options; > > and updated the code to reflect that opt is now a pointer or just replaced > all uses of "opt" with "o->merge_options" I agree that struct-copying is an unusual pattern, and we'd potentially run into problems with duplication. But I think it is even trickier than that here. We also go on to actually _modify_ opt in this function, assigning to various members (both directly, and I think the merge code itself will write to opt->priv). So if we use a pointer (rather than struct assignment), those changes will persist in the merge_options struct that was passed in. Which is also weird. Between the two, I think using a pointer is probably the least-weird. This real_merge() function is only called once, and is a static-local helper for cmd_merge_tree(). So the two functions work as a single unit, and munging "opt" is not a big deal. -Peff
Jeff King <peff@peff.net> writes: > I agree that struct-copying is an unusual pattern, and we'd potentially > run into problems with duplication. But I think it is even trickier than > that here. We also go on to actually _modify_ opt in this function, > assigning to various members (both directly, and I think the merge code > itself will write to opt->priv). > > So if we use a pointer (rather than struct assignment), those changes > will persist in the merge_options struct that was passed in. Which is > also weird. > > Between the two, I think using a pointer is probably the least-weird. > This real_merge() function is only called once, and is a static-local > helper for cmd_merge_tree(). So the two functions work as a single unit, > and munging "opt" is not a big deal. It is called once per --stdin input to perform many merges in a row. The most obvious "structure to pointer to structure" conversion below seems to break an assertion (which is not very surprising, as it happens inside that --stdin loop), so I am tempted to revert the whole thing for now. Thanks. git: merge-ort.c:5110: merge_incore_recursive: Assertion `opt->ancestor == NULL' failed. ./test-lib.sh: line 1067: 738791 Done printf "c1 c3\nc2 -- c1 c3\nc2 c3" 738792 Aborted | git -C repo merge-tree --stdin > actual builtin/merge-tree.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git c/builtin/merge-tree.c w/builtin/merge-tree.c index 7024b5ce2e..1cb1fba2de 100644 --- c/builtin/merge-tree.c +++ w/builtin/merge-tree.c @@ -425,7 +425,7 @@ static int real_merge(struct merge_tree_options *o, { struct commit *parent1, *parent2; struct commit_list *merge_bases = NULL; - struct merge_options opt = o->merge_options; + struct merge_options *opt = &o->merge_options; struct merge_result result = { 0 }; int show_messages = o->show_messages; @@ -439,10 +439,10 @@ static int real_merge(struct merge_tree_options *o, help_unknown_ref(branch2, "merge-tree", _("not something we can merge")); - opt.show_rename_progress = 0; + opt->show_rename_progress = 0; - opt.branch1 = branch1; - opt.branch2 = branch2; + opt->branch1 = branch1; + opt->branch2 = branch2; if (merge_base) { struct commit *base_commit; @@ -452,11 +452,11 @@ static int real_merge(struct merge_tree_options *o, if (!base_commit) die(_("could not lookup commit '%s'"), merge_base); - opt.ancestor = merge_base; + opt->ancestor = merge_base; base_tree = repo_get_commit_tree(the_repository, base_commit); parent1_tree = repo_get_commit_tree(the_repository, parent1); parent2_tree = repo_get_commit_tree(the_repository, parent2); - merge_incore_nonrecursive(&opt, base_tree, parent1_tree, parent2_tree, &result); + merge_incore_nonrecursive(opt, base_tree, parent1_tree, parent2_tree, &result); } else { /* * Get the merge bases, in reverse order; see comment above @@ -467,7 +467,7 @@ static int real_merge(struct merge_tree_options *o, if (!merge_bases && !o->allow_unrelated_histories) die(_("refusing to merge unrelated histories")); merge_bases = reverse_commit_list(merge_bases); - merge_incore_recursive(&opt, merge_bases, parent1, parent2, &result); + merge_incore_recursive(opt, merge_bases, parent1, parent2, &result); } if (result.clean < 0) @@ -501,12 +501,12 @@ static int real_merge(struct merge_tree_options *o, } if (show_messages) { putchar(line_termination); - merge_display_update_messages(&opt, line_termination == '\0', + merge_display_update_messages(opt, line_termination == '\0', &result); } if (o->use_stdin) putchar(line_termination); - merge_finalize(&opt, &result); + merge_finalize(opt, &result); return !result.clean; /* result.clean < 0 handled above */ }
On Mon, Oct 09, 2023 at 10:10:28AM -0700, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > I agree that struct-copying is an unusual pattern, and we'd potentially > > run into problems with duplication. But I think it is even trickier than > > that here. We also go on to actually _modify_ opt in this function, > > assigning to various members (both directly, and I think the merge code > > itself will write to opt->priv). > > > > So if we use a pointer (rather than struct assignment), those changes > > will persist in the merge_options struct that was passed in. Which is > > also weird. > > > > Between the two, I think using a pointer is probably the least-weird. > > This real_merge() function is only called once, and is a static-local > > helper for cmd_merge_tree(). So the two functions work as a single unit, > > and munging "opt" is not a big deal. > > It is called once per --stdin input to perform many merges in a row. > The most obvious "structure to pointer to structure" conversion > below seems to break an assertion (which is not very surprising, as > it happens inside that --stdin loop), so I am tempted to revert the > whole thing for now. Oops, I totally missed the loop around the call to real_merge(). So yeah, I think this is rather tricky. Before Izzy's patch, real_merge() always makes its own fresh merge_options. After, we have a template merge_options that we copy, but we are assuming that a shallow struct copy is OK (probably true, but an anti-pattern that may bite us later). If we add Phillip's suggestion on top, then we do not copy at all, and end up reusing the same options struct (which is definitely wrong). I don't think there are any bugs with the state at the current tip of ty/merge-tree-strategy-options, but if we want to make it safer, I think we have two options: - delay the conversion of the "xopts" list into merge_options until we initialize it in real_merge(). This avoids breaking abstraction boundaries, but does mean that the sanity-checking of the options happens a little later (but not much in practice). - provide a copy_merge_options() function, which makes this kind of "set up a template and then copy it" pattern official. It can be a struct assignment for now, but it at least alerts anybody adding new options to the notion that a deep copy might be required. Option 1 looks something like this (a lot of the hunks are just reverting the tip of that branch, so squashed in it would be even smaller): diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c index 7024b5ce2e..f9dbbdb867 100644 --- a/builtin/merge-tree.c +++ b/builtin/merge-tree.c @@ -415,7 +415,7 @@ struct merge_tree_options { int show_messages; int name_only; int use_stdin; - struct merge_options merge_options; + struct strvec xopts; }; static int real_merge(struct merge_tree_options *o, @@ -425,7 +425,7 @@ static int real_merge(struct merge_tree_options *o, { struct commit *parent1, *parent2; struct commit_list *merge_bases = NULL; - struct merge_options opt = o->merge_options; + struct merge_options opt; struct merge_result result = { 0 }; int show_messages = o->show_messages; @@ -439,11 +439,17 @@ static int real_merge(struct merge_tree_options *o, help_unknown_ref(branch2, "merge-tree", _("not something we can merge")); + init_merge_options(&opt, the_repository); + opt.show_rename_progress = 0; opt.branch1 = branch1; opt.branch2 = branch2; + for (size_t x = 0; x < o->xopts.nr; x++) + if (parse_merge_opt(&opt, o->xopts.v[x])) + die(_("unknown strategy option: -X%s"), o->xopts.v[x]); + if (merge_base) { struct commit *base_commit; struct tree *base_tree, *parent1_tree, *parent2_tree; @@ -512,8 +518,7 @@ static int real_merge(struct merge_tree_options *o, int cmd_merge_tree(int argc, const char **argv, const char *prefix) { - struct merge_tree_options o = { .show_messages = -1 }; - struct strvec xopts = STRVEC_INIT; + struct merge_tree_options o = { .show_messages = -1,.xopts = STRVEC_INIT }; int expected_remaining_argc; int original_argc; const char *merge_base = NULL; @@ -549,24 +554,18 @@ int cmd_merge_tree(int argc, const char **argv, const char *prefix) &merge_base, N_("commit"), N_("specify a merge-base for the merge")), - OPT_STRVEC('X', "strategy-option", &xopts, N_("option=value"), + OPT_STRVEC('X', "strategy-option", &o.xopts, N_("option=value"), N_("option for selected merge strategy")), OPT_END() }; - /* Init merge options */ - init_merge_options(&o.merge_options, the_repository); - /* Parse arguments */ original_argc = argc - 1; /* ignoring argv[0] */ argc = parse_options(argc, argv, prefix, mt_options, merge_tree_usage, PARSE_OPT_STOP_AT_NON_OPTION); - if (xopts.nr && o.mode == MODE_TRIVIAL) + if (o.xopts.nr && o.mode == MODE_TRIVIAL) die(_("--trivial-merge is incompatible with all other options")); - for (int x = 0; x < xopts.nr; x++) - if (parse_merge_opt(&o.merge_options, xopts.v[x])) - die(_("unknown strategy option: -X%s"), xopts.v[x]); /* Handle --stdin */ if (o.use_stdin) {
Jeff King <peff@peff.net> writes: > Oops, I totally missed the loop around the call to real_merge(). So > yeah, I think this is rather tricky. > ... > I don't think there are any bugs with the state at the current tip of > ty/merge-tree-strategy-options, but if we want to make it safer, I think > we have two options: > > - delay the conversion of the "xopts" list into merge_options until we > initialize it in real_merge(). This avoids breaking abstraction > boundaries, but does mean that the sanity-checking of the options > happens a little later (but not much in practice). > > - provide a copy_merge_options() function, which makes this kind of > "set up a template and then copy it" pattern official. It can be a > struct assignment for now, but it at least alerts anybody adding new > options to the notion that a deep copy might be required. > > Option 1 looks something like this (a lot of the hunks are just > reverting the tip of that branch, so squashed in it would be even > smaller): If we have no plan and intention to extend "merge-tree" even more in the future, then option 1 would be the approach with least patch noise, and as your "something like this" shows, it is a nice and clean solution. I very much like it. But as the renovated "merge-tree" is a relatively young thing in our toolbox, I suspect that more and more work may want to go into it. And the other "official copy_merge_options()" approach would be a more healthy solution in the longer run, I would think. If we were to go that route, we should also give an interface to free the resources held by the copy. It is not that much code on top of the commit that is already queued in 'next', I suspect. Perhaps something like this? builtin/merge-tree.c | 4 +++- merge-recursive.c | 16 ++++++++++++++++ merge-recursive.h | 3 +++ 3 files changed, 22 insertions(+), 1 deletion(-) diff --git c/builtin/merge-tree.c w/builtin/merge-tree.c index 7024b5ce2e..a35e0452d6 100644 --- c/builtin/merge-tree.c +++ w/builtin/merge-tree.c @@ -425,10 +425,11 @@ static int real_merge(struct merge_tree_options *o, { struct commit *parent1, *parent2; struct commit_list *merge_bases = NULL; - struct merge_options opt = o->merge_options; struct merge_result result = { 0 }; int show_messages = o->show_messages; + struct merge_options opt; + copy_merge_options(&opt, &o->merge_options); parent1 = get_merge_parent(branch1); if (!parent1) help_unknown_ref(branch1, "merge-tree", @@ -507,6 +508,7 @@ static int real_merge(struct merge_tree_options *o, if (o->use_stdin) putchar(line_termination); merge_finalize(&opt, &result); + clear_merge_options(&opt); return !result.clean; /* result.clean < 0 handled above */ } diff --git c/merge-recursive.c w/merge-recursive.c index 0d7e57e2df..e3beb0801b 100644 --- c/merge-recursive.c +++ w/merge-recursive.c @@ -3912,6 +3912,22 @@ void init_merge_options(struct merge_options *opt, opt->buffer_output = 0; } +/* + * For now, members of merge_options do not need deep copying, but + * it may change in the future, in which case we would need to update + * this, and also make a matching change to clear_merge_options() to + * release the resources held by a copied instance. + */ +void copy_merge_options(struct merge_options *dst, struct merge_options *src) +{ + *dst = *src; +} + +void clear_merge_options(struct merge_options *opt UNUSED) +{ + ; /* no-op as our copy is shallow right now */ +} + int parse_merge_opt(struct merge_options *opt, const char *s) { const char *arg; diff --git c/merge-recursive.h w/merge-recursive.h index b88000e3c2..3d3b3e3c29 100644 --- c/merge-recursive.h +++ w/merge-recursive.h @@ -55,6 +55,9 @@ struct merge_options { void init_merge_options(struct merge_options *opt, struct repository *repo); +void copy_merge_options(struct merge_options *dst, struct merge_options *src); +void clear_merge_options(struct merge_options *opt); + /* parse the option in s and update the relevant field of opt */ int parse_merge_opt(struct merge_options *opt, const char *s);
On Wed, Oct 11, 2023 at 12:38:03PM -0700, Junio C Hamano wrote: > If we have no plan and intention to extend "merge-tree" even more in > the future, then option 1 would be the approach with least patch > noise, and as your "something like this" shows, it is a nice and > clean solution. I very much like it. > > But as the renovated "merge-tree" is a relatively young thing in our > toolbox, I suspect that more and more work may want to go into it. > And the other "official copy_merge_options()" approach would be a > more healthy solution in the longer run, I would think. If we were > to go that route, we should also give an interface to free the > resources held by the copy. I am happy with either, as they both resolve the "merge-tree knows intimate details about merge_options" issue. The patch I showed would require manually passing more details down to real_merge(), which is I guess what you are getting at with the "more work may want to go into it". > It is not that much code on top of the commit that is already queued > in 'next', I suspect. Perhaps something like this? This looks OK, though... > +void clear_merge_options(struct merge_options *opt UNUSED) > +{ > + ; /* no-op as our copy is shallow right now */ > +} Clearing is generally not just about copies, but any use of the struct. so this invites the question of whether the original non-copy struct should have a call to clear_merge_options() in cmd_merge_tree(). And ditto for every other user. I do not mind adding such calls, but of course we know that they are currently noops (and we don't have any particular plan to change that). -Peff
Jeff King <peff@peff.net> writes: > I am happy with either, as they both resolve the "merge-tree knows > intimate details about merge_options" issue. The patch I showed would > require manually passing more details down to real_merge(), which is I > guess what you are getting at with the "more work may want to go into > it". I was alluding more about teaching "merge-tree" various optional behaviour merge_options represents. In today's patch it may be -X<options>, who knows what tomorrow's patch wants to bring "merge-tree" to feature-parity with "merge". And the first approach would mean we would add xopts today to the struct, but we will be required passing more details as we add other things. >> It is not that much code on top of the commit that is already queued >> in 'next', I suspect. Perhaps something like this? > > This looks OK, though... > >> +void clear_merge_options(struct merge_options *opt UNUSED) >> +{ >> + ; /* no-op as our copy is shallow right now */ >> +} > > Clearing is generally not just about copies, but any use of the struct. > so this invites the question of whether the original non-copy struct > should have a call to clear_merge_options() in cmd_merge_tree(). And > ditto for every other user. Yes, once we start leaking, somebody hopefully notice the lack of a call to this on the original/template copy and add one. Until then...
diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c index 0de42aecf4b..7024b5ce2e4 100644 --- a/builtin/merge-tree.c +++ b/builtin/merge-tree.c @@ -18,6 +18,7 @@ #include "quote.h" #include "tree.h" #include "config.h" +#include "strvec.h" static int line_termination = '\n'; @@ -414,6 +415,7 @@ struct merge_tree_options { int show_messages; int name_only; int use_stdin; + struct merge_options merge_options; }; static int real_merge(struct merge_tree_options *o, @@ -423,7 +425,7 @@ static int real_merge(struct merge_tree_options *o, { struct commit *parent1, *parent2; struct commit_list *merge_bases = NULL; - struct merge_options opt; + struct merge_options opt = o->merge_options; struct merge_result result = { 0 }; int show_messages = o->show_messages; @@ -437,8 +439,6 @@ static int real_merge(struct merge_tree_options *o, help_unknown_ref(branch2, "merge-tree", _("not something we can merge")); - init_merge_options(&opt, the_repository); - opt.show_rename_progress = 0; opt.branch1 = branch1; @@ -513,6 +513,7 @@ static int real_merge(struct merge_tree_options *o, int cmd_merge_tree(int argc, const char **argv, const char *prefix) { struct merge_tree_options o = { .show_messages = -1 }; + struct strvec xopts = STRVEC_INIT; int expected_remaining_argc; int original_argc; const char *merge_base = NULL; @@ -548,14 +549,25 @@ int cmd_merge_tree(int argc, const char **argv, const char *prefix) &merge_base, N_("commit"), N_("specify a merge-base for the merge")), + OPT_STRVEC('X', "strategy-option", &xopts, N_("option=value"), + N_("option for selected merge strategy")), OPT_END() }; + /* Init merge options */ + init_merge_options(&o.merge_options, the_repository); + /* Parse arguments */ original_argc = argc - 1; /* ignoring argv[0] */ argc = parse_options(argc, argv, prefix, mt_options, merge_tree_usage, PARSE_OPT_STOP_AT_NON_OPTION); + if (xopts.nr && o.mode == MODE_TRIVIAL) + die(_("--trivial-merge is incompatible with all other options")); + for (int x = 0; x < xopts.nr; x++) + if (parse_merge_opt(&o.merge_options, xopts.v[x])) + die(_("unknown strategy option: -X%s"), xopts.v[x]); + /* Handle --stdin */ if (o.use_stdin) { struct strbuf buf = STRBUF_INIT; diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh index 250f721795b..b2c8a43fce3 100755 --- a/t/t4301-merge-tree-write-tree.sh +++ b/t/t4301-merge-tree-write-tree.sh @@ -22,6 +22,7 @@ test_expect_success setup ' git branch side1 && git branch side2 && git branch side3 && + git branch side4 && git checkout side1 && test_write_lines 1 2 3 4 5 6 >numbers && @@ -46,6 +47,13 @@ test_expect_success setup ' test_tick && git commit -m rename-numbers && + git checkout side4 && + test_write_lines 0 1 2 3 4 5 >numbers && + echo yo >greeting && + git add numbers greeting && + test_tick && + git commit -m other-content-modifications && + git switch --orphan unrelated && >something-else && git add something-else && @@ -97,6 +105,21 @@ test_expect_success 'Content merge and a few conflicts' ' test_cmp expect actual ' +test_expect_success 'Auto resolve conflicts by "ours" strategy option' ' + git checkout side1^0 && + + # make sure merge conflict exists + test_must_fail git merge side4 && + git merge --abort && + + git merge -X ours side4 && + git rev-parse HEAD^{tree} >expected && + + git merge-tree -X ours side1 side4 >actual && + + test_cmp expected actual +' + test_expect_success 'Barf on misspelled option, with exit code other than 0 or 1' ' # Mis-spell with single "s" instead of double "s" test_expect_code 129 git merge-tree --write-tree --mesages FOOBAR side1 side2 2>expect &&