Message ID | b79f44e54b9611fa2b10a4e1cb666992d006951c.1658466942.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Fix merge restore state | expand |
On Fri, Jul 22 2022, Elijah Newren via GitGitGadget wrote: > From: Elijah Newren <newren@gmail.com> > > As noted in commit 9822175d2b ("Ensure index matches head before > invoking merge machinery, round N", 2019-08-17), we have had a very > long history of problems with failing to enforce the requirement that > index matches HEAD when starting a merge. One of the commits > referenced in the long tale of issues arising from lax enforcement of > this requirement was commit 55f39cf755 ("merge: fix misleading > pre-merge check documentation", 2018-06-30), which tried to document > the requirement and noted there were some exceptions. As mentioned in > that commit message, the `resolve` strategy was the one strategy that > did not have an explicit index matching HEAD check, and the reason it > didn't was that I wasn't able to discover any cases where the > implementation would fail to catch the problem and abort, and didn't > want to introduce unnecessary performance overhead of adding another > check. > > Well, today I discovered a testcase where the implementation does not > catch the problem and so an explicit check is needed. Add a testcase > that previously would have failed, and update git-merge-resolve.sh to > have an explicit check. Note that the code is copied from 3ec62ad9ff > ("merge-octopus: abort if index does not match HEAD", 2016-04-09), so > that we reuse the same message and avoid making translators need to > translate some new message. > > Signed-off-by: Elijah Newren <newren@gmail.com> > --- > builtin/merge.c | 20 ++++++++++++++++++ > git-merge-resolve.sh | 10 +++++++++ > t/t6424-merge-unrelated-index-changes.sh | 26 ++++++++++++++++++++++++ > 3 files changed, 56 insertions(+) > > diff --git a/builtin/merge.c b/builtin/merge.c > index 23170f2d2a6..13884b8e836 100644 > --- a/builtin/merge.c > +++ b/builtin/merge.c > @@ -1599,6 +1599,26 @@ int cmd_merge(int argc, const char **argv, const char *prefix) > */ > refresh_cache(REFRESH_QUIET); > if (allow_trivial && fast_forward != FF_ONLY) { > + /* > + * Must first ensure that index matches HEAD before > + * attempting a trivial merge. > + */ > + struct tree *head_tree = get_commit_tree(head_commit); > + struct strbuf sb = STRBUF_INIT; > + > + if (repo_index_has_changes(the_repository, head_tree, > + &sb)) { > + struct strbuf err = STRBUF_INIT; > + strbuf_addstr(&err, "error: "); > + strbuf_addf(&err, _("Your local changes to the following files would be overwritten by merge:\n %s"), > + sb.buf); > + strbuf_addch(&err, '\n'); At first glance I was expecting this to construct an error message to emit it somewhere else that stderr, so I wondered if you couldn't use the "error_routine" facility to avoid re-inventing "error: " etc., but... > + fputs(err.buf, stderr); ...we emit it to stderr anyway...? > + strbuf_release(&err); > + strbuf_release(&sb); > + return -1; > + } > + > /* See if it is really trivial. */ > git_committer_info(IDENT_STRICT); > printf(_("Trying really trivial in-index merge...\n")); > diff --git a/git-merge-resolve.sh b/git-merge-resolve.sh > index 343fe7bccd0..77e93121bf8 100755 > --- a/git-merge-resolve.sh > +++ b/git-merge-resolve.sh > @@ -5,6 +5,16 @@ > # > # Resolve two trees, using enhanced multi-base read-tree. > > +. git-sh-setup > + > +# Abort if index does not match HEAD > +if ! git diff-index --quiet --cached HEAD -- > +then > + gettextln "Error: Your local changes to the following files would be overwritten by merge" > + git diff-index --cached --name-only HEAD -- | sed -e 's/^/ /' > + exit 2 > +fi (The "..." continued below) Just in trying to poke holes in this I made this an "exit 0", and neither of the tests you added failed, but the last one ("resolve && recursive && ort") in the t6424*.sh will fail, is that intentional? I don't know enough about the context here, but given our *.sh->C migration elsewhere it's a bit unfortunate to see more *.sh code added back. We have "git merge" driving this, isn't it OK to have it make this check before invoking "resolve" (may be a stupid question). For this code in particular it: * Uses spaces, not tabs * We lose the diff-index .. --name-only exit code (segfault), but so did the older version * I wonder if bending over backwards to emit the exact message we emitted before is worth it If you just make this something like (untested): { gettext "error: " && gettextln "Your local..." } You could re-use the translation from the *.c one (and the "error: " one we'll get from usage.c). That leaves "\n %s" as the difference, but we could just remove that from the _() and emit it unconditionally, no? > # The first parameters up to -- are merge bases; the rest are heads. > bases= head= remotes= sep_seen= > for arg > diff --git a/t/t6424-merge-unrelated-index-changes.sh b/t/t6424-merge-unrelated-index-changes.sh > index b6e424a427b..f35d3182b86 100755 > --- a/t/t6424-merge-unrelated-index-changes.sh > +++ b/t/t6424-merge-unrelated-index-changes.sh > @@ -114,6 +114,32 @@ test_expect_success 'resolve, non-trivial' ' > test_path_is_missing .git/MERGE_HEAD > ' > > +test_expect_success 'resolve, trivial, related file removed' ' > + git reset --hard && > + git checkout B^0 && > + > + git rm a && > + test_path_is_missing a && > + > + test_must_fail git merge -s resolve C^0 && > + > + test_path_is_missing a && > + test_path_is_missing .git/MERGE_HEAD > +' > + > +test_expect_success 'resolve, non-trivial, related file removed' ' > + git reset --hard && > + git checkout B^0 && > + > + git rm a && > + test_path_is_missing a && > + > + test_must_fail git merge -s resolve D^0 && > + > + test_path_is_missing a && > + test_path_is_missing .git/MERGE_HEAD > +' > + > test_expect_success 'recursive' ' > git reset --hard && > git checkout B^0 && ...I tried with this change on top, it seems to me like you'd want this in any case, it passes the tests both with & without the C code change, so can't we just use error() here? diff --git a/builtin/merge.c b/builtin/merge.c index 7fb4414ebb7..64def49734a 100644 --- a/builtin/merge.c +++ b/builtin/merge.c @@ -1621,13 +1621,8 @@ int cmd_merge(int argc, const char **argv, const char *prefix) if (repo_index_has_changes(the_repository, head_tree, &sb)) { - struct strbuf err = STRBUF_INIT; - strbuf_addstr(&err, "error: "); - strbuf_addf(&err, _("Your local changes to the following files would be overwritten by merge:\n %s"), - sb.buf); - strbuf_addch(&err, '\n'); - fputs(err.buf, stderr); - strbuf_release(&err); + error(_("Your local changes to the following files would be overwritten by merge:\n %s"), + sb.buf); strbuf_release(&sb); return -1; } diff --git a/t/t6424-merge-unrelated-index-changes.sh b/t/t6424-merge-unrelated-index-changes.sh index c96649448fa..1df130b9ee6 100755 --- a/t/t6424-merge-unrelated-index-changes.sh +++ b/t/t6424-merge-unrelated-index-changes.sh @@ -96,7 +96,12 @@ test_expect_success 'resolve, trivial' ' touch random_file && git add random_file && - test_must_fail git merge -s resolve C^0 && + sed -e "s/^> //g" >expect <<-\EOF && + > error: Your local changes to the following files would be overwritten by merge: + > random_file + EOF + test_must_fail git merge -s resolve C^0 2>actual && + test_cmp expect actual && test_path_is_file random_file && git rev-parse --verify :random_file && test_path_is_missing .git/MERGE_HEAD
On Fri, Jul 22, 2022 at 3:46 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > > On Fri, Jul 22 2022, Elijah Newren via GitGitGadget wrote: > > > From: Elijah Newren <newren@gmail.com> > > > > As noted in commit 9822175d2b ("Ensure index matches head before > > invoking merge machinery, round N", 2019-08-17), we have had a very > > long history of problems with failing to enforce the requirement that > > index matches HEAD when starting a merge. One of the commits > > referenced in the long tale of issues arising from lax enforcement of > > this requirement was commit 55f39cf755 ("merge: fix misleading > > pre-merge check documentation", 2018-06-30), which tried to document > > the requirement and noted there were some exceptions. As mentioned in > > that commit message, the `resolve` strategy was the one strategy that > > did not have an explicit index matching HEAD check, and the reason it > > didn't was that I wasn't able to discover any cases where the > > implementation would fail to catch the problem and abort, and didn't > > want to introduce unnecessary performance overhead of adding another > > check. > > > > Well, today I discovered a testcase where the implementation does not > > catch the problem and so an explicit check is needed. Add a testcase > > that previously would have failed, and update git-merge-resolve.sh to > > have an explicit check. Note that the code is copied from 3ec62ad9ff > > ("merge-octopus: abort if index does not match HEAD", 2016-04-09), so > > that we reuse the same message and avoid making translators need to > > translate some new message. > > > > Signed-off-by: Elijah Newren <newren@gmail.com> > > --- > > builtin/merge.c | 20 ++++++++++++++++++ > > git-merge-resolve.sh | 10 +++++++++ > > t/t6424-merge-unrelated-index-changes.sh | 26 ++++++++++++++++++++++++ > > 3 files changed, 56 insertions(+) > > > > diff --git a/builtin/merge.c b/builtin/merge.c > > index 23170f2d2a6..13884b8e836 100644 > > --- a/builtin/merge.c > > +++ b/builtin/merge.c > > @@ -1599,6 +1599,26 @@ int cmd_merge(int argc, const char **argv, const char *prefix) > > */ > > refresh_cache(REFRESH_QUIET); > > if (allow_trivial && fast_forward != FF_ONLY) { > > + /* > > + * Must first ensure that index matches HEAD before > > + * attempting a trivial merge. > > + */ > > + struct tree *head_tree = get_commit_tree(head_commit); > > + struct strbuf sb = STRBUF_INIT; > > + > > + if (repo_index_has_changes(the_repository, head_tree, > > + &sb)) { > > + struct strbuf err = STRBUF_INIT; > > + strbuf_addstr(&err, "error: "); > > + strbuf_addf(&err, _("Your local changes to the following files would be overwritten by merge:\n %s"), > > + sb.buf); > > + strbuf_addch(&err, '\n'); > > At first glance I was expecting this to construct an error message to > emit it somewhere else that stderr, so I wondered if you couldn't use > the "error_routine" facility to avoid re-inventing "error: " etc., > but... > > > + fputs(err.buf, stderr); > > ...we emit it to stderr anyway...? > > > + strbuf_release(&err); > > + strbuf_release(&sb); > > + return -1; > > + } > > + UGH! I fixed the other one of these in my reroll yesterday[1]. I _knew_ I had copied that code somewhere else, but for some reason I thought it was in a different series and went searching for it. Don't know why I couldn't remember that it was in the same series, and I'm not sure how I missed it when I went looking. I mean, I know I was tired yesterday, but that's still kinda bad. Anyway, thanks for catching; I'll fix this one too. [1] https://lore.kernel.org/git/xmqqsfmulb6w.fsf@gitster.g/ > > /* See if it is really trivial. */ > > git_committer_info(IDENT_STRICT); > > printf(_("Trying really trivial in-index merge...\n")); > > diff --git a/git-merge-resolve.sh b/git-merge-resolve.sh > > index 343fe7bccd0..77e93121bf8 100755 > > --- a/git-merge-resolve.sh > > +++ b/git-merge-resolve.sh > > @@ -5,6 +5,16 @@ > > # > > # Resolve two trees, using enhanced multi-base read-tree. > > > > +. git-sh-setup > > + > > +# Abort if index does not match HEAD > > +if ! git diff-index --quiet --cached HEAD -- > > +then > > + gettextln "Error: Your local changes to the following files would be overwritten by merge" > > + git diff-index --cached --name-only HEAD -- | sed -e 's/^/ /' > > + exit 2 > > +fi > > (The "..." continued below) > > Just in trying to poke holes in this I made this an "exit 0", and > neither of the tests you added failed, but the last one ("resolve && > recursive && ort") in the t6424*.sh will fail, is that intentional? Nope it's not intentional. I had tested one fix (for these git-merge-resolve.sh changes) and verified they were good (and necessary), then found another bug (the one fixed by the builtin/merge.c changes) and added a test for it, and just decided to amend it into the same commit. Turns out the builtin/merge.c changes mask the fix for git-merge-resolve.sh here and makes that code go unexercised. I'll split these two bugfixes into separate patches, and tweak one of the two testcases to make sure it continues to exercise the new codepath added to git-merge-resolve.sh. > I don't know enough about the context here, but given our *.sh->C > migration elsewhere it's a bit unfortunate to see more *.sh code added > back. This seems like a curious objection. "We are trying to get rid of shell scripts, so don't even fix bugs in any of the existing ones." ? > We have "git merge" driving this, isn't it OK to have it make this > check before invoking "resolve" (may be a stupid question). Ah, I can kind of see where you're coming from now, but that seems to me to be bending over backwards in attempting to fix a component written in shell without actually modifying the shell. builtin/merge.c is some glue code that can call multiple different strategies, but isn't the place for the implementation of the strategies themselves, and I'd hate to see us put half the implementation in one place and half in another. In addition, besides the separation of concerns issue:: * We document that users can add their own merge strategies (a shell or executable named git-merge-$USERNAME and "git merge -s $USERNAME" will call them) * git-merge-resolve and git-merge-octopus serve as examples * Our examples should demonstrate correct behavior and perform documented, required steps. This particular check is important: /* * At this point, we need a real merge. No matter what strategy * we use, it would operate on the index, possibly affecting the * working tree, and when resolved cleanly, have the desired * tree in the index -- this means that the index must be in * sync with the head commit. The strategies are responsible * to ensure this. */ So, even if someone were to reimplement git-merge-resolve.sh in C, and start the deprecation process with some merge.useBuiltinResolve config setting (similar to rebase.useBuiltin), I'd still want this shell fix added to git-merge-resolve.sh in the meantime, both as an important bugfix, and so that people looking for merge strategy examples who find this script hopefully find a new enough version with this important check included. In general, if merge strategies do not perform this check, we have observed that they often will either (a) discard users' staged changes (best case) or (b) smash staged changes into the created commit and thus create some kind of evil merge (making it look like they created a merge normally, and then amended the merge with additional changes). We're lucky that the way resolve was implemented, other git calls would usually incidentally catch such issues for us without an explicit check. We were also lucky that the observed behavior was '(a)' rather than '(b)' for resolve. But the issue should still be fixed. > For this code in particular it: > > * Uses spaces, not tabs Yes, that's fair, but as I mentioned in the commit message, it was copied from git-merge-octopus.sh. So, as you say below, "so did the older version". > * We lose the diff-index .. --name-only exit code (segfault), but so > did the older version Um, I don't understand this objection. I think you are referring to the pipe to sed, but if so...who cares? The exit code would be lost anyway because we aren't running under errexit, and the next line of code ignores any and all previous exit codes when it runs "exit 2". And if you're not referring to the pipe to sed but the fact that it unconditionally returns an exit code of 2 on the next line, then yes that is the expected return code. Whatever the diff-index segfault returns would be the wrong exit status and could fool the builtin/merge.c into doing the wrong thing. It expects merge strategies to return one of three exit codes: 0, 1, or 2: /* * The backend exits with 1 when conflicts are * left to be resolved, with 2 when it does not * handle the given merge at all. */ So, ignoring the return code from diff-index is correct behavior here. Were you thinking this was a test script or something? > * I wonder if bending over backwards to emit the exact message we > emitted before is worth it > > If you just make this something like (untested): > > { > gettext "error: " && > gettextln "Your local..." > } > > You could re-use the translation from the *.c one (and the "error: " one > we'll get from usage.c). > > That leaves "\n %s" as the difference, but we could just remove that > from the _() and emit it unconditionally, no? ?? Copying a few lines from git-merge-octopus.sh to get the same fix it has is "bending over backwards"? That's what I call "doing the easiest thing possible" (and which _also_ has the benefit of being battle tested code), and then you describe a bunch of gymnastics as an alternative? I see your suggestion as running afoul of the objection you are raising, and the code I'm adding as being a solution to that particular objection. So this particular flag you are raising is confusing to me. > > # The first parameters up to -- are merge bases; the rest are heads. > > bases= head= remotes= sep_seen= > > for arg > > diff --git a/t/t6424-merge-unrelated-index-changes.sh b/t/t6424-merge-unrelated-index-changes.sh > > index b6e424a427b..f35d3182b86 100755 > > --- a/t/t6424-merge-unrelated-index-changes.sh > > +++ b/t/t6424-merge-unrelated-index-changes.sh > > @@ -114,6 +114,32 @@ test_expect_success 'resolve, non-trivial' ' > > test_path_is_missing .git/MERGE_HEAD > > ' > > > > +test_expect_success 'resolve, trivial, related file removed' ' > > + git reset --hard && > > + git checkout B^0 && > > + > > + git rm a && > > + test_path_is_missing a && > > + > > + test_must_fail git merge -s resolve C^0 && > > + > > + test_path_is_missing a && > > + test_path_is_missing .git/MERGE_HEAD > > +' > > + > > +test_expect_success 'resolve, non-trivial, related file removed' ' > > + git reset --hard && > > + git checkout B^0 && > > + > > + git rm a && > > + test_path_is_missing a && > > + > > + test_must_fail git merge -s resolve D^0 && > > + > > + test_path_is_missing a && > > + test_path_is_missing .git/MERGE_HEAD > > +' > > + > > test_expect_success 'recursive' ' > > git reset --hard && > > git checkout B^0 && > > ...I tried with this change on top, it seems to me like you'd want this > in any case, it passes the tests both with & without the C code change, > so can't we just use error() here? > > diff --git a/builtin/merge.c b/builtin/merge.c > index 7fb4414ebb7..64def49734a 100644 > --- a/builtin/merge.c > +++ b/builtin/merge.c > @@ -1621,13 +1621,8 @@ int cmd_merge(int argc, const char **argv, const char *prefix) > > if (repo_index_has_changes(the_repository, head_tree, > &sb)) { > - struct strbuf err = STRBUF_INIT; > - strbuf_addstr(&err, "error: "); > - strbuf_addf(&err, _("Your local changes to the following files would be overwritten by merge:\n %s"), > - sb.buf); > - strbuf_addch(&err, '\n'); > - fputs(err.buf, stderr); > - strbuf_release(&err); > + error(_("Your local changes to the following files would be overwritten by merge:\n %s"), > + sb.buf); > strbuf_release(&sb); > return -1; Yes, this is the same change suggested by Junio for patch 1 which I should have also applied here. Thanks for catching it!
On Fri, Jul 22 2022, Elijah Newren wrote: > On Fri, Jul 22, 2022 at 3:46 AM Ævar Arnfjörð Bjarmason > [...] >> I don't know enough about the context here, but given our *.sh->C >> migration elsewhere it's a bit unfortunate to see more *.sh code added >> back. > > This seems like a curious objection. "We are trying to get rid of > shell scripts, so don't even fix bugs in any of the existing ones." ? > >> We have "git merge" driving this, isn't it OK to have it make this >> check before invoking "resolve" (may be a stupid question). > > Ah, I can kind of see where you're coming from now, but that seems to > me to be bending over backwards in attempting to fix a component > written in shell without actually modifying the shell. > builtin/merge.c is some glue code that can call multiple different > strategies, but isn't the place for the implementation of the > strategies themselves, and I'd hate to see us put half the > implementation in one place and half in another. In addition, besides > the separation of concerns issue:: > > * We document that users can add their own merge strategies (a > shell or executable named git-merge-$USERNAME and "git merge -s > $USERNAME" will call them) > * git-merge-resolve and git-merge-octopus serve as examples > * Our examples should demonstrate correct behavior and perform > documented, required steps. This particular check is important: > > /* > * At this point, we need a real merge. No matter what strategy > * we use, it would operate on the index, possibly affecting the > * working tree, and when resolved cleanly, have the desired > * tree in the index -- this means that the index must be in > * sync with the head commit. The strategies are responsible > * to ensure this. > */ > > So, even if someone were to reimplement git-merge-resolve.sh in C, and > start the deprecation process with some merge.useBuiltinResolve config > setting (similar to rebase.useBuiltin), I'd still want this shell fix > added to git-merge-resolve.sh in the meantime, both as an important > bugfix, and so that people looking for merge strategy examples who > find this script hopefully find a new enough version with this > important check included. > > In general, if merge strategies do not perform this check, we have > observed that they often will either (a) discard users' staged changes > (best case) or (b) smash staged changes into the created commit and > thus create some kind of evil merge (making it look like they created > a merge normally, and then amended the merge with additional changes). > > We're lucky that the way resolve was implemented, other git calls > would usually incidentally catch such issues for us without an > explicit check. We were also lucky that the observed behavior was > '(a)' rather than '(b)' for resolve. But the issue should still be > fixed. Makes sense I guess, yeah I was wondering if we could just assume that "git merge" would always invoke those, and therefore could offload more of the pre-flight checks over there. >> For this code in particular it: >> >> * Uses spaces, not tabs > > Yes, that's fair, but as I mentioned in the commit message, it was > copied from git-merge-octopus.sh. So, as you say below, "so did the > older version". *nod* >> * We lose the diff-index .. --name-only exit code (segfault), but so >> did the older version > > Um, I don't understand this objection. I think you are referring to > the pipe to sed, but if so...who cares? The exit code would be lost > anyway because we aren't running under errexit, and the next line of > code ignores any and all previous exit codes when it runs "exit 2". > And if you're not referring to the pipe to sed but the fact that it > unconditionally returns an exit code of 2 on the next line, then yes > that is the expected return code. Whatever the diff-index segfault > returns would be the wrong exit status and could fool the > builtin/merge.c into doing the wrong thing. It expects merge > strategies to return one of three exit codes: 0, 1, or 2: > > /* > * The backend exits with 1 when conflicts are > * left to be resolved, with 2 when it does not > * handle the given merge at all. > */ > > So, ignoring the return code from diff-index is correct behavior here. > > Were you thinking this was a test script or something? We can leave this for now. But no. Whatever the merge driver is documenting as its normal return values we really should be ferrying up abort() and segfault, per the "why do we miss..." in: https://lore.kernel.org/git/patch-v2-11.14-8cc6ab390db-20220720T211221Z-avarab@gmail.com/ I.e. this is one of the cases in the test suite where we haven't closed that gap, and could hide segfaults as a "normal" exit 2. So I think your v5 is fine as-is, but in general I'd be really interested if you want to double-down on this view for the merge drivers for some reason, because my current plan for addressing these blindspots outlined in the above wouldn't work then... >> * I wonder if bending over backwards to emit the exact message we >> emitted before is worth it >> >> If you just make this something like (untested): >> >> { >> gettext "error: " && >> gettextln "Your local..." >> } >> >> You could re-use the translation from the *.c one (and the "error: " one >> we'll get from usage.c). >> >> That leaves "\n %s" as the difference, but we could just remove that >> from the _() and emit it unconditionally, no? > > ?? > > Copying a few lines from git-merge-octopus.sh to get the same fix it > has is "bending over backwards"? That's what I call "doing the > easiest thing possible" (and which _also_ has the benefit of being > battle tested code), and then you describe a bunch of gymnastics as an > alternative? I see your suggestion as running afoul of the objection > you are raising, and the code I'm adding as being a solution to that > particular objection. So this particular flag you are raising is > confusing to me. I wasn't aware of some greater context vis-as-vis octopus, but it just seemed to me that you were trying to maintain the "Error" v.s. "error" distinction, i.e. the C code you're adding uses lower case, the *.sh upper-case. Which I see is for consistency with some existing message we have a translation for, so if that was the main goal (and not bug-for-bug message compatibility) the above gettext/gettextln use would allow you to re-use the C i18n.
On Fri, Jul 22, 2022 at 10:53 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > > On Fri, Jul 22 2022, Elijah Newren wrote: > [...] > > So, ignoring the return code from diff-index is correct behavior here. > > > > Were you thinking this was a test script or something? > > We can leave this for now. > > But no. Whatever the merge driver is documenting as its normal return > values we really should be ferrying up abort() and segfault, per the > "why do we miss..." in: > https://lore.kernel.org/git/patch-v2-11.14-8cc6ab390db-20220720T211221Z-avarab@gmail.com/ > > I.e. this is one of the cases in the test suite where we haven't closed > that gap, and could hide segfaults as a "normal" exit 2. > > So I think your v5 is fine as-is, but in general I'd be really > interested if you want to double-down on this view for the merge drivers > for some reason, because my current plan for addressing these blindspots > outlined in the above wouldn't work then... Quoting from there: > * We have in-tree shellscripts like "git-merge-one-file.sh" invoking > git commands, they'll usually return their own exit codes on "git" > failure, rather then ferrying up segfault or abort() exit code. > > E.g. these invocations in git-merge-one-file.sh leak, but aren't > reflected in the "git merge" exit code: > >src1=$(git unpack-file $2) >src2=$(git unpack-file $3) > > That case would be easily "fixed" by adding a line like this after > each assignment: > >test $? -ne 0 && exit $? > > But we'd then in e.g. "t6407-merge-binary.sh" run into > write_tree_trivial() in "builtin/merge.c" calling die() instead of > ferrying up the relevant exit code. Sidenote, but I don't think t6407-merge-binary.sh calls into write_tree_trivial(). Doesn't in my testing, anyway. Are you really planning on auditing every line of git-bisect.sh, git-merge*.sh, git-sh-setup.sh, git-submodule.sh, git-web--browse.sh, and others, and munging every area that invokes git to check the exit status? Yuck. A few points: * Will this really buy you anything? Don't we have other regression tests of all these commands (e.g. "git unpack-file") which ought to show the same memory leaks? This seems like high lift, low value to me, and just fixing direct invocations in the regression tests is where the value comes. (If direct coverage is lacking in the regression tests, shouldn't the solution be to add coverage?) * Won't this be a huge review and support burden to maintain the extra checking? * Some of these scripts, such as git-merge-resolve.sh and git-merge-octopus.sh are used as examples of e.g. merge drivers, and invasive checks whose sole purpose is memory leak checking seems to run counter to the purpose of being a simple example for users * Wouldn't using errexit and pipefail be an easier way to accomplish checking the exit status (avoiding the problems from the last few bullets)? You'd still have to audit the code and write e.g. shutupgrep wrappers (since grep reports whether it found certain patterns in the input, rather than whether it was able to perform the search on the input, and we often only care about the latter), but it at least would automatically check future git invocations. * Are we running the risk of overloading special return codes (e.g. 125 in git-bisect) I do still think that "2" is the correct return code for the shell-script merge strategies here, though I think it's feasible in their cases to change the documentation to relax the return code requirements in such a way to allow those scripts to utilize errexit and pipefail. > >> * I wonder if bending over backwards to emit the exact message we > >> emitted before is worth it > >> > >> If you just make this something like (untested): > >> > >> { > >> gettext "error: " && > >> gettextln "Your local..." > >> } > >> > >> You could re-use the translation from the *.c one (and the "error: " one > >> we'll get from usage.c). > >> > >> That leaves "\n %s" as the difference, but we could just remove that > >> from the _() and emit it unconditionally, no? > > > > ?? > > > > Copying a few lines from git-merge-octopus.sh to get the same fix it > > has is "bending over backwards"? That's what I call "doing the > > easiest thing possible" (and which _also_ has the benefit of being > > battle tested code), and then you describe a bunch of gymnastics as an > > alternative? I see your suggestion as running afoul of the objection > > you are raising, and the code I'm adding as being a solution to that > > particular objection. So this particular flag you are raising is > > confusing to me. > > I wasn't aware of some greater context vis-as-vis octopus, I didn't expect everyone to be, but that's why I put it in the commit message. ;-)
On Mon, Jul 25 2022, Elijah Newren wrote: > On Fri, Jul 22, 2022 at 10:53 PM Ævar Arnfjörð Bjarmason > <avarab@gmail.com> wrote: >> >> On Fri, Jul 22 2022, Elijah Newren wrote: >> > [...] >> > So, ignoring the return code from diff-index is correct behavior here. >> > >> > Were you thinking this was a test script or something? >> >> We can leave this for now. >> >> But no. Whatever the merge driver is documenting as its normal return >> values we really should be ferrying up abort() and segfault, per the >> "why do we miss..." in: >> https://lore.kernel.org/git/patch-v2-11.14-8cc6ab390db-20220720T211221Z-avarab@gmail.com/ >> >> I.e. this is one of the cases in the test suite where we haven't closed >> that gap, and could hide segfaults as a "normal" exit 2. >> >> So I think your v5 is fine as-is, but in general I'd be really >> interested if you want to double-down on this view for the merge drivers >> for some reason, because my current plan for addressing these blindspots >> outlined in the above wouldn't work then... > > Quoting from there: > >> * We have in-tree shellscripts like "git-merge-one-file.sh" invoking >> git commands, they'll usually return their own exit codes on "git" >> failure, rather then ferrying up segfault or abort() exit code. >> >> E.g. these invocations in git-merge-one-file.sh leak, but aren't >> reflected in the "git merge" exit code: >> >>src1=$(git unpack-file $2) >>src2=$(git unpack-file $3) >> >> That case would be easily "fixed" by adding a line like this after >> each assignment: >> >>test $? -ne 0 && exit $? >> >> But we'd then in e.g. "t6407-merge-binary.sh" run into >> write_tree_trivial() in "builtin/merge.c" calling die() instead of >> ferrying up the relevant exit code. > > Sidenote, but I don't think t6407-merge-binary.sh calls into > write_tree_trivial(). Doesn't in my testing, anyway. I haven't gone back & looked at that, but I vaguely recall that if you "get past" that one it was returning 128 somewhere, either directly or indirectly. In any case, for the purposes of that summary it's a common pattern elsewhere... > Are you really planning on auditing every line of git-bisect.sh, > git-merge*.sh, git-sh-setup.sh, git-submodule.sh, git-web--browse.sh, > and others, and munging every area that invokes git to check the exit > status? Not really, but just for that change explaining why it's required to have this log-on-the-side to munge the exit code. Although I think you might have not kept up with just how close we are to "git rm"-ing most of our non-trivial amounts of shellscript. git-{submodule,bisect}.sh is going away, hopefully in this release. *If* we go for that approach I'd think it would be relatively easy to add a helper to git-sh-setup.sh to wrap the various "git" callse. > Yuck. A few points: > * Will this really buy you anything? Don't we have other regression > tests of all these commands (e.g. "git unpack-file") which ought to > show the same memory leaks? This seems like high lift, low value to > me, and just fixing direct invocations in the regression tests is > where the value comes. It's a long-tail problem, and we don't need to fix it all now. I'm just commenting on it here because there's an *addition* of a hidden exit code, and more importantly I wanted to clear up if you thought ferrying up abort() or segfaults wouldn't be desired in those cases. > (If direct coverage is lacking in the > regression tests, shouldn't the solution be to add coverage?) But how do we find out what we're covering? Yes "make coverage", but that's just going to give you all C lines we "visit", but you don't know if those lines were visited by parts of our test suite where we're checking the exit code. > * Won't this be a huge review and support burden to maintain the > extra checking? I think the end result mostly makes things easier to deal with & maintaine, e.g. consistently using &&-chaining, or test_cmp instead of "test "$(...)" etc. > * Some of these scripts, such as git-merge-resolve.sh and > git-merge-octopus.sh are used as examples of e.g. merge drivers, and > invasive checks whose sole purpose is memory leak checking seems to > run counter to the purpose of being a simple example for users I'm not doing any of this now, but I'd think we could do something like this (i.e. new helpers in git-sh-setup.sh): diff --git a/git-merge-resolve.sh b/git-merge-resolve.sh index 343fe7bccd0..f23344dfec8 100755 --- a/git-merge-resolve.sh +++ b/git-merge-resolve.sh @@ -37,15 +37,16 @@ then exit 2 fi -git update-index -q --refresh -git read-tree -u -m --aggressive $bases $head $remotes || exit 2 +run_git git update-index -q --refresh +run_git --exit-on-failure 2 git read-tree -u -m --aggressive $bases $head $remotes echo "Trying simple merge." -if result_tree=$(git write-tree 2>/dev/null) +result_tree=$(git write-tree 2>/dev/null) +if check_last_git_cmd $? then exit 0 else echo "Simple merge failed, trying Automatic merge." - if git merge-index -o git-merge-one-file -a + if run_git git merge-index -o git-merge-one-file -a then exit 0 else > * Wouldn't using errexit and pipefail be an easier way to accomplish > checking the exit status (avoiding the problems from the last few > bullets)? You'd still have to audit the code and write e.g. > shutupgrep wrappers (since grep reports whether it found certain > patterns in the input, rather than whether it was able to perform the > search on the input, and we often only care about the latter), but it > at least would automatically check future git invocations. Somewhat, but unless we're going to depend on "bash" I think we can at most use those during development to locate various issues, e.g. as I did in this series: https://lore.kernel.org/git/20210123130046.21975-1-avarab@gmail.com/ > * Are we running the risk of overloading special return codes (e.g. > 125 in git-bisect) No, we already deal with that as a potential problem in test_must_fail in the test suite, i.e. we just catch abort(), segfaults & the like. In bourn shells those are 134. > I do still think that "2" is the correct return code for the > shell-script merge strategies here, though I think it's feasible in > their cases to change the documentation to relax the return code > requirements in such a way to allow those scripts to utilize errexit > and pipefail. I think for any such interface it makes sense to exit with 0, 1 and 2 or whatever during normal circumstances. But if the program you just called segfaulted I think it makes sense to treat that as an exception. I.e. it's not just that the merge failed, but we should really abort() in the calling program too.. Anyway, this is all quite academic at this point, but since you asked...
diff --git a/builtin/merge.c b/builtin/merge.c index 23170f2d2a6..13884b8e836 100644 --- a/builtin/merge.c +++ b/builtin/merge.c @@ -1599,6 +1599,26 @@ int cmd_merge(int argc, const char **argv, const char *prefix) */ refresh_cache(REFRESH_QUIET); if (allow_trivial && fast_forward != FF_ONLY) { + /* + * Must first ensure that index matches HEAD before + * attempting a trivial merge. + */ + struct tree *head_tree = get_commit_tree(head_commit); + struct strbuf sb = STRBUF_INIT; + + if (repo_index_has_changes(the_repository, head_tree, + &sb)) { + struct strbuf err = STRBUF_INIT; + strbuf_addstr(&err, "error: "); + strbuf_addf(&err, _("Your local changes to the following files would be overwritten by merge:\n %s"), + sb.buf); + strbuf_addch(&err, '\n'); + fputs(err.buf, stderr); + strbuf_release(&err); + strbuf_release(&sb); + return -1; + } + /* See if it is really trivial. */ git_committer_info(IDENT_STRICT); printf(_("Trying really trivial in-index merge...\n")); diff --git a/git-merge-resolve.sh b/git-merge-resolve.sh index 343fe7bccd0..77e93121bf8 100755 --- a/git-merge-resolve.sh +++ b/git-merge-resolve.sh @@ -5,6 +5,16 @@ # # Resolve two trees, using enhanced multi-base read-tree. +. git-sh-setup + +# Abort if index does not match HEAD +if ! git diff-index --quiet --cached HEAD -- +then + gettextln "Error: Your local changes to the following files would be overwritten by merge" + git diff-index --cached --name-only HEAD -- | sed -e 's/^/ /' + exit 2 +fi + # The first parameters up to -- are merge bases; the rest are heads. bases= head= remotes= sep_seen= for arg diff --git a/t/t6424-merge-unrelated-index-changes.sh b/t/t6424-merge-unrelated-index-changes.sh index b6e424a427b..f35d3182b86 100755 --- a/t/t6424-merge-unrelated-index-changes.sh +++ b/t/t6424-merge-unrelated-index-changes.sh @@ -114,6 +114,32 @@ test_expect_success 'resolve, non-trivial' ' test_path_is_missing .git/MERGE_HEAD ' +test_expect_success 'resolve, trivial, related file removed' ' + git reset --hard && + git checkout B^0 && + + git rm a && + test_path_is_missing a && + + test_must_fail git merge -s resolve C^0 && + + test_path_is_missing a && + test_path_is_missing .git/MERGE_HEAD +' + +test_expect_success 'resolve, non-trivial, related file removed' ' + git reset --hard && + git checkout B^0 && + + git rm a && + test_path_is_missing a && + + test_must_fail git merge -s resolve D^0 && + + test_path_is_missing a && + test_path_is_missing .git/MERGE_HEAD +' + test_expect_success 'recursive' ' git reset --hard && git checkout B^0 &&