Message ID | dd86016b-3232-563b-940e-03bc36af917a@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | branch: operations on orphan branches | expand |
Rubén Justo <rjusto@gmail.com> writes: > In bcfc82bd48 (branch: description for non-existent branch errors, > 2022-10-08) we used "strcmp(head, branch)" to check if we're asked to > operate in a branch that is the current HEAD, and > "ref_exists(branch_ref)" to check if it points to a valid ref, i.e. it > is an orphan branch. We are doing this with the intention of avoiding > the confusing error: "No branch named..." I agree that it is good to notice "the user thinks the branch should already exist, for they have 'checked out' that branch, but there is no commit on the branch yet". I am not sure checked out on a separate worktree as an unborn branch is always the indication that the user thinks the branch should exist (e.g. "you allowed somebody else, or yourself, prepare a separate worktree to work on something a few weeks ago, but no single commit has been made there and you forgot about the worktree---should the branch still exist?"), but that is a separate topic. Let's assume that the two are equivalent. > diff --git a/builtin/branch.c b/builtin/branch.c > index f63fd45edb..954008e51d 100644 > --- a/builtin/branch.c > +++ b/builtin/branch.c > @@ -528,8 +528,8 @@ static void copy_or_rename_branch(const char *oldname, const char *newname, int > die(_("Invalid branch name: '%s'"), oldname); > } > > - if ((copy || strcmp(head, oldname)) && !ref_exists(oldref.buf)) { > - if (copy && !strcmp(head, oldname)) > + if ((copy || !branch_checked_out(oldref.buf)) && !ref_exists(oldref.buf)) { > + if (copy && branch_checked_out(oldref.buf)) > die(_("No commit on branch '%s' yet."), oldname); > else > die(_("No branch named '%s'."), oldname); Isn't branch_checked_out() a fairly heavyweight operation when you have multiple worktrees? The original went to the filesystem (i.e. check ref_exists()) only after seeing that a condition that can be computed using only in-core data holds (i.e. the branch names are the same or we are doing a copy), which is an optimum order to avoid doing unnecessary work in most common cases, but I am not sure if the order the checks are done in the updated code optimizes for the common case. If branch_checked_out() is more expensive than a call to ref_exists() for a single brnch, that would change the equation. Calling such a heavyweight operation twice would make it even more expensive but that is a perfectly fine thing to do in the error codepath, inside the block that is entered after we noticed an error condition. > +test_expect_success 'error descriptions on orphan or unborn-yet branch' ' > + cat >expect <<-EOF && > + error: No commit on branch '\''orphan-branch'\'' yet. > + EOF > ... > +' > + > +test_expect_success 'fatal descriptions on orphan or unborn-yet branch' ' > + cat >expect <<-EOF && > + fatal: No commit on branch '\''orphan-branch'\'' yet. > + EOF > ... > +' Do we already cover existing "No branch named" case the same way in this test script, so that it is OK for these new tests to cover only the "not yet" cases? I am asking because if we have existing coverage, before and after the change to the C code in this patch, some of the existing tests would change the behaviour (i.e. they would have said "No branch named X" when somebody else created an unborn branch in a separate worktree, but now they would say "No commit on branch X yet"), but I see no such change in the test. If we lack existing coverage, we probably should --- otherwise we would not notice when somebody breaks the command to say "No commit on branch X yet" when it should say "No such branch X".
On 01-ene-2023 12:45:47, Junio C Hamano wrote: > Rubén Justo <rjusto@gmail.com> writes: > > Isn't branch_checked_out() a fairly heavyweight operation when you > have multiple worktrees? The original went to the filesystem > (i.e. check ref_exists()) only after seeing that a condition that > can be computed using only in-core data holds (i.e. the branch names > are the same or we are doing a copy), which is an optimum order to > avoid doing unnecessary work in most common cases, but I am not sure > if the order the checks are done in the updated code optimizes for > the common case. If branch_checked_out() is more expensive than a > call to ref_exists() for a single brnch, that would change the > equation. Calling such a heavyweight operation twice would make it > even more expensive but that is a perfectly fine thing to do in the > error codepath, inside the block that is entered after we noticed an > error condition. I share your concern, I thought about it. My thoughts evaluating the change were more or less: - presumably there should be many more references than worktrees, and more repositories with 0 or 1 workdirs than more, so, arbitrarily, calling ref_exists() last still sounds sensible - strcmp() to branch_checked_out() introduces little change in the logic - I like branch_checked_out(), it expresses better what we want there - branch_checked_out() considers refs, strcmp considers branch names (we have a corner case with @{-1} pointing to HEAD, that this resolves) - finally, perhaps branch_checked_out() has optimization possibilities. Maybe in the common case we can get close to the amount of work we are doing now. Here we can alleviate a bit by removing the unconditional resolve_refdup(HEAD) we are doing at the beginning of cmd_branch(). In the end, it seems to me that we have some places where we are considering HEAD and we may need to consider HEADs. And again, I agree, the change has somewhat profound implications. > > > +test_expect_success 'error descriptions on orphan or unborn-yet branch' ' > > + cat >expect <<-EOF && > > + error: No commit on branch '\''orphan-branch'\'' yet. > > + EOF > > ... > > +' > > + > > +test_expect_success 'fatal descriptions on orphan or unborn-yet branch' ' > > + cat >expect <<-EOF && > > + fatal: No commit on branch '\''orphan-branch'\'' yet. > > + EOF > > ... > > +' > > Do we already cover existing "No branch named" case the same way in > this test script, so that it is OK for these new tests to cover only > the "not yet" cases? I am asking because if we have existing > coverage, before and after the change to the C code in this patch, > some of the existing tests would change the behaviour (i.e. they > would have said "No branch named X" when somebody else created an > unborn branch in a separate worktree, but now they would say "No > commit on branch X yet"), but I see no such change in the test. If > we lack existing coverage, we probably should --- otherwise we would > not notice when somebody breaks the command to say "No commit on > branch X yet" when it should say "No such branch X". > I think we do, bcfc82bd (branch: description for non-existent branch errors). We have some pending changes to follow the CodingGuideLines in this messages that maybe we can resume: https://lore.kernel.org/git/eb3c689e-efeb-4468-a10f-dd32bc0ee37b@gmail.com/ Thank you for reading the change this way.
Rubén Justo <rjusto@gmail.com> writes: > On 01-ene-2023 12:45:47, Junio C Hamano wrote: >> Rubén Justo <rjusto@gmail.com> writes: >> >> Isn't branch_checked_out() a fairly heavyweight operation when you >> have multiple worktrees? The original went to the filesystem >> (i.e. check ref_exists()) only after seeing that a condition that >> can be computed using only in-core data holds (i.e. the branch names >> are the same or we are doing a copy), which is an optimum order to >> avoid doing unnecessary work in most common cases, but I am not sure >> if the order the checks are done in the updated code optimizes for >> the common case. If branch_checked_out() is more expensive than a >> call to ref_exists() for a single brnch, that would change the >> equation. Calling such a heavyweight operation twice would make it >> even more expensive but that is a perfectly fine thing to do in the >> error codepath, inside the block that is entered after we noticed an >> error condition. > > I share your concern, I thought about it. > > My thoughts evaluating the change were more or less: > > - presumably there should be many more references than worktrees, and > more repositories with 0 or 1 workdirs than more, so, arbitrarily, > calling ref_exists() last still sounds sensible > > - strcmp() to branch_checked_out() introduces little change in the > logic > > - I like branch_checked_out(), it expresses better what we want there > > - branch_checked_out() considers refs, strcmp considers branch names > (we have a corner case with @{-1} pointing to HEAD, that this > resolves) > > - finally, perhaps branch_checked_out() has optimization possibilities. > Maybe in the common case we can get close to the amount of work we > are doing now. Here we can alleviate a bit by removing the > unconditional resolve_refdup(HEAD) we are doing at the beginning of > cmd_branch(). > > In the end, it seems to me that we have some places where we are > considering HEAD and we may need to consider HEADs. I am not sure what you share with me after reading the above, though. As I already said, I am not opposed to the use of branch_checked_out(), or checking the state of other worktrees connected to the same repository, at all. I was merely looking at this: > - if ((copy || strcmp(head, oldname)) && !ref_exists(oldref.buf)) { > - if (copy && !strcmp(head, oldname)) > + if ((copy || !branch_checked_out(oldref.buf)) && !ref_exists(oldref.buf)) { and wondering if the evaluation order to call branch_checked_out() unconditionally and then calling ref_exists() still makes sense, or now the strcmp() part of the original has become so much more costly, if we are better off doing the same thing in a different order, e.g. if (!ref_exists(oldref.buf) && (copy || !branch_checked_out(oldref.buf))) { >> Do we already cover existing "No branch named" case the same way in >> this test script, so that it is OK for these new tests to cover only >> the "not yet" cases? I am asking because if we have existing >> coverage, before and after the change to the C code in this patch, >> some of the existing tests would change the behaviour (i.e. they >> would have said "No branch named X" when somebody else created an >> unborn branch in a separate worktree, but now they would say "No >> commit on branch X yet"), but I see no such change in the test. If >> we lack existing coverage, we probably should --- otherwise we would >> not notice when somebody breaks the command to say "No commit on >> branch X yet" when it should say "No such branch X". > > I think we do, bcfc82bd (branch: description for non-existent branch > errors). If we already have checks that current code triggers the "no branch named X" warning, and because the patch is changing the code to give that warning to instead say "branch X has no commits yet" in some cases, if the existing test coverage were thorough, some of the existing tests that expect "no branch named X" warning should now expect "branch X has no commits yet" warning. Your patch did not have any such change in the tests, which was an indication that the existing test coverage was lacking, no? And given that, do we now have a complete test coverage for all cases with the patch we are discussing? Thanks.
On 04-ene-2023 15:58:02, Junio C Hamano wrote: > > > - if ((copy || strcmp(head, oldname)) && !ref_exists(oldref.buf)) { > > - if (copy && !strcmp(head, oldname)) > > + if ((copy || !branch_checked_out(oldref.buf)) && !ref_exists(oldref.buf)) { > > and wondering if the evaluation order to call branch_checked_out() > unconditionally and then calling ref_exists() still makes sense, or > now the strcmp() part of the original has become so much more > costly, if we are better off doing the same thing in a different > order, e.g. > > if (!ref_exists(oldref.buf) && (copy || !branch_checked_out(oldref.buf))) { > Thinking of this as a whole, perhaps after this series we can add: -- >8 -- Subject: [PATCH] branch: copy_or_rename_branch() unconditionally calls In previous commits we have introduced changes to copy_or_rename_branch() that lead to unconditionally calling ref_exists(), twice in some circumstances. Optimize copy_or_rename_branch() so that it only calls ref_exists() once and reorder some conditionals to consider ref_exists() first and avoid unnecessarily calling other expensive functions. Signed-off-by: Rubén Justo <rjusto@gmail.com> --- builtin/branch.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/builtin/branch.c b/builtin/branch.c index c14a7a42e6..6e70377a1a 100644 --- a/builtin/branch.c +++ b/builtin/branch.c @@ -515,7 +515,7 @@ static void copy_or_rename_branch(const char *oldname, const char *newname, int struct strbuf oldsection = STRBUF_INIT, newsection = STRBUF_INIT; const char *interpreted_oldname = NULL; const char *interpreted_newname = NULL; - int recovery = 0; + int recovery = 0, oldref_exists; if (strbuf_check_branch_ref(&oldref, oldname)) { /* @@ -523,12 +523,13 @@ static void copy_or_rename_branch(const char *oldname, const char *newname, int * ref that we used to allow to be created by accident. */ if (ref_exists(oldref.buf)) - recovery = 1; + oldref_exists = recovery = 1; else die(_("Invalid branch name: '%s'"), oldname); - } + } else + oldref_exists = ref_exists(oldref.buf); - if ((copy || !branch_checked_out(oldref.buf)) && !ref_exists(oldref.buf)) { + if (!oldref_exists && (copy || !branch_checked_out(oldref.buf))) { if (copy && branch_checked_out(oldref.buf)) die(_("No commit on branch '%s' yet."), oldname); else @@ -558,8 +559,7 @@ static void copy_or_rename_branch(const char *oldname, const char *newname, int strbuf_addf(&logmsg, "Branch: renamed %s to %s", oldref.buf, newref.buf); - if (!copy && - (!branch_checked_out(oldref.buf) || ref_exists(oldref.buf)) && + if (!copy && oldref_exists && rename_ref(oldref.buf, newref.buf, logmsg.buf)) die(_("Branch rename failed")); if (copy && copy_existing_ref(oldref.buf, newref.buf, logmsg.buf)) base-commit: 64b4d8c0eb1938fa10477b9bd9aead2773456e3e -- > >> Do we already cover existing "No branch named" case the same way in > >> this test script, so that it is OK for these new tests to cover only > >> the "not yet" cases? I am asking because if we have existing > >> coverage, before and after the change to the C code in this patch, > >> some of the existing tests would change the behaviour (i.e. they > >> would have said "No branch named X" when somebody else created an > >> unborn branch in a separate worktree, but now they would say "No > >> commit on branch X yet"), but I see no such change in the test. If > >> we lack existing coverage, we probably should --- otherwise we would > >> not notice when somebody breaks the command to say "No commit on > >> branch X yet" when it should say "No such branch X". > > > > I think we do, bcfc82bd (branch: description for non-existent branch > > errors). > > If we already have checks that current code triggers the "no branch > named X" warning, and because the patch is changing the code to give > that warning to instead say "branch X has no commits yet" in some > cases, if the existing test coverage were thorough, some of the > existing tests that expect "no branch named X" warning should now > expect "branch X has no commits yet" warning. Your patch did not > have any such change in the tests, which was an indication that the > existing test coverage was lacking, no? Yes. We did not have a test for 'No branch named' that implied an orphan branch. I think if we had tried that, we would have ended up doing what we're doing now. > > And given that, do we now have a complete test coverage for all > cases with the patch we are discussing? Considering 1/2 and 2/2, I think so. But if you're asking maybe you're realizing something...
Rubén Justo <rjusto@gmail.com> writes:
> Thinking of this as a whole, perhaps after this series we can add:
Why "after"? If we already know that the existing patches are
making things worse and need to fix the regression with a future
patch to make it usable again, why introduce a regression in the
first place?
Rubén Justo <rjusto@gmail.com> writes: > Considering 1/2 and 2/2, I think so. But if you're asking maybe > you're realizing something... No, I am trying to make sure that you have thought it through.
On 7/1/23 0:59, Junio C Hamano wrote: > Rubén Justo <rjusto@gmail.com> writes: > >> Thinking of this as a whole, perhaps after this series we can add: > > Why "after"? If we already know that the existing patches are > making things worse and need to fix the regression with a future > patch to make it usable again, why introduce a regression in the > first place? > I'm not sure if it is so worse, and if the optimization is a fix. We're actually paying for worktrees twice in: reject_rebase_or_bisedt_branch() and replace_each_worktree_head_symref(). Making the change this way makes more obvious IMHO what we are moving and why. Start moving the ref_exists() in 1/2, easily leads to 2/1 and this patch squashed with 1/2, for little gain (IMHO) and worse history. This is why I think it's a good sequence. But I understand your point and I'm not opposed to doing it as you suggest if you think the current way doesn't pay off.
diff --git a/builtin/branch.c b/builtin/branch.c index f63fd45edb..954008e51d 100644 --- a/builtin/branch.c +++ b/builtin/branch.c @@ -528,8 +528,8 @@ static void copy_or_rename_branch(const char *oldname, const char *newname, int die(_("Invalid branch name: '%s'"), oldname); } - if ((copy || strcmp(head, oldname)) && !ref_exists(oldref.buf)) { - if (copy && !strcmp(head, oldname)) + if ((copy || !branch_checked_out(oldref.buf)) && !ref_exists(oldref.buf)) { + if (copy && branch_checked_out(oldref.buf)) die(_("No commit on branch '%s' yet."), oldname); else die(_("No branch named '%s'."), oldname); @@ -806,7 +806,7 @@ int cmd_branch(int argc, const char **argv, const char *prefix) strbuf_addf(&branch_ref, "refs/heads/%s", branch_name); if (!ref_exists(branch_ref.buf)) - error((!argc || !strcmp(head, branch_name)) + error((!argc || branch_checked_out(branch_ref.buf)) ? _("No commit on branch '%s' yet.") : _("No branch named '%s'."), branch_name); @@ -851,7 +851,7 @@ int cmd_branch(int argc, const char **argv, const char *prefix) } if (!ref_exists(branch->refname)) { - if (!argc || !strcmp(head, branch->name)) + if (!argc || branch_checked_out(branch->refname)) die(_("No commit on branch '%s' yet."), branch->name); die(_("branch '%s' does not exist"), branch->name); } diff --git a/t/t3202-show-branch.sh b/t/t3202-show-branch.sh index ea7cfd1951..acaedac34e 100755 --- a/t/t3202-show-branch.sh +++ b/t/t3202-show-branch.sh @@ -221,4 +221,40 @@ test_expect_success 'fatal descriptions on non-existent branch' ' test_cmp expect actual ' +test_expect_success 'error descriptions on orphan or unborn-yet branch' ' + cat >expect <<-EOF && + error: No commit on branch '\''orphan-branch'\'' yet. + EOF + test_when_finished git worktree remove -f orphan-worktree && + git worktree add orphan-worktree --detach && + git -C orphan-worktree checkout --orphan orphan-branch && + test_must_fail git -C orphan-worktree branch --edit-description 2>actual && # implicit branch + test_cmp expect actual && + test_must_fail git -C orphan-worktree branch --edit-description orphan-branch 2>actual && # explicit branch + test_cmp expect actual && + test_must_fail git branch --edit-description orphan-branch 2>actual && # different worktree + test_cmp expect actual +' + +test_expect_success 'fatal descriptions on orphan or unborn-yet branch' ' + cat >expect <<-EOF && + fatal: No commit on branch '\''orphan-branch'\'' yet. + EOF + test_when_finished git worktree remove -f orphan-worktree && + git worktree add orphan-worktree --detach && + git -C orphan-worktree checkout --orphan orphan-branch && + test_must_fail git -C orphan-worktree branch --set-upstream-to=non-existent 2>actual && # implicit branch + test_cmp expect actual && + test_must_fail git -C orphan-worktree branch --set-upstream-to=non-existent orphan-branch 2>actual && # explicit branch + test_cmp expect actual && + test_must_fail git branch --set-upstream-to=non-existent orphan-branch 2>actual && # different worktree + test_cmp expect actual && + test_must_fail git -C orphan-worktree branch -c new-branch 2>actual && # implicit branch + test_cmp expect actual && + test_must_fail git -C orphan-worktree branch -c orphan-branch new-branch 2>actual && # explicit branch + test_cmp expect actual && + test_must_fail git branch -c orphan-branch new-branch 2>actual && # different worktree + test_cmp expect actual +' + test_done
In bcfc82bd48 (branch: description for non-existent branch errors, 2022-10-08) we used "strcmp(head, branch)" to check if we're asked to operate in a branch that is the current HEAD, and "ref_exists(branch_ref)" to check if it points to a valid ref, i.e. it is an orphan branch. We are doing this with the intention of avoiding the confusing error: "No branch named..." If we're asked to operate with an orphan branch, but it is on a different worktree, "strcmp(head, branch)" is not going to match: $ git worktree add orphan-worktree --detach $ git -C orphan-worktree checkout --orphan orpha-branch $ git branch -m orpha-branch orphan-branch fatal: No branch named 'orpha-branch'. Let's do the check for HEAD in any worktree in the repository, removing the "strcmp" and using the helper introduced in 31ad6b61bd (branch: add branch_checked_out() helper, 2022-06-15) This commit also extends the tests introduced on bcfc82bd48, in t3202-show-branch, to cover not just the initial branch but any orphan branch. Signed-off-by: Rubén Justo <rjusto@gmail.com> --- builtin/branch.c | 8 ++++---- t/t3202-show-branch.sh | 36 ++++++++++++++++++++++++++++++++++++ 2 files changed, 40 insertions(+), 4 deletions(-)