Message ID | 20191017162826.1064257-2-pjones@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/2] Make die_if_checked_out() ignore missing worktree checkouts. | expand |
On Thu, Oct 17, 2019 at 12:28 PM Peter Jones <pjones@redhat.com> wrote: > Currently, if you do: > > $ git branch zonk origin/master > $ git worktree add zonk zonk > $ rm -rf zonk > $ git branch -d zonk > > You get the following error: > > $ git branch -d zonk > error: Cannot delete branch 'zonk' checked out at '/home/pjones/devel/kernel.org/git/zonk' > > It isn't meaningfully checked out, the repo's data is just stale and no > longer reflects reality. Echoing SEZDER's comment on patch 1/2, this behavior is an intentional design choice and safety feature of the worktree implementation since worktrees may exist on removable media or remote filesystems which might not always be mounted; hence, the presence of commands "git worktree prune" and "git worktree remove". A couple comment regarding this patch... > Signed-off-by: Peter Jones <pjones@redhat.com> > --- > diff --git a/builtin/worktree.c b/builtin/worktree.c > @@ -133,6 +133,20 @@ static int prune_worktree(const char *id, struct strbuf *reason) > +int prune_worktree_if_missing(const struct worktree *wt) > +{ > + struct strbuf reason = STRBUF_INIT; > + > + if (access(wt->path, F_OK) >= 0 || > + (errno != ENOENT && errno == ENOTDIR)) { > + errno = EEXIST; > + return -1; > + } > + > + strbuf_addf(&reason, _("Removing worktrees/%s: worktree directory is not present"), wt->id); > + return prune_worktree(wt->id, &reason); > +} "git worktree" tries to clean up after itself as much as possible. For instance, it is careful to remove the .git/worktrees directory when the last worktree itself is removed (or pruned). So, the caller of this function would also want to call delete_worktrees_dir_if_empty() to follow suit. > diff --git a/worktree.h b/worktree.h > @@ -132,4 +132,10 @@ void strbuf_worktree_ref(const struct worktree *wt, > +/* > + * Prune a worktree if it is no longer present at the checked out location. > + * Returns < 0 if the checkout is there or if pruning fails. > + */ > +int prune_worktree_if_missing(const struct worktree *wt); It's rather ugly that this function is declared in top-level worktree.h whereas the actual implementation is in builtin/worktree.c. I'd expect to see a preparatory patch which moves prune_worktree() (and probably delete_worktrees_dir_if_empty()) to top-level worktree.c. These minor implementation comments aside, before considering this patch series, it would be nice to see a compelling argument as to why this change of behavior, which undercuts a deliberate design decision, is really desirable.
On Thu, Oct 17, 2019 at 06:44:26PM +0200, SZEDER Gábor wrote: > > if (!wt || (ignore_current_worktree && wt->is_current)) > > return; > > + if (access(wt->path, F_OK) < 0 && > > + (errno == ENOENT || errno == ENOTDIR)) > > + return; > > I think this check is insuffient: even if the directory of the working > tree is not present, the working tree might still exist, and should > not be ignored (or deleted/pruned in the second patch). > > See the description of 'git worktree lock' for details. Ah, thanks for that, I had not realized "lock" was relevant here as I have never used it. That explains some of what seemed to me like a very strange usage model. On Thu, Oct 17, 2019 at 01:28:09PM -0400, Eric Sunshine wrote: > Echoing SEZDER's comment on patch 1/2, this behavior is an intentional > design choice and safety feature of the worktree implementation since > worktrees may exist on removable media or remote filesystems which > might not always be mounted; hence, the presence of commands "git > worktree prune" and "git worktree remove". Okay, I see that use case now - I hadn't realized there was an intentional design decision here, and honestly that's anything but clear from the *code*. It's surprising, for example, that my patches didn't break a single test case. > A couple comment regarding this patch... Sure... > > diff --git a/builtin/worktree.c b/builtin/worktree.c > > @@ -133,6 +133,20 @@ static int prune_worktree(const char *id, struct strbuf *reason) > > +int prune_worktree_if_missing(const struct worktree *wt) > > +{ > > + struct strbuf reason = STRBUF_INIT; > > + > > + if (access(wt->path, F_OK) >= 0 || > > + (errno != ENOENT && errno == ENOTDIR)) { > > + errno = EEXIST; > > + return -1; > > + } > > + > > + strbuf_addf(&reason, _("Removing worktrees/%s: worktree directory is not present"), wt->id); > > + return prune_worktree(wt->id, &reason); > > +} > > "git worktree" tries to clean up after itself as much as possible. For > instance, it is careful to remove the .git/worktrees directory when > the last worktree itself is removed (or pruned). So, the caller of > this function would also want to call delete_worktrees_dir_if_empty() > to follow suit. Okay, will fix. > > diff --git a/worktree.h b/worktree.h > > @@ -132,4 +132,10 @@ void strbuf_worktree_ref(const struct worktree *wt, > > +/* > > + * Prune a worktree if it is no longer present at the checked out location. > > + * Returns < 0 if the checkout is there or if pruning fails. > > + */ > > +int prune_worktree_if_missing(const struct worktree *wt); > > It's rather ugly that this function is declared in top-level > worktree.h whereas the actual implementation is in builtin/worktree.c. I don't disagree, but I didn't want to move stuff into an exposed API if I didn't have to, and that seemed like an appropriate enough header. I can do it the other way though, no problem. > I'd expect to see a preparatory patch which moves prune_worktree() > (and probably delete_worktrees_dir_if_empty()) to top-level > worktree.c. Sure thing. > These minor implementation comments aside, before considering this > patch series, it would be nice to see a compelling argument as to why > this change of behavior, which undercuts a deliberate design decision, > is really desirable. Okay, so just for clarity, when you say there's a deliberate design decision, which behavior here are you talking about? If you mean making "lock" work, I don't have any issue with that. If you mean not cleaning up when we do other commands, then I don't see why that's a concern - after all, that's exactly what "lock" is for. Assuming it is the "lock" behavior we're talking about, I don't think I actually have any intention of breaking this design decision, just making my workflow (without "lock") nag at me less for what seem like pretty trivial issues. I can easily accommodate "git worktree lock". What bugs me though, is that using worktrees basically means I have to replace fairly regular filesystem activities with worktree commands, and it doesn't seem to be *necessary* in any way. And I'm going to forget. A lot. To me, there doesn't seem to be any reason these need to behave any different: $ git worktree add foo foo $ rm -rf foo vs $ git worktree add foo foo $ git worktree remove foo And in fact the only difference right now, aside from some very minuscule storage requirements that haven't gotten cleaned up, is the first one leaves an artifact that tells it to give me errors later until I run "git worktree prune" myself. I'll send another revision of this patchset as a reply to this mail, which should clear up some of our differences.
[cc:+duy] On Fri, Oct 18, 2019 at 3:43 PM Peter Jones <pjones@redhat.com> wrote: > On Thu, Oct 17, 2019 at 01:28:09PM -0400, Eric Sunshine wrote: > > Echoing SEZDER's comment on patch 1/2, this behavior is an intentional > > design choice and safety feature of the worktree implementation since > > worktrees may exist on removable media or remote filesystems which > > might not always be mounted; hence, the presence of commands "git > > worktree prune" and "git worktree remove". > > Okay, I see that use case now - I hadn't realized there was an > intentional design decision here, and honestly that's anything but clear > from the *code*. It can indeed sometimes be difficult to get a high-level functional overview by examining code in isolation. In this case, at least, git-worktree documentation tries to be clear about the "why" and "how" of the pruning behavior (which is not to say that the documentation -- or the code -- can't be improved to communicate this better). > It's surprising, for example, that my patches didn't break a single > test case. Tests suites are never perfect, and an attempt to prune a dangling worktree by deleting a branch likely never occurred to the git-worktree implementer(s). > > These minor implementation comments aside, before considering this > > patch series, it would be nice to see a compelling argument as to why > > this change of behavior, which undercuts a deliberate design decision, > > is really desirable. > > Okay, so just for clarity, when you say there's a deliberate design > decision, which behavior here are you talking about? If you mean making > "lock" work, I don't have any issue with that. If you mean not cleaning > up when we do other commands, then I don't see why that's a concern - > after all, that's exactly what "lock" is for. To clarify, I'm talking about Duy's deliberate design decision to model git-worktree auto-pruning after Git's own garbage-collection behavior. That model includes, not only explicit locking, but a grace period before dangling worktree administrative files can be pruned automatically (see the gc.worktreePruneExpire configuration). The point of git-worktree's grace period (just like git-gc's grace period) is to avoid deleting potentially precious information permanently. For instance, the worktree-local "index" file might have some changes staged but not yet committed. Under the existing model, those staged changes are immune from being accidentally deleted permanently until after the grace period expires or until they are thrown away deliberately (say, via "git worktree prune --expire=now"). > Assuming it is the "lock" behavior we're talking about, I don't think I > actually have any intention of breaking this design decision, just > making my workflow (without "lock") nag at me less for what seem like > pretty trivial issues. The ability to lock a worktree is an extra safety measure built atop the grace period mechanism to provide a way to completely override auto-pruning; it is not meant as an alternate or replacement safety mechanism to the grace period, but instead augments it. So, a behavior change which respects only one of those safety mechanisms but not the other is likely flawed. And, importantly, people may already be relying upon this behavior of having an automatic grace period -- without having to place a worktree lock manually -- so changing behavior arbitrarily could break existing workflows and result in data loss. > I can easily accommodate "git worktree lock". What bugs me though, is > that using worktrees basically means I have to replace fairly regular > filesystem activities with worktree commands, and it doesn't seem to be > *necessary* in any way. And I'm going to forget. A lot. > > To me, there doesn't seem to be any reason these need to behave any different: > > $ git worktree add foo foo > $ rm -rf foo > vs > $ git worktree add foo foo > $ git worktree remove foo > > And in fact the only difference right now, aside from some very > minuscule storage requirements that haven't gotten cleaned up, is the > first one leaves an artifact that tells it to give me errors later until > I run "git worktree prune" myself. I understand the pain point, but I also understand Duy's motivation for being very careful about pruning worktree administrative files automatically (so as to avoid data loss, such as changes already staged to a worktree-local "index" file). While the proposed change may address the pain point, it nevertheless creates the possibility of accidental loss which Duy was careful to avoid when designing worktree mechanics. Although annoying, the current behavior gives you the opportunity to avoid that accidental loss by forcing you to take deliberate action to remove the worktree administrative files. Perhaps there is some way to address the pain point without breaking the fundamental promise made by git-worktree about being careful with worktree metadata[*], but the changes proposed by this patch series seem insufficient (even if the patch is reworked to respect worktree locking). I've cc:'d Duy in case he wants to chime in. [*] For instance, perhaps before auto-pruning, it could check whether the index is recording staged changes or conflict information, and only allow auto-pruning if the index is clean. *But* there may be other ways for information to be lost permanently (beyond a dirty "index") which don't occur to me at present, so this has to be considered carefully.
On 08/11/2019 10:14, Eric Sunshine wrote: > [cc:+duy] > > On Fri, Oct 18, 2019 at 3:43 PM Peter Jones <pjones@redhat.com> wrote: >> On Thu, Oct 17, 2019 at 01:28:09PM -0400, Eric Sunshine wrote: >>> Echoing SEZDER's comment on patch 1/2, this behavior is an intentional >>> design choice and safety feature of the worktree implementation since >>> worktrees may exist on removable media or remote filesystems which >>> might not always be mounted; hence, the presence of commands "git >>> worktree prune" and "git worktree remove". >> >> Okay, I see that use case now - I hadn't realized there was an >> intentional design decision here, and honestly that's anything but clear >> from the *code*. > > It can indeed sometimes be difficult to get a high-level functional > overview by examining code in isolation. In this case, at least, > git-worktree documentation tries to be clear about the "why" and "how" > of the pruning behavior (which is not to say that the documentation -- > or the code -- can't be improved to communicate this better). > >> It's surprising, for example, that my patches didn't break a single >> test case. > > Tests suites are never perfect, and an attempt to prune a dangling > worktree by deleting a branch likely never occurred to the > git-worktree implementer(s). > >>> These minor implementation comments aside, before considering this >>> patch series, it would be nice to see a compelling argument as to why >>> this change of behavior, which undercuts a deliberate design decision, >>> is really desirable. >> >> Okay, so just for clarity, when you say there's a deliberate design >> decision, which behavior here are you talking about? If you mean making >> "lock" work, I don't have any issue with that. If you mean not cleaning >> up when we do other commands, then I don't see why that's a concern - >> after all, that's exactly what "lock" is for. > > To clarify, I'm talking about Duy's deliberate design decision to > model git-worktree auto-pruning after Git's own garbage-collection > behavior. That model includes, not only explicit locking, but a grace > period before dangling worktree administrative files can be pruned > automatically (see the gc.worktreePruneExpire configuration). > > The point of git-worktree's grace period (just like git-gc's grace > period) is to avoid deleting potentially precious information > permanently. For instance, the worktree-local "index" file might have > some changes staged but not yet committed. Under the existing model, > those staged changes are immune from being accidentally deleted > permanently until after the grace period expires or until they are > thrown away deliberately (say, via "git worktree prune --expire=now"). > >> Assuming it is the "lock" behavior we're talking about, I don't think I >> actually have any intention of breaking this design decision, just >> making my workflow (without "lock") nag at me less for what seem like >> pretty trivial issues. > > The ability to lock a worktree is an extra safety measure built atop > the grace period mechanism to provide a way to completely override > auto-pruning; it is not meant as an alternate or replacement safety > mechanism to the grace period, but instead augments it. So, a behavior > change which respects only one of those safety mechanisms but not the > other is likely flawed. > > And, importantly, people may already be relying upon this behavior of > having an automatic grace period -- without having to place a worktree > lock manually -- so changing behavior arbitrarily could break existing > workflows and result in data loss. > >> I can easily accommodate "git worktree lock". What bugs me though, is >> that using worktrees basically means I have to replace fairly regular >> filesystem activities with worktree commands, and it doesn't seem to be >> *necessary* in any way. And I'm going to forget. A lot. >> >> To me, there doesn't seem to be any reason these need to behave any different: >> >> $ git worktree add foo foo >> $ rm -rf foo >> vs >> $ git worktree add foo foo >> $ git worktree remove foo >> >> And in fact the only difference right now, aside from some very >> minuscule storage requirements that haven't gotten cleaned up, is the >> first one leaves an artifact that tells it to give me errors later until >> I run "git worktree prune" myself. > > I understand the pain point, but I also understand Duy's motivation > for being very careful about pruning worktree administrative files > automatically (so as to avoid data loss, such as changes already > staged to a worktree-local "index" file). While the proposed change > may address the pain point, it nevertheless creates the possibility of > accidental loss which Duy was careful to avoid when designing worktree > mechanics. Although annoying, the current behavior gives you the > opportunity to avoid that accidental loss by forcing you to take > deliberate action to remove the worktree administrative files. > > Perhaps there is some way to address the pain point without breaking > the fundamental promise made by git-worktree about being careful with > worktree metadata[*], but the changes proposed by this patch series > seem insufficient (even if the patch is reworked to respect worktree > locking). I've cc:'d Duy in case he wants to chime in. I agree that we want to preserve the safe guards in the worktree design. I wonder if detaching the HEAD of the missing worktree would solve the problem without losing data. In the case where something wants to checkout the same branch as the missing worktree then I think that is a good solution. I think it should be OK for branch deletion as well. Best Wishes Phillip > [*] For instance, perhaps before auto-pruning, it could check whether > the index is recording staged changes or conflict information, and > only allow auto-pruning if the index is clean. *But* there may be > other ways for information to be lost permanently (beyond a dirty > "index") which don't occur to me at present, so this has to be > considered carefully. >
On Fri, Nov 8, 2019 at 9:56 AM Phillip Wood <phillip.wood123@gmail.com> wrote: > On 08/11/2019 10:14, Eric Sunshine wrote: > > Perhaps there is some way to address the pain point without breaking > > the fundamental promise made by git-worktree about being careful with > > worktree metadata[*], but the changes proposed by this patch series > > seem insufficient (even if the patch is reworked to respect worktree > > locking). I've cc:'d Duy in case he wants to chime in. > > I agree that we want to preserve the safe guards in the worktree design. > I wonder if detaching the HEAD of the missing worktree would solve the > problem without losing data. In the case where something wants to > checkout the same branch as the missing worktree then I think that is a > good solution. I think it should be OK for branch deletion as well. I would feel very uncomfortable making "automatic HEAD detachment" (decapitation?) the default behavior. Although doing so may (in some fashion) safeguard precious information in .git/worktrees/<id>, it potentially brings its own difficulties. For instance, if someone takes an action which automatically detaches HEAD of a missing worktree which had some branch checked out (and possibly some changes staged in the worktree-specific "index"), and then builds more commits on that branch, then that worktree gets into a state akin to rebased upstream (for which git-rebase documentation devotes an entire section[1], "Recovering From Upstream Rebase"). While a power-user may be able to recover from such a state, allowing the general Git user to get into such a situation by default seems contraindicated. I'm not even convinced that hiding the suggested "auto-detach" behavior behind a configuration variable so power-users can enable it is entirely a good idea either since, while it may eliminate some pain, it also potentially allows abandoned worktree entries to accumulate. [1]: https://git-scm.com/docs/git-rebase#_recovering_from_upstream_rebase
diff --git a/builtin/branch.c b/builtin/branch.c index 2ef214632f0..d611f8183b4 100644 --- a/builtin/branch.c +++ b/builtin/branch.c @@ -236,7 +236,7 @@ static int delete_branches(int argc, const char **argv, int force, int kinds, if (kinds == FILTER_REFS_BRANCHES) { const struct worktree *wt = find_shared_symref("HEAD", name); - if (wt) { + if (wt && prune_worktree_if_missing(wt) < 0) { error(_("Cannot delete branch '%s' " "checked out at '%s'"), bname.buf, wt->path); diff --git a/builtin/worktree.c b/builtin/worktree.c index 4de44f579af..b3ad915c3c3 100644 --- a/builtin/worktree.c +++ b/builtin/worktree.c @@ -133,6 +133,20 @@ static int prune_worktree(const char *id, struct strbuf *reason) return 0; } +int prune_worktree_if_missing(const struct worktree *wt) +{ + struct strbuf reason = STRBUF_INIT; + + if (access(wt->path, F_OK) >= 0 || + (errno != ENOENT && errno == ENOTDIR)) { + errno = EEXIST; + return -1; + } + + strbuf_addf(&reason, _("Removing worktrees/%s: worktree directory is not present"), wt->id); + return prune_worktree(wt->id, &reason); +} + static void prune_worktrees(void) { struct strbuf reason = STRBUF_INIT; diff --git a/worktree.h b/worktree.h index caecc7a281c..75762c25752 100644 --- a/worktree.h +++ b/worktree.h @@ -132,4 +132,10 @@ void strbuf_worktree_ref(const struct worktree *wt, const char *worktree_ref(const struct worktree *wt, const char *refname); +/* + * Prune a worktree if it is no longer present at the checked out location. + * Returns < 0 if the checkout is there or if pruning fails. + */ +int prune_worktree_if_missing(const struct worktree *wt); + #endif
Currently, if you do: $ git branch zonk origin/master $ git worktree add zonk zonk $ rm -rf zonk $ git branch -d zonk You get the following error: $ git branch -d zonk error: Cannot delete branch 'zonk' checked out at '/home/pjones/devel/kernel.org/git/zonk' It isn't meaningfully checked out, the repo's data is just stale and no longer reflects reality. This makes it so that if nothing is present where a worktree is supposedly checked out, deleting the branch will automatically prune it. Signed-off-by: Peter Jones <pjones@redhat.com> --- builtin/branch.c | 2 +- builtin/worktree.c | 14 ++++++++++++++ worktree.h | 6 ++++++ 3 files changed, 21 insertions(+), 1 deletion(-)