diff mbox series

[v2] fetch: Protect branches checked out in all worktrees

Message ID alpine.DEB.2.21.999.2111041709080.94135@tardis-on-the-dome.mit.edu (mailing list archive)
State Superseded
Headers show
Series [v2] fetch: Protect branches checked out in all worktrees | expand

Commit Message

Anders Kaseorg Nov. 4, 2021, 9:10 p.m. UTC
Refuse to fetch into the currently checked out branch of any working
tree, not just the current one.

Fixes this previously reported bug:

https://public-inbox.org/git/cb957174-5e9a-5603-ea9e-ac9b58a2eaad@mathema.de

Signed-off-by: Anders Kaseorg <andersk@mit.edu>
---

(Eep! Sorry for the newline corruption in the previous message. 
Thunderbird has forsaken me.)

 builtin/fetch.c       | 22 +++++++++-------------
 t/t5516-fetch-push.sh | 10 ++++++++++
 2 files changed, 19 insertions(+), 13 deletions(-)

Comments

Jeff King Nov. 5, 2021, 8:38 a.m. UTC | #1
On Thu, Nov 04, 2021 at 05:10:52PM -0400, Anders Kaseorg wrote:

> Refuse to fetch into the currently checked out branch of any working
> tree, not just the current one.

I think that makes sense as a goal.

>  #define FORCED_UPDATES_DELAY_WARNING_IN_MS (10 * 1000)
>  
> @@ -854,7 +855,6 @@ static int update_local_ref(struct ref *ref,
>  			    int summary_width)
>  {
>  	struct commit *current = NULL, *updated;
> -	struct branch *current_branch = branch_get(NULL);
>  	const char *pretty_ref = prettify_refname(ref->name);
>  	int fast_forward = 0;
>  
> @@ -868,9 +868,8 @@ static int update_local_ref(struct ref *ref,
>  		return 0;
>  	}
>  
> -	if (current_branch &&
> -	    !strcmp(ref->name, current_branch->name) &&
> -	    !(update_head_ok || is_bare_repository()) &&
> +	if (!update_head_ok &&
> +	    find_shared_symref("HEAD", ref->name) &&
>  	    !is_null_oid(&ref->old_oid)) {
>  		/*
>  		 * If this is the head, and it's not okay to update

OK, so this is where we'd say "no, I won't update the HEAD". That makes
sense, and the find_shared_symref() helper from worktree.h makes it
easy.

If we get the non-default worktree here, it might be nice to tell the
user which worktree has it checked out. Otherwise it may lead to
head-scratching as they peek at their current HEAD.

I also notice that find_shared_symref() will catch cases where we're on
a detached HEAD, but it's because we're rebasing or bisecting on a given
branch. Arguably that's a sensible thing to do for this check, but it is
a change from the current behavior (even if you are not using worktrees
at all).

> @@ -1387,16 +1386,13 @@ static int prune_refs(struct refspec *rs, struct ref *ref_map,
>  
>  static void check_not_current_branch(struct ref *ref_map)
>  {
> -	struct branch *current_branch = branch_get(NULL);
> -
> -	if (is_bare_repository() || !current_branch)
> -		return;
> -
> +	const struct worktree *wt;
> [...]

So if the earlier hunk covered this case, then what is this hunk for? :)

It looks like we call it from do_fetch() when update_head_ok isn't set.
I'm not clear why we have two distinct code paths that check
update_head_ok. It seems like this one is the one that _actually_ kicks
in, because it's only later that we call update_local_ref(), several
layers down. If I put a BUG() in that one, it is never reached in the
test suite. Hmm.

It looks like this situation is quite old, and certainly it's nothing
new in your patch. So let's file it as a curiosity for now, and we can
deal with it separately (if at all).

>  	for (; ref_map; ref_map = ref_map->next)
> -		if (ref_map->peer_ref && !strcmp(current_branch->refname,
> -					ref_map->peer_ref->name))
> -			die(_("Refusing to fetch into current branch %s "
> -			    "of non-bare repository"), current_branch->refname);
> +		if (ref_map->peer_ref &&
> +		    (wt = find_shared_symref("HEAD", ref_map->peer_ref->name)))
> +			die(_("Refusing to fetch into branch '%s' "
> +			      "checked out at '%s'"),
> +			    ref_map->peer_ref->name, wt->path);

This one does update the message, which is nice (and from my analysis
above, the message from the other hunk doesn't matter at all!). It's a
bit more verbose if you're not using worktrees at all, though. I used to get:

  $ git init
  $ git commit --allow-empty -qm foo
  $ git fetch . main:main
  fatal: Refusing to fetch into current branch refs/heads/main of non-bare repository

but now get:

  $ git fetch . main:main
  fatal: Refusing to fetch into branch 'refs/heads/main' checked out at '/home/peff/tmp'

It's technically correct, but perhaps it would be nicer to keep the old
message if the worktree we found is the current one.

-Peff
Anders Kaseorg Nov. 5, 2021, 10:01 a.m. UTC | #2
On 11/5/21 01:38, Jeff King wrote:
> If we get the non-default worktree here, it might be nice to tell the
> user which worktree has it checked out. Otherwise it may lead to
> head-scratching as they peek at their current HEAD.

I’d agree if this case were reachable, but as per below, it seems not to 
be. So I didn’t bother figuring out how to shoehorn such a message into 
the arguments expected by format_display().

> I also notice that find_shared_symref() will catch cases where we're on
> a detached HEAD, but it's because we're rebasing or bisecting on a given
> branch. Arguably that's a sensible thing to do for this check, but it is
> a change from the current behavior (even if you are not using worktrees
> at all).

That’s a good observation. I agree it seems sensible, but should be 
noted in the commit message.

> So if the earlier hunk covered this case, then what is this hunk for? :)

Yeah, that duplication confused me too. The relevant call paths seem to be:

1. do_fetch calls check_not_current_branch (later hunk);
2. do_fetch calls fetch_and_consume_refs → store_updated_refs → 
update_local_ref (earlier hunk);
3. do_fetch calls backfill_tags → fetch_and_consume_refs → 
store_updated_refs → update_local_ref (earlier hunk);

in that order. That suggests there should be no way to trip the earlier 
hunk’s check without first tripping the latter hunk’s check.

>    $ git fetch . main:main
>    fatal: Refusing to fetch into branch 'refs/heads/main' checked out at '/home/peff/tmp'
> 
> It's technically correct, but perhaps it would be nicer to keep the old
> message if the worktree we found is the current one.

I note that this new message is more consistent with

$ git branch -d main
error: Cannot delete branch 'main' checked out at '/home/anders/tmp'


While looking for other related messages, I noticed another bug: 
receive-pack doesn’t stop you from updating or deleting a branch checked 
out in a worktree for a bare repository.

$ git init
$ git commit --allow-empty -qm foo
$ git clone --bare . bare.git
$ git -C bare.git worktree add wt
$ git -C bare.git/wt push .. :wt

To ..

  - [deleted]         wt


This is_bare_repository() check in update() in builtin/receive-pack.c 
seems wrong:

const struct worktree *worktree = is_bare_repository() ? NULL : 
find_shared_symref("HEAD", name);


Anders
Jeff King Nov. 5, 2021, 11:53 a.m. UTC | #3
On Fri, Nov 05, 2021 at 03:01:39AM -0700, Anders Kaseorg wrote:

> On 11/5/21 01:38, Jeff King wrote:
> > If we get the non-default worktree here, it might be nice to tell the
> > user which worktree has it checked out. Otherwise it may lead to
> > head-scratching as they peek at their current HEAD.
> 
> I’d agree if this case were reachable, but as per below, it seems not to be.
> So I didn’t bother figuring out how to shoehorn such a message into the
> arguments expected by format_display().

I was thinking of just:

diff --git a/builtin/fetch.c b/builtin/fetch.c
index a6549c2ab6..61c8fc9983 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -855,6 +855,7 @@ static int update_local_ref(struct ref *ref,
 			    int summary_width)
 {
 	struct commit *current = NULL, *updated;
+	const struct worktree *wt;
 	const char *pretty_ref = prettify_refname(ref->name);
 	int fast_forward = 0;
 
@@ -869,14 +870,16 @@ static int update_local_ref(struct ref *ref,
 	}
 
 	if (!update_head_ok &&
-	    find_shared_symref("HEAD", ref->name) &&
+	    (wt = find_shared_symref("HEAD", ref->name)) &&
 	    !is_null_oid(&ref->old_oid)) {
 		/*
 		 * If this is the head, and it's not okay to update
 		 * the head, and the old value of the head isn't empty...
 		 */
 		format_display(display, '!', _("[rejected]"),
-			       _("can't fetch in current branch"),
+			       wt->is_current ?
+			       _("can't fetch in current branch") :
+			       _("branch checked out in worktree"),
 			       remote, pretty_ref, summary_width);
 		return 1;
 	}

though of course that does not mention the working tree we found. We'd
probably have to format the string into an extra strbuf for that. But I
agree, that if we don't think this is reachable, it may not be worth
worrying about. But see below, where I get off on a bit of tangent.

> > So if the earlier hunk covered this case, then what is this hunk for? :)
> 
> Yeah, that duplication confused me too. The relevant call paths seem to be:
> 
> 1. do_fetch calls check_not_current_branch (later hunk);
> 2. do_fetch calls fetch_and_consume_refs → store_updated_refs →
> update_local_ref (earlier hunk);
> 3. do_fetch calls backfill_tags → fetch_and_consume_refs →
> store_updated_refs → update_local_ref (earlier hunk);
> 
> in that order. That suggests there should be no way to trip the earlier
> hunk’s check without first tripping the latter hunk’s check.

That matches what I found. I think that the check in update_local_ref()
became obsolete when we added check_not_current_branch() in 8ee5d73137
(Fix fetch/pull when run without --update-head-ok, 2008-10-13). That
commit confused me at first, because there's no mention of the existing
mechanism at all in the commit message nor the mailing list thread.

But it was indeed broken back then, and I think is even broken now! The
problem is that it was comparing against current_branch->name, which is
not the full refname, and thus would never match. The check added in
8ee5d73137 gets this right.

Your patch also ends up fixing this as a side effect, because it
switches to using find_shared_symref(), which eliminates the buggy code.

But all of that means we could actually drop check_not_current_branch()
in favor of the update_local_ref() check. Doing this:

diff --git a/builtin/fetch.c b/builtin/fetch.c
index f7abbc31ff..c52c44684a 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -869,7 +869,7 @@ static int update_local_ref(struct ref *ref,
 	}
 
 	if (current_branch &&
-	    !strcmp(ref->name, current_branch->name) &&
+	    !strcmp(ref->name, current_branch->refname) &&
 	    !(update_head_ok || is_bare_repository()) &&
 	    !is_null_oid(&ref->old_oid)) {
 		/*
@@ -1385,20 +1385,6 @@ static int prune_refs(struct refspec *rs, struct ref *ref_map,
 	return result;
 }
 
-static void check_not_current_branch(struct ref *ref_map)
-{
-	struct branch *current_branch = branch_get(NULL);
-
-	if (is_bare_repository() || !current_branch)
-		return;
-
-	for (; ref_map; ref_map = ref_map->next)
-		if (ref_map->peer_ref && !strcmp(current_branch->refname,
-					ref_map->peer_ref->name))
-			die(_("Refusing to fetch into current branch %s "
-			    "of non-bare repository"), current_branch->refname);
-}
-
 static int truncate_fetch_head(void)
 {
 	const char *filename = git_path_fetch_head(the_repository);
@@ -1587,8 +1573,6 @@ static int do_fetch(struct transport *transport,
 
 	ref_map = get_ref_map(transport->remote, remote_refs, rs,
 			      tags, &autotags);
-	if (!update_head_ok)
-		check_not_current_branch(ref_map);
 
 	if (tags == TAGS_DEFAULT && autotags)
 		transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, "1");

passes the entire test suite except for one test which expects:

  git fetch . side:main

to fail. But that is only because "side" and "main" point to the same
commit, and thus the fetch is a noop. The code in update_local_ref()
covers that case before checking the HEAD case (which I would argue is
a completely reasonable outcome).

The reason I bring this up is that I think doing the check in
update_local_ref() makes much more sense. We don't abort the whole
fetch, but just treat it as a normal per-ref failure. That gives us the
usual status-table output (I thought it might also avoid wasting some
work of actually fetching objects, but I think the current check kicks
in before we actually fetch anything).

On the other hand, this is a rare-ish error, and nobody has cared much
about the distinction in the last 13 years. So it probably doesn't
really matter.

Now getting back to your patch and the message produced in the
update_local_ref() check. I am content to say: this code path is not
really reachable, but cleaning it up is out of scope, so let's leave it
for another day. But if we do eventually want to keep just the check in
update_local_ref(), then we might want to improve the message now while
we're thinking about it.

> > It's technically correct, but perhaps it would be nicer to keep the old
> > message if the worktree we found is the current one.
> 
> I note that this new message is more consistent with
> 
> $ git branch -d main
> error: Cannot delete branch 'main' checked out at '/home/anders/tmp'

Yeah, that may be a reason to keep things consistent. My main worry is
just that somebody who is not using worktrees but has a long repo
pathname may find the message a bit more overwhelming. But if we want to
deal with that, we should probably do so everywhere.

> While looking for other related messages, I noticed another bug:
> receive-pack doesn’t stop you from updating or deleting a branch checked out
> in a worktree for a bare repository.
> 
> $ git init
> $ git commit --allow-empty -qm foo
> $ git clone --bare . bare.git
> $ git -C bare.git worktree add wt
> $ git -C bare.git/wt push .. :wt
> 
> To ..
> 
>  - [deleted]         wt
> 
> 
> This is_bare_repository() check in update() in builtin/receive-pack.c seems
> wrong:
> 
> const struct worktree *worktree = is_bare_repository() ? NULL :
> find_shared_symref("HEAD", name);

Hmm. That is the same check that is on the fetch side, isn't it? In a
bare repository, we will not do any of these current-branch checks. What
is weird in your example is that you are adding a worktree to a bare
repository. Is that a supported workflow? Should that make it non-bare?

I notice that there is a worktree->is_bare flag, and that
find_shared_symref() would not report such a bare entry to us. But I'm
really unclear how any of that is supposed to work (how do you have a
bare worktree, and what does it mean?).

I think that's all orthogonal to the main purpose of your patch here.
You may want to post about it separately with a different subject to get
the attention of folks who work on worktrees.

-Peff
Junio C Hamano Nov. 5, 2021, 6:39 p.m. UTC | #4
Jeff King <peff@peff.net> writes:

> But all of that means we could actually drop check_not_current_branch()
> in favor of the update_local_ref() check. Doing this:
>
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index f7abbc31ff..c52c44684a 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -869,7 +869,7 @@ static int update_local_ref(struct ref *ref,
>  	}
>  
>  	if (current_branch &&
> -	    !strcmp(ref->name, current_branch->name) &&
> +	    !strcmp(ref->name, current_branch->refname) &&
>  	    !(update_head_ok || is_bare_repository()) &&
>  	    !is_null_oid(&ref->old_oid)) {
>  		/*
> @@ -1385,20 +1385,6 @@ static int prune_refs(struct refspec *rs, struct ref *ref_map,
>  	return result;
>  }
>  
> -static void check_not_current_branch(struct ref *ref_map)
> -{
> -	struct branch *current_branch = branch_get(NULL);
> -
> -	if (is_bare_repository() || !current_branch)
> -		return;
> -
> -	for (; ref_map; ref_map = ref_map->next)
> -		if (ref_map->peer_ref && !strcmp(current_branch->refname,
> -					ref_map->peer_ref->name))
> -			die(_("Refusing to fetch into current branch %s "
> -			    "of non-bare repository"), current_branch->refname);
> -}
> -
>  static int truncate_fetch_head(void)
>  {
>  	const char *filename = git_path_fetch_head(the_repository);
> @@ -1587,8 +1573,6 @@ static int do_fetch(struct transport *transport,
>  
>  	ref_map = get_ref_map(transport->remote, remote_refs, rs,
>  			      tags, &autotags);
> -	if (!update_head_ok)
> -		check_not_current_branch(ref_map);
>  
>  	if (tags == TAGS_DEFAULT && autotags)
>  		transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, "1");
>
> passes the entire test suite except for one test which expects:
>
>   git fetch . side:main
>
> to fail. But that is only because "side" and "main" point to the same
> commit, and thus the fetch is a noop. The code in update_local_ref()
> covers that case before checking the HEAD case (which I would argue is
> a completely reasonable outcome).

So we can lose redundant code that was added there only because the
original safety was broken by actually fixing the original safety?

That is very nice.

> The reason I bring this up is that I think doing the check in
> update_local_ref() makes much more sense. We don't abort the whole
> fetch, but just treat it as a normal per-ref failure. That gives us the
> usual status-table output (I thought it might also avoid wasting some
> work of actually fetching objects, but I think the current check kicks
> in before we actually fetch anything).

Yes I agree with that reasoning.  

Thanks for digging this to the bottom, both of you.
Anders Kaseorg Nov. 8, 2021, 8:12 p.m. UTC | #5
On 11/5/21 04:53, Jeff King wrote:
> +			       wt->is_current ?
> +			       _("can't fetch in current branch") :
> +			       _("branch checked out in worktree"),

Sure, I’ll go ahead and add this.  We can make further improvements here 
if we proceed with removing the other error path, which seems to have 
enough complications that we should consider it after merging this bug fix.

> Hmm. That is the same check that is on the fetch side, isn't it? In a
> bare repository, we will not do any of these current-branch checks. What
> is weird in your example is that you are adding a worktree to a bare
> repository. Is that a supported workflow? Should that make it non-bare?
> 
> I notice that there is a worktree->is_bare flag, and that
> find_shared_symref() would not report such a bare entry to us. But I'm
> really unclear how any of that is supposed to work (how do you have a
> bare worktree, and what does it mean?).

worktree->is_bare is only ever potentially set for the main working tree 
(.git/..).  Although a bare repository doesn’t have a main working tree, 
it can still have other working trees created with git worktree.  Making 
a temporary working tree can be useful to quickly inspect or change a 
bare repository.  It’s also useful for sharing refs, remotes, and object 
storage between several working trees where any of them might be created 
or deleted, and there isn’t one that should be considered primary.

The Zulip installer works this way to save some disk space and 
re-cloning time while allowing for simple rollbacks to a previous 
deployment directory (https://github.com/zulip/zulip/pull/18487), which 
is how my team stumbled on the issue.

> I think that's all orthogonal to the main purpose of your patch here.
> You may want to post about it separately with a different subject to get
> the attention of folks who work on worktrees.

I expect this subject line will have already caught the attention of 
those folks, but I’ll add a patch 2/2 with a more specific subject line 
for this issue.

Anders
diff mbox series

Patch

diff --git a/builtin/fetch.c b/builtin/fetch.c
index f7abbc31ff..a6549c2ab6 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -28,6 +28,7 @@ 
 #include "promisor-remote.h"
 #include "commit-graph.h"
 #include "shallow.h"
+#include "worktree.h"
 
 #define FORCED_UPDATES_DELAY_WARNING_IN_MS (10 * 1000)
 
@@ -854,7 +855,6 @@  static int update_local_ref(struct ref *ref,
 			    int summary_width)
 {
 	struct commit *current = NULL, *updated;
-	struct branch *current_branch = branch_get(NULL);
 	const char *pretty_ref = prettify_refname(ref->name);
 	int fast_forward = 0;
 
@@ -868,9 +868,8 @@  static int update_local_ref(struct ref *ref,
 		return 0;
 	}
 
-	if (current_branch &&
-	    !strcmp(ref->name, current_branch->name) &&
-	    !(update_head_ok || is_bare_repository()) &&
+	if (!update_head_ok &&
+	    find_shared_symref("HEAD", ref->name) &&
 	    !is_null_oid(&ref->old_oid)) {
 		/*
 		 * If this is the head, and it's not okay to update
@@ -1387,16 +1386,13 @@  static int prune_refs(struct refspec *rs, struct ref *ref_map,
 
 static void check_not_current_branch(struct ref *ref_map)
 {
-	struct branch *current_branch = branch_get(NULL);
-
-	if (is_bare_repository() || !current_branch)
-		return;
-
+	const struct worktree *wt;
 	for (; ref_map; ref_map = ref_map->next)
-		if (ref_map->peer_ref && !strcmp(current_branch->refname,
-					ref_map->peer_ref->name))
-			die(_("Refusing to fetch into current branch %s "
-			    "of non-bare repository"), current_branch->refname);
+		if (ref_map->peer_ref &&
+		    (wt = find_shared_symref("HEAD", ref_map->peer_ref->name)))
+			die(_("Refusing to fetch into branch '%s' "
+			      "checked out at '%s'"),
+			    ref_map->peer_ref->name, wt->path);
 }
 
 static int truncate_fetch_head(void)
diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
index 8212ca56dc..c8cb62c05d 100755
--- a/t/t5516-fetch-push.sh
+++ b/t/t5516-fetch-push.sh
@@ -1771,4 +1771,14 @@  test_expect_success 'denyCurrentBranch and worktrees' '
 	git -C cloned push origin HEAD:new-wt &&
 	test_must_fail git -C cloned push --delete origin new-wt
 '
+
+test_expect_success 'refuse fetch to current branch of worktree' '
+	test_commit -C cloned second &&
+	test_must_fail git fetch cloned HEAD:new-wt &&
+	git clone --bare . bare.git &&
+	git -C bare.git worktree add bare-wt &&
+	test_must_fail git -C bare.git fetch ../cloned HEAD:bare-wt &&
+	git fetch -u cloned HEAD:new-wt &&
+	git -C bare.git fetch -u ../cloned HEAD:bare-wt
+'
 test_done