[v2,2/3] send-pack: fix push nego. when remote has refs

Message ID	c8416933035849e40b88c29f1d5fa91064ca0c8a.1626370766.git.jonathantanmy@google.com (mailing list archive)
State	Accepted
Commit	54a03bc7d9a7f264d511d88166afe8da7f75e90a
Headers	show Return-Path: <git-owner@kernel.org> Date: Thu, 15 Jul 2021 10:44:31 -0700 In-Reply-To: <cover.1626370766.git.jonathantanmy@google.com> Message-Id: <c8416933035849e40b88c29f1d5fa91064ca0c8a.1626370766.git.jonathantanmy@google.com> Mime-Version: 1.0 References: <cover.1624486920.git.jonathantanmy@google.com> <cover.1626370766.git.jonathantanmy@google.com> Subject: [PATCH v2 2/3] send-pack: fix push nego. when remote has refs From: Jonathan Tan <jonathantanmy@google.com> To: git@vger.kernel.org Cc: Jonathan Tan <jonathantanmy@google.com>, avarab@gmail.com, emilyshaffer@google.com, Junio C Hamano <gitster@pobox.com> Content-Type: text/plain; charset="UTF-8" Precedence: bulk
Series	Push negotiation fixes \| expand [v2,0/3] Push negotiation fixes [v2,1/3] send-pack: fix push.negotiate with remote helper [v2,2/3] send-pack: fix push nego. when remote has refs [v2,3/3] fetch: die on invalid --negotiation-tip hash

Message ID

c8416933035849e40b88c29f1d5fa91064ca0c8a.1626370766.git.jonathantanmy@google.com (mailing list archive)

State

Accepted

Commit

54a03bc7d9a7f264d511d88166afe8da7f75e90a

Headers

Date: Thu, 15 Jul 2021 10:44:31 -0700
In-Reply-To: <cover.1626370766.git.jonathantanmy@google.com>
Message-Id: 
 <c8416933035849e40b88c29f1d5fa91064ca0c8a.1626370766.git.jonathantanmy@google.com>
Mime-Version: 1.0
References: <cover.1624486920.git.jonathantanmy@google.com>
 <cover.1626370766.git.jonathantanmy@google.com>
Subject: [PATCH v2 2/3] send-pack: fix push nego. when remote has refs
From: Jonathan Tan <jonathantanmy@google.com>
To: git@vger.kernel.org
Cc: Jonathan Tan <jonathantanmy@google.com>, avarab@gmail.com,
        emilyshaffer@google.com, Junio C Hamano <gitster@pobox.com>
Content-Type: text/plain; charset="UTF-8"
Precedence: bulk

Series

Push negotiation fixes | expand

Commit Message

Jonathan Tan July 15, 2021, 5:44 p.m. UTC

Commit 477673d6f3 ("send-pack: support push negotiation", 2021-05-05)
did not test the case in which a remote advertises at least one ref. In
such a case, "remote_refs" in get_commons_through_negotiation() in
send-pack.c would also contain those refs with a zero ref->new_oid (in
addition to the refs being pushed with a nonzero ref->new_oid). Passing
them as negotiation tips to "git fetch" causes an error, so filter them
out.

(The exact error that would happen in "git fetch" in this case is a
segmentation fault, which is unwanted. This will be fixed in the
subsequent commit.)

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 send-pack.c                | 6 ++++--
 t/t5516-fetch-push.sh      | 4 +++-
 t/t5549-fetch-push-http.sh | 1 +
 3 files changed, 8 insertions(+), 3 deletions(-)

Comments

Ævar Arnfjörð Bjarmason July 27, 2021, 8:09 a.m. UTC | #1

On Thu, Jul 15 2021, Jonathan Tan wrote:

> Commit 477673d6f3 ("send-pack: support push negotiation", 2021-05-05)
> did not test the case in which a remote advertises at least one ref. In
> such a case, "remote_refs" in get_commons_through_negotiation() in
> send-pack.c would also contain those refs with a zero ref->new_oid (in
> addition to the refs being pushed with a nonzero ref->new_oid). Passing
> them as negotiation tips to "git fetch" causes an error, so filter them
> out.
>
> (The exact error that would happen in "git fetch" in this case is a
> segmentation fault, which is unwanted. This will be fixed in the
> subsequent commit.)

Let's add the test from the subsequent here as a test_expect_failure and
flip it to "success".

> @@ -425,8 +425,10 @@ static void get_commons_through_negotiation(const char *url,
>  	child.no_stdin = 1;
>  	child.out = -1;
>  	strvec_pushl(&child.args, "fetch", "--negotiate-only", NULL);
> -	for (ref = remote_refs; ref; ref = ref->next)
> -		strvec_pushf(&child.args, "--negotiation-tip=%s", oid_to_hex(&ref->new_oid));
> +	for (ref = remote_refs; ref; ref = ref->next) {
> +		if (!is_null_oid(&ref->new_oid))
> +			strvec_pushf(&child.args, "--negotiation-tip=%s", oid_to_hex(&ref->new_oid));
> +	}
>  	strvec_push(&child.args, url);

This will run into my eff40457a4 (fetch: fix segfault in
--negotiate-only without --negotiation-tip=*, 2021-07-08) if we supply a
--negotiate-only without --negotiation-tip=, but trying it it looks like
even when you push to an empty repo and your repo is itself empty we'll
always add the tip you're pushing as the negotiation tip.

Let's add a test for that, i.e. I instrumented your test to check what
happens whe I do the push without any remote/local refs, both for
one/both cases (and both combinations), it seems to work...

For code that's doing a loop over "refs" testing that seems to be
worthwhile, i.e. we don't actually depend on "refs" in the sense that
they exist, but the refs we've constructed in-memory to be created on
the remote, correct?

I.e. this on top would be OK (not saying you need this, but I for one
would find it easier to follow with this):

	diff --git a/send-pack.c b/send-pack.c
	index b3a495b7b1..d1e231076c 100644
	--- a/send-pack.c
	+++ b/send-pack.c
	@@ -420,15 +420,20 @@ static void get_commons_through_negotiation(const char *url,
	 	struct child_process child = CHILD_PROCESS_INIT;
	 	const struct ref *ref;
	 	int len = the_hash_algo->hexsz + 1; /* hash + NL */
	+	int got_tip = 0;

	 	child.git_cmd = 1;
	 	child.no_stdin = 1;
	 	child.out = -1;
	 	strvec_pushl(&child.args, "fetch", "--negotiate-only", NULL);
	 	for (ref = remote_refs; ref; ref = ref->next) {
	-		if (!is_null_oid(&ref->new_oid))
	-			strvec_pushf(&child.args, "--negotiation-tip=%s", oid_to_hex(&ref->new_oid));
	+		if (is_null_oid(&ref->new_oid))
	+			continue;
	+		strvec_pushf(&child.args, "--negotiation-tip=%s", oid_to_hex(&ref->new_oid));
	+		got_tip = 1;
	 	}
	+	if (!got_tip)
	+		BUG("should get at least one ref tip, even with no remote/local refs");
	 	strvec_push(&child.args, url);

	 	if (start_command(&child))

But also: looking at the trace output we already have the ref
advertisement at this point, so in the case of an empty repo we'll see
it has no refs, but then we're going to provide a --negotiation-tip=*
pointing to our local OID anyway.

That seems like a fairly non-obvious edge case that should be called out
/ tested.

I.e. aren't we at least just going to engage in redundant work there in
trying to negotiate with empty repos, or is it going to noop anyway.

Or maybe we'll get lucky and they have the OID already, they just
recently deleted their reference(s), then we won't need to send as much
over? Is that what this is trying to do?

But hrm, won't that sort of thing increase the odds of repository
corruption?

I.e. now we make the implicit assumption that an OID we see in the
advertisement is one the server isn't going to aggressively prune while
our push is underday (Jeff King has a good E-Mail summarizing that
somewhere, not digging it up now, but I could...).

So such a remote will negotiate with us using that OID, but unlike with
advertised OIDs we can't safely assume that the OID won't be racily
deleted during our negotiation.

Or maybe I'm entirely wrong here....

Jeff King July 27, 2021, 4:46 p.m. UTC | #2

On Tue, Jul 27, 2021 at 10:09:35AM +0200, Ævar Arnfjörð Bjarmason wrote:

> I.e. now we make the implicit assumption that an OID we see in the
> advertisement is one the server isn't going to aggressively prune while
> our push is underday (Jeff King has a good E-Mail summarizing that
> somewhere, not digging it up now, but I could...).
> 
> So such a remote will negotiate with us using that OID, but unlike with
> advertised OIDs we can't safely assume that the OID won't be racily
> deleted during our negotiation.

I haven't been following the push-negotiation stuff closely, nor do I
have a specific email in mind that summarizes this. So take my input
with a grain of salt. But...

Wouldn't this also be a problem for multi-round fetch negotiation? An
object may become unreachable or even go away entirely during the course
of a fetch. I'd expect that to be rare, but when it does happen, for the
fetch to end up barfing (the server says "hey, I don't know about that
object").

-Peff

Jonathan Tan July 27, 2021, 9:11 p.m. UTC | #3

> On Thu, Jul 15 2021, Jonathan Tan wrote:
> 
> > Commit 477673d6f3 ("send-pack: support push negotiation", 2021-05-05)
> > did not test the case in which a remote advertises at least one ref. In
> > such a case, "remote_refs" in get_commons_through_negotiation() in
> > send-pack.c would also contain those refs with a zero ref->new_oid (in
> > addition to the refs being pushed with a nonzero ref->new_oid). Passing
> > them as negotiation tips to "git fetch" causes an error, so filter them
> > out.
> >
> > (The exact error that would happen in "git fetch" in this case is a
> > segmentation fault, which is unwanted. This will be fixed in the
> > subsequent commit.)
> 
> Let's add the test from the subsequent here as a test_expect_failure and
> flip it to "success".

What is the subsequent?

> > @@ -425,8 +425,10 @@ static void get_commons_through_negotiation(const char *url,
> >  	child.no_stdin = 1;
> >  	child.out = -1;
> >  	strvec_pushl(&child.args, "fetch", "--negotiate-only", NULL);
> > -	for (ref = remote_refs; ref; ref = ref->next)
> > -		strvec_pushf(&child.args, "--negotiation-tip=%s", oid_to_hex(&ref->new_oid));
> > +	for (ref = remote_refs; ref; ref = ref->next) {
> > +		if (!is_null_oid(&ref->new_oid))
> > +			strvec_pushf(&child.args, "--negotiation-tip=%s", oid_to_hex(&ref->new_oid));
> > +	}
> >  	strvec_push(&child.args, url);
> 
> This will run into my eff40457a4 (fetch: fix segfault in
> --negotiate-only without --negotiation-tip=*, 2021-07-08) if we supply a
> --negotiate-only without --negotiation-tip=, but trying it it looks like
> even when you push to an empty repo and your repo is itself empty we'll
> always add the tip you're pushing as the negotiation tip.
> 
> Let's add a test for that, i.e. I instrumented your test to check what
> happens whe I do the push without any remote/local refs, both for
> one/both cases (and both combinations), it seems to work...

I'm not sure how useful this no-ref test will be, because if my existing
tests are correct, the thing we're pushing is guaranteed to be in this
list (so the list will be non-empty).

> For code that's doing a loop over "refs" testing that seems to be
> worthwhile, i.e. we don't actually depend on "refs" in the sense that
> they exist, but the refs we've constructed in-memory to be created on
> the remote, correct?

Yes.

> But also: looking at the trace output we already have the ref
> advertisement at this point, so in the case of an empty repo we'll see
> it has no refs, but then we're going to provide a --negotiation-tip=*
> pointing to our local OID anyway.

Hmm...are you running under protocol v0? In protocol v2, there should be
no ref advertisement at this point.

> That seems like a fairly non-obvious edge case that should be called out
> / tested.
> 
> I.e. aren't we at least just going to engage in redundant work there in
> trying to negotiate with empty repos, or is it going to noop anyway.
> 
> Or maybe we'll get lucky and they have the OID already, they just
> recently deleted their reference(s), then we won't need to send as much
> over? Is that what this is trying to do?
> 
> But hrm, won't that sort of thing increase the odds of repository
> corruption?

No, trying to be lucky in finding an OID that the server has no plans of
advertising is not the aim.

> I.e. now we make the implicit assumption that an OID we see in the
> advertisement is one the server isn't going to aggressively prune while
> our push is underday (Jeff King has a good E-Mail summarizing that
> somewhere, not digging it up now, but I could...).
> 
> So such a remote will negotiate with us using that OID, but unlike with
> advertised OIDs we can't safely assume that the OID won't be racily
> deleted during our negotiation.
> 
> Or maybe I'm entirely wrong here....

There's always the risk that the server will say it has something and
then aggressively prune it, but I think that all fetch/push code has to
deal with it. A more realistic scenario is that one server in a
load-balanced arrangement advertises a commit that the other does not
have, but we are unlikely to be affected by that here because the ref
negotiation would usually concern old commits that the local user has
built upon, not the very latest commits that someone else just pushed.

diff --git a/send-pack.c b/send-pack.c
index 9cb9f71650..85945becf0 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -425,8 +425,10 @@  static void get_commons_through_negotiation(const char *url,
 	child.no_stdin = 1;
 	child.out = -1;
 	strvec_pushl(&child.args, "fetch", "--negotiate-only", NULL);
-	for (ref = remote_refs; ref; ref = ref->next)
-		strvec_pushf(&child.args, "--negotiation-tip=%s", oid_to_hex(&ref->new_oid));
+	for (ref = remote_refs; ref; ref = ref->next) {
+		if (!is_null_oid(&ref->new_oid))
+			strvec_pushf(&child.args, "--negotiation-tip=%s", oid_to_hex(&ref->new_oid));
+	}
 	strvec_push(&child.args, url);
 
 	if (start_command(&child))
diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
index 0916f76302..4db8edd9c8 100755
--- a/t/t5516-fetch-push.sh
+++ b/t/t5516-fetch-push.sh
@@ -201,6 +201,7 @@  test_expect_success 'push with negotiation' '
 	# Without negotiation
 	mk_empty testrepo &&
 	git push testrepo $the_first_commit:refs/remotes/origin/first_commit &&
+	test_commit -C testrepo unrelated_commit &&
 	git -C testrepo config receive.hideRefs refs/remotes/origin/first_commit &&
 	echo now pushing without negotiation &&
 	GIT_TRACE2_EVENT="$(pwd)/event" git -c protocol.version=2 push testrepo refs/heads/main:refs/remotes/origin/main &&
@@ -210,6 +211,7 @@  test_expect_success 'push with negotiation' '
 	rm event &&
 	mk_empty testrepo &&
 	git push testrepo $the_first_commit:refs/remotes/origin/first_commit &&
+	test_commit -C testrepo unrelated_commit &&
 	git -C testrepo config receive.hideRefs refs/remotes/origin/first_commit &&
 	GIT_TRACE2_EVENT="$(pwd)/event" git -c protocol.version=2 -c push.negotiate=1 push testrepo refs/heads/main:refs/remotes/origin/main &&
 	grep_wrote 2 event # 1 commit, 1 tree
@@ -219,6 +221,7 @@  test_expect_success 'push with negotiation proceeds anyway even if negotiation f
 	rm event &&
 	mk_empty testrepo &&
 	git push testrepo $the_first_commit:refs/remotes/origin/first_commit &&
+	test_commit -C testrepo unrelated_commit &&
 	git -C testrepo config receive.hideRefs refs/remotes/origin/first_commit &&
 	GIT_TEST_PROTOCOL_VERSION=0 GIT_TRACE2_EVENT="$(pwd)/event" \
 		git -c push.negotiate=1 push testrepo refs/heads/main:refs/remotes/origin/main 2>err &&
@@ -1767,5 +1770,4 @@  test_expect_success 'denyCurrentBranch and worktrees' '
 	git -C cloned push origin HEAD:new-wt &&
 	test_must_fail git -C cloned push --delete origin new-wt
 '
-
 test_done
diff --git a/t/t5549-fetch-push-http.sh b/t/t5549-fetch-push-http.sh
index f50d584881..2cdebcb735 100755
--- a/t/t5549-fetch-push-http.sh
+++ b/t/t5549-fetch-push-http.sh
@@ -27,6 +27,7 @@  setup_client_and_server () {
 	git init "$SERVER" &&
 	test_when_finished 'rm -rf "$SERVER"' &&
 	test_config -C "$SERVER" http.receivepack true &&
+	test_commit -C "$SERVER" unrelated_commit &&
 	git -C client push "$URI" first_commit:refs/remotes/origin/first_commit &&
 	git -C "$SERVER" config receive.hideRefs refs/remotes/origin/first_commit
 }

[v2,2/3] send-pack: fix push nego. when remote has refs

Commit Message

Comments

Patch