diff mbox series

[4/4] promisor-remote: check advertised name or URL

Message ID 20240731134014.2299361-5-christian.couder@gmail.com (mailing list archive)
State New
Headers show
Series Introduce a "promisor-remote" capability | expand

Commit Message

Christian Couder July 31, 2024, 1:40 p.m. UTC
A previous commit introduced a "promisor.acceptFromServer" configuration
variable with only "None" or "All" as valid values.

Let's introduce "KnownName" and "KnownUrl" as valid values for this
configuration option to give more choice to a client about which
promisor remotes it might accept among those that the server advertised.

In case of "KnownName", the client will accept promisor remotes which
are already configured on the client and have the same name as those
advertised by the client.

In case of "KnownUrl", the client will accept promisor remotes which
have both the same name and the same URL configured on the client as the
name and URL advertised by the server.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/config/promisor.txt     | 11 +++--
 promisor-remote.c                     | 54 +++++++++++++++++++--
 t/t5710-promisor-remote-capability.sh | 68 +++++++++++++++++++++++++++
 3 files changed, 126 insertions(+), 7 deletions(-)

Comments

Junio C Hamano July 31, 2024, 6:35 p.m. UTC | #1
Christian Couder <christian.couder@gmail.com> writes:

> A previous commit introduced a "promisor.acceptFromServer" configuration
> variable with only "None" or "All" as valid values.
>
> Let's introduce "KnownName" and "KnownUrl" as valid values for this
> configuration option to give more choice to a client about which
> promisor remotes it might accept among those that the server advertised.

A malicous server can swich name and url correspondence.  The URLs
this repository uses to lazily fetch missing objects from are the
only thing that matters, and it does not matter what name the server
calls these URLs as, I am not sure what value, if any, KnownName has,
other than adding a potential security hole.

> In case of "KnownUrl", the client will accept promisor remotes which
> have both the same name and the same URL configured on the client as the
> name and URL advertised by the server.

This makes sense, especially if we had updates to documents I
suggested in my review of [3/4].  If the side effect of "accepting"
a suggested promisor remote were to only use it as a promisor remote
on this side, there is no reason to "accept" the same thing again,
but because the main effect at the protocol level of "accepting" is
to affect the behaviour of the server in such a way that it is now
allowed to omit objects that are requested but would be available
lazily from the promisor remotes in the response, we _do_ need to
be able to respond with the promisor remotes we are willing to and
have been using.

This iteration does not seem to have the true server side support to
slim its response by omitting objects that are available elsewhere,
but I agree that it is a good approach to get the protocol support
right.

Thanks.
Christian Couder Sept. 10, 2024, 4:32 p.m. UTC | #2
On Wed, Jul 31, 2024 at 8:35 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Christian Couder <christian.couder@gmail.com> writes:
>
> > A previous commit introduced a "promisor.acceptFromServer" configuration
> > variable with only "None" or "All" as valid values.
> >
> > Let's introduce "KnownName" and "KnownUrl" as valid values for this
> > configuration option to give more choice to a client about which
> > promisor remotes it might accept among those that the server advertised.
>
> A malicous server can swich name and url correspondence.  The URLs
> this repository uses to lazily fetch missing objects from are the
> only thing that matters, and it does not matter what name the server
> calls these URLs as, I am not sure what value, if any, KnownName has,
> other than adding a potential security hole.

In a corporate setup where clients and servers trust each other to not
switch names and URLs, it could be valuable to still have a bit of
control in a simple way, for example:
  - if servers use many promisor remotes, but clients should only use
a subset of them, or:
  - if the URLs used by clients should not be the same as the URLs
used by servers

In version 2, I have updated the "promisor.acceptFromServer"
documentation and the commit message of this patch to better explain
cases where the new "KnownName" and "KnownUrl" could be useful.

> > In case of "KnownUrl", the client will accept promisor remotes which
> > have both the same name and the same URL configured on the client as the
> > name and URL advertised by the server.
>
> This makes sense, especially if we had updates to documents I
> suggested in my review of [3/4].  If the side effect of "accepting"
> a suggested promisor remote were to only use it as a promisor remote
> on this side, there is no reason to "accept" the same thing again,
> but because the main effect at the protocol level of "accepting" is
> to affect the behaviour of the server in such a way that it is now
> allowed to omit objects that are requested but would be available
> lazily from the promisor remotes in the response, we _do_ need to
> be able to respond with the promisor remotes we are willing to and
> have been using.

Yeah, it is better to let the server know.

> This iteration does not seem to have the true server side support to
> slim its response by omitting objects that are available elsewhere,

Yeah, in version 2, the commit message of patch 3/4 has been improved
to say that implementation of this case, which would require S to omit
in its response the objects available on X, is left for future
improvement.

> but I agree that it is a good approach to get the protocol support
> right.

Thanks.
diff mbox series

Patch

diff --git a/Documentation/config/promisor.txt b/Documentation/config/promisor.txt
index e3939d83a9..fadf593621 100644
--- a/Documentation/config/promisor.txt
+++ b/Documentation/config/promisor.txt
@@ -11,6 +11,11 @@  promisor.advertise::
 promisor.acceptFromServer::
 	If set to "all", a client will accept all the promisor remotes
 	a server might advertise using the "promisor-remote"
-	capability, see linkgit:gitprotocol-v2[5]. Default is "none",
-	which means no promisor remote advertised by a server will be
-	accepted.
+	capability, see linkgit:gitprotocol-v2[5]. If set to
+	"knownName" the client will accept promisor remotes which are
+	already configured on the client and have the same name as
+	those advertised by the client. If set to "knownUrl", the
+	client will accept promisor remotes which have both the same
+	name and the same URL configured on the client as the name and
+	URL advertised by the server. Default is "none", which means
+	no promisor remote advertised by a server will be accepted.
diff --git a/promisor-remote.c b/promisor-remote.c
index d347f4d9b5..0ff26b835e 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -362,19 +362,54 @@  void promisor_remote_info(struct repository *repo, struct strbuf *buf)
 	strvec_clear(&urls);
 }
 
+/*
+ * Find first index of 'vec' where there is 'val'. 'val' is compared
+ * case insensively to the strings in 'vec'. If not found 'vec->nr' is
+ * returned.
+ */
+static size_t strvec_find_index(struct strvec *vec, const char *val)
+{
+	for (size_t i = 0; i < vec->nr; i++)
+		if (!strcasecmp(vec->v[i], val))
+			return i;
+	return vec->nr;
+}
+
 enum accept_promisor {
 	ACCEPT_NONE = 0,
+	ACCEPT_KNOWN_URL,
+	ACCEPT_KNOWN_NAME,
 	ACCEPT_ALL
 };
 
 static int should_accept_remote(enum accept_promisor accept,
-				const char *remote_name UNUSED,
-				const char *remote_url UNUSED)
+				const char *remote_name, const char *remote_url,
+				struct strvec *names, struct strvec *urls)
 {
+	size_t i;
+
 	if (accept == ACCEPT_ALL)
 		return 1;
 
-	BUG("Unhandled 'enum accept_promisor' value '%d'", accept);
+	i = strvec_find_index(names, remote_name);
+
+	if (i >= names->nr)
+		/* We don't know about that remote */
+		return 0;
+
+	if (accept == ACCEPT_KNOWN_NAME)
+		return 1;
+
+	if (accept != ACCEPT_KNOWN_URL)
+		BUG("Unhandled 'enum accept_promisor' value '%d'", accept);
+
+	if (!strcasecmp(urls->v[i], remote_url))
+		return 1;
+
+	warning(_("known remote named '%s' but with url '%s' instead of '%s'"),
+		remote_name, urls->v[i], remote_url);
+
+	return 0;
 }
 
 static void filter_promisor_remote(struct repository *repo,
@@ -384,10 +419,16 @@  static void filter_promisor_remote(struct repository *repo,
 	struct strbuf **remotes;
 	char *accept_str;
 	enum accept_promisor accept = ACCEPT_NONE;
+	struct strvec names = STRVEC_INIT;
+	struct strvec urls = STRVEC_INIT;
 
 	if (!git_config_get_string("promisor.acceptfromserver", &accept_str)) {
 		if (!accept_str || !*accept_str || !strcasecmp("None", accept_str))
 			accept = ACCEPT_NONE;
+		else if (!strcasecmp("KnownUrl", accept_str))
+			accept = ACCEPT_KNOWN_URL;
+		else if (!strcasecmp("KnownName", accept_str))
+			accept = ACCEPT_KNOWN_NAME;
 		else if (!strcasecmp("All", accept_str))
 			accept = ACCEPT_ALL;
 		else
@@ -398,6 +439,9 @@  static void filter_promisor_remote(struct repository *repo,
 	if (accept == ACCEPT_NONE)
 		return;
 
+	if (accept != ACCEPT_ALL)
+		promisor_info_vecs(repo, &names, &urls);
+
 	/* Parse remote info received */
 
 	remotes = strbuf_split_str(info, ';', 0);
@@ -423,7 +467,7 @@  static void filter_promisor_remote(struct repository *repo,
 
 		decoded_url = url_decode(remote_url);
 
-		if (should_accept_remote(accept, remote_name, decoded_url))
+		if (should_accept_remote(accept, remote_name, decoded_url, &names, &urls))
 			strvec_push(accepted, remote_name);
 
 		strbuf_list_free(elems);
@@ -431,6 +475,8 @@  static void filter_promisor_remote(struct repository *repo,
 	}
 
 	free(accept_str);
+	strvec_clear(&names);
+	strvec_clear(&urls);
 	strbuf_list_free(remotes);
 }
 
diff --git a/t/t5710-promisor-remote-capability.sh b/t/t5710-promisor-remote-capability.sh
index 7e44ad15ce..c2c83a5914 100755
--- a/t/t5710-promisor-remote-capability.sh
+++ b/t/t5710-promisor-remote-capability.sh
@@ -117,6 +117,74 @@  test_expect_success "fetch with promisor.acceptfromserver set to 'None'" '
 		--no-local --filter="blob:limit=5k" server client &&
 	test_when_finished "rm -rf client" &&
 
+	# Check that the largest object is not missing on the server
+	check_missing_objects server 0 "" &&
+
+	# Reinitialize server so that the largest object is missing again
+	initialize_server
+'
+
+test_expect_success "fetch with promisor.acceptfromserver set to 'KnownName'" '
+	git -C server config promisor.advertise true &&
+
+	# Clone from server to create a client
+	GIT_NO_LAZY_FETCH=0 git clone -c remote.server2.promisor=true \
+		-c remote.server2.fetch="+refs/heads/*:refs/remotes/server2/*" \
+		-c remote.server2.url="file://$(pwd)/server2" \
+		-c promisor.acceptfromserver=KnownName \
+		--no-local --filter="blob:limit=5k" server client &&
+	test_when_finished "rm -rf client" &&
+
+	# Check that the largest object is still missing on the server
+	check_missing_objects server 1 "$oid"
+'
+
+test_expect_success "fetch with 'KnownName' and different remote names" '
+	git -C server config promisor.advertise true &&
+
+	# Clone from server to create a client
+	GIT_NO_LAZY_FETCH=0 git clone -c remote.serverTwo.promisor=true \
+		-c remote.serverTwo.fetch="+refs/heads/*:refs/remotes/server2/*" \
+		-c remote.serverTwo.url="file://$(pwd)/server2" \
+		-c promisor.acceptfromserver=KnownName \
+		--no-local --filter="blob:limit=5k" server client &&
+	test_when_finished "rm -rf client" &&
+
+	# Check that the largest object is not missing on the server
+	check_missing_objects server 0 "" &&
+
+	# Reinitialize server so that the largest object is missing again
+	initialize_server
+'
+
+test_expect_success "fetch with promisor.acceptfromserver set to 'KnownUrl'" '
+	git -C server config promisor.advertise true &&
+
+	# Clone from server to create a client
+	GIT_NO_LAZY_FETCH=0 git clone -c remote.server2.promisor=true \
+		-c remote.server2.fetch="+refs/heads/*:refs/remotes/server2/*" \
+		-c remote.server2.url="file://$(pwd)/server2" \
+		-c promisor.acceptfromserver=KnownUrl \
+		--no-local --filter="blob:limit=5k" server client &&
+	test_when_finished "rm -rf client" &&
+
+	# Check that the largest object is still missing on the server
+	check_missing_objects server 1 "$oid"
+'
+
+test_expect_success "fetch with 'KnownUrl' and different remote urls" '
+	ln -s server2 serverTwo &&
+
+	git -C server config promisor.advertise true &&
+
+	# Clone from server to create a client
+	GIT_NO_LAZY_FETCH=0 git clone -c remote.server2.promisor=true \
+		-c remote.server2.fetch="+refs/heads/*:refs/remotes/server2/*" \
+		-c remote.server2.url="file://$(pwd)/serverTwo" \
+		-c promisor.acceptfromserver=KnownUrl \
+		--no-local --filter="blob:limit=5k" server client &&
+	test_when_finished "rm -rf client" &&
+
 	# Check that the largest object is not missing on the server
 	check_missing_objects server 0 ""
 '