diff mbox series

[6/9] upload-pack: disallow object-info capability by default

Message ID 20240228223858.GF1158131@coredump.intra.peff.net (mailing list archive)
State Accepted
Commit 8c735b11decc96d673140b268e4a257629e1dad4
Headers show
Series bound upload-pack memory allocations | expand

Commit Message

Jeff King Feb. 28, 2024, 10:38 p.m. UTC
From: Taylor Blau <me@ttaylorr.com>

We added an "object-info" capability to the v2 upload-pack protocol in
a2ba162cda (object-info: support for retrieving object info,
2021-04-20). In the almost 3 years since, we have not added any
client-side support, and it does not appear to exist in other
implementations either (JGit understands the verb on the server side,
but not on the client side).

Since this largely unused code is accessible over the network by
default, it increases the attack surface of upload-pack. I don't know of
any particularly severe problem, but one issue is that because of the
request/response nature of the v2 protocol, it will happily read an
unbounded number of packets, adding each one to a string list (without
regard to whether they are objects we know about, duplicates, etc).

This may be something we want to improve in the long run, but in the
short term it makes sense to disable the feature entirely. We'll add a
config option as an escape hatch for anybody who wants to develop the
feature further.

A more gentle option would be to add the config option to let people
disable it manually, but leave it enabled by default. But given that
there's no client side support, that seems like the wrong balance with
security.

Disabling by default will slow adoption a bit once client-side support
does become available (there were some patches[1] in 2022, but nothing
got merged and there's been nothing since). But clients have to deal
with older servers that do not understand the option anyway (and the
capability system handles that), so it will just be a matter of servers
flipping their config at that point (and hopefully once any unbounded
allocations have been addressed).

[jk: this is a patch that GitHub has been running for several years, but
     rebased forward and with a new commit message for upstream]

[1] https://lore.kernel.org/git/20220208231911.725273-1-calvinwan@google.com/

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jeff King <peff@peff.net>
---
 Documentation/config/transfer.txt |  4 ++++
 serve.c                           | 14 +++++++++++++-
 t/t5555-http-smart-common.sh      |  1 -
 t/t5701-git-serve.sh              | 24 +++++++++++++++++++++++-
 4 files changed, 40 insertions(+), 3 deletions(-)

Comments

Patrick Steinhardt March 4, 2024, 8:34 a.m. UTC | #1
On Wed, Feb 28, 2024 at 05:38:58PM -0500, Jeff King wrote:
> From: Taylor Blau <me@ttaylorr.com>
> 
> We added an "object-info" capability to the v2 upload-pack protocol in
> a2ba162cda (object-info: support for retrieving object info,
> 2021-04-20). In the almost 3 years since, we have not added any
> client-side support, and it does not appear to exist in other
> implementations either (JGit understands the verb on the server side,
> but not on the client side).
> 
> Since this largely unused code is accessible over the network by
> default, it increases the attack surface of upload-pack. I don't know of
> any particularly severe problem, but one issue is that because of the
> request/response nature of the v2 protocol, it will happily read an
> unbounded number of packets, adding each one to a string list (without
> regard to whether they are objects we know about, duplicates, etc).
> 
> This may be something we want to improve in the long run, but in the
> short term it makes sense to disable the feature entirely. We'll add a
> config option as an escape hatch for anybody who wants to develop the
> feature further.
> 
> A more gentle option would be to add the config option to let people
> disable it manually, but leave it enabled by default. But given that
> there's no client side support, that seems like the wrong balance with
> security.
> 
> Disabling by default will slow adoption a bit once client-side support
> does become available (there were some patches[1] in 2022, but nothing
> got merged and there's been nothing since). But clients have to deal
> with older servers that do not understand the option anyway (and the
> capability system handles that), so it will just be a matter of servers
> flipping their config at that point (and hopefully once any unbounded
> allocations have been addressed).
> 
> [jk: this is a patch that GitHub has been running for several years, but
>      rebased forward and with a new commit message for upstream]
> 
> [1] https://lore.kernel.org/git/20220208231911.725273-1-calvinwan@google.com/
> 
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  Documentation/config/transfer.txt |  4 ++++
>  serve.c                           | 14 +++++++++++++-
>  t/t5555-http-smart-common.sh      |  1 -
>  t/t5701-git-serve.sh              | 24 +++++++++++++++++++++++-
>  4 files changed, 40 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/config/transfer.txt b/Documentation/config/transfer.txt
> index a9cbdb88a1..f1ce50f4a6 100644
> --- a/Documentation/config/transfer.txt
> +++ b/Documentation/config/transfer.txt
> @@ -121,3 +121,7 @@ transfer.bundleURI::
>  	information from the remote server (if advertised) and download
>  	bundles before continuing the clone through the Git protocol.
>  	Defaults to `false`.
> +
> +transfer.advertiseObjectInfo::
> +	When `true`, the `object-info` capability is advertised by
> +	servers. Defaults to false.
> diff --git a/serve.c b/serve.c
> index a1d71134d4..aa651b73e9 100644
> --- a/serve.c
> +++ b/serve.c
> @@ -12,6 +12,7 @@
>  #include "trace2.h"
>  
>  static int advertise_sid = -1;
> +static int advertise_object_info = -1;
>  static int client_hash_algo = GIT_HASH_SHA1;
>  
>  static int always_advertise(struct repository *r UNUSED,
> @@ -67,6 +68,17 @@ static void session_id_receive(struct repository *r UNUSED,
>  	trace2_data_string("transfer", NULL, "client-sid", client_sid);
>  }
>  
> +static int object_info_advertise(struct repository *r, struct strbuf *value UNUSED)
> +{
> +	if (advertise_object_info == -1 &&
> +	    repo_config_get_bool(r, "transfer.advertiseobjectinfo",
> +				 &advertise_object_info)) {
> +		/* disabled by default */
> +		advertise_object_info = 0;
> +	}
> +	return advertise_object_info;
> +}
> +
>  struct protocol_capability {
>  	/*
>  	 * The name of the capability.  The server uses this name when
> @@ -135,7 +147,7 @@ static struct protocol_capability capabilities[] = {
>  	},
>  	{
>  		.name = "object-info",
> -		.advertise = always_advertise,
> +		.advertise = object_info_advertise,
>  		.command = cap_object_info,
>  	},
>  	{
> diff --git a/t/t5555-http-smart-common.sh b/t/t5555-http-smart-common.sh
> index b1cfe8b7db..3dcb3340a3 100755
> --- a/t/t5555-http-smart-common.sh
> +++ b/t/t5555-http-smart-common.sh
> @@ -131,7 +131,6 @@ test_expect_success 'git upload-pack --advertise-refs: v2' '
>  	fetch=shallow wait-for-done
>  	server-option
>  	object-format=$(test_oid algo)
> -	object-info
>  	0000
>  	EOF
>  
> diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
> index 3591bc2417..c48830de8f 100755
> --- a/t/t5701-git-serve.sh
> +++ b/t/t5701-git-serve.sh
> @@ -20,7 +20,6 @@ test_expect_success 'test capability advertisement' '
>  	fetch=shallow wait-for-done
>  	server-option
>  	object-format=$(test_oid algo)
> -	object-info
>  	EOF
>  	cat >expect.trailer <<-EOF &&
>  	0000
> @@ -323,6 +322,8 @@ test_expect_success 'unexpected lines are not allowed in fetch request' '
>  # Test the basics of object-info
>  #
>  test_expect_success 'basics of object-info' '
> +	test_config transfer.advertiseObjectInfo true &&
> +
>  	test-tool pkt-line pack >in <<-EOF &&
>  	command=object-info
>  	object-format=$(test_oid algo)
> @@ -380,4 +381,25 @@ test_expect_success 'basics of bundle-uri: dies if not enabled' '
>  	test_must_be_empty out
>  '
>  
> +test_expect_success 'object-info missing from capabilities when disabled' '
> +	test_config transfer.advertiseObjectInfo false &&
> +
> +	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
> +		--advertise-capabilities >out &&
> +	test-tool pkt-line unpack <out >actual &&
> +
> +	! grep object.info actual
> +'

Is it intentional that you grep for "object.info" instead of
"object-info"?

Patrick

> +test_expect_success 'object-info commands rejected when disabled' '
> +	test_config transfer.advertiseObjectInfo false &&
> +
> +	test-tool pkt-line pack >in <<-EOF &&
> +	command=object-info
> +	EOF
> +
> +	test_must_fail test-tool serve-v2 --stateless-rpc <in 2>err &&
> +	grep invalid.command err
> +'
> +
>  test_done
> -- 
> 2.44.0.rc2.424.gbdbf4d014b
> 
>
Jeff King March 4, 2024, 9:59 a.m. UTC | #2
On Mon, Mar 04, 2024 at 09:34:30AM +0100, Patrick Steinhardt wrote:

> > +test_expect_success 'object-info missing from capabilities when disabled' '
> > +	test_config transfer.advertiseObjectInfo false &&
> > +
> > +	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
> > +		--advertise-capabilities >out &&
> > +	test-tool pkt-line unpack <out >actual &&
> > +
> > +	! grep object.info actual
> > +'
> 
> Is it intentional that you grep for "object.info" instead of
> "object-info"?

I didn't even notice this. It should be equivalent because of the regex,
but I don't think there's a particular reason to be more loose (and
unlike single-quote, which we sometimes match with "." for shell
readability, it should be fine to say "object-info" here).

+cc Taylor, who wrote the original.

-Peff
Taylor Blau April 29, 2024, 8:49 p.m. UTC | #3
On Mon, Mar 04, 2024 at 04:59:07AM -0500, Jeff King wrote:
> On Mon, Mar 04, 2024 at 09:34:30AM +0100, Patrick Steinhardt wrote:
>
> > > +test_expect_success 'object-info missing from capabilities when disabled' '
> > > +	test_config transfer.advertiseObjectInfo false &&
> > > +
> > > +	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
> > > +		--advertise-capabilities >out &&
> > > +	test-tool pkt-line unpack <out >actual &&
> > > +
> > > +	! grep object.info actual
> > > +'
> >
> > Is it intentional that you grep for "object.info" instead of
> > "object-info"?
>
> I didn't even notice this. It should be equivalent because of the regex,
> but I don't think there's a particular reason to be more loose (and
> unlike single-quote, which we sometimes match with "." for shell
> readability, it should be fine to say "object-info" here).
>
> +cc Taylor, who wrote the original.

I can't think of any reason why I would have written "object.info"
instead of "object-info". Indeed if the missing character were a "'"
single-quote, then I would have used the "." wildcard to match it for
readability, but that's not the case here.

Thanks,
Taylor
diff mbox series

Patch

diff --git a/Documentation/config/transfer.txt b/Documentation/config/transfer.txt
index a9cbdb88a1..f1ce50f4a6 100644
--- a/Documentation/config/transfer.txt
+++ b/Documentation/config/transfer.txt
@@ -121,3 +121,7 @@  transfer.bundleURI::
 	information from the remote server (if advertised) and download
 	bundles before continuing the clone through the Git protocol.
 	Defaults to `false`.
+
+transfer.advertiseObjectInfo::
+	When `true`, the `object-info` capability is advertised by
+	servers. Defaults to false.
diff --git a/serve.c b/serve.c
index a1d71134d4..aa651b73e9 100644
--- a/serve.c
+++ b/serve.c
@@ -12,6 +12,7 @@ 
 #include "trace2.h"
 
 static int advertise_sid = -1;
+static int advertise_object_info = -1;
 static int client_hash_algo = GIT_HASH_SHA1;
 
 static int always_advertise(struct repository *r UNUSED,
@@ -67,6 +68,17 @@  static void session_id_receive(struct repository *r UNUSED,
 	trace2_data_string("transfer", NULL, "client-sid", client_sid);
 }
 
+static int object_info_advertise(struct repository *r, struct strbuf *value UNUSED)
+{
+	if (advertise_object_info == -1 &&
+	    repo_config_get_bool(r, "transfer.advertiseobjectinfo",
+				 &advertise_object_info)) {
+		/* disabled by default */
+		advertise_object_info = 0;
+	}
+	return advertise_object_info;
+}
+
 struct protocol_capability {
 	/*
 	 * The name of the capability.  The server uses this name when
@@ -135,7 +147,7 @@  static struct protocol_capability capabilities[] = {
 	},
 	{
 		.name = "object-info",
-		.advertise = always_advertise,
+		.advertise = object_info_advertise,
 		.command = cap_object_info,
 	},
 	{
diff --git a/t/t5555-http-smart-common.sh b/t/t5555-http-smart-common.sh
index b1cfe8b7db..3dcb3340a3 100755
--- a/t/t5555-http-smart-common.sh
+++ b/t/t5555-http-smart-common.sh
@@ -131,7 +131,6 @@  test_expect_success 'git upload-pack --advertise-refs: v2' '
 	fetch=shallow wait-for-done
 	server-option
 	object-format=$(test_oid algo)
-	object-info
 	0000
 	EOF
 
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index 3591bc2417..c48830de8f 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -20,7 +20,6 @@  test_expect_success 'test capability advertisement' '
 	fetch=shallow wait-for-done
 	server-option
 	object-format=$(test_oid algo)
-	object-info
 	EOF
 	cat >expect.trailer <<-EOF &&
 	0000
@@ -323,6 +322,8 @@  test_expect_success 'unexpected lines are not allowed in fetch request' '
 # Test the basics of object-info
 #
 test_expect_success 'basics of object-info' '
+	test_config transfer.advertiseObjectInfo true &&
+
 	test-tool pkt-line pack >in <<-EOF &&
 	command=object-info
 	object-format=$(test_oid algo)
@@ -380,4 +381,25 @@  test_expect_success 'basics of bundle-uri: dies if not enabled' '
 	test_must_be_empty out
 '
 
+test_expect_success 'object-info missing from capabilities when disabled' '
+	test_config transfer.advertiseObjectInfo false &&
+
+	GIT_TEST_SIDEBAND_ALL=0 test-tool serve-v2 \
+		--advertise-capabilities >out &&
+	test-tool pkt-line unpack <out >actual &&
+
+	! grep object.info actual
+'
+
+test_expect_success 'object-info commands rejected when disabled' '
+	test_config transfer.advertiseObjectInfo false &&
+
+	test-tool pkt-line pack >in <<-EOF &&
+	command=object-info
+	EOF
+
+	test_must_fail test-tool serve-v2 --stateless-rpc <in 2>err &&
+	grep invalid.command err
+'
+
 test_done