diff mbox series

promisor-remote: fix segfault when remote URL is missing

Message ID 20250310074053.1886097-1-christian.couder@gmail.com (mailing list archive)
State Superseded
Headers show
Series promisor-remote: fix segfault when remote URL is missing | expand

Commit Message

Christian Couder March 10, 2025, 7:40 a.m. UTC
Pushing NULL into a 'strvec' results in a segfault, as 'strvec' is a
kind of NULL terminated array that is designed to be compatible with
'argv' variables used on the command line.

So when an URL is missing from the config, let's push an empty string
instead of NULL into the 'strvec' that stores URLs.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---

This is a fix for cc/lop-remote. Sorry for sending it late in the cycle.

 promisor-remote.c                     | 10 +++++++---
 t/t5710-promisor-remote-capability.sh | 16 ++++++++++++++++
 2 files changed, 23 insertions(+), 3 deletions(-)

Comments

Junio C Hamano March 10, 2025, 4:29 p.m. UTC | #1
Christian Couder <christian.couder@gmail.com> writes:

> Pushing NULL into a 'strvec' results in a segfault, as 'strvec' is a
> kind of NULL terminated array that is designed to be compatible with
> 'argv' variables used on the command line.

It is good that you corrected a caller that tries to make a strvec
into such a state, but shouldn't strvec_push() protect itself with a
BUG or something, I have to wonder.

> So when an URL is missing from the config, let's push an empty string
> instead of NULL into the 'strvec' that stores URLs.

How will these strings in the "names" strvec used?  When URLs are
present, I'm sure we are going to use it as URL, but when they are
missing, what happens?  The second hunk of this patch seems to
ignore such a broken entry with an empty URL, but that smells like
sweeping a problem under the rug.  Shouldn't such a promisor be
either diagnosed as an error, or skipped when you accumulate URLs to
be used into the strvec, so that the later code (i.e. the second
hunk) does not even have to worry about it?

> @@ -356,7 +360,7 @@ char *promisor_remote_info(struct repository *repo)
>  			strbuf_addch(&sb, ';');
>  		strbuf_addstr(&sb, "name=");
>  		strbuf_addstr_urlencode(&sb, names.v[i], allow_unsanitized);
> -		if (urls.v[i]) {
> +		if (urls.v[i] && *urls.v[i]) {

Why does urls.v[i] need to be checked?  Didn't you just make sure
that the strvec would not have NULL in it?
Christian Couder March 11, 2025, 3:24 p.m. UTC | #2
On Mon, Mar 10, 2025 at 5:29 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Christian Couder <christian.couder@gmail.com> writes:
>
> > Pushing NULL into a 'strvec' results in a segfault, as 'strvec' is a
> > kind of NULL terminated array that is designed to be compatible with
> > 'argv' variables used on the command line.
>
> It is good that you corrected a caller that tries to make a strvec
> into such a state, but shouldn't strvec_push() protect itself with a
> BUG or something, I have to wonder.

Actually strvec_push() uses xstrdup() on the value it is passed and
xstrdup() crashes if that value is NULL. So another way to avoid
crashes would be to make xstrdup() BUG when it's passed NULL. Or maybe
xstrdup() should just return NULL in this case?

Also it looks like strvec_push_nodup() kind of works if it is passed a
NULL. (It adds the NULL to the array and grows it.) So I wonder if the
right solution for strvec_push() would be to make it kind of work in
the same way.

Anyway I think these are separate issues that deserve their own
discussions and can wait for after the 2.49.0 release. Here I am just
providing a hotfix for the "promisor-remote" protocol capability.

> > So when an URL is missing from the config, let's push an empty string
> > instead of NULL into the 'strvec' that stores URLs.
>
> How will these strings in the "names" strvec used?  When URLs are
> present, I'm sure we are going to use it as URL, but when they are
> missing, what happens?

The 'names' strvec and 'urls' strvec contain what exists in the client
config. They are only used by the promisor remote code to compare the
names and maybe urls in the config with what the server advertises in
case 'promisor.acceptfromserver' is set to KnownName or KnownUrl. This
is done to decide if an advertised promisor remote is accepted or not
by the client.

When 'promisor.acceptfromserver' is set to KnownUrl, a remote should
be rejected in any of the following cases:

  - the server doesn't advertise an URL for that remote,
  - the client doesn't have an URL configured for that remote.

When 'promisor.acceptfromserver' is set to KnownName, URLs should not
be taken into account to decide if an advertised promisor remote is
accepted or not.

Note that the promisor remote code added by this series doesn't change
the code that actually uses the remote names and urls to clone or
fetch objects, so there is no change there. In particular, if the
client doesn't have an URL configured for a remote, even if the remote
is accepted and the server provides an URL, the client will not be
able to fetch or clone from the remote as it will not use the server
provided URL. The test called "clone with 'KnownName' and missing URL
in the config" shows that.

> The second hunk of this patch seems to
> ignore such a broken entry with an empty URL, but that smells like
> sweeping a problem under the rug.  Shouldn't such a promisor be
> either diagnosed as an error, or skipped when you accumulate URLs to
> be used into the strvec, so that the later code (i.e. the second
> hunk) does not even have to worry about it?

If 'promisor.acceptfromserver' is set to KnownName, I think it is
simpler to just ignore URLs altogether. Such a behavior is easier to
document and implement.

Also if we ever develop a mode where the advertised URL can be used
for cloning or fetching by the client, then it won't matter if no URL
is configured on the client side. In fact it might be the common case.

Skipping remotes with no URL would make it more difficult to explain
why a remote was rejected in the KnownUrl case. In the next version, I
have added checks and warnings to help diagnose why a remote is
rejected when a URL is missing.

I have also added a test case where the LOP URL is not configured on
the server side, so not advertised.

> > @@ -356,7 +360,7 @@ char *promisor_remote_info(struct repository *repo)
> >                       strbuf_addch(&sb, ';');
> >               strbuf_addstr(&sb, "name=");
> >               strbuf_addstr_urlencode(&sb, names.v[i], allow_unsanitized);
> > -             if (urls.v[i]) {
> > +             if (urls.v[i] && *urls.v[i]) {
>
> Why does urls.v[i] need to be checked?  Didn't you just make sure
> that the strvec would not have NULL in it?

Yeah, right. I changed this to just `if (*urls.v[i])` in the next version.

Thanks for your review.
Junio C Hamano March 11, 2025, 4:57 p.m. UTC | #3
Christian Couder <christian.couder@gmail.com> writes:

> Actually strvec_push() uses xstrdup() on the value it is passed and
> xstrdup() crashes if that value is NULL. So another way to avoid
> crashes would be to make xstrdup() BUG when it's passed NULL. Or maybe
> xstrdup() should just return NULL in this case?
>
> Also it looks like strvec_push_nodup() kind of works if it is passed a
> NULL. (It adds the NULL to the array and grows it.) So I wonder if the
> right solution for strvec_push() would be to make it kind of work in
> the same way.

Meaning "xstrdup" -> "xstrdup_or_null"?  

It actually may be a reasonable thing to do, to remain parallel to
the _nodup() variant, but given strvec is about making it easier to
do argc/argv[], allowing NULL in it does not sound like it is in
line with its stated purpose.

I also do not think checking inside xstrdup() is a good idea; it is
at way too low a level.

> Anyway I think these are separate issues that deserve their own
> discussions and can wait for after the 2.49.0 release.

Absolutely.  

> Here I am just
> providing a hotfix for the "promisor-remote" protocol capability.

Ooof, I didn't realize that cc/lop-remote escaped
'seen' with such a bug X-<.

Thanks.
diff mbox series

Patch

diff --git a/promisor-remote.c b/promisor-remote.c
index 6a0a61382f..56567c6e45 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -323,11 +323,15 @@  static void promisor_info_vecs(struct repository *repo,
 	promisor_remote_init(repo);
 
 	for (r = repo->promisor_remote_config->promisors; r; r = r->next) {
-		char *url;
+		char *url = NULL;
+		const char *url_pushed = "";
 		char *url_key = xstrfmt("remote.%s.url", r->name);
 
+		if (!git_config_get_string(url_key, &url) && url)
+			url_pushed = url;
+
 		strvec_push(names, r->name);
-		strvec_push(urls, git_config_get_string(url_key, &url) ? NULL : url);
+		strvec_push(urls, url_pushed);
 
 		free(url);
 		free(url_key);
@@ -356,7 +360,7 @@  char *promisor_remote_info(struct repository *repo)
 			strbuf_addch(&sb, ';');
 		strbuf_addstr(&sb, "name=");
 		strbuf_addstr_urlencode(&sb, names.v[i], allow_unsanitized);
-		if (urls.v[i]) {
+		if (urls.v[i] && *urls.v[i]) {
 			strbuf_addstr(&sb, ",url=");
 			strbuf_addstr_urlencode(&sb, urls.v[i], allow_unsanitized);
 		}
diff --git a/t/t5710-promisor-remote-capability.sh b/t/t5710-promisor-remote-capability.sh
index d2cc69a17e..d9c5676e4d 100755
--- a/t/t5710-promisor-remote-capability.sh
+++ b/t/t5710-promisor-remote-capability.sh
@@ -193,6 +193,22 @@  test_expect_success "clone with 'KnownName' and different remote names" '
 	initialize_server 1 "$oid"
 '
 
+test_expect_success "clone with 'KnownName' and missing URL in the config" '
+	git -C server config promisor.advertise true &&
+
+	# Clone from server to create a client
+	GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \
+		-c promisor.acceptfromserver=KnownName \
+		--no-local --filter="blob:limit=5k" server client &&
+	test_when_finished "rm -rf client" &&
+
+	# Check that the largest object is not missing on the server
+	check_missing_objects server 0 "" &&
+
+	# Reinitialize server so that the largest object is missing again
+	initialize_server 1 "$oid"
+'
+
 test_expect_success "clone with promisor.acceptfromserver set to 'KnownUrl'" '
 	git -C server config promisor.advertise true &&