Message ID | 20250310074053.1886097-1-christian.couder@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | promisor-remote: fix segfault when remote URL is missing | expand |
Christian Couder <christian.couder@gmail.com> writes: > Pushing NULL into a 'strvec' results in a segfault, as 'strvec' is a > kind of NULL terminated array that is designed to be compatible with > 'argv' variables used on the command line. It is good that you corrected a caller that tries to make a strvec into such a state, but shouldn't strvec_push() protect itself with a BUG or something, I have to wonder. > So when an URL is missing from the config, let's push an empty string > instead of NULL into the 'strvec' that stores URLs. How will these strings in the "names" strvec used? When URLs are present, I'm sure we are going to use it as URL, but when they are missing, what happens? The second hunk of this patch seems to ignore such a broken entry with an empty URL, but that smells like sweeping a problem under the rug. Shouldn't such a promisor be either diagnosed as an error, or skipped when you accumulate URLs to be used into the strvec, so that the later code (i.e. the second hunk) does not even have to worry about it? > @@ -356,7 +360,7 @@ char *promisor_remote_info(struct repository *repo) > strbuf_addch(&sb, ';'); > strbuf_addstr(&sb, "name="); > strbuf_addstr_urlencode(&sb, names.v[i], allow_unsanitized); > - if (urls.v[i]) { > + if (urls.v[i] && *urls.v[i]) { Why does urls.v[i] need to be checked? Didn't you just make sure that the strvec would not have NULL in it?
On Mon, Mar 10, 2025 at 5:29 PM Junio C Hamano <gitster@pobox.com> wrote: > > Christian Couder <christian.couder@gmail.com> writes: > > > Pushing NULL into a 'strvec' results in a segfault, as 'strvec' is a > > kind of NULL terminated array that is designed to be compatible with > > 'argv' variables used on the command line. > > It is good that you corrected a caller that tries to make a strvec > into such a state, but shouldn't strvec_push() protect itself with a > BUG or something, I have to wonder. Actually strvec_push() uses xstrdup() on the value it is passed and xstrdup() crashes if that value is NULL. So another way to avoid crashes would be to make xstrdup() BUG when it's passed NULL. Or maybe xstrdup() should just return NULL in this case? Also it looks like strvec_push_nodup() kind of works if it is passed a NULL. (It adds the NULL to the array and grows it.) So I wonder if the right solution for strvec_push() would be to make it kind of work in the same way. Anyway I think these are separate issues that deserve their own discussions and can wait for after the 2.49.0 release. Here I am just providing a hotfix for the "promisor-remote" protocol capability. > > So when an URL is missing from the config, let's push an empty string > > instead of NULL into the 'strvec' that stores URLs. > > How will these strings in the "names" strvec used? When URLs are > present, I'm sure we are going to use it as URL, but when they are > missing, what happens? The 'names' strvec and 'urls' strvec contain what exists in the client config. They are only used by the promisor remote code to compare the names and maybe urls in the config with what the server advertises in case 'promisor.acceptfromserver' is set to KnownName or KnownUrl. This is done to decide if an advertised promisor remote is accepted or not by the client. When 'promisor.acceptfromserver' is set to KnownUrl, a remote should be rejected in any of the following cases: - the server doesn't advertise an URL for that remote, - the client doesn't have an URL configured for that remote. When 'promisor.acceptfromserver' is set to KnownName, URLs should not be taken into account to decide if an advertised promisor remote is accepted or not. Note that the promisor remote code added by this series doesn't change the code that actually uses the remote names and urls to clone or fetch objects, so there is no change there. In particular, if the client doesn't have an URL configured for a remote, even if the remote is accepted and the server provides an URL, the client will not be able to fetch or clone from the remote as it will not use the server provided URL. The test called "clone with 'KnownName' and missing URL in the config" shows that. > The second hunk of this patch seems to > ignore such a broken entry with an empty URL, but that smells like > sweeping a problem under the rug. Shouldn't such a promisor be > either diagnosed as an error, or skipped when you accumulate URLs to > be used into the strvec, so that the later code (i.e. the second > hunk) does not even have to worry about it? If 'promisor.acceptfromserver' is set to KnownName, I think it is simpler to just ignore URLs altogether. Such a behavior is easier to document and implement. Also if we ever develop a mode where the advertised URL can be used for cloning or fetching by the client, then it won't matter if no URL is configured on the client side. In fact it might be the common case. Skipping remotes with no URL would make it more difficult to explain why a remote was rejected in the KnownUrl case. In the next version, I have added checks and warnings to help diagnose why a remote is rejected when a URL is missing. I have also added a test case where the LOP URL is not configured on the server side, so not advertised. > > @@ -356,7 +360,7 @@ char *promisor_remote_info(struct repository *repo) > > strbuf_addch(&sb, ';'); > > strbuf_addstr(&sb, "name="); > > strbuf_addstr_urlencode(&sb, names.v[i], allow_unsanitized); > > - if (urls.v[i]) { > > + if (urls.v[i] && *urls.v[i]) { > > Why does urls.v[i] need to be checked? Didn't you just make sure > that the strvec would not have NULL in it? Yeah, right. I changed this to just `if (*urls.v[i])` in the next version. Thanks for your review.
Christian Couder <christian.couder@gmail.com> writes: > Actually strvec_push() uses xstrdup() on the value it is passed and > xstrdup() crashes if that value is NULL. So another way to avoid > crashes would be to make xstrdup() BUG when it's passed NULL. Or maybe > xstrdup() should just return NULL in this case? > > Also it looks like strvec_push_nodup() kind of works if it is passed a > NULL. (It adds the NULL to the array and grows it.) So I wonder if the > right solution for strvec_push() would be to make it kind of work in > the same way. Meaning "xstrdup" -> "xstrdup_or_null"? It actually may be a reasonable thing to do, to remain parallel to the _nodup() variant, but given strvec is about making it easier to do argc/argv[], allowing NULL in it does not sound like it is in line with its stated purpose. I also do not think checking inside xstrdup() is a good idea; it is at way too low a level. > Anyway I think these are separate issues that deserve their own > discussions and can wait for after the 2.49.0 release. Absolutely. > Here I am just > providing a hotfix for the "promisor-remote" protocol capability. Ooof, I didn't realize that cc/lop-remote escaped 'seen' with such a bug X-<. Thanks.
diff --git a/promisor-remote.c b/promisor-remote.c index 6a0a61382f..56567c6e45 100644 --- a/promisor-remote.c +++ b/promisor-remote.c @@ -323,11 +323,15 @@ static void promisor_info_vecs(struct repository *repo, promisor_remote_init(repo); for (r = repo->promisor_remote_config->promisors; r; r = r->next) { - char *url; + char *url = NULL; + const char *url_pushed = ""; char *url_key = xstrfmt("remote.%s.url", r->name); + if (!git_config_get_string(url_key, &url) && url) + url_pushed = url; + strvec_push(names, r->name); - strvec_push(urls, git_config_get_string(url_key, &url) ? NULL : url); + strvec_push(urls, url_pushed); free(url); free(url_key); @@ -356,7 +360,7 @@ char *promisor_remote_info(struct repository *repo) strbuf_addch(&sb, ';'); strbuf_addstr(&sb, "name="); strbuf_addstr_urlencode(&sb, names.v[i], allow_unsanitized); - if (urls.v[i]) { + if (urls.v[i] && *urls.v[i]) { strbuf_addstr(&sb, ",url="); strbuf_addstr_urlencode(&sb, urls.v[i], allow_unsanitized); } diff --git a/t/t5710-promisor-remote-capability.sh b/t/t5710-promisor-remote-capability.sh index d2cc69a17e..d9c5676e4d 100755 --- a/t/t5710-promisor-remote-capability.sh +++ b/t/t5710-promisor-remote-capability.sh @@ -193,6 +193,22 @@ test_expect_success "clone with 'KnownName' and different remote names" ' initialize_server 1 "$oid" ' +test_expect_success "clone with 'KnownName' and missing URL in the config" ' + git -C server config promisor.advertise true && + + # Clone from server to create a client + GIT_NO_LAZY_FETCH=0 git clone -c remote.lop.promisor=true \ + -c promisor.acceptfromserver=KnownName \ + --no-local --filter="blob:limit=5k" server client && + test_when_finished "rm -rf client" && + + # Check that the largest object is not missing on the server + check_missing_objects server 0 "" && + + # Reinitialize server so that the largest object is missing again + initialize_server 1 "$oid" +' + test_expect_success "clone with promisor.acceptfromserver set to 'KnownUrl'" ' git -C server config promisor.advertise true &&
Pushing NULL into a 'strvec' results in a segfault, as 'strvec' is a kind of NULL terminated array that is designed to be compatible with 'argv' variables used on the command line. So when an URL is missing from the config, let's push an empty string instead of NULL into the 'strvec' that stores URLs. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> --- This is a fix for cc/lop-remote. Sorry for sending it late in the cycle. promisor-remote.c | 10 +++++++--- t/t5710-promisor-remote-capability.sh | 16 ++++++++++++++++ 2 files changed, 23 insertions(+), 3 deletions(-)