diff mbox series

[v2,4/5] bundle-uri: add support for http(s):// and file://

Message ID e4f2dcc7a45388663aeac786e5abdcf2164cfe62.1659443384.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series Bundle URIs II: git clone --bundle-uri | expand

Commit Message

Derrick Stolee Aug. 2, 2022, 12:29 p.m. UTC
From: Derrick Stolee <derrickstolee@github.com>

The previous change created the 'git clone --bundle-uri=<uri>' option.
Currently, <uri> must be a filename.

Update copy_uri_to_file() to first inspect the URI for an HTTP(S) prefix
and use git-remote-https as the way to download the data at that URI.
Otherwise, check to see if file:// is present and modify the prefix
accordingly.

Reviewed-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 bundle-uri.c                | 70 +++++++++++++++++++++++++++++++++++--
 t/t5558-clone-bundle-uri.sh | 45 ++++++++++++++++++++++++
 2 files changed, 112 insertions(+), 3 deletions(-)

Comments

Junio C Hamano Aug. 2, 2022, 9:32 p.m. UTC | #1
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> +static int copy_uri_to_file(const char *filename, const char *uri)
> +{
> +	const char *out;
> +
> +	if (istarts_with(uri, "https:") ||
> +	    istarts_with(uri, "http:"))

Let's be a bit more strict to avoid mistakes and make the code
immediately obvious, e.g.

	if (istarts_with(uri, "https://") ||
	    istarts_with(uri, "http://"))

> +		return download_https_uri_to_file(filename, uri);
> +
> +	if (!skip_prefix(uri, "file://", &out))
> +		out = uri;

If we are using istarts_with because URI scheme name is case
insensitive, shouldn't we do the same for "file://" URL, not
just for "http(s)://" URL?  IOW

	if (!skip_iprefix(uri, "file://", &out))

> +static int download_https_uri_to_file(const char *file, const char *uri)
>  {
> +	int result = 0;
> +	struct child_process cp = CHILD_PROCESS_INIT;
> +	FILE *child_in = NULL, *child_out = NULL;
> +	struct strbuf line = STRBUF_INIT;
> +	int found_get = 0;
> +
> +	strvec_pushl(&cp.args, "git-remote-https", "origin", uri, NULL);

Does "git-remote-https" talk to a "http://" URL just fine when uri
parameter starts with "http://"?  Would it be the same if the uri
parameter begins with say "Http://"?

Other than that, looks sensible.
Derrick Stolee Aug. 4, 2022, 3:34 p.m. UTC | #2
On 8/2/2022 5:32 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> +static int copy_uri_to_file(const char *filename, const char *uri)
>> +{
>> +	const char *out;
>> +
>> +	if (istarts_with(uri, "https:") ||
>> +	    istarts_with(uri, "http:"))
> 
> Let's be a bit more strict to avoid mistakes and make the code
> immediately obvious, e.g.
> 
> 	if (istarts_with(uri, "https://") ||
> 	    istarts_with(uri, "http://"))
> 
>> +		return download_https_uri_to_file(filename, uri);
>> +
>> +	if (!skip_prefix(uri, "file://", &out))
>> +		out = uri;
> 
> If we are using istarts_with because URI scheme name is case
> insensitive, shouldn't we do the same for "file://" URL, not
> just for "http(s)://" URL?  IOW
> 
> 	if (!skip_iprefix(uri, "file://", &out))

Good ideas. Of course, we don't have a skip_iprefix(), but
I can use "istarts_with()" and then manually add the length.
If we see more need for that in the future, we can consider
adding it.

(It's interesting that these uses in bundle-uri.c are the
only uses of istarts_with() that I see in the codebase.)

>> +static int download_https_uri_to_file(const char *file, const char *uri)
>>  {
>> +	int result = 0;
>> +	struct child_process cp = CHILD_PROCESS_INIT;
>> +	FILE *child_in = NULL, *child_out = NULL;
>> +	struct strbuf line = STRBUF_INIT;
>> +	int found_get = 0;
>> +
>> +	strvec_pushl(&cp.args, "git-remote-https", "origin", uri, NULL);
> 
> Does "git-remote-https" talk to a "http://" URL just fine when uri
> parameter starts with "http://"?  Would it be the same if the uri
> parameter begins with say "Http://"?

I did a quick check of our HTTPS tests modifying the HTTPD_PROTO
variable in lib-httpd.sh to "HtTP" and we get this fun error:

+ git clone --filter=blob:limit=0 HtTP://127.0.0.1:5601/smart/server client
Cloning into 'client'...
git: 'remote-HtTP' is not a git command. See 'git --help'.

So I guess I can keep case-sensitive comparisons here.

Thanks,
-Stolee
Junio C Hamano Aug. 4, 2022, 6:19 p.m. UTC | #3
Derrick Stolee <derrickstolee@github.com> writes:

>>> +	if (istarts_with(uri, "https:") ||
>>> +	    istarts_with(uri, "http:"))
>> 
>> Let's be a bit more strict to avoid mistakes and make the code
>> immediately obvious, e.g.
>> 
>> 	if (istarts_with(uri, "https://") ||
>> 	    istarts_with(uri, "http://"))

This part (i.e. check for "<scheme>://", not "<scheme>:") still
stands, as the latter could be an scp style Git URL that goes to the
host whose name is <scheme>.  As mentioned later, s/istarts/starts/
is probably a good thing to do here.

>> Does "git-remote-https" talk to a "http://" URL just fine when uri
>> parameter starts with "http://"?  Would it be the same if the uri
>> parameter begins with say "Http://"?
>
> I did a quick check of our HTTPS tests modifying the HTTPD_PROTO
> variable in lib-httpd.sh to "HtTP" and we get this fun error:
>
> + git clone --filter=blob:limit=0 HtTP://127.0.0.1:5601/smart/server client
> Cloning into 'client'...
> git: 'remote-HtTP' is not a git command. See 'git --help'.
>
> So I guess I can keep case-sensitive comparisons here.

Guarding them to lowercase-only may sound like a cop-out to purists,
but I think it is reasonable thing to do.  The only folks that would
be offended by are protocol lawyers, and as your check shows, we are
treating <scheme> case sensitively already.

An obvious alternative is to downcase the "<scheme>://" part but I
do not think it is worth it; we have to do that everywhere and we
need to be confident that we covered all the code paths---the latter
is expensive.

There do exist skip_iprefix() that are used in many places, like
convert, http, mailinfo, etc. by the way.  That is a moot point as
we are not doing case insensitive comparison anymore.

Thanks.
diff mbox series

Patch

diff --git a/bundle-uri.c b/bundle-uri.c
index b35babc36aa..930e995d194 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -23,10 +23,74 @@  static int find_temp_filename(struct strbuf *name)
 	return 0;
 }
 
-static int copy_uri_to_file(const char *file, const char *uri)
+static int download_https_uri_to_file(const char *file, const char *uri)
 {
-	/* File-based URIs only for now. */
-	return copy_file(file, uri, 0);
+	int result = 0;
+	struct child_process cp = CHILD_PROCESS_INIT;
+	FILE *child_in = NULL, *child_out = NULL;
+	struct strbuf line = STRBUF_INIT;
+	int found_get = 0;
+
+	strvec_pushl(&cp.args, "git-remote-https", "origin", uri, NULL);
+	cp.in = -1;
+	cp.out = -1;
+
+	if (start_command(&cp))
+		return 1;
+
+	child_in = fdopen(cp.in, "w");
+	if (!child_in) {
+		result = 1;
+		goto cleanup;
+	}
+
+	child_out = fdopen(cp.out, "r");
+	if (!child_out) {
+		result = 1;
+		goto cleanup;
+	}
+
+	fprintf(child_in, "capabilities\n");
+	fflush(child_in);
+
+	while (!strbuf_getline(&line, child_out)) {
+		if (!line.len)
+			break;
+		if (!strcmp(line.buf, "get"))
+			found_get = 1;
+	}
+	strbuf_release(&line);
+
+	if (!found_get) {
+		result = error(_("insufficient capabilities"));
+		goto cleanup;
+	}
+
+	fprintf(child_in, "get %s %s\n\n", uri, file);
+
+cleanup:
+	if (child_in)
+		fclose(child_in);
+	if (finish_command(&cp))
+		return 1;
+	if (child_out)
+		fclose(child_out);
+	return result;
+}
+
+static int copy_uri_to_file(const char *filename, const char *uri)
+{
+	const char *out;
+
+	if (istarts_with(uri, "https:") ||
+	    istarts_with(uri, "http:"))
+		return download_https_uri_to_file(filename, uri);
+
+	if (!skip_prefix(uri, "file://", &out))
+		out = uri;
+
+	/* Copy as a file */
+	return copy_file(filename, out, 0);
 }
 
 static int unbundle_from_file(struct repository *r, const char *file)
diff --git a/t/t5558-clone-bundle-uri.sh b/t/t5558-clone-bundle-uri.sh
index f709bcb729c..ad666a2d28a 100755
--- a/t/t5558-clone-bundle-uri.sh
+++ b/t/t5558-clone-bundle-uri.sh
@@ -33,4 +33,49 @@  test_expect_success 'clone with path bundle' '
 	test_cmp expect actual
 '
 
+test_expect_success 'clone with file:// bundle' '
+	git clone --bundle-uri="file://$(pwd)/clone-from/B.bundle" \
+		clone-from clone-file &&
+	git -C clone-file rev-parse refs/bundles/topic >actual &&
+	git -C clone-from rev-parse topic >expect &&
+	test_cmp expect actual
+'
+
+#########################################################################
+# HTTP tests begin here
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'fail to fetch from non-existent HTTP URL' '
+	test_when_finished rm -rf test &&
+	git clone --bundle-uri="$HTTPD_URL/does-not-exist" . test 2>err &&
+	grep "failed to download bundle from URI" err
+'
+
+test_expect_success 'fail to fetch from non-bundle HTTP URL' '
+	test_when_finished rm -rf test &&
+	echo bogus >"$HTTPD_DOCUMENT_ROOT_PATH/bogus" &&
+	git clone --bundle-uri="$HTTPD_URL/bogus" . test 2>err &&
+	grep "is not a bundle" err
+'
+
+test_expect_success 'clone HTTP bundle' '
+	cp clone-from/B.bundle "$HTTPD_DOCUMENT_ROOT_PATH/B.bundle" &&
+
+	git clone --no-local --mirror clone-from \
+		"$HTTPD_DOCUMENT_ROOT_PATH/fetch.git" &&
+
+	git clone --bundle-uri="$HTTPD_URL/B.bundle" \
+		"$HTTPD_URL/smart/fetch.git" clone-http &&
+	git -C clone-http rev-parse refs/bundles/topic >actual &&
+	git -C clone-from rev-parse topic >expect &&
+	test_cmp expect actual &&
+
+	test_config -C clone-http log.excludedecoration refs/bundle/
+'
+
+# Do not add tests here unless they use the HTTP server, as they will
+# not run unless the HTTP dependencies exist.
+
 test_done