diff mbox series

[v3,09/11] bundle-uri: allow relative URLs in bundle lists

Message ID 48731438d6a289129a5768b80af856fb49692426.1670262639.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Commit f6305ed17eaa2b52dfa598a58cad0ec9eb06a964
Headers show
Series Bundle URIs IV: advertise over protocol v2 | expand

Commit Message

Derrick Stolee Dec. 5, 2022, 5:50 p.m. UTC
From: Derrick Stolee <derrickstolee@github.com>

Bundle providers may want to distribute that data across multiple CDNs.
This might require a change in the base URI, all the way to the domain
name. If all bundles require an absolute URI in their 'uri' value, then
every push to a CDN would require altering the table of contents to
match the expected domain and exact location within it.

Allow a bundle list to specify a relative URI for the bundles. This URI
is based on where the client received the bundle list. For a list
provided in the 'bundle-uri' protocol v2 command, the Git remote URI is
the base URI. Otherwise, the bundle list was provided from an HTTP URI
not using the Git protocol, and that URI is the base URI. This allows
easier distribution of bundle data.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 bundle-uri.c                | 16 +++++++-
 bundle-uri.h                | 14 +++++++
 t/helper/test-bundle-uri.c  |  2 +
 t/t5750-bundle-uri-parse.sh | 82 +++++++++++++++++++++++++++++++++++++
 transport.c                 |  3 ++
 5 files changed, 116 insertions(+), 1 deletion(-)

Comments

Victoria Dye Dec. 5, 2022, 11:33 p.m. UTC | #1
Derrick Stolee via GitGitGadget wrote:
> Allow a bundle list to specify a relative URI for the bundles. This URI
> is based on where the client received the bundle list. For a list
> provided in the 'bundle-uri' protocol v2 command, the Git remote URI is
> the base URI. Otherwise, the bundle list was provided from an HTTP URI
> not using the Git protocol, and that URI is the base URI. This allows
> easier distribution of bundle data.

Thanks, this clears up my confusion about the source of 'baseURI'.

> +	/**
> +	 * The baseURI of a bundle_list is the URI that provided the list.
> +	 *
> +	 * In the case of the 'bundle-uri' protocol v2 command, the base
> +	 * URI is the URI of the Git remote.
> +	 *
> +	 * Otherewise, the bundle list was downloaded over HTTP from some
> +	 * known URI.

s/Otherewise/Otherwise

Also, this sentence is a bit more vague than what was noted in the commit
message; it doesn't actually say what the base URI is set to in this
scenario. Feel free to ignore if you think it's overkill, but that could
probably be cleared up by adding another sentence after like "The base URI
is set to that known URI."

> +	 *
> +	 * The baseURI is used as the base for any relative URIs
> +	 * advertised by the bundle list at that location.
> +	 */
> +	char *baseURI;

...

> +	# TODO: We would prefer if parsing a bundle list would not cause
> +	# a die() and instead would give a warning and allow the rest of
> +	# a Git command to continue. This test_must_fail is necessary for
> +	# now until the interface for relative_url() allows for reporting
> +	# an error instead of die()ing.
> +	test_must_fail test-tool bundle-uri parse-key-values in >actual 2>err &&
> +	grep "fatal: cannot strip one component off url" err

Thanks for adding this, I'm content to leave this as a TODO for now.
Derrick Stolee Dec. 7, 2022, 3:22 p.m. UTC | #2
On 12/5/2022 6:33 PM, Victoria Dye wrote:
> Derrick Stolee via GitGitGadget wrote:
>> Allow a bundle list to specify a relative URI for the bundles. This URI
>> is based on where the client received the bundle list. For a list
>> provided in the 'bundle-uri' protocol v2 command, the Git remote URI is
>> the base URI. Otherwise, the bundle list was provided from an HTTP URI
>> not using the Git protocol, and that URI is the base URI. This allows
>> easier distribution of bundle data.
> 
> Thanks, this clears up my confusion about the source of 'baseURI'.
> 
>> +	/**
>> +	 * The baseURI of a bundle_list is the URI that provided the list.
>> +	 *
>> +	 * In the case of the 'bundle-uri' protocol v2 command, the base
>> +	 * URI is the URI of the Git remote.
>> +	 *
>> +	 * Otherewise, the bundle list was downloaded over HTTP from some
>> +	 * known URI.
> 
> s/Otherewise/Otherwise
> 
> Also, this sentence is a bit more vague than what was noted in the commit
> message; it doesn't actually say what the base URI is set to in this
> scenario. Feel free to ignore if you think it's overkill, but that could
> probably be cleared up by adding another sentence after like "The base URI
> is set to that known URI."

Thanks for both of these suggestions.

-Stolee
diff mbox series

Patch

diff --git a/bundle-uri.c b/bundle-uri.c
index 6919f541085..80370992773 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -7,6 +7,7 @@ 
 #include "hashmap.h"
 #include "pkt-line.h"
 #include "config.h"
+#include "remote.h"
 
 static int compare_bundles(const void *hashmap_cmp_fn_data,
 			   const struct hashmap_entry *he1,
@@ -49,6 +50,7 @@  void clear_bundle_list(struct bundle_list *list)
 
 	for_all_bundles_in_list(list, clear_remote_bundle_info, NULL);
 	hashmap_clear_and_free(&list->bundles, struct remote_bundle_info, ent);
+	free(list->baseURI);
 }
 
 int for_all_bundles_in_list(struct bundle_list *list,
@@ -163,7 +165,7 @@  static int bundle_list_update(const char *key, const char *value,
 	if (!strcmp(subkey, "uri")) {
 		if (bundle->uri)
 			return -1;
-		bundle->uri = xstrdup(value);
+		bundle->uri = relative_url(list->baseURI, value, NULL);
 		return 0;
 	}
 
@@ -190,6 +192,18 @@  int bundle_uri_parse_config_format(const char *uri,
 		.error_action = CONFIG_ERROR_ERROR,
 	};
 
+	if (!list->baseURI) {
+		struct strbuf baseURI = STRBUF_INIT;
+		strbuf_addstr(&baseURI, uri);
+
+		/*
+		 * If the URI does not end with a trailing slash, then
+		 * remove the filename portion of the path. This is
+		 * important for relative URIs.
+		 */
+		strbuf_strip_file_from_path(&baseURI);
+		list->baseURI = strbuf_detach(&baseURI, NULL);
+	}
 	result = git_config_from_file_with_options(config_to_bundle_list,
 						   filename, list,
 						   &opts);
diff --git a/bundle-uri.h b/bundle-uri.h
index 357111ecce8..e7e90a5f088 100644
--- a/bundle-uri.h
+++ b/bundle-uri.h
@@ -61,6 +61,20 @@  struct bundle_list {
 	int version;
 	enum bundle_list_mode mode;
 	struct hashmap bundles;
+
+	/**
+	 * The baseURI of a bundle_list is the URI that provided the list.
+	 *
+	 * In the case of the 'bundle-uri' protocol v2 command, the base
+	 * URI is the URI of the Git remote.
+	 *
+	 * Otherewise, the bundle list was downloaded over HTTP from some
+	 * known URI.
+	 *
+	 * The baseURI is used as the base for any relative URIs
+	 * advertised by the bundle list at that location.
+	 */
+	char *baseURI;
 };
 
 void init_bundle_list(struct bundle_list *list);
diff --git a/t/helper/test-bundle-uri.c b/t/helper/test-bundle-uri.c
index f8159187014..5df5bc3b89e 100644
--- a/t/helper/test-bundle-uri.c
+++ b/t/helper/test-bundle-uri.c
@@ -40,6 +40,8 @@  static int cmd__bundle_uri_parse(int argc, const char **argv, enum input_mode mo
 
 	init_bundle_list(&list);
 
+	list.baseURI = xstrdup("<uri>");
+
 	switch (mode) {
 	case KEY_VALUE_PAIRS:
 		if (argc != 1)
diff --git a/t/t5750-bundle-uri-parse.sh b/t/t5750-bundle-uri-parse.sh
index c2fe3f9c5a5..7b4f930e532 100755
--- a/t/t5750-bundle-uri-parse.sh
+++ b/t/t5750-bundle-uri-parse.sh
@@ -30,6 +30,58 @@  test_expect_success 'bundle_uri_parse_line() just URIs' '
 	test_cmp_config_output expect actual
 '
 
+test_expect_success 'bundle_uri_parse_line(): relative URIs' '
+	cat >in <<-\EOF &&
+	bundle.one.uri=bundle.bdl
+	bundle.two.uri=../bundle.bdl
+	bundle.three.uri=sub/dir/bundle.bdl
+	EOF
+
+	cat >expect <<-\EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	[bundle "one"]
+		uri = <uri>/bundle.bdl
+	[bundle "two"]
+		uri = bundle.bdl
+	[bundle "three"]
+		uri = <uri>/sub/dir/bundle.bdl
+	EOF
+
+	test-tool bundle-uri parse-key-values in >actual 2>err &&
+	test_must_be_empty err &&
+	test_cmp_config_output expect actual
+'
+
+test_expect_success 'bundle_uri_parse_line(): relative URIs and parent paths' '
+	cat >in <<-\EOF &&
+	bundle.one.uri=bundle.bdl
+	bundle.two.uri=../bundle.bdl
+	bundle.three.uri=../../bundle.bdl
+	EOF
+
+	cat >expect <<-\EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	[bundle "one"]
+		uri = <uri>/bundle.bdl
+	[bundle "two"]
+		uri = bundle.bdl
+	[bundle "three"]
+		uri = <uri>/../bundle.bdl
+	EOF
+
+	# TODO: We would prefer if parsing a bundle list would not cause
+	# a die() and instead would give a warning and allow the rest of
+	# a Git command to continue. This test_must_fail is necessary for
+	# now until the interface for relative_url() allows for reporting
+	# an error instead of die()ing.
+	test_must_fail test-tool bundle-uri parse-key-values in >actual 2>err &&
+	grep "fatal: cannot strip one component off url" err
+'
+
 test_expect_success 'bundle_uri_parse_line() parsing edge cases: empty key or value' '
 	cat >in <<-\EOF &&
 	=bogus-value
@@ -136,6 +188,36 @@  test_expect_success 'parse config format: just URIs' '
 	test_cmp_config_output expect actual
 '
 
+test_expect_success 'parse config format: relative URIs' '
+	cat >in <<-\EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	[bundle "one"]
+		uri = bundle.bdl
+	[bundle "two"]
+		uri = ../bundle.bdl
+	[bundle "three"]
+		uri = sub/dir/bundle.bdl
+	EOF
+
+	cat >expect <<-\EOF &&
+	[bundle]
+		version = 1
+		mode = all
+	[bundle "one"]
+		uri = <uri>/bundle.bdl
+	[bundle "two"]
+		uri = bundle.bdl
+	[bundle "three"]
+		uri = <uri>/sub/dir/bundle.bdl
+	EOF
+
+	test-tool bundle-uri parse-config in >actual 2>err &&
+	test_must_be_empty err &&
+	test_cmp_config_output expect actual
+'
+
 test_expect_success 'parse config format edge cases: empty key or value' '
 	cat >in1 <<-\EOF &&
 	= bogus-value
diff --git a/transport.c b/transport.c
index 97d395e10a3..957dca4923c 100644
--- a/transport.c
+++ b/transport.c
@@ -1539,6 +1539,9 @@  int transport_get_remote_bundle_uri(struct transport *transport)
 	    (git_config_get_bool("transfer.bundleuri", &value) || !value))
 		return 0;
 
+	if (!transport->bundles->baseURI)
+		transport->bundles->baseURI = xstrdup(transport->url);
+
 	if (!vtable->get_bundle_uri)
 		return error(_("bundle-uri operation not supported by protocol"));