Message ID | 1d1bd9c710327b4d705cfede017771da7fb6ec52.1661181174.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Bundle URIs III: Parse and download from bundle lists | expand |
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > To allow for the incremental fetch case, teach Git to understand a > bundle list that could be advertised at an independent bundle URI. Such > a bundle list is likely to be inspected by human readers, even if only > by the bundle provider creating the list. For this reason, we can take > our expected "key=value" pairs and instead format them using Git config > format. "can" does not explain why it is a good idea. "As a sequence of key=value pairs is a lot more dense and harder to read than the configuration file format, let's declare that it is the format we use in a file that holds a bundle-list" would be. I do not personally buy it, though. As I hinted in an earlier step, some trait we associate with our configuration fioe format, like the "last one wins" semantics, are undesirable ones, so even if we reuse the appearance of the text, the semantics would have to become different (including "syntax errors lead to die()" mentioned elsewhere in the proposed log message). > Update 'test-tool bundle-uri' to take this config file format as input. > It uses a filename instead of stdin because there is no existing way to > parse a FILE pointer in the config machinery. Using > git_config_from_mem() is overly complicated and more likely to introduce > bugs than this simpler version. I would rather have a slightly confusing > test helper than complicated product code. All the troubles described above seem to come from the initial mistake to try reusing the configuration file parser or reusing the configuration file format, at least to me.
On 8/22/2022 3:25 PM, Junio C Hamano wrote: > "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > >> To allow for the incremental fetch case, teach Git to understand a >> bundle list that could be advertised at an independent bundle URI. Such >> a bundle list is likely to be inspected by human readers, even if only >> by the bundle provider creating the list. For this reason, we can take >> our expected "key=value" pairs and instead format them using Git config >> format. > > "can" does not explain why it is a good idea. "As a sequence of > key=value pairs is a lot more dense and harder to read than the > configuration file format, let's declare that it is the format we > use in a file that holds a bundle-list" would be. This "more dense and harder to read" was definitely my intention for wanting a different format. > I do not personally buy it, though. As I hinted in an earlier step, > some trait we associate with our configuration fioe format, like the > "last one wins" semantics, are undesirable ones, so even if we reuse > the appearance of the text, the semantics would have to become > different (including "syntax errors lead to die()" mentioned > elsewhere in the proposed log message). The points you made earlier about "last one wins" semantics are the biggest road-blocks to using the config file format, from what I've read so far. We could change those semantics to be different from my current implementation which respects the "last one wins" rule, and then that makes the config format match not as closely. That burden of avoiding multiple key values is not on the end-user but the bundle provider to match the new expectations. (There might be something we should be careful about when advertising the bundle list from our Git config in the 'bundle-uri' command in the next series.) The "syntax errors lead to die()" is mitigated by using CONFIG_ERROR_ERROR, which is what I meant by "Be careful to call..." I should have been more clear that we are _not_ going to die() based on the remote data. We might write an error message and then abort the bundle download. With all of these points in mind, I'd still prefer to use the config file format as described in the design document. If you still don't agree, then I'll change the format to be key=value pairs split with newlines, and update the design document accordingly. Thanks, -Stolee
Junio C Hamano <gitster@pobox.com> writes: > "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > > > To allow for the incremental fetch case, teach Git to understand a > > bundle list that could be advertised at an independent bundle URI. Such > > a bundle list is likely to be inspected by human readers, even if only > > by the bundle provider creating the list. For this reason, we can take > > our expected "key=value" pairs and instead format them using Git config > > format. > > "can" does not explain why it is a good idea. "As a sequence of > key=value pairs is a lot more dense and harder to read than the > configuration file format, let's declare that it is the format we > use in a file that holds a bundle-list" would be. > > I do not personally buy it, though. As I hinted in an earlier step, > some trait we associate with our configuration fioe format, like the > "last one wins" semantics, are undesirable ones, so even if we reuse > the appearance of the text, the semantics would have to become > different (including "syntax errors lead to die()" mentioned > elsewhere in the proposed log message). One reason for using the configuration file format (which perhaps could have been better explained in the commit message) is that we plan to have a way for a repo to advertise a list of bundles during fetch. I think that config is a natural place to put that, even with its "last one wins" semantics. It could be argued that we can just put a single URI in config and only allow advertising of a single URI (and then use a different format for the bundle lists with semantics that are stricter than "last one wins"), but that seems unnecessarily restrictive (and would make the client make one more network request). And if we're advertising multiple bundles, it seems reasonable to make all bundle lists have the same format (whether they are in config or in a separate file).
Derrick Stolee <derrickstolee@github.com> writes: > diff --git a/bundle-uri.h b/bundle-uri.h > index 41a1510a4ac..294ac804140 100644 > --- a/bundle-uri.h > +++ b/bundle-uri.h > @@ -71,6 +71,16 @@ int for_all_bundles_in_list(struct bundle_list *list, > struct FILE; > void print_bundle_list(FILE *fp, struct bundle_list *list); > > +/** > + * A bundle URI may point to a bundle list where the key=value > + * pairs are provided in config file format. This method is > + * exposed publicly for testing purposes. > + */ > + > +int parse_bundle_list_in_config_format(const char *uri, > + const char *filename, > + struct bundle_list *list); > + Although the comment clarifies the purpose of why to introduce "parse_bundle_list_in_config_format", but I think this API is useful if finally config format is supported. So far, we have a API names "bundle_uri_parse_line" which is used to parsing key-value pairs and package into bundle list, I think maybe we should rename the API name from "parse_bundle_list_in_config_format" to "bundle_uri_parse_config_format", maybe better in my opinion for more consistent naming. I think it doesnt break anything, feel free to accept or remain. Thanks.
diff --git a/bundle-uri.c b/bundle-uri.c index d56c5e33d5f..dca88ed1e89 100644 --- a/bundle-uri.c +++ b/bundle-uri.c @@ -6,6 +6,7 @@ #include "run-command.h" #include "hashmap.h" #include "pkt-line.h" +#include "config.h" static int compare_bundles(const void *hashmap_cmp_fn_data, const struct hashmap_entry *he1, @@ -172,6 +173,34 @@ static int bundle_list_update(const char *key, const char *value, return 0; } +static int config_to_bundle_list(const char *key, const char *value, void *data) +{ + struct bundle_list *list = data; + return bundle_list_update(key, value, list); +} + +int parse_bundle_list_in_config_format(const char *uri, + const char *filename, + struct bundle_list *list) +{ + int result; + struct config_options opts = { + .error_action = CONFIG_ERROR_ERROR, + }; + + list->mode = BUNDLE_MODE_NONE; + result = git_config_from_file_with_options(config_to_bundle_list, + filename, list, + &opts); + + if (!result && list->mode == BUNDLE_MODE_NONE) { + warning(_("bundle list at '%s' has no mode"), uri); + result = 1; + } + + return result; +} + static int find_temp_filename(struct strbuf *name) { int fd; diff --git a/bundle-uri.h b/bundle-uri.h index 41a1510a4ac..294ac804140 100644 --- a/bundle-uri.h +++ b/bundle-uri.h @@ -71,6 +71,16 @@ int for_all_bundles_in_list(struct bundle_list *list, struct FILE; void print_bundle_list(FILE *fp, struct bundle_list *list); +/** + * A bundle URI may point to a bundle list where the key=value + * pairs are provided in config file format. This method is + * exposed publicly for testing purposes. + */ + +int parse_bundle_list_in_config_format(const char *uri, + const char *filename, + struct bundle_list *list); + /** * Fetch data from the given 'uri' and unbundle the bundle data found * based on that information. diff --git a/t/helper/test-bundle-uri.c b/t/helper/test-bundle-uri.c index 5cb0c9196fa..23ce0eebca3 100644 --- a/t/helper/test-bundle-uri.c +++ b/t/helper/test-bundle-uri.c @@ -4,27 +4,52 @@ #include "strbuf.h" #include "string-list.h" -static int cmd__bundle_uri_parse_key_values(int argc, const char **argv) +enum input_mode { + KEY_VALUE_PAIRS, + CONFIG_FILE, +}; + +static int cmd__bundle_uri_parse(int argc, const char **argv, enum input_mode mode) { - const char *usage[] = { + const char *key_value_usage[] = { "test-tool bundle-uri parse-key-values <in", NULL }; + const char *config_usage[] = { + "test-tool bundle-uri parse-config <input>", + NULL + }; struct option options[] = { OPT_END(), }; + const char **usage = key_value_usage; struct strbuf sb = STRBUF_INIT; struct bundle_list list; int err = 0; - argc = parse_options(argc, argv, NULL, options, usage, 0); - if (argc) - goto usage; + if (mode == CONFIG_FILE) + usage = config_usage; + + argc = parse_options(argc, argv, NULL, options, usage, + PARSE_OPT_STOP_AT_NON_OPTION); init_bundle_list(&list); - while (strbuf_getline(&sb, stdin) != EOF) { - if (bundle_uri_parse_line(&list, sb.buf) < 0) - err = error("bad line: '%s'", sb.buf); + + switch (mode) { + case KEY_VALUE_PAIRS: + if (argc) + goto usage; + while (strbuf_getline(&sb, stdin) != EOF) { + if (bundle_uri_parse_line(&list, sb.buf) < 0) + err = error("bad line: '%s'", sb.buf); + } + break; + + case CONFIG_FILE: + if (argc != 1) + goto usage; + err = parse_bundle_list_in_config_format("<uri>", argv[0], &list); + break; } strbuf_release(&sb); @@ -55,7 +80,9 @@ int cmd__bundle_uri(int argc, const char **argv) goto usage; if (!strcmp(argv[1], "parse-key-values")) - return cmd__bundle_uri_parse_key_values(argc - 1, argv + 1); + return cmd__bundle_uri_parse(argc - 1, argv + 1, KEY_VALUE_PAIRS); + if (!strcmp(argv[1], "parse-config")) + return cmd__bundle_uri_parse(argc - 1, argv + 1, CONFIG_FILE); error("there is no test-tool bundle-uri tool '%s'", argv[1]); usage: diff --git a/t/t5750-bundle-uri-parse.sh b/t/t5750-bundle-uri-parse.sh index 675c1f1d2f4..dd9dc36bfd7 100755 --- a/t/t5750-bundle-uri-parse.sh +++ b/t/t5750-bundle-uri-parse.sh @@ -88,4 +88,54 @@ test_expect_success 'bundle_uri_parse_line() parsing edge cases: empty lines' ' test_cmp_config_output expect actual ' +test_expect_success 'parse config format: just URIs' ' + cat >expect <<-\EOF && + [bundle] + version = 1 + mode = all + [bundle "one"] + uri = http://example.com/bundle.bdl + [bundle "two"] + uri = https://example.com/bundle.bdl + [bundle "three"] + uri = file:///usr/share/git/bundle.bdl + EOF + + test-tool bundle-uri parse-config expect >actual 2>err && + test_must_be_empty err && + test_cmp_config_output expect actual +' + +test_expect_success 'parse config format edge cases: empty key or value' ' + cat >in1 <<-\EOF && + = bogus-value + EOF + + cat >err1 <<-EOF && + error: bad config line 1 in file in1 + EOF + + cat >expect <<-\EOF && + [bundle] + version = 1 + mode = <unknown> + EOF + + test_must_fail test-tool bundle-uri parse-config in1 >actual 2>err && + test_cmp err1 err && + test_cmp_config_output expect actual && + + cat >in2 <<-\EOF && + bogus-key = + EOF + + cat >err2 <<-EOF && + warning: bundle list at '\''<uri>'\'' has no mode + EOF + + test_must_fail test-tool bundle-uri parse-config in2 >actual 2>err && + test_cmp err2 err && + test_cmp_config_output expect actual +' + test_done