Message ID | 025f38290f5a705c80854a42e1abcaba9a9f336d.1646750359.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Partial bundles | expand |
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > diff --git a/bundle.h b/bundle.h > index 06009fe6b1f..7fef2108f43 100644 > --- a/bundle.h > +++ b/bundle.h > @@ -4,12 +4,14 @@ > #include "strvec.h" > #include "cache.h" > #include "string-list.h" > +#include "list-objects-filter-options.h" > > struct bundle_header { > unsigned version; > struct string_list prerequisites; > struct string_list references; > const struct git_hash_algo *hash_algo; > + struct list_objects_filter_options filter; > }; This used to be a pointer to the struct, with "NULL means do not filter" semantics, with .nr==0 as BUG(). Which was the same justification used when an earlier step added a pointer to the filter struct to rev_info. Should the same logic applies there to make it into an embedded struct in rev_info as well?
On Tue, Mar 08 2022, Derrick Stolee via GitGitGadget wrote: > From: Derrick Stolee <derrickstolee@github.com> > [...] > static const char v2_bundle_signature[] = "# v2 git bundle\n"; > static const char v3_bundle_signature[] = "# v3 git bundle\n"; > @@ -33,6 +33,7 @@ void bundle_header_release(struct bundle_header *header) > { > string_list_clear(&header->prerequisites, 1); > string_list_clear(&header->references, 1); > + list_objects_filter_release(&header->filter); > } > > static int parse_capability(struct bundle_header *header, const char *capability) > @@ -45,6 +46,10 @@ static int parse_capability(struct bundle_header *header, const char *capability > header->hash_algo = &hash_algos[algo]; > return 0; > } > + if (skip_prefix(capability, "filter=", &arg)) { > + parse_list_objects_filter(&header->filter, arg); > + return 0; > + } > return error(_("unknown capability '%s'"), capability); > } I haven't tested, but I did wonder if our purely "check reachability" (or equivalent) "verify" would be slowed down by doing whatever filter magic we're doing here, but then remembered/saw that we only parse the header, so it can't be that bad :) I.e. this is only checking the syntax of the filter, surely, and then spits it back at us. That makes sense. I think that this hunk from the subsequent 10/12 is in the wrong place though, and should be here when we change "verify" (not "create" later): diff --git a/Documentation/git-bundle.txt b/Documentation/git-bundle.txt index 72ab8139052..831c4788a94 100644 --- a/Documentation/git-bundle.txt +++ b/Documentation/git-bundle.txt @@ -75,8 +75,8 @@ verify <file>:: cleanly to the current repository. This includes checks on the bundle format itself as well as checking that the prerequisite commits exist and are fully linked in the current repository. - 'git bundle' prints a list of missing commits, if any, and exits - with a non-zero status. + 'git bundle' prints the bundle's object filter and its list of + missing commits, if any, and exits with a non-zero status. list-heads <file>:: Lists the references defined in the bundle. If followed by a I think instead of starting to list every header we might add in the future there it would make sense to just add after the pre-image full-stop something like: We'll also print out information about any known capabilities, such as "object filter". See "Capabilities" in technical/... Which future proofs it a bit... > [...] I think it would also be great to have tests for intentionally corrupting the header and seeing what the "verify" output is, i.e. do we die right away, proceed to validate the rest. So just (somewhat pseudocode): git bundle create [...] my.bdl && sed 's/@filter: .*/@filter: bad/' <my.bdl >bad.bdl && test_must_fail git bundle verify bad.bdl 2>actual && [...] test_cmp expect actual Overall this series looks really good at this point, and I'm down to minor shades of colors on the bikeshedding :)
On 3/8/2022 12:29 PM, Junio C Hamano wrote: > "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > >> diff --git a/bundle.h b/bundle.h >> index 06009fe6b1f..7fef2108f43 100644 >> --- a/bundle.h >> +++ b/bundle.h >> @@ -4,12 +4,14 @@ >> #include "strvec.h" >> #include "cache.h" >> #include "string-list.h" >> +#include "list-objects-filter-options.h" >> >> struct bundle_header { >> unsigned version; >> struct string_list prerequisites; >> struct string_list references; >> const struct git_hash_algo *hash_algo; >> + struct list_objects_filter_options filter; >> }; > > This used to be a pointer to the struct, with "NULL means do not > filter" semantics, with .nr==0 as BUG(). Which was the same > justification used when an earlier step added a pointer to the > filter struct to rev_info. > > Should the same logic applies there to make it into an embedded > struct in rev_info as well? You're absolutely right. Making the change will make the range-diff look absolutely horrid, but the change isn't terribly hard for the most part. Thanks, -Stolee
diff --git a/Documentation/technical/bundle-format.txt b/Documentation/technical/bundle-format.txt index bac558d049a..b9be8644cf5 100644 --- a/Documentation/technical/bundle-format.txt +++ b/Documentation/technical/bundle-format.txt @@ -71,6 +71,11 @@ and the Git bundle v2 format cannot represent a shallow clone repository. == Capabilities Because there is no opportunity for negotiation, unknown capabilities cause 'git -bundle' to abort. The only known capability is `object-format`, which specifies -the hash algorithm in use, and can take the same values as the -`extensions.objectFormat` configuration value. +bundle' to abort. + +* `object-format` specifies the hash algorithm in use, and can take the same + values as the `extensions.objectFormat` configuration value. + +* `filter` specifies an object filter as in the `--filter` option in + linkgit:git-rev-list[1]. The resulting pack-file must be marked as a + `.promisor` pack-file after it is unbundled. diff --git a/bundle.c b/bundle.c index 7ba60a573d7..41922565627 100644 --- a/bundle.c +++ b/bundle.c @@ -11,7 +11,7 @@ #include "run-command.h" #include "refs.h" #include "strvec.h" - +#include "list-objects-filter-options.h" static const char v2_bundle_signature[] = "# v2 git bundle\n"; static const char v3_bundle_signature[] = "# v3 git bundle\n"; @@ -33,6 +33,7 @@ void bundle_header_release(struct bundle_header *header) { string_list_clear(&header->prerequisites, 1); string_list_clear(&header->references, 1); + list_objects_filter_release(&header->filter); } static int parse_capability(struct bundle_header *header, const char *capability) @@ -45,6 +46,10 @@ static int parse_capability(struct bundle_header *header, const char *capability header->hash_algo = &hash_algos[algo]; return 0; } + if (skip_prefix(capability, "filter=", &arg)) { + parse_list_objects_filter(&header->filter, arg); + return 0; + } return error(_("unknown capability '%s'"), capability); } @@ -220,6 +225,8 @@ int verify_bundle(struct repository *r, req_nr = revs.pending.nr; setup_revisions(2, argv, &revs, NULL); + revs.filter = &header->filter; + if (prepare_revision_walk(&revs)) die(_("revision walk setup failed")); @@ -259,6 +266,12 @@ int verify_bundle(struct repository *r, r->nr), r->nr); list_refs(r, 0, NULL); + + if (header->filter.choice != LOFC_DISABLED) { + printf_ln("The bundle uses this filter: %s", + list_objects_filter_spec(&header->filter)); + } + r = &header->prerequisites; if (!r->nr) { printf_ln(_("The bundle records a complete history.")); diff --git a/bundle.h b/bundle.h index 06009fe6b1f..7fef2108f43 100644 --- a/bundle.h +++ b/bundle.h @@ -4,12 +4,14 @@ #include "strvec.h" #include "cache.h" #include "string-list.h" +#include "list-objects-filter-options.h" struct bundle_header { unsigned version; struct string_list prerequisites; struct string_list references; const struct git_hash_algo *hash_algo; + struct list_objects_filter_options filter; }; #define BUNDLE_HEADER_INIT \ diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c index fd8d59f653a..d8597cdee36 100644 --- a/list-objects-filter-options.c +++ b/list-objects-filter-options.c @@ -40,22 +40,7 @@ const char *list_object_filter_config_name(enum list_objects_filter_choice c) BUG("list_object_filter_config_name: invalid argument '%d'", c); } -/* - * Parse value of the argument to the "filter" keyword. - * On the command line this looks like: - * --filter=<arg> - * and in the pack protocol as: - * "filter" SP <arg> - * - * The filter keyword will be used by many commands. - * See Documentation/rev-list-options.txt for allowed values for <arg>. - * - * Capture the given arg as the "filter_spec". This can be forwarded to - * subordinate commands when necessary (although it's better to pass it through - * expand_list_objects_filter_spec() first). We also "intern" the arg for the - * convenience of the current command. - */ -static int gently_parse_list_objects_filter( +int gently_parse_list_objects_filter( struct list_objects_filter_options *filter_options, const char *arg, struct strbuf *errbuf) diff --git a/list-objects-filter-options.h b/list-objects-filter-options.h index da5b6737e27..f6fe6a3d2ca 100644 --- a/list-objects-filter-options.h +++ b/list-objects-filter-options.h @@ -72,6 +72,26 @@ struct list_objects_filter_options { /* Normalized command line arguments */ #define CL_ARG__FILTER "filter" +/* + * Parse value of the argument to the "filter" keyword. + * On the command line this looks like: + * --filter=<arg> + * and in the pack protocol as: + * "filter" SP <arg> + * + * The filter keyword will be used by many commands. + * See Documentation/rev-list-options.txt for allowed values for <arg>. + * + * Capture the given arg as the "filter_spec". This can be forwarded to + * subordinate commands when necessary (although it's better to pass it through + * expand_list_objects_filter_spec() first). We also "intern" the arg for the + * convenience of the current command. + */ +int gently_parse_list_objects_filter( + struct list_objects_filter_options *filter_options, + const char *arg, + struct strbuf *errbuf); + void list_objects_filter_die_if_populated( struct list_objects_filter_options *filter_options);