Message ID | 20201126012854.399-8-avarab@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC/PATCH,01/12] mktag: use default strbuf_read() hint | expand |
On Thu, Nov 26, 2020 at 02:28:49AM +0100, Ævar Arnfjörð Bjarmason wrote: > Add optional "extra" checks to fsck, these are needed to eventually > replace the custom not-quite-fsck code in mktag.c. > > The mktag checks differ from fsck_tag() in several ways, one of those > is that fsck doesn't know how to refuse an object with custom headers, > and isn't strict about header and body newline separation. > > Teach it how to optionally report these. I thought the best way to do > that given the current structure of the code was to add a new "extra" > category in addition to error/warn/info. Hmm, this new severity (and the extra options bit) feels a bit backwards. We are already passing the information on what we find to the report() callback. It seems like that is the place that should be deciding what is important and what is not. Unfortunately the defaults are somewhat backwards here. We'd have to teach the fsck callbacks to ignore these harmless entries, rather than teaching the mktag caller that they need to be respected. So probably the extra bit in options to say "do these extra tag checks" is the least-bad thing. But then why do we need to put them in their own EXTRA section? The only caller that wants them would treat them as errors. I'm slightly on the fence on whether mktag really needs to enforce the "unknown header" thing at all. Sure, we don't encourage them, but it's a plumbing tool one could use to experiment with new headers. I guess the downside is that a typo'd header would not be caught. > Under --strict the "info" > becomes a "warn" and "warn" becomes "error". Existing users of > fsck's (and others, e.g. index-pack) --strict option rely on this. Yeah, this is a weirdness I think we should eventually fix (along with re-prioritizing some of the existing checks). I'm wary of doing anything that further cements that somewhat broken world-view (keep in mind that "index-pack --strict" is not "do fsck more strictly" but "do fsck at all"). > I'm not changing fsck_commit() to validate commit objects like this > either, we could do that, but unlike in the tag case that code > wouldn't be used anywhere. If someone wants to write a "mkcommit" they > which behaves like "mktag" they can borrow or refactor this logic for > use in fsck_commit(). Seems reasonable. -Peff
On Thu, Nov 26 2020, Jeff King wrote: > On Thu, Nov 26, 2020 at 02:28:49AM +0100, Ævar Arnfjörð Bjarmason wrote: > >> Add optional "extra" checks to fsck, these are needed to eventually >> replace the custom not-quite-fsck code in mktag.c. >> >> The mktag checks differ from fsck_tag() in several ways, one of those >> is that fsck doesn't know how to refuse an object with custom headers, >> and isn't strict about header and body newline separation. >> >> Teach it how to optionally report these. I thought the best way to do >> that given the current structure of the code was to add a new "extra" >> category in addition to error/warn/info. > > Hmm, this new severity (and the extra options bit) feels a bit > backwards. We are already passing the information on what we find to the > report() callback. It seems like that is the place that should be > deciding what is important and what is not. > > Unfortunately the defaults are somewhat backwards here. We'd have to > teach the fsck callbacks to ignore these harmless entries, rather than > teaching the mktag caller that they need to be respected. > > So probably the extra bit in options to say "do these extra tag checks" > is the least-bad thing. But then why do we need to put them in their own > EXTRA section? The only caller that wants them would treat them as > errors. Right, it'll be hidden behind options->extra, so I could just make them ERROR. I guess I was thinking it would be confusing to stick stuff in the middle of ERROR that wasn't on by default, e.g. I've sometimes skimmed that macro definition and saw "ah, bad parent sha1 is an error", as in transfer.fsckObjects would reject it. So I'm slightly on the fence about keeping it as it is, what do you think? > I'm slightly on the fence on whether mktag really needs to enforce the > "unknown header" thing at all. Sure, we don't encourage them, but it's a > plumbing tool one could use to experiment with new headers. I guess the > downside is that a typo'd header would not be caught. The problem is that since verify_headers() in fsck.c wants to allow it, there's no way for it to distinguish a fat-fingerd "didn't separate the body from the headers" v.s. actually wanting a custom header in some cases. >> Under --strict the "info" >> becomes a "warn" and "warn" becomes "error". Existing users of >> fsck's (and others, e.g. index-pack) --strict option rely on this. > > Yeah, this is a weirdness I think we should eventually fix (along with > re-prioritizing some of the existing checks). I'm wary of doing anything > that further cements that somewhat broken world-view (keep in mind that > "index-pack --strict" is not "do fsck more strictly" but "do fsck at > all"). *nod*, will note that. >> I'm not changing fsck_commit() to validate commit objects like this >> either, we could do that, but unlike in the tag case that code >> wouldn't be used anywhere. If someone wants to write a "mkcommit" they >> which behaves like "mktag" they can borrow or refactor this logic for >> use in fsck_commit(). > > Seems reasonable. > > -Peff
diff --git a/fsck.c b/fsck.c index f82e2fe9e3..3c25df2244 100644 --- a/fsck.c +++ b/fsck.c @@ -80,7 +80,10 @@ static struct oidset gitmodules_done = OIDSET_INIT; /* infos (reported as warnings, but ignored by default) */ \ FUNC(GITMODULES_PARSE, INFO) \ FUNC(BAD_TAG_NAME, INFO) \ - FUNC(MISSING_TAGGER_ENTRY, INFO) + FUNC(MISSING_TAGGER_ENTRY, INFO) \ + /* extra (only reported when requested) */ \ + FUNC(EXTRA_HEADER_ENTRY, EXTRA) \ + FUNC(EXTRA_HEADER_BODY_NEWLINE, EXTRA) #define MSG_ID(id, msg_type) FSCK_MSG_##id, enum fsck_msg_id { @@ -975,6 +978,33 @@ static int fsck_tag(const struct object_id *oid, const char *buffer, else ret = fsck_ident(&buffer, oid, OBJ_TAG, options); + if (options->extra && *buffer) { + if (!starts_with(buffer, "\n")) { + /* + * The verify_headers() check will allow + * e.g. "[...]tagger <tagger>\nsome + * garbage\n\nmessage" to pass, thinking "some + * garbage" could be a custom + * header. E.g. "mktag" doesn't want any + * unknown headers. + */ + ret = report(options, oid, OBJ_TAG, FSCK_MSG_EXTRA_HEADER_ENTRY, "invalid format - extra header(s) after 'tagger'"); + if (ret) + goto done; + } + if (starts_with(buffer, "\n\n")) { + /* + * Some callers such as "mktag" want to + * disallow "[...]tagger + * <tagger>\n\n\nmessage", only allowing a + * single newline for separation. + */ + ret = report(options, oid, OBJ_TAG, FSCK_MSG_EXTRA_HEADER_BODY_NEWLINE, "invalid format - headers separated body by more than one newline"); + if (ret) + goto done; + } + } + done: strbuf_release(&sb); return ret; diff --git a/fsck.h b/fsck.h index 69cf715e79..110efc65fd 100644 --- a/fsck.h +++ b/fsck.h @@ -6,6 +6,7 @@ #define FSCK_ERROR 1 #define FSCK_WARN 2 #define FSCK_IGNORE 3 +#define FSCK_EXTRA 4 struct fsck_options; struct object; @@ -40,6 +41,7 @@ struct fsck_options { unsigned strict:1; int *msg_type; struct oidset skiplist; + unsigned extra:1; kh_oid_map_t *object_names; };
Add optional "extra" checks to fsck, these are needed to eventually replace the custom not-quite-fsck code in mktag.c. The mktag checks differ from fsck_tag() in several ways, one of those is that fsck doesn't know how to refuse an object with custom headers, and isn't strict about header and body newline separation. Teach it how to optionally report these. I thought the best way to do that given the current structure of the code was to add a new "extra" category in addition to error/warn/info. Under --strict the "info" becomes a "warn" and "warn" becomes "error". Existing users of fsck's (and others, e.g. index-pack) --strict option rely on this. By adding an "extra" category and only reporting it based on a flag in fsck_options callers can opt-in to these "extra" messages, which they'll then need to deal with in their own "error_func". No tests are being added for this new functionality, they're added in a subsequent commit where we teach "mktag" to use this new validation mode. I'm not changing fsck_commit() to validate commit objects like this either, we could do that, but unlike in the tag case that code wouldn't be used anywhere. If someone wants to write a "mkcommit" they which behaves like "mktag" they can borrow or refactor this logic for use in fsck_commit(). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- fsck.c | 32 +++++++++++++++++++++++++++++++- fsck.h | 2 ++ 2 files changed, 33 insertions(+), 1 deletion(-)