From patchwork Fri Oct 18 04:42:13 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11197513 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E470A112B for ; Fri, 18 Oct 2019 05:08:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CD84B21835 for ; Fri, 18 Oct 2019 05:08:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404906AbfJRFIz (ORCPT ); Fri, 18 Oct 2019 01:08:55 -0400 Received: from cloud.peff.net ([104.130.231.41]:51796 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1732067AbfJRFIz (ORCPT ); Fri, 18 Oct 2019 01:08:55 -0400 Received: (qmail 9312 invoked by uid 109); 18 Oct 2019 04:42:15 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Fri, 18 Oct 2019 04:42:15 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14035 invoked by uid 111); 18 Oct 2019 04:45:19 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Fri, 18 Oct 2019 00:45:19 -0400 Authentication-Results: peff.net; auth=none Date: Fri, 18 Oct 2019 00:42:13 -0400 From: Jeff King To: git@vger.kernel.org Subject: [PATCH 01/23] parse_commit_buffer(): treat lookup_commit() failure as parse error Message-ID: <20191018044212.GA17879@sigill.intra.peff.net> References: <20191018044103.GA17625@sigill.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20191018044103.GA17625@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org While parsing the parents of a commit, if we are able to parse an actual oid but lookup_commit() fails on it (because we previously saw it in this process as a different object type), we silently omit the parent and do not report any error to the caller. The caller has no way of knowing this happened, because even an empty parent list is a valid parse result. As a result, it's possible to fool our "rev-list" connectivity check into accepting a corrupted set of objects. There's a test for this case already in t6102, but unfortunately it has a slight error. It creates a broken commit with a parent line pointing to a blob, and then checks that rev-list notices the problem in two cases: 1. the "lone" case: we traverse the broken commit by itself (here we try to actually load the blob from disk and find out that it's not a commit) 2. the "seen" case: we parse the blob earlier in the process, and then when calling lookup_commit() we realize immediately that it's not a commit The "seen" variant for this test mistakenly parsed another commit instead of the blob, meaning that we were actually just testing the "lone" case again. Changing that reveals the breakage (and shows that this fixes it). Signed-off-by: Jeff King --- commit.c | 11 ++++++++--- t/t6102-rev-list-unexpected-objects.sh | 2 +- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/commit.c b/commit.c index 40890ae7ce..6467c9e175 100644 --- a/commit.c +++ b/commit.c @@ -432,8 +432,11 @@ int parse_commit_buffer(struct repository *r, struct commit *item, const void *b if (graft && (graft->nr_parent < 0 || grafts_replace_parents)) continue; new_parent = lookup_commit(r, &parent); - if (new_parent) - pptr = &commit_list_insert(new_parent, pptr)->next; + if (!new_parent) + return error("bad parent %s in commit %s", + oid_to_hex(&parent), + oid_to_hex(&item->object.oid)); + pptr = &commit_list_insert(new_parent, pptr)->next; } if (graft) { int i; @@ -442,7 +445,9 @@ int parse_commit_buffer(struct repository *r, struct commit *item, const void *b new_parent = lookup_commit(r, &graft->parent[i]); if (!new_parent) - continue; + return error("bad graft parent %s in commit %s", + oid_to_hex(&graft->parent[i]), + oid_to_hex(&item->object.oid)); pptr = &commit_list_insert(new_parent, pptr)->next; } } diff --git a/t/t6102-rev-list-unexpected-objects.sh b/t/t6102-rev-list-unexpected-objects.sh index 28611c978e..52cde097dd 100755 --- a/t/t6102-rev-list-unexpected-objects.sh +++ b/t/t6102-rev-list-unexpected-objects.sh @@ -52,7 +52,7 @@ test_expect_success 'traverse unexpected non-commit parent (lone)' ' ' test_expect_success 'traverse unexpected non-commit parent (seen)' ' - test_must_fail git rev-list --objects $commit $broken_commit \ + test_must_fail git rev-list --objects $blob $broken_commit \ >output 2>&1 && test_i18ngrep "not a commit" output ' From patchwork Fri Oct 18 04:43:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11197517 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4034B15AB for ; Fri, 18 Oct 2019 05:10:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2833A222C6 for ; Fri, 18 Oct 2019 05:10:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389797AbfJRFKN (ORCPT ); Fri, 18 Oct 2019 01:10:13 -0400 Received: from cloud.peff.net ([104.130.231.41]:51802 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1727606AbfJRFKL (ORCPT ); Fri, 18 Oct 2019 01:10:11 -0400 Received: (qmail 9317 invoked by uid 109); 18 Oct 2019 04:43:31 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Fri, 18 Oct 2019 04:43:31 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14040 invoked by uid 111); 18 Oct 2019 04:46:35 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Fri, 18 Oct 2019 00:46:35 -0400 Authentication-Results: peff.net; auth=none Date: Fri, 18 Oct 2019 00:43:29 -0400 From: Jeff King To: git@vger.kernel.org Subject: [PATCH 02/23] parse_commit_buffer(): treat lookup_tree() failure as parse error Message-ID: <20191018044328.GB17879@sigill.intra.peff.net> References: <20191018044103.GA17625@sigill.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20191018044103.GA17625@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org If parsing a commit yields a valid tree oid, but we've seen that same oid as a non-tree in the same process, the resulting commit struct will end up with a NULL tree pointer, but not otherwise report a parsing failure. That's perhaps convenient for callers which want to operate on even partially corrupt commits (e.g., by still looking at the parents). But it leaves a potential trap for most callers, who now have to manually check for a NULL tree. Most do not, and it's likely that there are possible segfaults lurking. I say "possible" because there are many candidates, and I don't think it's worth following through on reproducing them when we can just fix them all in one spot. And certainly we _have_ seen real-world cases, such as the one fixed by 806278dead (commit-graph.c: handle corrupt/missing trees, 2019-09-05). Note that we can't quite drop the check in the caller added by that commit yet, as there's some subtlety with repeated parsings (which will be addressed in a future commit). Signed-off-by: Jeff King --- commit.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/commit.c b/commit.c index 6467c9e175..810419a168 100644 --- a/commit.c +++ b/commit.c @@ -401,6 +401,7 @@ int parse_commit_buffer(struct repository *r, struct commit *item, const void *b struct commit_graft *graft; const int tree_entry_len = the_hash_algo->hexsz + 5; const int parent_entry_len = the_hash_algo->hexsz + 7; + struct tree *tree; if (item->object.parsed) return 0; @@ -412,7 +413,12 @@ int parse_commit_buffer(struct repository *r, struct commit *item, const void *b if (get_oid_hex(bufptr + 5, &parent) < 0) return error("bad tree pointer in commit %s", oid_to_hex(&item->object.oid)); - set_commit_tree(item, lookup_tree(r, &parent)); + tree = lookup_tree(r, &parent); + if (!tree) + return error("bad tree pointer %s in commit %s", + oid_to_hex(&parent), + oid_to_hex(&item->object.oid)); + set_commit_tree(item, tree); bufptr += tree_entry_len + 1; /* "tree " + "hex sha1" + "\n" */ pptr = &item->parents; From patchwork Fri Oct 18 04:45:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11197523 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E267F15AB for ; Fri, 18 Oct 2019 05:12:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C841D21835 for ; Fri, 18 Oct 2019 05:12:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2403981AbfJRFMR (ORCPT ); Fri, 18 Oct 2019 01:12:17 -0400 Received: from cloud.peff.net ([104.130.231.41]:51808 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1732046AbfJRFMR (ORCPT ); Fri, 18 Oct 2019 01:12:17 -0400 Received: (qmail 9327 invoked by uid 109); 18 Oct 2019 04:45:37 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Fri, 18 Oct 2019 04:45:37 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14045 invoked by uid 111); 18 Oct 2019 04:48:41 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Fri, 18 Oct 2019 00:48:41 -0400 Authentication-Results: peff.net; auth=none Date: Fri, 18 Oct 2019 00:45:35 -0400 From: Jeff King To: git@vger.kernel.org Subject: [PATCH 03/23] parse_tag_buffer(): treat NULL tag pointer as parse error Message-ID: <20191018044534.GC17879@sigill.intra.peff.net> References: <20191018044103.GA17625@sigill.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20191018044103.GA17625@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When parsing a tag, we may end up with a NULL "tagged" field when there's a type mismatch (e.g., the tag claims to point to object X as a commit, but we previously saw X as a blob in the same process), but we do not otherwise indicate a parse failure to the caller. This is similar to the case discussed in the previous commit, where a commit could end up with a NULL tree field: while slightly convenient for callers who want to overlook a corrupt object, it means that normal callers have to explicitly deal with this case (rather than just relying on the return code from parsing). And most don't, leading to segfault fixes like the one in c77722b3ea (use get_tagged_oid(), 2019-09-05). Let's address this more centrally, by returning an error code from the parse itself, which most callers would already notice (adventurous callers are free to ignore the error and continue looking at the struct). This also covers the case where the tag contains a nonsensical "type" field (there we produced a user-visible error but still returned success to the caller; now we'll produce a slightly better message and return an error). Signed-off-by: Jeff King --- It's possible that c77722b3ea fixed all of the segfaults here, though of course new callers must remember to use it instead of accessing tag->tagged directly. We could _possibly_ get rid of that check now, but that assumes that all of the caller notice parse_tag() or parse_object() failing. Having both gives us belt-and-suspenders protection against a segfault. tag.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/tag.c b/tag.c index bfa0e31435..6a51efda8d 100644 --- a/tag.c +++ b/tag.c @@ -167,10 +167,15 @@ int parse_tag_buffer(struct repository *r, struct tag *item, const void *data, u } else if (!strcmp(type, tag_type)) { item->tagged = (struct object *)lookup_tag(r, &oid); } else { - error("Unknown type %s", type); - item->tagged = NULL; + return error("unknown tag type '%s' in %s", + type, oid_to_hex(&item->object.oid)); } + if (!item->tagged) + return error("bad tag pointer to %s in %s", + oid_to_hex(&oid), + oid_to_hex(&item->object.oid)); + if (bufptr + 4 < tail && starts_with(bufptr, "tag ")) ; /* good */ else From patchwork Fri Oct 18 04:47:21 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11197457 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3EAB4112B for ; Fri, 18 Oct 2019 04:54:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 278BE222C3 for ; Fri, 18 Oct 2019 04:54:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726032AbfJREyE (ORCPT ); Fri, 18 Oct 2019 00:54:04 -0400 Received: from cloud.peff.net ([104.130.231.41]:51664 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1725973AbfJREyD (ORCPT ); Fri, 18 Oct 2019 00:54:03 -0400 Received: (qmail 9346 invoked by uid 109); 18 Oct 2019 04:47:23 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Fri, 18 Oct 2019 04:47:23 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14054 invoked by uid 111); 18 Oct 2019 04:50:27 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Fri, 18 Oct 2019 00:50:27 -0400 Authentication-Results: peff.net; auth=none Date: Fri, 18 Oct 2019 00:47:21 -0400 From: Jeff King To: git@vger.kernel.org Subject: [PATCH 04/23] remember commit/tag parse failures Message-ID: <20191018044721.GD17879@sigill.intra.peff.net> References: <20191018044103.GA17625@sigill.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20191018044103.GA17625@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org If we can't parse a commit, then parse_commit() will return an error code. But it _also_ sets the "parsed" flag, which tells us not to bother trying to re-parse the object. That means that subsequent parses have no idea that the information in the struct may be bogus. I.e., doing this: parse_commit(commit); ... if (parse_commit(commit) < 0) die("commit is broken"); will never trigger the die(). The second parse_commit() will see the "parsed" flag and quietly return success. There are two obvious ways to fix this: 1. Stop setting "parsed" until we've successfully parsed. 2. Keep a second "corrupt" flag to indicate that we saw an error (and when the parsed flag is set, return 0/-1 depending on the corrupt flag). This patch does option 1. The obvious downside versus option 2 is that we might continually re-parse a broken object. But in practice, corruption like this is rare, and we typically die() or return an error in the caller. So it's OK not to worry about optimizing for corruption. And it's much simpler: we don't need to use an extra bit in the object struct, and callers which check the "parsed" flag don't need to learn about the corrupt bit, too. There's no new test here, because this case is already covered in t5318. Note that we do need to update the expected message there, because we now detect the problem in the return from "parse_commit()", and not with a separate check for a NULL tree. In fact, we can now ditch that explicit tree check entirely, as we're covered robustly by this change (and the previous recent change to treat a NULL tree as a parse error). We'll also give tags the same treatment. I don't know offhand of any cases where the problem can be triggered (it implies somebody ignoring a parse error earlier in the process), but consistently returning an error should cause the least surprise. Signed-off-by: Jeff King --- commit-graph.c | 3 --- commit.c | 14 +++++++++++++- t/t5318-commit-graph.sh | 2 +- tag.c | 12 +++++++++++- 4 files changed, 25 insertions(+), 6 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index fc4a43b8d6..852b9c39e6 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -855,9 +855,6 @@ static void write_graph_chunk_data(struct hashfile *f, int hash_len, die(_("unable to parse commit %s"), oid_to_hex(&(*list)->object.oid)); tree = get_commit_tree_oid(*list); - if (!tree) - die(_("unable to get tree for %s"), - oid_to_hex(&(*list)->object.oid)); hashwrite(f, tree->hash, hash_len); parent = (*list)->parents; diff --git a/commit.c b/commit.c index 810419a168..e12e7998ad 100644 --- a/commit.c +++ b/commit.c @@ -405,7 +405,18 @@ int parse_commit_buffer(struct repository *r, struct commit *item, const void *b if (item->object.parsed) return 0; - item->object.parsed = 1; + + if (item->parents) { + /* + * Presumably this is leftover from an earlier failed parse; + * clear it out in preparation for us re-parsing (we'll hit the + * same error, but that's good, since it lets our caller know + * the result cannot be trusted. + */ + free_commit_list(item->parents); + item->parents = NULL; + } + tail += size; if (tail <= bufptr + tree_entry_len + 1 || memcmp(bufptr, "tree ", 5) || bufptr[tree_entry_len] != '\n') @@ -462,6 +473,7 @@ int parse_commit_buffer(struct repository *r, struct commit *item, const void *b if (check_graph) load_commit_graph_info(r, item); + item->object.parsed = 1; return 0; } diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index d42b3efe39..127b404856 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -660,7 +660,7 @@ test_expect_success 'corrupt commit-graph write (missing tree)' ' git commit-tree -p "$broken" -m "good" "$tree" >good && test_must_fail git commit-graph write --stdin-commits \ test_err && - test_i18ngrep "unable to get tree for" test_err + test_i18ngrep "unable to parse commit" test_err ) ' diff --git a/tag.c b/tag.c index 6a51efda8d..71b544467e 100644 --- a/tag.c +++ b/tag.c @@ -141,7 +141,16 @@ int parse_tag_buffer(struct repository *r, struct tag *item, const void *data, u if (item->object.parsed) return 0; - item->object.parsed = 1; + + if (item->tag) { + /* + * Presumably left over from a previous failed parse; + * clear it out in preparation for re-parsing (we'll probably + * hit the same error, which lets us tell our current caller + * about the problem). + */ + FREE_AND_NULL(item->tag); + } if (size < the_hash_algo->hexsz + 24) return -1; @@ -192,6 +201,7 @@ int parse_tag_buffer(struct repository *r, struct tag *item, const void *data, u else item->date = 0; + item->object.parsed = 1; return 0; } From patchwork Fri Oct 18 04:48:20 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11197461 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E063319A1 for ; Fri, 18 Oct 2019 04:55:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C7364222CB for ; Fri, 18 Oct 2019 04:55:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725973AbfJREzD (ORCPT ); Fri, 18 Oct 2019 00:55:03 -0400 Received: from cloud.peff.net ([104.130.231.41]:51676 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1725926AbfJREzD (ORCPT ); Fri, 18 Oct 2019 00:55:03 -0400 Received: (qmail 9356 invoked by uid 109); 18 Oct 2019 04:48:23 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Fri, 18 Oct 2019 04:48:23 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14057 invoked by uid 111); 18 Oct 2019 04:51:26 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Fri, 18 Oct 2019 00:51:26 -0400 Authentication-Results: peff.net; auth=none Date: Fri, 18 Oct 2019 00:48:20 -0400 From: Jeff King To: git@vger.kernel.org Subject: [PATCH 05/23] fsck: stop checking commit->tree value Message-ID: <20191018044820.GE17879@sigill.intra.peff.net> References: <20191018044103.GA17625@sigill.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20191018044103.GA17625@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org We check in fsck_commit_buffer() that commit->tree isn't NULL, which in turn generally comes from a previous parse by parse_commit(). But this isn't really accomplishing anything. The two things we might care about are: - was there a syntactically valid "tree " line in the object? But we've just done our own parse in fsck_commit_buffer() to check this. - does it point to a valid tree object? But checking the "tree" pointer here doesn't actually accomplish that; it just shows that lookup_tree() didn't return NULL, which only means that we haven't yet seen that oid as a non-tree in this process. A real connectivity check would exhaustively walk all graph links, and we do that already in a separate function. So this code isn't helping anything. And it makes the fsck code slightly more confusing and rigid (e.g., it requires that any commit structs have already been parsed). Let's drop it. As a bit of history, the presence of this code looks like a leftover from early fsck code (which did rely on parse_commit() to do most of the parsing). The check comes from ff5ebe39b0 (Port fsck-cache to use parsing functions, 2005-04-18), but we later added an explicit walk in 355885d531 (add generic, type aware object chain walker, 2008-02-25). Signed-off-by: Jeff King --- fsck.c | 5 ----- 1 file changed, 5 deletions(-) diff --git a/fsck.c b/fsck.c index cdb7d8db03..6dfc533fb0 100644 --- a/fsck.c +++ b/fsck.c @@ -800,11 +800,6 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer, err = fsck_ident(&buffer, &commit->object, options); if (err) return err; - if (!get_commit_tree(commit)) { - err = report(options, &commit->object, FSCK_MSG_BAD_TREE, "could not load commit's tree %s", oid_to_hex(&tree_oid)); - if (err) - return err; - } if (memchr(buffer_begin, '\0', size)) { err = report(options, &commit->object, FSCK_MSG_NUL_IN_COMMIT, "NUL byte in the commit object body"); From patchwork Fri Oct 18 04:49:10 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11197467 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F043F112B for ; Fri, 18 Oct 2019 04:55:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D7D75222CB for ; Fri, 18 Oct 2019 04:55:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2409473AbfJREzx (ORCPT ); Fri, 18 Oct 2019 00:55:53 -0400 Received: from cloud.peff.net ([104.130.231.41]:51682 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S2392808AbfJREzx (ORCPT ); Fri, 18 Oct 2019 00:55:53 -0400 Received: (qmail 9362 invoked by uid 109); 18 Oct 2019 04:49:13 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Fri, 18 Oct 2019 04:49:13 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14062 invoked by uid 111); 18 Oct 2019 04:52:16 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Fri, 18 Oct 2019 00:52:16 -0400 Authentication-Results: peff.net; auth=none Date: Fri, 18 Oct 2019 00:49:10 -0400 From: Jeff King To: git@vger.kernel.org Subject: [PATCH 06/23] fsck: stop checking commit->parent counts Message-ID: <20191018044910.GF17879@sigill.intra.peff.net> References: <20191018044103.GA17625@sigill.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20191018044103.GA17625@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In 4516338243 (builtin-fsck: reports missing parent commits, 2008-02-25), we added code to check that fsck found the same number of parents from parsing the commit itself as we see in the commit struct we got from parse_commit_buffer(). Back then the rationale was that the normal commit parser might skip some bad parents. But earlier in this series, we started treating that reliably as a parsing error, meaning that we'd complain about it before we even hit the code in fsck.c. Let's drop this code, which now makes fsck_commit_buffer() completely independent of any parsed values in the commit struct (that's conceptually cleaner, and also opens up more refactoring options). Note that we can also drop the MISSING_PARENT and MISSING_GRAFT fsck identifiers. This is no loss, as these would not trigger reliably anyway. We'd hit them only when lookup_commit() failed, which occurs only if we happen to have seen the object with another type already in the same process. In most cases, we'd actually run into the problem during the connectivity walk, not here. Signed-off-by: Jeff King --- fsck.c | 23 +---------------------- 1 file changed, 1 insertion(+), 22 deletions(-) diff --git a/fsck.c b/fsck.c index 6dfc533fb0..a0f8ae7650 100644 --- a/fsck.c +++ b/fsck.c @@ -43,10 +43,8 @@ static struct oidset gitmodules_done = OIDSET_INIT; FUNC(MISSING_AUTHOR, ERROR) \ FUNC(MISSING_COMMITTER, ERROR) \ FUNC(MISSING_EMAIL, ERROR) \ - FUNC(MISSING_GRAFT, ERROR) \ FUNC(MISSING_NAME_BEFORE_EMAIL, ERROR) \ FUNC(MISSING_OBJECT, ERROR) \ - FUNC(MISSING_PARENT, ERROR) \ FUNC(MISSING_SPACE_BEFORE_DATE, ERROR) \ FUNC(MISSING_SPACE_BEFORE_EMAIL, ERROR) \ FUNC(MISSING_TAG, ERROR) \ @@ -739,8 +737,7 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer, unsigned long size, struct fsck_options *options) { struct object_id tree_oid, oid; - struct commit_graft *graft; - unsigned parent_count, parent_line_count = 0, author_count; + unsigned author_count; int err; const char *buffer_begin = buffer; const char *p; @@ -763,24 +760,6 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer, return err; } buffer = p + 1; - parent_line_count++; - } - graft = lookup_commit_graft(the_repository, &commit->object.oid); - parent_count = commit_list_count(commit->parents); - if (graft) { - if (graft->nr_parent == -1 && !parent_count) - ; /* shallow commit */ - else if (graft->nr_parent != parent_count) { - err = report(options, &commit->object, FSCK_MSG_MISSING_GRAFT, "graft objects missing"); - if (err) - return err; - } - } else { - if (parent_count != parent_line_count) { - err = report(options, &commit->object, FSCK_MSG_MISSING_PARENT, "parent objects missing"); - if (err) - return err; - } } author_count = 0; while (skip_prefix(buffer, "author ", &buffer)) { From patchwork Fri Oct 18 04:51:19 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11197475 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F1003112B for ; Fri, 18 Oct 2019 04:58:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CDB6A222C5 for ; Fri, 18 Oct 2019 04:58:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2407184AbfJRE6B (ORCPT ); Fri, 18 Oct 2019 00:58:01 -0400 Received: from cloud.peff.net ([104.130.231.41]:51710 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1730688AbfJRE6B (ORCPT ); Fri, 18 Oct 2019 00:58:01 -0400 Received: (qmail 9373 invoked by uid 109); 18 Oct 2019 04:51:21 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Fri, 18 Oct 2019 04:51:21 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14065 invoked by uid 111); 18 Oct 2019 04:54:25 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Fri, 18 Oct 2019 00:54:25 -0400 Authentication-Results: peff.net; auth=none Date: Fri, 18 Oct 2019 00:51:19 -0400 From: Jeff King To: git@vger.kernel.org Subject: [PATCH 07/23] fsck: stop checking tag->tagged Message-ID: <20191018045118.GG17879@sigill.intra.peff.net> References: <20191018044103.GA17625@sigill.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20191018044103.GA17625@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Way back in 92d4c85d24 (fsck-cache: fix SIGSEGV on bad tag object, 2005-05-03), we added an fsck check that the "tagged" field of a tag struct isn't NULL. But that was mainly protecting the printing code for "--tags", and that code wasn't moved along with the check as part of ba002f3b28 (builtin-fsck: move common object checking code to fsck.c, 2008-02-25). It could also serve to detect type mismatch problems (where a tag points to object X as a commit, but really X is a blob), but it couldn't do so reliably (we'd call lookup_commit(X), but it will only notice the problem if we happen to have previously called lookup_blob(X) in the same process). And as of a commit earlier in this series, we'd consider that a parse error and complain about the object even before getting to this point anyway. So let's drop this "tag->tagged" check. It's not helping anything, and getting rid of it makes the function conceptually cleaner, as it really is just checking the buffer we feed it. In fact, we can get rid of our one-line wrapper and just unify fsck_tag() and fsck_tag_buffer(). Signed-off-by: Jeff King --- fsck.c | 15 ++------------- 1 file changed, 2 insertions(+), 13 deletions(-) diff --git a/fsck.c b/fsck.c index a0f8ae7650..79ce3a97c8 100644 --- a/fsck.c +++ b/fsck.c @@ -798,8 +798,8 @@ static int fsck_commit(struct commit *commit, const char *data, return ret; } -static int fsck_tag_buffer(struct tag *tag, const char *data, - unsigned long size, struct fsck_options *options) +static int fsck_tag(struct tag *tag, const char *data, + unsigned long size, struct fsck_options *options) { struct object_id oid; int ret = 0; @@ -893,17 +893,6 @@ static int fsck_tag_buffer(struct tag *tag, const char *data, return ret; } -static int fsck_tag(struct tag *tag, const char *data, - unsigned long size, struct fsck_options *options) -{ - struct object *tagged = tag->tagged; - - if (!tagged) - return report(options, &tag->object, FSCK_MSG_BAD_TAG_OBJECT, "could not load tagged object"); - - return fsck_tag_buffer(tag, data, size, options); -} - struct fsck_gitmodules_data { struct object *obj; struct fsck_options *options; From patchwork Fri Oct 18 04:54:12 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11197459 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5F74F19A1 for ; Fri, 18 Oct 2019 04:54:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 47A30222C3 for ; Fri, 18 Oct 2019 04:54:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2441842AbfJREyP (ORCPT ); Fri, 18 Oct 2019 00:54:15 -0400 Received: from cloud.peff.net ([104.130.231.41]:51666 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1728053AbfJREyN (ORCPT ); Fri, 18 Oct 2019 00:54:13 -0400 Received: (qmail 9389 invoked by uid 109); 18 Oct 2019 04:54:14 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Fri, 18 Oct 2019 04:54:14 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14121 invoked by uid 111); 18 Oct 2019 04:57:18 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Fri, 18 Oct 2019 00:57:18 -0400 Authentication-Results: peff.net; auth=none Date: Fri, 18 Oct 2019 00:54:12 -0400 From: Jeff King To: git@vger.kernel.org Subject: [PATCH 08/23] fsck: require an actual buffer for non-blobs Message-ID: <20191018045412.GH17879@sigill.intra.peff.net> References: <20191018044103.GA17625@sigill.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20191018044103.GA17625@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The fsck_object() function takes in a buffer, but also a "struct object". The rules for using these vary between types: - for a commit, we'll use the provided buffer; if it's NULL, we'll fall back to get_commit_buffer(), which loads from either an in-memory cache or from disk. If the latter fails, we'd die(), which is non-ideal for fsck. - for a tag, a NULL buffer will fall back to loading the object from disk (and failure would lead to an fsck error) - for a tree, we _never_ look at the provided buffer, and always use tree->buffer - for a blob, we usually don't look at the buffer at all, unless it has been marked as a .gitmodule file. In that case we check the buffer given to us, or assume a NULL buffer is a very large blob (and complain about it) This is much more complex than it needs to be. It turns out that nobody ever feeds a NULL buffer that isn't a blob: - git-fsck calls fsck_object() only from fsck_obj(). That in turn is called by one of: - fsck_obj_buffer(), which is a callback to verify_pack(), which unpacks everything except large blobs into a buffer (see pack-check.c, lines 131-141). - fsck_loose(), which hits a BUG() on non-blobs with a NULL buffer (builtin/fsck.c, lines 639-640) And in either case, we'll have just called parse_object_buffer() anyway, which would segfault on a NULL buffer for commits or tags (not for trees, but it would install a NULL tree->buffer which would later cause a segfault) - git-index-pack asserts that the buffer is non-NULL unless the object is a blob (see builtin/index-pack.c, line 832) - git-unpack-objects always writes a non-NULL buffer into its obj_buffer hash, which is then fed to fsck_object(). (There is actually a funny thing here where it does not store blob buffers at all, nor does it call fsck on them; it does check any needed blobs via fsck_finish() though). Let's make the rules simpler, which reduces the amount of code and gives us more flexibility in refactoring the fsck code. The new rules are: - only blobs are allowed to pass a NULL buffer - we always use the provided buffer, never pulling information from the object struct We don't have to adjust any callers, because they were already adhering to these. Note that we do drop a few fsck identifiers for missing tags, but that was all dead code (because nobody passed a NULL tag buffer). Signed-off-by: Jeff King --- fsck.c | 51 +++++++++------------------------------------------ fsck.h | 6 +++++- 2 files changed, 14 insertions(+), 43 deletions(-) diff --git a/fsck.c b/fsck.c index 79ce3a97c8..347a0ef5c9 100644 --- a/fsck.c +++ b/fsck.c @@ -49,13 +49,11 @@ static struct oidset gitmodules_done = OIDSET_INIT; FUNC(MISSING_SPACE_BEFORE_EMAIL, ERROR) \ FUNC(MISSING_TAG, ERROR) \ FUNC(MISSING_TAG_ENTRY, ERROR) \ - FUNC(MISSING_TAG_OBJECT, ERROR) \ FUNC(MISSING_TREE, ERROR) \ FUNC(MISSING_TREE_OBJECT, ERROR) \ FUNC(MISSING_TYPE, ERROR) \ FUNC(MISSING_TYPE_ENTRY, ERROR) \ FUNC(MULTIPLE_AUTHORS, ERROR) \ - FUNC(TAG_OBJECT_NOT_TAG, ERROR) \ FUNC(TREE_NOT_SORTED, ERROR) \ FUNC(UNKNOWN_TYPE, ERROR) \ FUNC(ZERO_PADDED_DATE, ERROR) \ @@ -541,7 +539,9 @@ static int verify_ordered(unsigned mode1, const char *name1, unsigned mode2, con return c1 < c2 ? 0 : TREE_UNORDERED; } -static int fsck_tree(struct tree *item, struct fsck_options *options) +static int fsck_tree(struct tree *item, + const char *buffer, unsigned long size, + struct fsck_options *options) { int retval = 0; int has_null_sha1 = 0; @@ -558,7 +558,7 @@ static int fsck_tree(struct tree *item, struct fsck_options *options) unsigned o_mode; const char *o_name; - if (init_tree_desc_gently(&desc, item->buffer, item->size)) { + if (init_tree_desc_gently(&desc, buffer, size)) { retval += report(options, &item->object, FSCK_MSG_BAD_TREE, "cannot be parsed as a tree"); return retval; } @@ -733,8 +733,8 @@ static int fsck_ident(const char **ident, struct object *obj, struct fsck_option return 0; } -static int fsck_commit_buffer(struct commit *commit, const char *buffer, - unsigned long size, struct fsck_options *options) +static int fsck_commit(struct commit *commit, const char *buffer, + unsigned long size, struct fsck_options *options) { struct object_id tree_oid, oid; unsigned author_count; @@ -788,47 +788,15 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer, return 0; } -static int fsck_commit(struct commit *commit, const char *data, - unsigned long size, struct fsck_options *options) -{ - const char *buffer = data ? data : get_commit_buffer(commit, &size); - int ret = fsck_commit_buffer(commit, buffer, size, options); - if (!data) - unuse_commit_buffer(commit, buffer); - return ret; -} - -static int fsck_tag(struct tag *tag, const char *data, +static int fsck_tag(struct tag *tag, const char *buffer, unsigned long size, struct fsck_options *options) { struct object_id oid; int ret = 0; - const char *buffer; - char *to_free = NULL, *eol; + char *eol; struct strbuf sb = STRBUF_INIT; const char *p; - if (data) - buffer = data; - else { - enum object_type type; - - buffer = to_free = - read_object_file(&tag->object.oid, &type, &size); - if (!buffer) - return report(options, &tag->object, - FSCK_MSG_MISSING_TAG_OBJECT, - "cannot read tag object"); - - if (type != OBJ_TAG) { - ret = report(options, &tag->object, - FSCK_MSG_TAG_OBJECT_NOT_TAG, - "expected tag got %s", - type_name(type)); - goto done; - } - } - ret = verify_headers(buffer, size, &tag->object, options); if (ret) goto done; @@ -889,7 +857,6 @@ static int fsck_tag(struct tag *tag, const char *data, done: strbuf_release(&sb); - free(to_free); return ret; } @@ -979,7 +946,7 @@ int fsck_object(struct object *obj, void *data, unsigned long size, if (obj->type == OBJ_BLOB) return fsck_blob((struct blob *)obj, data, size, options); if (obj->type == OBJ_TREE) - return fsck_tree((struct tree *) obj, options); + return fsck_tree((struct tree *) obj, data, size, options); if (obj->type == OBJ_COMMIT) return fsck_commit((struct commit *) obj, (const char *) data, size, options); diff --git a/fsck.h b/fsck.h index b95595ae5f..e479461075 100644 --- a/fsck.h +++ b/fsck.h @@ -52,7 +52,11 @@ struct fsck_options { * 0 everything OK */ int fsck_walk(struct object *obj, void *data, struct fsck_options *options); -/* If NULL is passed for data, we assume the object is local and read it. */ + +/* + * Blob objects my pass a NULL data pointer, which indicates they are too large + * to fit in memory. All other types must pass a real buffer. + */ int fsck_object(struct object *obj, void *data, unsigned long size, struct fsck_options *options); From patchwork Fri Oct 18 04:56:13 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11197469 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EDF79112B for ; Fri, 18 Oct 2019 04:56:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C2BCB222CB for ; Fri, 18 Oct 2019 04:56:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391270AbfJRE4P (ORCPT ); Fri, 18 Oct 2019 00:56:15 -0400 Received: from cloud.peff.net ([104.130.231.41]:51688 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1726597AbfJRE4P (ORCPT ); Fri, 18 Oct 2019 00:56:15 -0400 Received: (qmail 9414 invoked by uid 109); 18 Oct 2019 04:56:15 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Fri, 18 Oct 2019 04:56:15 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14192 invoked by uid 111); 18 Oct 2019 04:59:19 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Fri, 18 Oct 2019 00:59:19 -0400 Authentication-Results: peff.net; auth=none Date: Fri, 18 Oct 2019 00:56:13 -0400 From: Jeff King To: git@vger.kernel.org Subject: [PATCH 09/23] fsck: unify object-name code Message-ID: <20191018045613.GI17879@sigill.intra.peff.net> References: <20191018044103.GA17625@sigill.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20191018044103.GA17625@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Commit 90cf590f53 (fsck: optionally show more helpful info for broken links, 2016-07-17) added a system for decorating objects with names. The code is split across builtin/fsck.c (which gives the initial names) and fsck.c (which adds to the names as it traverses the object graph). This leads to some duplication, where both sites have near-identical describe_object() functions (the difference being that the one in builtin/fsck.c uses a circular array of buffers to allow multiple calls in a single printf). Let's provide a unified object_name API for fsck. That lets us drop the duplication, as well as making the interface boundaries more clear (which will let us refactor the implementation more in a future patch). We'll leave describe_object() in builtin/fsck.c as a thin wrapper around the new API, as it relies on a static global to make its many callers a bit shorter. We'll also convert the bare add_decoration() calls in builtin/fsck.c to put_object_name(). This fixes two minor bugs: 1. We leak many small strings. add_decoration() has a last-one-wins approach: it updates the decoration to the new string and returns the old one. But we ignore the return value, leaking the old string. This is quite common to trigger, since we look at reflogs: the tip of any ref will be described both by looking at the actual ref, as well as the latest reflog entry. So we'd always end up leaking one of those strings. 2. The last-one-wins approach gives us lousy names. For instance, we first look at all of the refs, and then all of the reflogs. So rather than seeing "refs/heads/master", we're likely to overwrite it with "HEAD@{12345678}". We're generally better off using the first name we find. And indeed, the test in t1450 expects this ugly HEAD@{} name. After this patch, we've switched to using fsck_put_object_name()'s first-one-wins semantics, and we output the more human-friendly "refs/tags/julius" (and the test is updated accordingly). Signed-off-by: Jeff King Signed-off-by: Jeff King Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- builtin/fsck.c | 50 ++++++++---------------------- fsck.c | 82 ++++++++++++++++++++++++++++++------------------- fsck.h | 24 +++++++++++++++ t/t1450-fsck.sh | 2 +- 4 files changed, 89 insertions(+), 69 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index 18403a94fa..237643cc1d 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -52,24 +52,7 @@ static int name_objects; static const char *describe_object(struct object *obj) { - static struct strbuf bufs[] = { - STRBUF_INIT, STRBUF_INIT, STRBUF_INIT, STRBUF_INIT - }; - static int b = 0; - struct strbuf *buf; - char *name = NULL; - - if (name_objects) - name = lookup_decoration(fsck_walk_options.object_names, obj); - - buf = bufs + b; - b = (b + 1) % ARRAY_SIZE(bufs); - strbuf_reset(buf); - strbuf_addstr(buf, oid_to_hex(&obj->oid)); - if (name) - strbuf_addf(buf, " (%s)", name); - - return buf->buf; + return fsck_describe_object(&fsck_walk_options, obj); } static const char *printable_type(struct object *obj) @@ -499,10 +482,10 @@ static void fsck_handle_reflog_oid(const char *refname, struct object_id *oid, if (!is_null_oid(oid)) { obj = lookup_object(the_repository, oid); if (obj && (obj->flags & HAS_OBJ)) { - if (timestamp && name_objects) - add_decoration(fsck_walk_options.object_names, - obj, - xstrfmt("%s@{%"PRItime"}", refname, timestamp)); + if (timestamp) + fsck_put_object_name(&fsck_walk_options, obj, + "%s@{%"PRItime"}", + refname, timestamp); obj->flags |= USED; mark_object_reachable(obj); } else if (!is_promisor_object(oid)) { @@ -566,9 +549,8 @@ static int fsck_handle_ref(const char *refname, const struct object_id *oid, } default_refs++; obj->flags |= USED; - if (name_objects) - add_decoration(fsck_walk_options.object_names, - obj, xstrdup(refname)); + fsck_put_object_name(&fsck_walk_options, + obj, "%s", refname); mark_object_reachable(obj); return 0; @@ -742,9 +724,7 @@ static int fsck_cache_tree(struct cache_tree *it) return 1; } obj->flags |= USED; - if (name_objects) - add_decoration(fsck_walk_options.object_names, - obj, xstrdup(":")); + fsck_put_object_name(&fsck_walk_options, obj, ":"); mark_object_reachable(obj); if (obj->type != OBJ_TREE) err |= objerror(obj, _("non-tree in cache-tree")); @@ -830,8 +810,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) } if (name_objects) - fsck_walk_options.object_names = - xcalloc(1, sizeof(struct decoration)); + fsck_enable_object_names(&fsck_walk_options); git_config(fsck_config, NULL); @@ -890,9 +869,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) } obj->flags |= USED; - if (name_objects) - add_decoration(fsck_walk_options.object_names, - obj, xstrdup(arg)); + fsck_put_object_name(&fsck_walk_options, obj, + "%s", arg); mark_object_reachable(obj); continue; } @@ -928,10 +906,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) continue; obj = &blob->object; obj->flags |= USED; - if (name_objects) - add_decoration(fsck_walk_options.object_names, - obj, - xstrfmt(":%s", active_cache[i]->name)); + fsck_put_object_name(&fsck_walk_options, obj, + ":%s", active_cache[i]->name); mark_object_reachable(obj); } if (active_cache_tree) diff --git a/fsck.c b/fsck.c index 347a0ef5c9..ecd5957362 100644 --- a/fsck.c +++ b/fsck.c @@ -312,15 +312,21 @@ static int report(struct fsck_options *options, struct object *object, return result; } -static char *get_object_name(struct fsck_options *options, struct object *obj) +void fsck_enable_object_names(struct fsck_options *options) +{ + if (!options->object_names) + options->object_names = xcalloc(1, sizeof(struct decoration)); +} + +const char *fsck_get_object_name(struct fsck_options *options, struct object *obj) { if (!options->object_names) return NULL; return lookup_decoration(options->object_names, obj); } -static void put_object_name(struct fsck_options *options, struct object *obj, - const char *fmt, ...) +void fsck_put_object_name(struct fsck_options *options, struct object *obj, + const char *fmt, ...) { va_list ap; struct strbuf buf = STRBUF_INIT; @@ -337,17 +343,27 @@ static void put_object_name(struct fsck_options *options, struct object *obj, va_end(ap); } -static const char *describe_object(struct fsck_options *o, struct object *obj) +const char *fsck_describe_object(struct fsck_options *options, + struct object *obj) { - static struct strbuf buf = STRBUF_INIT; - char *name; - - strbuf_reset(&buf); - strbuf_addstr(&buf, oid_to_hex(&obj->oid)); - if (o->object_names && (name = lookup_decoration(o->object_names, obj))) - strbuf_addf(&buf, " (%s)", name); + static struct strbuf bufs[] = { + STRBUF_INIT, STRBUF_INIT, STRBUF_INIT, STRBUF_INIT + }; + static int b = 0; + struct strbuf *buf; + char *name = NULL; + + if (options->object_names) + name = lookup_decoration(options->object_names, obj); + + buf = bufs + b; + b = (b + 1) % ARRAY_SIZE(bufs); + strbuf_reset(buf); + strbuf_addstr(buf, oid_to_hex(&obj->oid)); + if (name) + strbuf_addf(buf, " (%s)", name); - return buf.buf; + return buf->buf; } static int fsck_walk_tree(struct tree *tree, void *data, struct fsck_options *options) @@ -360,7 +376,7 @@ static int fsck_walk_tree(struct tree *tree, void *data, struct fsck_options *op if (parse_tree(tree)) return -1; - name = get_object_name(options, &tree->object); + name = fsck_get_object_name(options, &tree->object); if (init_tree_desc_gently(&desc, tree->buffer, tree->size)) return -1; while (tree_entry_gently(&desc, &entry)) { @@ -373,20 +389,21 @@ static int fsck_walk_tree(struct tree *tree, void *data, struct fsck_options *op if (S_ISDIR(entry.mode)) { obj = (struct object *)lookup_tree(the_repository, &entry.oid); if (name && obj) - put_object_name(options, obj, "%s%s/", name, - entry.path); + fsck_put_object_name(options, obj, "%s%s/", + name, entry.path); result = options->walk(obj, OBJ_TREE, data, options); } else if (S_ISREG(entry.mode) || S_ISLNK(entry.mode)) { obj = (struct object *)lookup_blob(the_repository, &entry.oid); if (name && obj) - put_object_name(options, obj, "%s%s", name, - entry.path); + fsck_put_object_name(options, obj, "%s%s", + name, entry.path); result = options->walk(obj, OBJ_BLOB, data, options); } else { result = error("in tree %s: entry %s has bad mode %.6o", - describe_object(options, &tree->object), entry.path, entry.mode); + fsck_describe_object(options, &tree->object), + entry.path, entry.mode); } if (result < 0) return result; @@ -407,10 +424,10 @@ static int fsck_walk_commit(struct commit *commit, void *data, struct fsck_optio if (parse_commit(commit)) return -1; - name = get_object_name(options, &commit->object); + name = fsck_get_object_name(options, &commit->object); if (name) - put_object_name(options, &get_commit_tree(commit)->object, - "%s:", name); + fsck_put_object_name(options, &get_commit_tree(commit)->object, + "%s:", name); result = options->walk((struct object *)get_commit_tree(commit), OBJ_TREE, data, options); @@ -441,13 +458,15 @@ static int fsck_walk_commit(struct commit *commit, void *data, struct fsck_optio struct object *obj = &parents->item->object; if (counter++) - put_object_name(options, obj, "%s^%d", - name, counter); + fsck_put_object_name(options, obj, "%s^%d", + name, counter); else if (generation > 0) - put_object_name(options, obj, "%.*s~%d", - name_prefix_len, name, generation + 1); + fsck_put_object_name(options, obj, "%.*s~%d", + name_prefix_len, name, + generation + 1); else - put_object_name(options, obj, "%s^", name); + fsck_put_object_name(options, obj, "%s^", + name); } result = options->walk((struct object *)parents->item, OBJ_COMMIT, data, options); if (result < 0) @@ -461,12 +480,12 @@ static int fsck_walk_commit(struct commit *commit, void *data, struct fsck_optio static int fsck_walk_tag(struct tag *tag, void *data, struct fsck_options *options) { - char *name = get_object_name(options, &tag->object); + const char *name = fsck_get_object_name(options, &tag->object); if (parse_tag(tag)) return -1; if (name) - put_object_name(options, tag->tagged, "%s", name); + fsck_put_object_name(options, tag->tagged, "%s", name); return options->walk(tag->tagged, OBJ_ANY, data, options); } @@ -488,7 +507,8 @@ int fsck_walk(struct object *obj, void *data, struct fsck_options *options) case OBJ_TAG: return fsck_walk_tag((struct tag *)obj, data, options); default: - error("Unknown object type for %s", describe_object(options, obj)); + error("Unknown object type for %s", + fsck_describe_object(options, obj)); return -1; } } @@ -962,10 +982,10 @@ int fsck_error_function(struct fsck_options *o, struct object *obj, int msg_type, const char *message) { if (msg_type == FSCK_WARN) { - warning("object %s: %s", describe_object(o, obj), message); + warning("object %s: %s", fsck_describe_object(o, obj), message); return 0; } - error("object %s: %s", describe_object(o, obj), message); + error("object %s: %s", fsck_describe_object(o, obj), message); return 1; } diff --git a/fsck.h b/fsck.h index e479461075..f6f0c40060 100644 --- a/fsck.h +++ b/fsck.h @@ -67,4 +67,28 @@ int fsck_object(struct object *obj, void *data, unsigned long size, */ int fsck_finish(struct fsck_options *options); +/* + * Subsystem for storing human-readable names for each object. + * + * If fsck_enable_object_names() has not been called, all other functions are + * noops. + * + * Use put_object_name() to seed initial names (e.g. from refnames); the fsck + * code will extend that while walking trees, etc. + * + * Use get_object_name() to get a single name (or NULL if none). Or the more + * convenient describe_object(), which always produces an output string with + * the oid combined with the name (if any). Note that the return value points + * to a rotating array of static buffers, and may be invalidated by a + * subsequent call. + */ +void fsck_enable_object_names(struct fsck_options *options); +const char *fsck_get_object_name(struct fsck_options *options, + struct object *obj); +__attribute__((format (printf,3,4))) +void fsck_put_object_name(struct fsck_options *options, struct object *obj, + const char *fmt, ...); +const char *fsck_describe_object(struct fsck_options *options, + struct object *obj); + #endif diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh index 50d28e6fdb..7c7ff7e961 100755 --- a/t/t1450-fsck.sh +++ b/t/t1450-fsck.sh @@ -616,7 +616,7 @@ test_expect_success 'fsck --name-objects' ' remove_object $(git rev-parse julius:caesar.t) && test_must_fail git fsck --name-objects >out && tree=$(git rev-parse --verify julius:) && - test_i18ngrep -E "$tree \((refs/heads/master|HEAD)@\{[0-9]*\}:" out + test_i18ngrep "$tree (refs/tags/julius:" out ) ' From patchwork Fri Oct 18 04:56:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11197471 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7FE3E19A1 for ; Fri, 18 Oct 2019 04:56:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 674832070B for ; Fri, 18 Oct 2019 04:56:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2393029AbfJRE4j (ORCPT ); Fri, 18 Oct 2019 00:56:39 -0400 Received: from cloud.peff.net ([104.130.231.41]:51694 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S2389158AbfJRE4j (ORCPT ); Fri, 18 Oct 2019 00:56:39 -0400 Received: (qmail 9425 invoked by uid 109); 18 Oct 2019 04:56:40 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Fri, 18 Oct 2019 04:56:40 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14208 invoked by uid 111); 18 Oct 2019 04:59:44 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Fri, 18 Oct 2019 00:59:44 -0400 Authentication-Results: peff.net; auth=none Date: Fri, 18 Oct 2019 00:56:38 -0400 From: Jeff King To: git@vger.kernel.org Subject: [PATCH 10/23] fsck_describe_object(): build on our get_object_name() primitive Message-ID: <20191018045638.GJ17879@sigill.intra.peff.net> References: <20191018044103.GA17625@sigill.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20191018044103.GA17625@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This isolates the implementation detail of using the decoration code to our put/get functions. Signed-off-by: Jeff King --- Arguably this could be squashed into the previous commit. By not doing so, it made describe_object() more of a pure code movement. fsck.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/fsck.c b/fsck.c index ecd5957362..b0c4de67c9 100644 --- a/fsck.c +++ b/fsck.c @@ -351,10 +351,7 @@ const char *fsck_describe_object(struct fsck_options *options, }; static int b = 0; struct strbuf *buf; - char *name = NULL; - - if (options->object_names) - name = lookup_decoration(options->object_names, obj); + const char *name = fsck_get_object_name(options, obj); buf = bufs + b; b = (b + 1) % ARRAY_SIZE(bufs); From patchwork Fri Oct 18 04:57:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11197473 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B0DA9112B for ; Fri, 18 Oct 2019 04:57:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8C1652070B for ; Fri, 18 Oct 2019 04:57:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2392865AbfJRE5j (ORCPT ); Fri, 18 Oct 2019 00:57:39 -0400 Received: from cloud.peff.net ([104.130.231.41]:51700 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1730688AbfJRE5j (ORCPT ); Fri, 18 Oct 2019 00:57:39 -0400 Received: (qmail 9445 invoked by uid 109); 18 Oct 2019 04:57:39 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Fri, 18 Oct 2019 04:57:39 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14234 invoked by uid 111); 18 Oct 2019 05:00:43 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Fri, 18 Oct 2019 01:00:43 -0400 Authentication-Results: peff.net; auth=none Date: Fri, 18 Oct 2019 00:57:37 -0400 From: Jeff King To: git@vger.kernel.org Subject: [PATCH 11/23] fsck: use oids rather than objects for object_name API Message-ID: <20191018045737.GK17879@sigill.intra.peff.net> References: <20191018044103.GA17625@sigill.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20191018044103.GA17625@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org We don't actually care about having object structs; we only need to look up decorations by oid. Let's accept this more limited form, which will give our callers more flexibility. Note that the decoration API we rely on uses object structs itself (even though it only looks at their oids). We can solve this by switching to a kh_oid_map (we could also use the hashmap oidmap, but it's more awkward for the simple case of just storing a void pointer). Signed-off-by: Jeff King --- builtin/fsck.c | 12 +++++----- fsck.c | 61 ++++++++++++++++++++++++++++---------------------- fsck.h | 9 ++++---- 3 files changed, 45 insertions(+), 37 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index 237643cc1d..66fa727c14 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -52,7 +52,7 @@ static int name_objects; static const char *describe_object(struct object *obj) { - return fsck_describe_object(&fsck_walk_options, obj); + return fsck_describe_object(&fsck_walk_options, &obj->oid); } static const char *printable_type(struct object *obj) @@ -483,7 +483,7 @@ static void fsck_handle_reflog_oid(const char *refname, struct object_id *oid, obj = lookup_object(the_repository, oid); if (obj && (obj->flags & HAS_OBJ)) { if (timestamp) - fsck_put_object_name(&fsck_walk_options, obj, + fsck_put_object_name(&fsck_walk_options, oid, "%s@{%"PRItime"}", refname, timestamp); obj->flags |= USED; @@ -550,7 +550,7 @@ static int fsck_handle_ref(const char *refname, const struct object_id *oid, default_refs++; obj->flags |= USED; fsck_put_object_name(&fsck_walk_options, - obj, "%s", refname); + oid, "%s", refname); mark_object_reachable(obj); return 0; @@ -724,7 +724,7 @@ static int fsck_cache_tree(struct cache_tree *it) return 1; } obj->flags |= USED; - fsck_put_object_name(&fsck_walk_options, obj, ":"); + fsck_put_object_name(&fsck_walk_options, &it->oid, ":"); mark_object_reachable(obj); if (obj->type != OBJ_TREE) err |= objerror(obj, _("non-tree in cache-tree")); @@ -869,7 +869,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) } obj->flags |= USED; - fsck_put_object_name(&fsck_walk_options, obj, + fsck_put_object_name(&fsck_walk_options, &oid, "%s", arg); mark_object_reachable(obj); continue; @@ -906,7 +906,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) continue; obj = &blob->object; obj->flags |= USED; - fsck_put_object_name(&fsck_walk_options, obj, + fsck_put_object_name(&fsck_walk_options, &obj->oid, ":%s", active_cache[i]->name); mark_object_reachable(obj); } diff --git a/fsck.c b/fsck.c index b0c4de67c9..124c0184d4 100644 --- a/fsck.c +++ b/fsck.c @@ -315,48 +315,56 @@ static int report(struct fsck_options *options, struct object *object, void fsck_enable_object_names(struct fsck_options *options) { if (!options->object_names) - options->object_names = xcalloc(1, sizeof(struct decoration)); + options->object_names = kh_init_oid_map(); } -const char *fsck_get_object_name(struct fsck_options *options, struct object *obj) +const char *fsck_get_object_name(struct fsck_options *options, + const struct object_id *oid) { + khiter_t pos; if (!options->object_names) return NULL; - return lookup_decoration(options->object_names, obj); + pos = kh_get_oid_map(options->object_names, *oid); + if (pos >= kh_end(options->object_names)) + return NULL; + return kh_value(options->object_names, pos); } -void fsck_put_object_name(struct fsck_options *options, struct object *obj, +void fsck_put_object_name(struct fsck_options *options, + const struct object_id *oid, const char *fmt, ...) { va_list ap; struct strbuf buf = STRBUF_INIT; - char *existing; + khiter_t pos; + int hashret; if (!options->object_names) return; - existing = lookup_decoration(options->object_names, obj); - if (existing) + + pos = kh_put_oid_map(options->object_names, *oid, &hashret); + if (!hashret) return; va_start(ap, fmt); strbuf_vaddf(&buf, fmt, ap); - add_decoration(options->object_names, obj, strbuf_detach(&buf, NULL)); + kh_value(options->object_names, pos) = strbuf_detach(&buf, NULL); va_end(ap); } const char *fsck_describe_object(struct fsck_options *options, - struct object *obj) + const struct object_id *oid) { static struct strbuf bufs[] = { STRBUF_INIT, STRBUF_INIT, STRBUF_INIT, STRBUF_INIT }; static int b = 0; struct strbuf *buf; - const char *name = fsck_get_object_name(options, obj); + const char *name = fsck_get_object_name(options, oid); buf = bufs + b; b = (b + 1) % ARRAY_SIZE(bufs); strbuf_reset(buf); - strbuf_addstr(buf, oid_to_hex(&obj->oid)); + strbuf_addstr(buf, oid_to_hex(oid)); if (name) strbuf_addf(buf, " (%s)", name); @@ -373,7 +381,7 @@ static int fsck_walk_tree(struct tree *tree, void *data, struct fsck_options *op if (parse_tree(tree)) return -1; - name = fsck_get_object_name(options, &tree->object); + name = fsck_get_object_name(options, &tree->object.oid); if (init_tree_desc_gently(&desc, tree->buffer, tree->size)) return -1; while (tree_entry_gently(&desc, &entry)) { @@ -386,20 +394,20 @@ static int fsck_walk_tree(struct tree *tree, void *data, struct fsck_options *op if (S_ISDIR(entry.mode)) { obj = (struct object *)lookup_tree(the_repository, &entry.oid); if (name && obj) - fsck_put_object_name(options, obj, "%s%s/", + fsck_put_object_name(options, &entry.oid, "%s%s/", name, entry.path); result = options->walk(obj, OBJ_TREE, data, options); } else if (S_ISREG(entry.mode) || S_ISLNK(entry.mode)) { obj = (struct object *)lookup_blob(the_repository, &entry.oid); if (name && obj) - fsck_put_object_name(options, obj, "%s%s", + fsck_put_object_name(options, &entry.oid, "%s%s", name, entry.path); result = options->walk(obj, OBJ_BLOB, data, options); } else { result = error("in tree %s: entry %s has bad mode %.6o", - fsck_describe_object(options, &tree->object), + fsck_describe_object(options, &tree->object.oid), entry.path, entry.mode); } if (result < 0) @@ -421,9 +429,9 @@ static int fsck_walk_commit(struct commit *commit, void *data, struct fsck_optio if (parse_commit(commit)) return -1; - name = fsck_get_object_name(options, &commit->object); + name = fsck_get_object_name(options, &commit->object.oid); if (name) - fsck_put_object_name(options, &get_commit_tree(commit)->object, + fsck_put_object_name(options, get_commit_tree_oid(commit), "%s:", name); result = options->walk((struct object *)get_commit_tree(commit), @@ -452,18 +460,17 @@ static int fsck_walk_commit(struct commit *commit, void *data, struct fsck_optio while (parents) { if (name) { - struct object *obj = &parents->item->object; + struct object_id *oid = &parents->item->object.oid; if (counter++) - fsck_put_object_name(options, obj, "%s^%d", + fsck_put_object_name(options, oid, "%s^%d", name, counter); else if (generation > 0) - fsck_put_object_name(options, obj, "%.*s~%d", + fsck_put_object_name(options, oid, "%.*s~%d", name_prefix_len, name, generation + 1); else - fsck_put_object_name(options, obj, "%s^", - name); + fsck_put_object_name(options, oid, "%s^", name); } result = options->walk((struct object *)parents->item, OBJ_COMMIT, data, options); if (result < 0) @@ -477,12 +484,12 @@ static int fsck_walk_commit(struct commit *commit, void *data, struct fsck_optio static int fsck_walk_tag(struct tag *tag, void *data, struct fsck_options *options) { - const char *name = fsck_get_object_name(options, &tag->object); + const char *name = fsck_get_object_name(options, &tag->object.oid); if (parse_tag(tag)) return -1; if (name) - fsck_put_object_name(options, tag->tagged, "%s", name); + fsck_put_object_name(options, &tag->tagged->oid, "%s", name); return options->walk(tag->tagged, OBJ_ANY, data, options); } @@ -505,7 +512,7 @@ int fsck_walk(struct object *obj, void *data, struct fsck_options *options) return fsck_walk_tag((struct tag *)obj, data, options); default: error("Unknown object type for %s", - fsck_describe_object(options, obj)); + fsck_describe_object(options, &obj->oid)); return -1; } } @@ -979,10 +986,10 @@ int fsck_error_function(struct fsck_options *o, struct object *obj, int msg_type, const char *message) { if (msg_type == FSCK_WARN) { - warning("object %s: %s", fsck_describe_object(o, obj), message); + warning("object %s: %s", fsck_describe_object(o, &obj->oid), message); return 0; } - error("object %s: %s", fsck_describe_object(o, obj), message); + error("object %s: %s", fsck_describe_object(o, &obj->oid), message); return 1; } diff --git a/fsck.h b/fsck.h index f6f0c40060..4397cab20f 100644 --- a/fsck.h +++ b/fsck.h @@ -38,7 +38,7 @@ struct fsck_options { unsigned strict:1; int *msg_type; struct oidset skiplist; - struct decoration *object_names; + kh_oid_map_t *object_names; }; #define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL, OIDSET_INIT } @@ -84,11 +84,12 @@ int fsck_finish(struct fsck_options *options); */ void fsck_enable_object_names(struct fsck_options *options); const char *fsck_get_object_name(struct fsck_options *options, - struct object *obj); + const struct object_id *oid); __attribute__((format (printf,3,4))) -void fsck_put_object_name(struct fsck_options *options, struct object *obj, +void fsck_put_object_name(struct fsck_options *options, + const struct object_id *oid, const char *fmt, ...); const char *fsck_describe_object(struct fsck_options *options, - struct object *obj); + const struct object_id *oid); #endif From patchwork Fri Oct 18 04:58:07 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11197477 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5B35319A1 for ; Fri, 18 Oct 2019 04:58:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3A25C222C3 for ; Fri, 18 Oct 2019 04:58:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2409568AbfJRE6J (ORCPT ); Fri, 18 Oct 2019 00:58:09 -0400 Received: from cloud.peff.net ([104.130.231.41]:51716 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S2409565AbfJRE6I (ORCPT ); Fri, 18 Oct 2019 00:58:08 -0400 Received: (qmail 9459 invoked by uid 109); 18 Oct 2019 04:58:09 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Fri, 18 Oct 2019 04:58:09 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14348 invoked by uid 111); 18 Oct 2019 05:01:13 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Fri, 18 Oct 2019 01:01:13 -0400 Authentication-Results: peff.net; auth=none Date: Fri, 18 Oct 2019 00:58:07 -0400 From: Jeff King To: git@vger.kernel.org Subject: [PATCH 12/23] fsck: don't require object structs for display functions Message-ID: <20191018045807.GL17879@sigill.intra.peff.net> References: <20191018044103.GA17625@sigill.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20191018044103.GA17625@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Our printable_type() and describe_object() functions take whole object structs, but they really only care about the oid and type. Let's take those individually in order to give our callers more flexibility. Signed-off-by: Jeff King --- builtin/fsck.c | 69 +++++++++++++++++++++++++++----------------------- 1 file changed, 37 insertions(+), 32 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index 66fa727c14..59c77c1baa 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -50,23 +50,20 @@ static int name_objects; #define ERROR_REFS 010 #define ERROR_COMMIT_GRAPH 020 -static const char *describe_object(struct object *obj) +static const char *describe_object(const struct object_id *oid) { - return fsck_describe_object(&fsck_walk_options, &obj->oid); + return fsck_describe_object(&fsck_walk_options, oid); } -static const char *printable_type(struct object *obj) +static const char *printable_type(const struct object_id *oid, + enum object_type type) { const char *ret; - if (obj->type == OBJ_NONE) { - enum object_type type = oid_object_info(the_repository, - &obj->oid, NULL); - if (type > 0) - object_as_type(the_repository, obj, type, 0); - } + if (type == OBJ_NONE) + type = oid_object_info(the_repository, oid, NULL); - ret = type_name(obj->type); + ret = type_name(type); if (!ret) ret = _("unknown"); @@ -101,7 +98,8 @@ static int objerror(struct object *obj, const char *err) errors_found |= ERROR_OBJECT; /* TRANSLATORS: e.g. error in tree 01bfda: */ fprintf_ln(stderr, _("error in %s %s: %s"), - printable_type(obj), describe_object(obj), err); + printable_type(&obj->oid, obj->type), + describe_object(&obj->oid), err); return -1; } @@ -112,12 +110,14 @@ static int fsck_error_func(struct fsck_options *o, case FSCK_WARN: /* TRANSLATORS: e.g. warning in tree 01bfda: */ fprintf_ln(stderr, _("warning in %s %s: %s"), - printable_type(obj), describe_object(obj), message); + printable_type(&obj->oid, obj->type), + describe_object(&obj->oid), message); return 0; case FSCK_ERROR: /* TRANSLATORS: e.g. error in tree 01bfda: */ fprintf_ln(stderr, _("error in %s %s: %s"), - printable_type(obj), describe_object(obj), message); + printable_type(&obj->oid, obj->type), + describe_object(&obj->oid), message); return 1; default: BUG("%d (FSCK_IGNORE?) should never trigger this callback", type); @@ -138,7 +138,8 @@ static int mark_object(struct object *obj, int type, void *data, struct fsck_opt if (!obj) { /* ... these references to parent->fld are safe here */ printf_ln(_("broken link from %7s %s"), - printable_type(parent), describe_object(parent)); + printable_type(&parent->oid, parent->type), + describe_object(&parent->oid)); printf_ln(_("broken link from %7s %s"), (type == OBJ_ANY ? _("unknown") : type_name(type)), _("unknown")); @@ -166,10 +167,10 @@ static int mark_object(struct object *obj, int type, void *data, struct fsck_opt if (parent && !has_object_file(&obj->oid)) { printf_ln(_("broken link from %7s %s\n" " to %7s %s"), - printable_type(parent), - describe_object(parent), - printable_type(obj), - describe_object(obj)); + printable_type(&parent->oid, parent->type), + describe_object(&parent->oid), + printable_type(&obj->oid, obj->type), + describe_object(&obj->oid)); errors_found |= ERROR_REACHABLE; } return 1; @@ -275,8 +276,9 @@ static void check_reachable_object(struct object *obj) return; if (has_object_pack(&obj->oid)) return; /* it is in pack - forget about it */ - printf_ln(_("missing %s %s"), printable_type(obj), - describe_object(obj)); + printf_ln(_("missing %s %s"), + printable_type(&obj->oid, obj->type), + describe_object(&obj->oid)); errors_found |= ERROR_REACHABLE; return; } @@ -301,8 +303,9 @@ static void check_unreachable_object(struct object *obj) * since this is something that is prunable. */ if (show_unreachable) { - printf_ln(_("unreachable %s %s"), printable_type(obj), - describe_object(obj)); + printf_ln(_("unreachable %s %s"), + printable_type(&obj->oid, obj->type), + describe_object(&obj->oid)); return; } @@ -320,12 +323,13 @@ static void check_unreachable_object(struct object *obj) */ if (!(obj->flags & USED)) { if (show_dangling) - printf_ln(_("dangling %s %s"), printable_type(obj), - describe_object(obj)); + printf_ln(_("dangling %s %s"), + printable_type(&obj->oid, obj->type), + describe_object(&obj->oid)); if (write_lost_and_found) { char *filename = git_pathdup("lost-found/%s/%s", obj->type == OBJ_COMMIT ? "commit" : "other", - describe_object(obj)); + describe_object(&obj->oid)); FILE *f; if (safe_create_leading_directories_const(filename)) { @@ -338,7 +342,7 @@ static void check_unreachable_object(struct object *obj) if (stream_blob_to_fd(fileno(f), &obj->oid, NULL, 1)) die_errno(_("could not write '%s'"), filename); } else - fprintf(f, "%s\n", describe_object(obj)); + fprintf(f, "%s\n", describe_object(&obj->oid)); if (fclose(f)) die_errno(_("could not finish '%s'"), filename); @@ -357,7 +361,7 @@ static void check_unreachable_object(struct object *obj) static void check_object(struct object *obj) { if (verbose) - fprintf_ln(stderr, _("Checking %s"), describe_object(obj)); + fprintf_ln(stderr, _("Checking %s"), describe_object(&obj->oid)); if (obj->flags & REACHABLE) check_reachable_object(obj); @@ -415,7 +419,8 @@ static int fsck_obj(struct object *obj, void *buffer, unsigned long size) if (verbose) fprintf_ln(stderr, _("Checking %s %s"), - printable_type(obj), describe_object(obj)); + printable_type(&obj->oid, obj->type), + describe_object(&obj->oid)); if (fsck_walk(obj, NULL, &fsck_obj_options)) objerror(obj, _("broken links")); @@ -428,7 +433,7 @@ static int fsck_obj(struct object *obj, void *buffer, unsigned long size) if (!commit->parents && show_root) printf_ln(_("root %s"), - describe_object(&commit->object)); + describe_object(&commit->object.oid)); } if (obj->type == OBJ_TAG) { @@ -436,10 +441,10 @@ static int fsck_obj(struct object *obj, void *buffer, unsigned long size) if (show_tags && tag->tagged) { printf_ln(_("tagged %s %s (%s) in %s"), - printable_type(tag->tagged), - describe_object(tag->tagged), + printable_type(&tag->tagged->oid, tag->tagged->type), + describe_object(&tag->tagged->oid), tag->tag, - describe_object(&tag->object)); + describe_object(&tag->object.oid)); } } From patchwork Fri Oct 18 04:58:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11197479 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2ED5E112B for ; Fri, 18 Oct 2019 04:58:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 14D01222C3 for ; Fri, 18 Oct 2019 04:58:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2407185AbfJRE6m (ORCPT ); Fri, 18 Oct 2019 00:58:42 -0400 Received: from cloud.peff.net ([104.130.231.41]:51722 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1728791AbfJRE6l (ORCPT ); Fri, 18 Oct 2019 00:58:41 -0400 Received: (qmail 9466 invoked by uid 109); 18 Oct 2019 04:58:43 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Fri, 18 Oct 2019 04:58:43 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14365 invoked by uid 111); 18 Oct 2019 05:01:46 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Fri, 18 Oct 2019 01:01:46 -0400 Authentication-Results: peff.net; auth=none Date: Fri, 18 Oct 2019 00:58:40 -0400 From: Jeff King To: git@vger.kernel.org Subject: [PATCH 13/23] fsck: only provide oid/type in fsck_error callback Message-ID: <20191018045840.GM17879@sigill.intra.peff.net> References: <20191018044103.GA17625@sigill.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20191018044103.GA17625@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org None of the callbacks actually care about having a "struct object"; they're happy with just the oid and type information. So let's give ourselves more flexibility to avoid having a "struct object" by just passing the broken-down fields. Note that the callback already takes a "type" field for the fsck message type. We'll rename that to "msg_type" (and use "object_type" for the object type) to make the distinction explicit. Signed-off-by: Jeff King --- builtin/fsck.c | 17 ++++++++++------- fsck.c | 11 +++++++---- fsck.h | 6 ++++-- 3 files changed, 21 insertions(+), 13 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index 59c77c1baa..8d13794b14 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -104,23 +104,26 @@ static int objerror(struct object *obj, const char *err) } static int fsck_error_func(struct fsck_options *o, - struct object *obj, int type, const char *message) + const struct object_id *oid, + enum object_type object_type, + int msg_type, const char *message) { - switch (type) { + switch (msg_type) { case FSCK_WARN: /* TRANSLATORS: e.g. warning in tree 01bfda: */ fprintf_ln(stderr, _("warning in %s %s: %s"), - printable_type(&obj->oid, obj->type), - describe_object(&obj->oid), message); + printable_type(oid, object_type), + describe_object(oid), message); return 0; case FSCK_ERROR: /* TRANSLATORS: e.g. error in tree 01bfda: */ fprintf_ln(stderr, _("error in %s %s: %s"), - printable_type(&obj->oid, obj->type), - describe_object(&obj->oid), message); + printable_type(oid, object_type), + describe_object(oid), message); return 1; default: - BUG("%d (FSCK_IGNORE?) should never trigger this callback", type); + BUG("%d (FSCK_IGNORE?) should never trigger this callback", + msg_type); } } diff --git a/fsck.c b/fsck.c index 124c0184d4..c036ba09ab 100644 --- a/fsck.c +++ b/fsck.c @@ -305,7 +305,8 @@ static int report(struct fsck_options *options, struct object *object, va_start(ap, fmt); strbuf_vaddf(&sb, fmt, ap); - result = options->error_func(options, object, msg_type, sb.buf); + result = options->error_func(options, &object->oid, object->type, + msg_type, sb.buf); strbuf_release(&sb); va_end(ap); @@ -983,13 +984,15 @@ int fsck_object(struct object *obj, void *data, unsigned long size, } int fsck_error_function(struct fsck_options *o, - struct object *obj, int msg_type, const char *message) + const struct object_id *oid, + enum object_type object_type, + int msg_type, const char *message) { if (msg_type == FSCK_WARN) { - warning("object %s: %s", fsck_describe_object(o, &obj->oid), message); + warning("object %s: %s", fsck_describe_object(o, oid), message); return 0; } - error("object %s: %s", fsck_describe_object(o, &obj->oid), message); + error("object %s: %s", fsck_describe_object(o, oid), message); return 1; } diff --git a/fsck.h b/fsck.h index 4397cab20f..f81aedf12f 100644 --- a/fsck.h +++ b/fsck.h @@ -27,10 +27,12 @@ typedef int (*fsck_walk_func)(struct object *obj, int type, void *data, struct f /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */ typedef int (*fsck_error)(struct fsck_options *o, - struct object *obj, int type, const char *message); + const struct object_id *oid, enum object_type object_type, + int msg_type, const char *message); int fsck_error_function(struct fsck_options *o, - struct object *obj, int type, const char *message); + const struct object_id *oid, enum object_type object_type, + int msg_type, const char *message); struct fsck_options { fsck_walk_func walk; From patchwork Fri Oct 18 04:58:51 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11197481 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2FDD219A1 for ; Fri, 18 Oct 2019 04:58:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 178E8222CB for ; Fri, 18 Oct 2019 04:58:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2439295AbfJRE6x (ORCPT ); Fri, 18 Oct 2019 00:58:53 -0400 Received: from cloud.peff.net ([104.130.231.41]:51728 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S2436473AbfJRE6w (ORCPT ); Fri, 18 Oct 2019 00:58:52 -0400 Received: (qmail 9473 invoked by uid 109); 18 Oct 2019 04:58:53 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Fri, 18 Oct 2019 04:58:53 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14381 invoked by uid 111); 18 Oct 2019 05:01:57 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Fri, 18 Oct 2019 01:01:57 -0400 Authentication-Results: peff.net; auth=none Date: Fri, 18 Oct 2019 00:58:51 -0400 From: Jeff King To: git@vger.kernel.org Subject: [PATCH 14/23] fsck: only require an oid for skiplist functions Message-ID: <20191018045851.GN17879@sigill.intra.peff.net> References: <20191018044103.GA17625@sigill.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20191018044103.GA17625@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The skiplist is inherently an oidset, so we don't need a full object struct. Let's take just the oid to give our callers more flexibility. Signed-off-by: Jeff King --- fsck.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/fsck.c b/fsck.c index c036ba09ab..2309c40a11 100644 --- a/fsck.c +++ b/fsck.c @@ -277,9 +277,10 @@ static void append_msg_id(struct strbuf *sb, const char *msg_id) strbuf_addstr(sb, ": "); } -static int object_on_skiplist(struct fsck_options *opts, struct object *obj) +static int object_on_skiplist(struct fsck_options *opts, + const struct object_id *oid) { - return opts && obj && oidset_contains(&opts->skiplist, &obj->oid); + return opts && oid && oidset_contains(&opts->skiplist, oid); } __attribute__((format (printf, 4, 5))) @@ -293,7 +294,7 @@ static int report(struct fsck_options *options, struct object *object, if (msg_type == FSCK_IGNORE) return 0; - if (object_on_skiplist(options, object)) + if (object_on_skiplist(options, &object->oid)) return 0; if (msg_type == FSCK_FATAL) @@ -935,7 +936,7 @@ static int fsck_blob(struct blob *blob, const char *buf, return 0; oidset_insert(&gitmodules_done, &blob->object.oid); - if (object_on_skiplist(options, &blob->object)) + if (object_on_skiplist(options, &blob->object.oid)) return 0; if (!buf) { From patchwork Fri Oct 18 04:59:15 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11197483 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5FD99112B for ; Fri, 18 Oct 2019 04:59:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 28ABE222CD for ; Fri, 18 Oct 2019 04:59:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2439354AbfJRE7S (ORCPT ); Fri, 18 Oct 2019 00:59:18 -0400 Received: from cloud.peff.net ([104.130.231.41]:51734 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S2395281AbfJRE7R (ORCPT ); Fri, 18 Oct 2019 00:59:17 -0400 Received: (qmail 9480 invoked by uid 109); 18 Oct 2019 04:59:17 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Fri, 18 Oct 2019 04:59:17 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14397 invoked by uid 111); 18 Oct 2019 05:02:21 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Fri, 18 Oct 2019 01:02:21 -0400 Authentication-Results: peff.net; auth=none Date: Fri, 18 Oct 2019 00:59:15 -0400 From: Jeff King To: git@vger.kernel.org Subject: [PATCH 15/23] fsck: don't require an object struct for report() Message-ID: <20191018045915.GO17879@sigill.intra.peff.net> References: <20191018044103.GA17625@sigill.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20191018044103.GA17625@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The report() function really only cares about the oid and type of the object, not the full object struct. Let's convert it to take those two items separately, which gives our callers more flexibility. This makes some already-long lines even longer. I've mostly left them, as our eventual goal is to shrink these down as we continue refactoring (e.g., "&item->object" becomes "&item->object.oid, item->object.type", but will eventually shrink down to "oid, type"). Signed-off-by: Jeff King --- fsck.c | 128 +++++++++++++++++++++++++++++++-------------------------- 1 file changed, 69 insertions(+), 59 deletions(-) diff --git a/fsck.c b/fsck.c index 2309c40a11..465247be71 100644 --- a/fsck.c +++ b/fsck.c @@ -283,9 +283,10 @@ static int object_on_skiplist(struct fsck_options *opts, return opts && oid && oidset_contains(&opts->skiplist, oid); } -__attribute__((format (printf, 4, 5))) -static int report(struct fsck_options *options, struct object *object, - enum fsck_msg_id id, const char *fmt, ...) +__attribute__((format (printf, 5, 6))) +static int report(struct fsck_options *options, + const struct object_id *oid, enum object_type object_type, + enum fsck_msg_id id, const char *fmt, ...) { va_list ap; struct strbuf sb = STRBUF_INIT; @@ -294,7 +295,7 @@ static int report(struct fsck_options *options, struct object *object, if (msg_type == FSCK_IGNORE) return 0; - if (object_on_skiplist(options, &object->oid)) + if (object_on_skiplist(options, oid)) return 0; if (msg_type == FSCK_FATAL) @@ -306,7 +307,7 @@ static int report(struct fsck_options *options, struct object *object, va_start(ap, fmt); strbuf_vaddf(&sb, fmt, ap); - result = options->error_func(options, &object->oid, object->type, + result = options->error_func(options, oid, object_type, msg_type, sb.buf); strbuf_release(&sb); va_end(ap); @@ -585,7 +586,7 @@ static int fsck_tree(struct tree *item, const char *o_name; if (init_tree_desc_gently(&desc, buffer, size)) { - retval += report(options, &item->object, FSCK_MSG_BAD_TREE, "cannot be parsed as a tree"); + retval += report(options, &item->object.oid, item->object.type, FSCK_MSG_BAD_TREE, "cannot be parsed as a tree"); return retval; } @@ -611,13 +612,14 @@ static int fsck_tree(struct tree *item, if (!S_ISLNK(mode)) oidset_insert(&gitmodules_found, oid); else - retval += report(options, &item->object, + retval += report(options, + &item->object.oid, item->object.type, FSCK_MSG_GITMODULES_SYMLINK, ".gitmodules is a symbolic link"); } if (update_tree_entry_gently(&desc)) { - retval += report(options, &item->object, FSCK_MSG_BAD_TREE, "cannot be parsed as a tree"); + retval += report(options, &item->object.oid, item->object.type, FSCK_MSG_BAD_TREE, "cannot be parsed as a tree"); break; } @@ -662,25 +664,25 @@ static int fsck_tree(struct tree *item, } if (has_null_sha1) - retval += report(options, &item->object, FSCK_MSG_NULL_SHA1, "contains entries pointing to null sha1"); + retval += report(options, &item->object.oid, item->object.type, FSCK_MSG_NULL_SHA1, "contains entries pointing to null sha1"); if (has_full_path) - retval += report(options, &item->object, FSCK_MSG_FULL_PATHNAME, "contains full pathnames"); + retval += report(options, &item->object.oid, item->object.type, FSCK_MSG_FULL_PATHNAME, "contains full pathnames"); if (has_empty_name) - retval += report(options, &item->object, FSCK_MSG_EMPTY_NAME, "contains empty pathname"); + retval += report(options, &item->object.oid, item->object.type, FSCK_MSG_EMPTY_NAME, "contains empty pathname"); if (has_dot) - retval += report(options, &item->object, FSCK_MSG_HAS_DOT, "contains '.'"); + retval += report(options, &item->object.oid, item->object.type, FSCK_MSG_HAS_DOT, "contains '.'"); if (has_dotdot) - retval += report(options, &item->object, FSCK_MSG_HAS_DOTDOT, "contains '..'"); + retval += report(options, &item->object.oid, item->object.type, FSCK_MSG_HAS_DOTDOT, "contains '..'"); if (has_dotgit) - retval += report(options, &item->object, FSCK_MSG_HAS_DOTGIT, "contains '.git'"); + retval += report(options, &item->object.oid, item->object.type, FSCK_MSG_HAS_DOTGIT, "contains '.git'"); if (has_zero_pad) - retval += report(options, &item->object, FSCK_MSG_ZERO_PADDED_FILEMODE, "contains zero-padded file modes"); + retval += report(options, &item->object.oid, item->object.type, FSCK_MSG_ZERO_PADDED_FILEMODE, "contains zero-padded file modes"); if (has_bad_modes) - retval += report(options, &item->object, FSCK_MSG_BAD_FILEMODE, "contains bad file modes"); + retval += report(options, &item->object.oid, item->object.type, FSCK_MSG_BAD_FILEMODE, "contains bad file modes"); if (has_dup_entries) - retval += report(options, &item->object, FSCK_MSG_DUPLICATE_ENTRIES, "contains duplicate file entries"); + retval += report(options, &item->object.oid, item->object.type, FSCK_MSG_DUPLICATE_ENTRIES, "contains duplicate file entries"); if (not_properly_sorted) - retval += report(options, &item->object, FSCK_MSG_TREE_NOT_SORTED, "not properly sorted"); + retval += report(options, &item->object.oid, item->object.type, FSCK_MSG_TREE_NOT_SORTED, "not properly sorted"); return retval; } @@ -693,7 +695,7 @@ static int verify_headers(const void *data, unsigned long size, for (i = 0; i < size; i++) { switch (buffer[i]) { case '\0': - return report(options, obj, + return report(options, &obj->oid, obj->type, FSCK_MSG_NUL_IN_HEADER, "unterminated header: NUL at offset %ld", i); case '\n': @@ -711,7 +713,7 @@ static int verify_headers(const void *data, unsigned long size, if (size && buffer[size - 1] == '\n') return 0; - return report(options, obj, + return report(options, &obj->oid, obj->type, FSCK_MSG_UNTERMINATED_HEADER, "unterminated header"); } @@ -725,28 +727,28 @@ static int fsck_ident(const char **ident, struct object *obj, struct fsck_option (*ident)++; if (*p == '<') - return report(options, obj, FSCK_MSG_MISSING_NAME_BEFORE_EMAIL, "invalid author/committer line - missing space before email"); + return report(options, &obj->oid, obj->type, FSCK_MSG_MISSING_NAME_BEFORE_EMAIL, "invalid author/committer line - missing space before email"); p += strcspn(p, "<>\n"); if (*p == '>') - return report(options, obj, FSCK_MSG_BAD_NAME, "invalid author/committer line - bad name"); + return report(options, &obj->oid, obj->type, FSCK_MSG_BAD_NAME, "invalid author/committer line - bad name"); if (*p != '<') - return report(options, obj, FSCK_MSG_MISSING_EMAIL, "invalid author/committer line - missing email"); + return report(options, &obj->oid, obj->type, FSCK_MSG_MISSING_EMAIL, "invalid author/committer line - missing email"); if (p[-1] != ' ') - return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_EMAIL, "invalid author/committer line - missing space before email"); + return report(options, &obj->oid, obj->type, FSCK_MSG_MISSING_SPACE_BEFORE_EMAIL, "invalid author/committer line - missing space before email"); p++; p += strcspn(p, "<>\n"); if (*p != '>') - return report(options, obj, FSCK_MSG_BAD_EMAIL, "invalid author/committer line - bad email"); + return report(options, &obj->oid, obj->type, FSCK_MSG_BAD_EMAIL, "invalid author/committer line - bad email"); p++; if (*p != ' ') - return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_DATE, "invalid author/committer line - missing space before date"); + return report(options, &obj->oid, obj->type, FSCK_MSG_MISSING_SPACE_BEFORE_DATE, "invalid author/committer line - missing space before date"); p++; if (*p == '0' && p[1] != ' ') - return report(options, obj, FSCK_MSG_ZERO_PADDED_DATE, "invalid author/committer line - zero-padded date"); + return report(options, &obj->oid, obj->type, FSCK_MSG_ZERO_PADDED_DATE, "invalid author/committer line - zero-padded date"); if (date_overflows(parse_timestamp(p, &end, 10))) - return report(options, obj, FSCK_MSG_BAD_DATE_OVERFLOW, "invalid author/committer line - date causes integer overflow"); + return report(options, &obj->oid, obj->type, FSCK_MSG_BAD_DATE_OVERFLOW, "invalid author/committer line - date causes integer overflow"); if ((end == p || *end != ' ')) - return report(options, obj, FSCK_MSG_BAD_DATE, "invalid author/committer line - bad date"); + return report(options, &obj->oid, obj->type, FSCK_MSG_BAD_DATE, "invalid author/committer line - bad date"); p = end + 1; if ((*p != '+' && *p != '-') || !isdigit(p[1]) || @@ -754,7 +756,7 @@ static int fsck_ident(const char **ident, struct object *obj, struct fsck_option !isdigit(p[3]) || !isdigit(p[4]) || (p[5] != '\n')) - return report(options, obj, FSCK_MSG_BAD_TIMEZONE, "invalid author/committer line - bad time zone"); + return report(options, &obj->oid, obj->type, FSCK_MSG_BAD_TIMEZONE, "invalid author/committer line - bad time zone"); p += 6; return 0; } @@ -772,16 +774,16 @@ static int fsck_commit(struct commit *commit, const char *buffer, return -1; if (!skip_prefix(buffer, "tree ", &buffer)) - return report(options, &commit->object, FSCK_MSG_MISSING_TREE, "invalid format - expected 'tree' line"); + return report(options, &commit->object.oid, commit->object.type, FSCK_MSG_MISSING_TREE, "invalid format - expected 'tree' line"); if (parse_oid_hex(buffer, &tree_oid, &p) || *p != '\n') { - err = report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1"); + err = report(options, &commit->object.oid, commit->object.type, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1"); if (err) return err; } buffer = p + 1; while (skip_prefix(buffer, "parent ", &buffer)) { if (parse_oid_hex(buffer, &oid, &p) || *p != '\n') { - err = report(options, &commit->object, FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1"); + err = report(options, &commit->object.oid, commit->object.type, FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1"); if (err) return err; } @@ -795,18 +797,18 @@ static int fsck_commit(struct commit *commit, const char *buffer, return err; } if (author_count < 1) - err = report(options, &commit->object, FSCK_MSG_MISSING_AUTHOR, "invalid format - expected 'author' line"); + err = report(options, &commit->object.oid, commit->object.type, FSCK_MSG_MISSING_AUTHOR, "invalid format - expected 'author' line"); else if (author_count > 1) - err = report(options, &commit->object, FSCK_MSG_MULTIPLE_AUTHORS, "invalid format - multiple 'author' lines"); + err = report(options, &commit->object.oid, commit->object.type, FSCK_MSG_MULTIPLE_AUTHORS, "invalid format - multiple 'author' lines"); if (err) return err; if (!skip_prefix(buffer, "committer ", &buffer)) - return report(options, &commit->object, FSCK_MSG_MISSING_COMMITTER, "invalid format - expected 'committer' line"); + return report(options, &commit->object.oid, commit->object.type, FSCK_MSG_MISSING_COMMITTER, "invalid format - expected 'committer' line"); err = fsck_ident(&buffer, &commit->object, options); if (err) return err; if (memchr(buffer_begin, '\0', size)) { - err = report(options, &commit->object, FSCK_MSG_NUL_IN_COMMIT, + err = report(options, &commit->object.oid, commit->object.type, FSCK_MSG_NUL_IN_COMMIT, "NUL byte in the commit object body"); if (err) return err; @@ -828,45 +830,46 @@ static int fsck_tag(struct tag *tag, const char *buffer, goto done; if (!skip_prefix(buffer, "object ", &buffer)) { - ret = report(options, &tag->object, FSCK_MSG_MISSING_OBJECT, "invalid format - expected 'object' line"); + ret = report(options, &tag->object.oid, tag->object.type, FSCK_MSG_MISSING_OBJECT, "invalid format - expected 'object' line"); goto done; } if (parse_oid_hex(buffer, &oid, &p) || *p != '\n') { - ret = report(options, &tag->object, FSCK_MSG_BAD_OBJECT_SHA1, "invalid 'object' line format - bad sha1"); + ret = report(options, &tag->object.oid, tag->object.type, FSCK_MSG_BAD_OBJECT_SHA1, "invalid 'object' line format - bad sha1"); if (ret) goto done; } buffer = p + 1; if (!skip_prefix(buffer, "type ", &buffer)) { - ret = report(options, &tag->object, FSCK_MSG_MISSING_TYPE_ENTRY, "invalid format - expected 'type' line"); + ret = report(options, &tag->object.oid, tag->object.type, FSCK_MSG_MISSING_TYPE_ENTRY, "invalid format - expected 'type' line"); goto done; } eol = strchr(buffer, '\n'); if (!eol) { - ret = report(options, &tag->object, FSCK_MSG_MISSING_TYPE, "invalid format - unexpected end after 'type' line"); + ret = report(options, &tag->object.oid, tag->object.type, FSCK_MSG_MISSING_TYPE, "invalid format - unexpected end after 'type' line"); goto done; } if (type_from_string_gently(buffer, eol - buffer, 1) < 0) - ret = report(options, &tag->object, FSCK_MSG_BAD_TYPE, "invalid 'type' value"); + ret = report(options, &tag->object.oid, tag->object.type, FSCK_MSG_BAD_TYPE, "invalid 'type' value"); if (ret) goto done; buffer = eol + 1; if (!skip_prefix(buffer, "tag ", &buffer)) { - ret = report(options, &tag->object, FSCK_MSG_MISSING_TAG_ENTRY, "invalid format - expected 'tag' line"); + ret = report(options, &tag->object.oid, tag->object.type, FSCK_MSG_MISSING_TAG_ENTRY, "invalid format - expected 'tag' line"); goto done; } eol = strchr(buffer, '\n'); if (!eol) { - ret = report(options, &tag->object, FSCK_MSG_MISSING_TAG, "invalid format - unexpected end after 'type' line"); + ret = report(options, &tag->object.oid, tag->object.type, FSCK_MSG_MISSING_TAG, "invalid format - unexpected end after 'type' line"); goto done; } strbuf_addf(&sb, "refs/tags/%.*s", (int)(eol - buffer), buffer); if (check_refname_format(sb.buf, 0)) { - ret = report(options, &tag->object, FSCK_MSG_BAD_TAG_NAME, - "invalid 'tag' name: %.*s", - (int)(eol - buffer), buffer); + ret = report(options, &tag->object.oid, tag->object.type, + FSCK_MSG_BAD_TAG_NAME, + "invalid 'tag' name: %.*s", + (int)(eol - buffer), buffer); if (ret) goto done; } @@ -874,7 +877,7 @@ static int fsck_tag(struct tag *tag, const char *buffer, if (!skip_prefix(buffer, "tagger ", &buffer)) { /* early tags do not contain 'tagger' lines; warn only */ - ret = report(options, &tag->object, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line"); + ret = report(options, &tag->object.oid, tag->object.type, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line"); if (ret) goto done; } @@ -905,19 +908,22 @@ static int fsck_gitmodules_fn(const char *var, const char *value, void *vdata) name = xmemdupz(subsection, subsection_len); if (check_submodule_name(name) < 0) - data->ret |= report(data->options, data->obj, + data->ret |= report(data->options, + &data->obj->oid, data->obj->type, FSCK_MSG_GITMODULES_NAME, "disallowed submodule name: %s", name); if (!strcmp(key, "url") && value && looks_like_command_line_option(value)) - data->ret |= report(data->options, data->obj, + data->ret |= report(data->options, + &data->obj->oid, data->obj->type, FSCK_MSG_GITMODULES_URL, "disallowed submodule url: %s", value); if (!strcmp(key, "path") && value && looks_like_command_line_option(value)) - data->ret |= report(data->options, data->obj, + data->ret |= report(data->options, + &data->obj->oid, data->obj->type, FSCK_MSG_GITMODULES_PATH, "disallowed submodule path: %s", value); @@ -945,7 +951,7 @@ static int fsck_blob(struct blob *blob, const char *buf, * blob too gigantic to load into memory. Let's just consider * that an error. */ - return report(options, &blob->object, + return report(options, &blob->object.oid, blob->object.type, FSCK_MSG_GITMODULES_LARGE, ".gitmodules too large to parse"); } @@ -956,7 +962,7 @@ static int fsck_blob(struct blob *blob, const char *buf, config_opts.error_action = CONFIG_ERROR_SILENT; if (git_config_from_mem(fsck_gitmodules_fn, CONFIG_ORIGIN_BLOB, ".gitmodules", buf, size, &data, &config_opts)) - data.ret |= report(options, &blob->object, + data.ret |= report(options, &blob->object.oid, blob->object.type, FSCK_MSG_GITMODULES_PARSE, "could not parse gitmodules blob"); @@ -967,7 +973,7 @@ int fsck_object(struct object *obj, void *data, unsigned long size, struct fsck_options *options) { if (!obj) - return report(options, obj, FSCK_MSG_BAD_OBJECT_SHA1, "no valid object to fsck"); + return report(options, NULL, OBJ_NONE, FSCK_MSG_BAD_OBJECT_SHA1, "no valid object to fsck"); if (obj->type == OBJ_BLOB) return fsck_blob((struct blob *)obj, data, size, options); @@ -980,8 +986,10 @@ int fsck_object(struct object *obj, void *data, unsigned long size, return fsck_tag((struct tag *) obj, (const char *) data, size, options); - return report(options, obj, FSCK_MSG_UNKNOWN_TYPE, "unknown type '%d' (internal fsck error)", - obj->type); + return report(options, &obj->oid, obj->type, + FSCK_MSG_UNKNOWN_TYPE, + "unknown type '%d' (internal fsck error)", + obj->type); } int fsck_error_function(struct fsck_options *o, @@ -1016,7 +1024,7 @@ int fsck_finish(struct fsck_options *options) blob = lookup_blob(the_repository, oid); if (!blob) { struct object *obj = lookup_unknown_object(oid); - ret |= report(options, obj, + ret |= report(options, &obj->oid, obj->type, FSCK_MSG_GITMODULES_BLOB, "non-blob found at .gitmodules"); continue; @@ -1026,7 +1034,8 @@ int fsck_finish(struct fsck_options *options) if (!buf) { if (is_promisor_object(&blob->object.oid)) continue; - ret |= report(options, &blob->object, + ret |= report(options, + &blob->object.oid, blob->object.type, FSCK_MSG_GITMODULES_MISSING, "unable to read .gitmodules blob"); continue; @@ -1035,7 +1044,8 @@ int fsck_finish(struct fsck_options *options) if (type == OBJ_BLOB) ret |= fsck_blob(blob, buf, size, options); else - ret |= report(options, &blob->object, + ret |= report(options, + &blob->object.oid, blob->object.type, FSCK_MSG_GITMODULES_BLOB, "non-blob found at .gitmodules"); free(buf); From patchwork Fri Oct 18 04:59:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11197485 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 51A0219A1 for ; Fri, 18 Oct 2019 04:59:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3A55A222C5 for ; Fri, 18 Oct 2019 04:59:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2395281AbfJRE7a (ORCPT ); Fri, 18 Oct 2019 00:59:30 -0400 Received: from cloud.peff.net ([104.130.231.41]:51740 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1727020AbfJRE7a (ORCPT ); Fri, 18 Oct 2019 00:59:30 -0400 Received: (qmail 9486 invoked by uid 109); 18 Oct 2019 04:59:31 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Fri, 18 Oct 2019 04:59:31 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14413 invoked by uid 111); 18 Oct 2019 05:02:35 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Fri, 18 Oct 2019 01:02:35 -0400 Authentication-Results: peff.net; auth=none Date: Fri, 18 Oct 2019 00:59:29 -0400 From: Jeff King To: git@vger.kernel.org Subject: [PATCH 16/23] fsck: accept an oid instead of a "struct blob" for fsck_blob() Message-ID: <20191018045928.GP17879@sigill.intra.peff.net> References: <20191018044103.GA17625@sigill.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20191018044103.GA17625@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org We don't actually need any information from the object struct except its oid (and the type, of course, but that's implicitly OBJ_BLOB). This gives our callers more flexibility to drop the object structs, too. Signed-off-by: Jeff King --- fsck.c | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/fsck.c b/fsck.c index 465247be71..6e9640a1a6 100644 --- a/fsck.c +++ b/fsck.c @@ -890,7 +890,7 @@ static int fsck_tag(struct tag *tag, const char *buffer, } struct fsck_gitmodules_data { - struct object *obj; + const struct object_id *oid; struct fsck_options *options; int ret; }; @@ -909,21 +909,21 @@ static int fsck_gitmodules_fn(const char *var, const char *value, void *vdata) name = xmemdupz(subsection, subsection_len); if (check_submodule_name(name) < 0) data->ret |= report(data->options, - &data->obj->oid, data->obj->type, + data->oid, OBJ_BLOB, FSCK_MSG_GITMODULES_NAME, "disallowed submodule name: %s", name); if (!strcmp(key, "url") && value && looks_like_command_line_option(value)) data->ret |= report(data->options, - &data->obj->oid, data->obj->type, + data->oid, OBJ_BLOB, FSCK_MSG_GITMODULES_URL, "disallowed submodule url: %s", value); if (!strcmp(key, "path") && value && looks_like_command_line_option(value)) data->ret |= report(data->options, - &data->obj->oid, data->obj->type, + data->oid, OBJ_BLOB, FSCK_MSG_GITMODULES_PATH, "disallowed submodule path: %s", value); @@ -932,17 +932,17 @@ static int fsck_gitmodules_fn(const char *var, const char *value, void *vdata) return 0; } -static int fsck_blob(struct blob *blob, const char *buf, +static int fsck_blob(const struct object_id *oid, const char *buf, unsigned long size, struct fsck_options *options) { struct fsck_gitmodules_data data; struct config_options config_opts = { 0 }; - if (!oidset_contains(&gitmodules_found, &blob->object.oid)) + if (!oidset_contains(&gitmodules_found, oid)) return 0; - oidset_insert(&gitmodules_done, &blob->object.oid); + oidset_insert(&gitmodules_done, oid); - if (object_on_skiplist(options, &blob->object.oid)) + if (object_on_skiplist(options, oid)) return 0; if (!buf) { @@ -951,18 +951,18 @@ static int fsck_blob(struct blob *blob, const char *buf, * blob too gigantic to load into memory. Let's just consider * that an error. */ - return report(options, &blob->object.oid, blob->object.type, + return report(options, oid, OBJ_BLOB, FSCK_MSG_GITMODULES_LARGE, ".gitmodules too large to parse"); } - data.obj = &blob->object; + data.oid = oid; data.options = options; data.ret = 0; config_opts.error_action = CONFIG_ERROR_SILENT; if (git_config_from_mem(fsck_gitmodules_fn, CONFIG_ORIGIN_BLOB, ".gitmodules", buf, size, &data, &config_opts)) - data.ret |= report(options, &blob->object.oid, blob->object.type, + data.ret |= report(options, oid, OBJ_BLOB, FSCK_MSG_GITMODULES_PARSE, "could not parse gitmodules blob"); @@ -976,7 +976,7 @@ int fsck_object(struct object *obj, void *data, unsigned long size, return report(options, NULL, OBJ_NONE, FSCK_MSG_BAD_OBJECT_SHA1, "no valid object to fsck"); if (obj->type == OBJ_BLOB) - return fsck_blob((struct blob *)obj, data, size, options); + return fsck_blob(&obj->oid, data, size, options); if (obj->type == OBJ_TREE) return fsck_tree((struct tree *) obj, data, size, options); if (obj->type == OBJ_COMMIT) @@ -1042,7 +1042,7 @@ int fsck_finish(struct fsck_options *options) } if (type == OBJ_BLOB) - ret |= fsck_blob(blob, buf, size, options); + ret |= fsck_blob(&blob->object.oid, buf, size, options); else ret |= report(options, &blob->object.oid, blob->object.type, From patchwork Fri Oct 18 04:59:54 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11197487 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3C7CD19A1 for ; Fri, 18 Oct 2019 04:59:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 169D2222C5 for ; Fri, 18 Oct 2019 04:59:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2395205AbfJRE7z (ORCPT ); Fri, 18 Oct 2019 00:59:55 -0400 Received: from cloud.peff.net ([104.130.231.41]:51746 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1726315AbfJRE7z (ORCPT ); Fri, 18 Oct 2019 00:59:55 -0400 Received: (qmail 9494 invoked by uid 109); 18 Oct 2019 04:59:56 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Fri, 18 Oct 2019 04:59:56 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14429 invoked by uid 111); 18 Oct 2019 05:03:00 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Fri, 18 Oct 2019 01:03:00 -0400 Authentication-Results: peff.net; auth=none Date: Fri, 18 Oct 2019 00:59:54 -0400 From: Jeff King To: git@vger.kernel.org Subject: [PATCH 17/23] fsck: drop blob struct from fsck_finish() Message-ID: <20191018045953.GQ17879@sigill.intra.peff.net> References: <20191018044103.GA17625@sigill.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20191018044103.GA17625@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Since fsck_blob() no longer requires us to have a "struct blob", we don't need to create one. Which also means we don't need to worry about handling the case that lookup_blob() returns NULL (we'll still catch wrongly-identified blobs when we read the actual object contents and type from disk). Signed-off-by: Jeff King --- fsck.c | 18 ++++-------------- 1 file changed, 4 insertions(+), 14 deletions(-) diff --git a/fsck.c b/fsck.c index 6e9640a1a6..4ff0ceb4ac 100644 --- a/fsck.c +++ b/fsck.c @@ -1013,7 +1013,6 @@ int fsck_finish(struct fsck_options *options) oidset_iter_init(&gitmodules_found, &iter); while ((oid = oidset_iter_next(&iter))) { - struct blob *blob; enum object_type type; unsigned long size; char *buf; @@ -1021,31 +1020,22 @@ int fsck_finish(struct fsck_options *options) if (oidset_contains(&gitmodules_done, oid)) continue; - blob = lookup_blob(the_repository, oid); - if (!blob) { - struct object *obj = lookup_unknown_object(oid); - ret |= report(options, &obj->oid, obj->type, - FSCK_MSG_GITMODULES_BLOB, - "non-blob found at .gitmodules"); - continue; - } - buf = read_object_file(oid, &type, &size); if (!buf) { - if (is_promisor_object(&blob->object.oid)) + if (is_promisor_object(oid)) continue; ret |= report(options, - &blob->object.oid, blob->object.type, + oid, OBJ_BLOB, FSCK_MSG_GITMODULES_MISSING, "unable to read .gitmodules blob"); continue; } if (type == OBJ_BLOB) - ret |= fsck_blob(&blob->object.oid, buf, size, options); + ret |= fsck_blob(oid, buf, size, options); else ret |= report(options, - &blob->object.oid, blob->object.type, + oid, type, FSCK_MSG_GITMODULES_BLOB, "non-blob found at .gitmodules"); free(buf); From patchwork Fri Oct 18 05:00:04 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11197489 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2DF8313BD for ; Fri, 18 Oct 2019 05:00:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0C307222C3 for ; Fri, 18 Oct 2019 05:00:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2441655AbfJRFAG (ORCPT ); Fri, 18 Oct 2019 01:00:06 -0400 Received: from cloud.peff.net ([104.130.231.41]:51752 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1726315AbfJRFAF (ORCPT ); Fri, 18 Oct 2019 01:00:05 -0400 Received: (qmail 9500 invoked by uid 109); 18 Oct 2019 05:00:06 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Fri, 18 Oct 2019 05:00:06 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14447 invoked by uid 111); 18 Oct 2019 05:03:10 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Fri, 18 Oct 2019 01:03:10 -0400 Authentication-Results: peff.net; auth=none Date: Fri, 18 Oct 2019 01:00:04 -0400 From: Jeff King To: git@vger.kernel.org Subject: [PATCH 18/23] fsck: don't require an object struct for fsck_ident() Message-ID: <20191018050004.GR17879@sigill.intra.peff.net> References: <20191018044103.GA17625@sigill.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20191018044103.GA17625@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The only thing we do with the struct is pass its oid and type to report(). We can just take those explicitly, which gives our callers more flexibility. Signed-off-by: Jeff King --- fsck.c | 30 ++++++++++++++++-------------- 1 file changed, 16 insertions(+), 14 deletions(-) diff --git a/fsck.c b/fsck.c index 4ff0ceb4ac..e1d06fb210 100644 --- a/fsck.c +++ b/fsck.c @@ -717,7 +717,9 @@ static int verify_headers(const void *data, unsigned long size, FSCK_MSG_UNTERMINATED_HEADER, "unterminated header"); } -static int fsck_ident(const char **ident, struct object *obj, struct fsck_options *options) +static int fsck_ident(const char **ident, + const struct object_id *oid, enum object_type type, + struct fsck_options *options) { const char *p = *ident; char *end; @@ -727,28 +729,28 @@ static int fsck_ident(const char **ident, struct object *obj, struct fsck_option (*ident)++; if (*p == '<') - return report(options, &obj->oid, obj->type, FSCK_MSG_MISSING_NAME_BEFORE_EMAIL, "invalid author/committer line - missing space before email"); + return report(options, oid, type, FSCK_MSG_MISSING_NAME_BEFORE_EMAIL, "invalid author/committer line - missing space before email"); p += strcspn(p, "<>\n"); if (*p == '>') - return report(options, &obj->oid, obj->type, FSCK_MSG_BAD_NAME, "invalid author/committer line - bad name"); + return report(options, oid, type, FSCK_MSG_BAD_NAME, "invalid author/committer line - bad name"); if (*p != '<') - return report(options, &obj->oid, obj->type, FSCK_MSG_MISSING_EMAIL, "invalid author/committer line - missing email"); + return report(options, oid, type, FSCK_MSG_MISSING_EMAIL, "invalid author/committer line - missing email"); if (p[-1] != ' ') - return report(options, &obj->oid, obj->type, FSCK_MSG_MISSING_SPACE_BEFORE_EMAIL, "invalid author/committer line - missing space before email"); + return report(options, oid, type, FSCK_MSG_MISSING_SPACE_BEFORE_EMAIL, "invalid author/committer line - missing space before email"); p++; p += strcspn(p, "<>\n"); if (*p != '>') - return report(options, &obj->oid, obj->type, FSCK_MSG_BAD_EMAIL, "invalid author/committer line - bad email"); + return report(options, oid, type, FSCK_MSG_BAD_EMAIL, "invalid author/committer line - bad email"); p++; if (*p != ' ') - return report(options, &obj->oid, obj->type, FSCK_MSG_MISSING_SPACE_BEFORE_DATE, "invalid author/committer line - missing space before date"); + return report(options, oid, type, FSCK_MSG_MISSING_SPACE_BEFORE_DATE, "invalid author/committer line - missing space before date"); p++; if (*p == '0' && p[1] != ' ') - return report(options, &obj->oid, obj->type, FSCK_MSG_ZERO_PADDED_DATE, "invalid author/committer line - zero-padded date"); + return report(options, oid, type, FSCK_MSG_ZERO_PADDED_DATE, "invalid author/committer line - zero-padded date"); if (date_overflows(parse_timestamp(p, &end, 10))) - return report(options, &obj->oid, obj->type, FSCK_MSG_BAD_DATE_OVERFLOW, "invalid author/committer line - date causes integer overflow"); + return report(options, oid, type, FSCK_MSG_BAD_DATE_OVERFLOW, "invalid author/committer line - date causes integer overflow"); if ((end == p || *end != ' ')) - return report(options, &obj->oid, obj->type, FSCK_MSG_BAD_DATE, "invalid author/committer line - bad date"); + return report(options, oid, type, FSCK_MSG_BAD_DATE, "invalid author/committer line - bad date"); p = end + 1; if ((*p != '+' && *p != '-') || !isdigit(p[1]) || @@ -756,7 +758,7 @@ static int fsck_ident(const char **ident, struct object *obj, struct fsck_option !isdigit(p[3]) || !isdigit(p[4]) || (p[5] != '\n')) - return report(options, &obj->oid, obj->type, FSCK_MSG_BAD_TIMEZONE, "invalid author/committer line - bad time zone"); + return report(options, oid, type, FSCK_MSG_BAD_TIMEZONE, "invalid author/committer line - bad time zone"); p += 6; return 0; } @@ -792,7 +794,7 @@ static int fsck_commit(struct commit *commit, const char *buffer, author_count = 0; while (skip_prefix(buffer, "author ", &buffer)) { author_count++; - err = fsck_ident(&buffer, &commit->object, options); + err = fsck_ident(&buffer, &commit->object.oid, commit->object.type, options); if (err) return err; } @@ -804,7 +806,7 @@ static int fsck_commit(struct commit *commit, const char *buffer, return err; if (!skip_prefix(buffer, "committer ", &buffer)) return report(options, &commit->object.oid, commit->object.type, FSCK_MSG_MISSING_COMMITTER, "invalid format - expected 'committer' line"); - err = fsck_ident(&buffer, &commit->object, options); + err = fsck_ident(&buffer, &commit->object.oid, commit->object.type, options); if (err) return err; if (memchr(buffer_begin, '\0', size)) { @@ -882,7 +884,7 @@ static int fsck_tag(struct tag *tag, const char *buffer, goto done; } else - ret = fsck_ident(&buffer, &tag->object, options); + ret = fsck_ident(&buffer, &tag->object.oid, tag->object.type, options); done: strbuf_release(&sb); From patchwork Fri Oct 18 05:00:50 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11197493 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9F36813BD for ; Fri, 18 Oct 2019 05:00:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8767F21897 for ; Fri, 18 Oct 2019 05:00:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2393320AbfJRFAw (ORCPT ); Fri, 18 Oct 2019 01:00:52 -0400 Received: from cloud.peff.net ([104.130.231.41]:51760 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1727627AbfJRFAv (ORCPT ); Fri, 18 Oct 2019 01:00:51 -0400 Received: (qmail 9509 invoked by uid 109); 18 Oct 2019 05:00:53 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Fri, 18 Oct 2019 05:00:53 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14476 invoked by uid 111); 18 Oct 2019 05:03:56 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Fri, 18 Oct 2019 01:03:56 -0400 Authentication-Results: peff.net; auth=none Date: Fri, 18 Oct 2019 01:00:50 -0400 From: Jeff King To: git@vger.kernel.org Subject: [PATCH 19/23] fsck: don't require an object struct in verify_headers() Message-ID: <20191018050050.GS17879@sigill.intra.peff.net> References: <20191018044103.GA17625@sigill.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20191018044103.GA17625@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org We only need the oid and type to pass on to report(). Let's accept the broken-out parameters to give our callers more flexibility. Signed-off-by: Jeff King --- fsck.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/fsck.c b/fsck.c index e1d06fb210..50c93200ed 100644 --- a/fsck.c +++ b/fsck.c @@ -687,7 +687,8 @@ static int fsck_tree(struct tree *item, } static int verify_headers(const void *data, unsigned long size, - struct object *obj, struct fsck_options *options) + const struct object_id *oid, enum object_type type, + struct fsck_options *options) { const char *buffer = (const char *)data; unsigned long i; @@ -695,7 +696,7 @@ static int verify_headers(const void *data, unsigned long size, for (i = 0; i < size; i++) { switch (buffer[i]) { case '\0': - return report(options, &obj->oid, obj->type, + return report(options, oid, type, FSCK_MSG_NUL_IN_HEADER, "unterminated header: NUL at offset %ld", i); case '\n': @@ -713,7 +714,7 @@ static int verify_headers(const void *data, unsigned long size, if (size && buffer[size - 1] == '\n') return 0; - return report(options, &obj->oid, obj->type, + return report(options, oid, type, FSCK_MSG_UNTERMINATED_HEADER, "unterminated header"); } @@ -772,7 +773,8 @@ static int fsck_commit(struct commit *commit, const char *buffer, const char *buffer_begin = buffer; const char *p; - if (verify_headers(buffer, size, &commit->object, options)) + if (verify_headers(buffer, size, &commit->object.oid, + commit->object.type, options)) return -1; if (!skip_prefix(buffer, "tree ", &buffer)) @@ -827,7 +829,7 @@ static int fsck_tag(struct tag *tag, const char *buffer, struct strbuf sb = STRBUF_INIT; const char *p; - ret = verify_headers(buffer, size, &tag->object, options); + ret = verify_headers(buffer, size, &tag->object.oid, tag->object.type, options); if (ret) goto done; From patchwork Fri Oct 18 05:00:59 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11197495 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B6301112B for ; Fri, 18 Oct 2019 05:01:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 93E8A21897 for ; Fri, 18 Oct 2019 05:01:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390308AbfJRFBA (ORCPT ); Fri, 18 Oct 2019 01:01:00 -0400 Received: from cloud.peff.net ([104.130.231.41]:51762 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1727627AbfJRFBA (ORCPT ); Fri, 18 Oct 2019 01:01:00 -0400 Received: (qmail 9512 invoked by uid 109); 18 Oct 2019 05:01:01 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Fri, 18 Oct 2019 05:01:01 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14481 invoked by uid 111); 18 Oct 2019 05:04:05 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Fri, 18 Oct 2019 01:04:05 -0400 Authentication-Results: peff.net; auth=none Date: Fri, 18 Oct 2019 01:00:59 -0400 From: Jeff King To: git@vger.kernel.org Subject: [PATCH 20/23] fsck: rename vague "oid" local variables Message-ID: <20191018050059.GT17879@sigill.intra.peff.net> References: <20191018044103.GA17625@sigill.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20191018044103.GA17625@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In fsck_commit() and fsck_tag(), we have local "oid" variables used for parsing parent and tagged-object oids. Let's give these more specific names in preparation for the functions taking an "oid" parameter for the object we're checking. Signed-off-by: Jeff King --- fsck.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/fsck.c b/fsck.c index 50c93200ed..42e7d1f71f 100644 --- a/fsck.c +++ b/fsck.c @@ -767,7 +767,7 @@ static int fsck_ident(const char **ident, static int fsck_commit(struct commit *commit, const char *buffer, unsigned long size, struct fsck_options *options) { - struct object_id tree_oid, oid; + struct object_id tree_oid, parent_oid; unsigned author_count; int err; const char *buffer_begin = buffer; @@ -786,7 +786,7 @@ static int fsck_commit(struct commit *commit, const char *buffer, } buffer = p + 1; while (skip_prefix(buffer, "parent ", &buffer)) { - if (parse_oid_hex(buffer, &oid, &p) || *p != '\n') { + if (parse_oid_hex(buffer, &parent_oid, &p) || *p != '\n') { err = report(options, &commit->object.oid, commit->object.type, FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1"); if (err) return err; @@ -823,7 +823,7 @@ static int fsck_commit(struct commit *commit, const char *buffer, static int fsck_tag(struct tag *tag, const char *buffer, unsigned long size, struct fsck_options *options) { - struct object_id oid; + struct object_id tagged_oid; int ret = 0; char *eol; struct strbuf sb = STRBUF_INIT; @@ -837,7 +837,7 @@ static int fsck_tag(struct tag *tag, const char *buffer, ret = report(options, &tag->object.oid, tag->object.type, FSCK_MSG_MISSING_OBJECT, "invalid format - expected 'object' line"); goto done; } - if (parse_oid_hex(buffer, &oid, &p) || *p != '\n') { + if (parse_oid_hex(buffer, &tagged_oid, &p) || *p != '\n') { ret = report(options, &tag->object.oid, tag->object.type, FSCK_MSG_BAD_OBJECT_SHA1, "invalid 'object' line format - bad sha1"); if (ret) goto done; From patchwork Fri Oct 18 05:01:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11197497 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 702B313BD for ; Fri, 18 Oct 2019 05:01:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 581FC222C3 for ; Fri, 18 Oct 2019 05:01:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2407379AbfJRFB2 (ORCPT ); Fri, 18 Oct 2019 01:01:28 -0400 Received: from cloud.peff.net ([104.130.231.41]:51772 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1733292AbfJRFB2 (ORCPT ); Fri, 18 Oct 2019 01:01:28 -0400 Received: (qmail 9525 invoked by uid 109); 18 Oct 2019 05:01:28 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Fri, 18 Oct 2019 05:01:28 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14510 invoked by uid 111); 18 Oct 2019 05:04:32 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Fri, 18 Oct 2019 01:04:32 -0400 Authentication-Results: peff.net; auth=none Date: Fri, 18 Oct 2019 01:01:26 -0400 From: Jeff King To: git@vger.kernel.org Subject: [PATCH 21/23] fsck: accept an oid instead of a "struct tag" for fsck_tag() Message-ID: <20191018050125.GU17879@sigill.intra.peff.net> References: <20191018044103.GA17625@sigill.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20191018044103.GA17625@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org We don't actually look at the tag struct in fsck_tag() beyond its oid and type (which is obviously OBJ_TAG). Just taking an oid gives our callers more flexibility to avoid creating or parsing a struct, and makes it clear that we are fscking just what is in the buffer, not any pre-parsed bits from the struct. Signed-off-by: Jeff King --- fsck.c | 27 +++++++++++++-------------- 1 file changed, 13 insertions(+), 14 deletions(-) diff --git a/fsck.c b/fsck.c index 42e7d1f71f..38be501278 100644 --- a/fsck.c +++ b/fsck.c @@ -820,7 +820,7 @@ static int fsck_commit(struct commit *commit, const char *buffer, return 0; } -static int fsck_tag(struct tag *tag, const char *buffer, +static int fsck_tag(const struct object_id *oid, const char *buffer, unsigned long size, struct fsck_options *options) { struct object_id tagged_oid; @@ -829,48 +829,48 @@ static int fsck_tag(struct tag *tag, const char *buffer, struct strbuf sb = STRBUF_INIT; const char *p; - ret = verify_headers(buffer, size, &tag->object.oid, tag->object.type, options); + ret = verify_headers(buffer, size, oid, OBJ_TAG, options); if (ret) goto done; if (!skip_prefix(buffer, "object ", &buffer)) { - ret = report(options, &tag->object.oid, tag->object.type, FSCK_MSG_MISSING_OBJECT, "invalid format - expected 'object' line"); + ret = report(options, oid, OBJ_TAG, FSCK_MSG_MISSING_OBJECT, "invalid format - expected 'object' line"); goto done; } if (parse_oid_hex(buffer, &tagged_oid, &p) || *p != '\n') { - ret = report(options, &tag->object.oid, tag->object.type, FSCK_MSG_BAD_OBJECT_SHA1, "invalid 'object' line format - bad sha1"); + ret = report(options, oid, OBJ_TAG, FSCK_MSG_BAD_OBJECT_SHA1, "invalid 'object' line format - bad sha1"); if (ret) goto done; } buffer = p + 1; if (!skip_prefix(buffer, "type ", &buffer)) { - ret = report(options, &tag->object.oid, tag->object.type, FSCK_MSG_MISSING_TYPE_ENTRY, "invalid format - expected 'type' line"); + ret = report(options, oid, OBJ_TAG, FSCK_MSG_MISSING_TYPE_ENTRY, "invalid format - expected 'type' line"); goto done; } eol = strchr(buffer, '\n'); if (!eol) { - ret = report(options, &tag->object.oid, tag->object.type, FSCK_MSG_MISSING_TYPE, "invalid format - unexpected end after 'type' line"); + ret = report(options, oid, OBJ_TAG, FSCK_MSG_MISSING_TYPE, "invalid format - unexpected end after 'type' line"); goto done; } if (type_from_string_gently(buffer, eol - buffer, 1) < 0) - ret = report(options, &tag->object.oid, tag->object.type, FSCK_MSG_BAD_TYPE, "invalid 'type' value"); + ret = report(options, oid, OBJ_TAG, FSCK_MSG_BAD_TYPE, "invalid 'type' value"); if (ret) goto done; buffer = eol + 1; if (!skip_prefix(buffer, "tag ", &buffer)) { - ret = report(options, &tag->object.oid, tag->object.type, FSCK_MSG_MISSING_TAG_ENTRY, "invalid format - expected 'tag' line"); + ret = report(options, oid, OBJ_TAG, FSCK_MSG_MISSING_TAG_ENTRY, "invalid format - expected 'tag' line"); goto done; } eol = strchr(buffer, '\n'); if (!eol) { - ret = report(options, &tag->object.oid, tag->object.type, FSCK_MSG_MISSING_TAG, "invalid format - unexpected end after 'type' line"); + ret = report(options, oid, OBJ_TAG, FSCK_MSG_MISSING_TAG, "invalid format - unexpected end after 'type' line"); goto done; } strbuf_addf(&sb, "refs/tags/%.*s", (int)(eol - buffer), buffer); if (check_refname_format(sb.buf, 0)) { - ret = report(options, &tag->object.oid, tag->object.type, + ret = report(options, oid, OBJ_TAG, FSCK_MSG_BAD_TAG_NAME, "invalid 'tag' name: %.*s", (int)(eol - buffer), buffer); @@ -881,12 +881,12 @@ static int fsck_tag(struct tag *tag, const char *buffer, if (!skip_prefix(buffer, "tagger ", &buffer)) { /* early tags do not contain 'tagger' lines; warn only */ - ret = report(options, &tag->object.oid, tag->object.type, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line"); + ret = report(options, oid, OBJ_TAG, FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line"); if (ret) goto done; } else - ret = fsck_ident(&buffer, &tag->object.oid, tag->object.type, options); + ret = fsck_ident(&buffer, oid, OBJ_TAG, options); done: strbuf_release(&sb); @@ -987,8 +987,7 @@ int fsck_object(struct object *obj, void *data, unsigned long size, return fsck_commit((struct commit *) obj, (const char *) data, size, options); if (obj->type == OBJ_TAG) - return fsck_tag((struct tag *) obj, (const char *) data, - size, options); + return fsck_tag(&obj->oid, data, size, options); return report(options, &obj->oid, obj->type, FSCK_MSG_UNKNOWN_TYPE, From patchwork Fri Oct 18 05:01:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11197499 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 985EB112B for ; Fri, 18 Oct 2019 05:01:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 814A0222C5 for ; Fri, 18 Oct 2019 05:01:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732200AbfJRFBt (ORCPT ); Fri, 18 Oct 2019 01:01:49 -0400 Received: from cloud.peff.net ([104.130.231.41]:51778 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1726139AbfJRFBt (ORCPT ); Fri, 18 Oct 2019 01:01:49 -0400 Received: (qmail 9531 invoked by uid 109); 18 Oct 2019 05:01:50 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Fri, 18 Oct 2019 05:01:50 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14526 invoked by uid 111); 18 Oct 2019 05:04:54 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Fri, 18 Oct 2019 01:04:54 -0400 Authentication-Results: peff.net; auth=none Date: Fri, 18 Oct 2019 01:01:48 -0400 From: Jeff King To: git@vger.kernel.org Subject: [PATCH 22/23] fsck: accept an oid instead of a "struct commit" for fsck_commit() Message-ID: <20191018050148.GV17879@sigill.intra.peff.net> References: <20191018044103.GA17625@sigill.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20191018044103.GA17625@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org We don't actually look at the commit struct in fsck_commit() beyond its oid and type (which is obviously OBJ_COMMIT). Just taking an oid gives our callers more flexibility to avoid creating or parsing a struct, and makes it clear that we are fscking just what is in the buffer, not any pre-parsed bits from the struct. Signed-off-by: Jeff King --- fsck.c | 29 ++++++++++++++--------------- 1 file changed, 14 insertions(+), 15 deletions(-) diff --git a/fsck.c b/fsck.c index 38be501278..f8c5bbe891 100644 --- a/fsck.c +++ b/fsck.c @@ -764,8 +764,9 @@ static int fsck_ident(const char **ident, return 0; } -static int fsck_commit(struct commit *commit, const char *buffer, - unsigned long size, struct fsck_options *options) +static int fsck_commit(const struct object_id *oid, + const char *buffer, unsigned long size, + struct fsck_options *options) { struct object_id tree_oid, parent_oid; unsigned author_count; @@ -773,21 +774,20 @@ static int fsck_commit(struct commit *commit, const char *buffer, const char *buffer_begin = buffer; const char *p; - if (verify_headers(buffer, size, &commit->object.oid, - commit->object.type, options)) + if (verify_headers(buffer, size, oid, OBJ_COMMIT, options)) return -1; if (!skip_prefix(buffer, "tree ", &buffer)) - return report(options, &commit->object.oid, commit->object.type, FSCK_MSG_MISSING_TREE, "invalid format - expected 'tree' line"); + return report(options, oid, OBJ_COMMIT, FSCK_MSG_MISSING_TREE, "invalid format - expected 'tree' line"); if (parse_oid_hex(buffer, &tree_oid, &p) || *p != '\n') { - err = report(options, &commit->object.oid, commit->object.type, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1"); + err = report(options, oid, OBJ_COMMIT, FSCK_MSG_BAD_TREE_SHA1, "invalid 'tree' line format - bad sha1"); if (err) return err; } buffer = p + 1; while (skip_prefix(buffer, "parent ", &buffer)) { if (parse_oid_hex(buffer, &parent_oid, &p) || *p != '\n') { - err = report(options, &commit->object.oid, commit->object.type, FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1"); + err = report(options, oid, OBJ_COMMIT, FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1"); if (err) return err; } @@ -796,23 +796,23 @@ static int fsck_commit(struct commit *commit, const char *buffer, author_count = 0; while (skip_prefix(buffer, "author ", &buffer)) { author_count++; - err = fsck_ident(&buffer, &commit->object.oid, commit->object.type, options); + err = fsck_ident(&buffer, oid, OBJ_COMMIT, options); if (err) return err; } if (author_count < 1) - err = report(options, &commit->object.oid, commit->object.type, FSCK_MSG_MISSING_AUTHOR, "invalid format - expected 'author' line"); + err = report(options, oid, OBJ_COMMIT, FSCK_MSG_MISSING_AUTHOR, "invalid format - expected 'author' line"); else if (author_count > 1) - err = report(options, &commit->object.oid, commit->object.type, FSCK_MSG_MULTIPLE_AUTHORS, "invalid format - multiple 'author' lines"); + err = report(options, oid, OBJ_COMMIT, FSCK_MSG_MULTIPLE_AUTHORS, "invalid format - multiple 'author' lines"); if (err) return err; if (!skip_prefix(buffer, "committer ", &buffer)) - return report(options, &commit->object.oid, commit->object.type, FSCK_MSG_MISSING_COMMITTER, "invalid format - expected 'committer' line"); - err = fsck_ident(&buffer, &commit->object.oid, commit->object.type, options); + return report(options, oid, OBJ_COMMIT, FSCK_MSG_MISSING_COMMITTER, "invalid format - expected 'committer' line"); + err = fsck_ident(&buffer, oid, OBJ_COMMIT, options); if (err) return err; if (memchr(buffer_begin, '\0', size)) { - err = report(options, &commit->object.oid, commit->object.type, FSCK_MSG_NUL_IN_COMMIT, + err = report(options, oid, OBJ_COMMIT, FSCK_MSG_NUL_IN_COMMIT, "NUL byte in the commit object body"); if (err) return err; @@ -984,8 +984,7 @@ int fsck_object(struct object *obj, void *data, unsigned long size, if (obj->type == OBJ_TREE) return fsck_tree((struct tree *) obj, data, size, options); if (obj->type == OBJ_COMMIT) - return fsck_commit((struct commit *) obj, (const char *) data, - size, options); + return fsck_commit(&obj->oid, data, size, options); if (obj->type == OBJ_TAG) return fsck_tag(&obj->oid, data, size, options); From patchwork Fri Oct 18 05:02:08 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11197501 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B7448112B for ; Fri, 18 Oct 2019 05:02:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A1FB4222C5 for ; Fri, 18 Oct 2019 05:02:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390336AbfJRFCJ (ORCPT ); Fri, 18 Oct 2019 01:02:09 -0400 Received: from cloud.peff.net ([104.130.231.41]:51784 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1726139AbfJRFCJ (ORCPT ); Fri, 18 Oct 2019 01:02:09 -0400 Received: (qmail 9542 invoked by uid 109); 18 Oct 2019 05:02:10 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Fri, 18 Oct 2019 05:02:10 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 14542 invoked by uid 111); 18 Oct 2019 05:05:14 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Fri, 18 Oct 2019 01:05:14 -0400 Authentication-Results: peff.net; auth=none Date: Fri, 18 Oct 2019 01:02:08 -0400 From: Jeff King To: git@vger.kernel.org Subject: [PATCH 23/23] fsck: accept an oid instead of a "struct tree" for fsck_tree() Message-ID: <20191018050208.GW17879@sigill.intra.peff.net> References: <20191018044103.GA17625@sigill.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20191018044103.GA17625@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org We don't actually look at the tree struct in fsck_tree() beyond its oid and type (which is obviously OBJ_TREE). Just taking an oid gives our callers more flexibility to avoid creating a struct, and makes it clear that we are fscking just what is in the buffer, not any pre-parsed bits from the struct. Signed-off-by: Jeff King --- fsck.c | 30 +++++++++++++++--------------- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/fsck.c b/fsck.c index f8c5bbe891..ac4ba4c8e8 100644 --- a/fsck.c +++ b/fsck.c @@ -566,7 +566,7 @@ static int verify_ordered(unsigned mode1, const char *name1, unsigned mode2, con return c1 < c2 ? 0 : TREE_UNORDERED; } -static int fsck_tree(struct tree *item, +static int fsck_tree(const struct object_id *oid, const char *buffer, unsigned long size, struct fsck_options *options) { @@ -586,7 +586,7 @@ static int fsck_tree(struct tree *item, const char *o_name; if (init_tree_desc_gently(&desc, buffer, size)) { - retval += report(options, &item->object.oid, item->object.type, FSCK_MSG_BAD_TREE, "cannot be parsed as a tree"); + retval += report(options, oid, OBJ_TREE, FSCK_MSG_BAD_TREE, "cannot be parsed as a tree"); return retval; } @@ -613,13 +613,13 @@ static int fsck_tree(struct tree *item, oidset_insert(&gitmodules_found, oid); else retval += report(options, - &item->object.oid, item->object.type, + oid, OBJ_TREE, FSCK_MSG_GITMODULES_SYMLINK, ".gitmodules is a symbolic link"); } if (update_tree_entry_gently(&desc)) { - retval += report(options, &item->object.oid, item->object.type, FSCK_MSG_BAD_TREE, "cannot be parsed as a tree"); + retval += report(options, oid, OBJ_TREE, FSCK_MSG_BAD_TREE, "cannot be parsed as a tree"); break; } @@ -664,25 +664,25 @@ static int fsck_tree(struct tree *item, } if (has_null_sha1) - retval += report(options, &item->object.oid, item->object.type, FSCK_MSG_NULL_SHA1, "contains entries pointing to null sha1"); + retval += report(options, oid, OBJ_TREE, FSCK_MSG_NULL_SHA1, "contains entries pointing to null sha1"); if (has_full_path) - retval += report(options, &item->object.oid, item->object.type, FSCK_MSG_FULL_PATHNAME, "contains full pathnames"); + retval += report(options, oid, OBJ_TREE, FSCK_MSG_FULL_PATHNAME, "contains full pathnames"); if (has_empty_name) - retval += report(options, &item->object.oid, item->object.type, FSCK_MSG_EMPTY_NAME, "contains empty pathname"); + retval += report(options, oid, OBJ_TREE, FSCK_MSG_EMPTY_NAME, "contains empty pathname"); if (has_dot) - retval += report(options, &item->object.oid, item->object.type, FSCK_MSG_HAS_DOT, "contains '.'"); + retval += report(options, oid, OBJ_TREE, FSCK_MSG_HAS_DOT, "contains '.'"); if (has_dotdot) - retval += report(options, &item->object.oid, item->object.type, FSCK_MSG_HAS_DOTDOT, "contains '..'"); + retval += report(options, oid, OBJ_TREE, FSCK_MSG_HAS_DOTDOT, "contains '..'"); if (has_dotgit) - retval += report(options, &item->object.oid, item->object.type, FSCK_MSG_HAS_DOTGIT, "contains '.git'"); + retval += report(options, oid, OBJ_TREE, FSCK_MSG_HAS_DOTGIT, "contains '.git'"); if (has_zero_pad) - retval += report(options, &item->object.oid, item->object.type, FSCK_MSG_ZERO_PADDED_FILEMODE, "contains zero-padded file modes"); + retval += report(options, oid, OBJ_TREE, FSCK_MSG_ZERO_PADDED_FILEMODE, "contains zero-padded file modes"); if (has_bad_modes) - retval += report(options, &item->object.oid, item->object.type, FSCK_MSG_BAD_FILEMODE, "contains bad file modes"); + retval += report(options, oid, OBJ_TREE, FSCK_MSG_BAD_FILEMODE, "contains bad file modes"); if (has_dup_entries) - retval += report(options, &item->object.oid, item->object.type, FSCK_MSG_DUPLICATE_ENTRIES, "contains duplicate file entries"); + retval += report(options, oid, OBJ_TREE, FSCK_MSG_DUPLICATE_ENTRIES, "contains duplicate file entries"); if (not_properly_sorted) - retval += report(options, &item->object.oid, item->object.type, FSCK_MSG_TREE_NOT_SORTED, "not properly sorted"); + retval += report(options, oid, OBJ_TREE, FSCK_MSG_TREE_NOT_SORTED, "not properly sorted"); return retval; } @@ -982,7 +982,7 @@ int fsck_object(struct object *obj, void *data, unsigned long size, if (obj->type == OBJ_BLOB) return fsck_blob(&obj->oid, data, size, options); if (obj->type == OBJ_TREE) - return fsck_tree((struct tree *) obj, data, size, options); + return fsck_tree(&obj->oid, data, size, options); if (obj->type == OBJ_COMMIT) return fsck_commit(&obj->oid, data, size, options); if (obj->type == OBJ_TAG)