From patchwork Thu Dec 1 19:27:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 13061749 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58D39C43217 for ; Thu, 1 Dec 2022 19:27:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230106AbiLAT1m (ORCPT ); Thu, 1 Dec 2022 14:27:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48136 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229835AbiLAT1j (ORCPT ); Thu, 1 Dec 2022 14:27:39 -0500 Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E0C4BCF9 for ; Thu, 1 Dec 2022 11:27:37 -0800 (PST) Received: by mail-pg1-x549.google.com with SMTP id 11-20020a63000b000000b004776fe2eebfso2482854pga.9 for ; Thu, 01 Dec 2022 11:27:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=IhFvxOTtDVsVIvDDtiSjZfhYyykMUe4az/D4qrfxjNA=; b=hyzgiZiuRtPLn5kbibdFqGDbiTSEFghpKdtJffqX1lzOuD8ZJ5oXXkpjUqUHLVwv0T 3FCf8VDVlqtskyYqz7CqAsVs1rGuZQMmQlz6Ck95q7rrtQh9kSzDwEEMpoT4GKPovqRy DFNH786/gHF3i+IABSepV96AO5ReDBzg15DD9Hy6SpoksHS886JPhivYI8jFDtFHxSUJ RNM9pQYyPuw+1Ov1DSmoLlc1AdQnk7s8NZMcnkIfBkyX9ixfJzE+hOUePTm0Xg8EukHL C2Zo038CsUoTMJJsHV1Q6p0vRSv1IBGm34AlB1v9jrJEolWKYoTdxDLo5QBdnM+k7Y2V zigQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=IhFvxOTtDVsVIvDDtiSjZfhYyykMUe4az/D4qrfxjNA=; b=pgjLpyLlXn46/lGIarhiHeNHS6g7Dcmfh5ncrYHOoYm5jZML9G/wGN66AKndLCAYAK 7Ha4MWTMFediZNut4fvEdqkiCOmhNkf+tGeclYn9zS2YZICc4+m/mMWpwL5IN3CHYIQ8 9yoGpWokpkMmjcFnZvMpAckk4zBOqRiIwgmid9yCAyh1D5DVZoMfoE9Wio2tcoYidlYn Hid4/obUKpGROTB81atDLn5g+OkCEFhvLHAS8LaQXwYC0p+542dznzTAPwGo6KszN02V hmBX8BD58hC+NBUTAy8mIlD7WNd8NmJi2HlFPpfPtUqMM4euYmqzMDYyJKb+4gmkDqfa RGMQ== X-Gm-Message-State: ANoB5pkO1Wor8P6jBW+ubp3rTnZ2G+/v6yh+d5GdLVGUT041Q2kOuyAe c0RIunh1FTp5/ySX29OAuzlFvPV6dN1rIW0Sx7aVrLDT4g7Q6fmFu0azTh6506WtjCqEWoeYfUX 4ikPa6OOteoOIOGdC5aLzoXBlQhQYCgrS0hfO7qD+t+WPv5XDJr1Kf1P4JCkJ69zbvnjiwdyEcS xR X-Google-Smtp-Source: AA0mqf50QLy3JdENXh3+ai/1cYPHfx965CkP1UviVZtTPxNMUl8gSoRbJ3InZOCT1wZfb3Uw2ld0sDm0zhPOe+cQ/nVX X-Received: from twelve4.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:437a]) (user=jonathantanmy job=sendgmr) by 2002:a63:cf0e:0:b0:477:b603:f754 with SMTP id j14-20020a63cf0e000000b00477b603f754mr39375734pgg.232.1669922857282; Thu, 01 Dec 2022 11:27:37 -0800 (PST) Date: Thu, 1 Dec 2022 11:27:30 -0800 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.39.0.rc0.267.gcb52ba06e7-goog Message-ID: <604160e79cef94fd8e03fe025990c999bb795395.1669922792.git.jonathantanmy@google.com> Subject: [PATCH v2 1/4] object-file: reread object with exact same args From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , peff@peff.net, gitster@pobox.com Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When an object in do_oid_object_info_extended() is found in a packfile, but corrupt, that packfile entry is marked as bad and the read is retried. Currently, this is done by invoking the function again but with the replace target of the object and with no flags. This currently works, but will be clumsy when a later patch modifies this function to also return the "real" object being read (that is, the replace target). It does not make sense to pass a pointer in order to receive this information when no replace lookups are requested, which is exactly what the reinvocation does. Therefore, change this reinvocation to pass exactly the arguments which were originally passed. This also makes us forwards compatible with future flags that may change the behavior of this function. This does slow down the case when packfile corruption is detected, but that is expected to be a very rare case. Signed-off-by: Jonathan Tan --- object-file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/object-file.c b/object-file.c index 26290554bb..1cde477267 100644 --- a/object-file.c +++ b/object-file.c @@ -1621,7 +1621,7 @@ static int do_oid_object_info_extended(struct repository *r, rtype = packed_object_info(r, e.p, e.offset, oi); if (rtype < 0) { mark_bad_packed_object(e.p, real); - return do_oid_object_info_extended(r, real, oi, 0); + return do_oid_object_info_extended(r, oid, oi, flags); } else if (oi->whence == OI_PACKED) { oi->u.packed.offset = e.offset; oi->u.packed.pack = e.p; From patchwork Thu Dec 1 19:27:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 13061751 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7756AC43217 for ; Thu, 1 Dec 2022 19:27:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229943AbiLAT1r (ORCPT ); Thu, 1 Dec 2022 14:27:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48220 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229708AbiLAT1l (ORCPT ); Thu, 1 Dec 2022 14:27:41 -0500 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F0488E0ED for ; Thu, 1 Dec 2022 11:27:39 -0800 (PST) Received: by mail-pj1-x104a.google.com with SMTP id om6-20020a17090b3a8600b0021965c06195so6649128pjb.2 for ; Thu, 01 Dec 2022 11:27:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=6TodTgJTHAfLxIfznVa94Ovf+scmC3jr+AlIcU9Dzm4=; b=pDKF/tpBIVDMEj6owmRAVD6W9Mhf4aR0Wy6qDAF1EuTECUKbh8Dr2X6xOhI68ntEq5 TzSdWz5Q8dhzinzq6pOltBrYHRdiZeAmmCcMM3ikJIi8xJ+e+vUrOGNGofDK9b2FcRuC aXg1TPwqjy5jWDlM9/iiaVskyIoo3wZWufb7uKmWM70NuPws1R6+SMYDiTVBlbwIR3Xz BxfIoidiqJ9t8q/p4qVZOxm/lCzlha9WjjN6Q6ZCD0d1FTo3w7w9f9S7XOyMow9Gk2g7 GP8N1Ekc0bpv4hz8NdCKComYN5RpxrRx3cXRVFrKLiRS8mOXeJ0CCdGsgfMSHMLbZzeB 6nlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=6TodTgJTHAfLxIfznVa94Ovf+scmC3jr+AlIcU9Dzm4=; b=thYRQfkkG2UBELEpGGj/YbChkzIusvc6XSCRm0XzUXRPCb9WBqjFXr28DYV1rZ0ZZt LrjrDWaAnoCWzWqzcRvHbe8PPceHrsFfFXGmLgn087ec8zD4uoPXfwxGm+FIVb8S1Kqq en+6kQM7yj6XsKVE3BF/hGv37c/e9q3epjDY68bLWrH9kYFtLNBV0MiVqDZStWuU4gjO xTmPCthsQKyVsWjlxMBLxpOd3PquACI21pV+Kim2T8jloO2JFLMXQQJi8ZtcbkSfh+bd +O7z7a1jxNpavcy50nraFYcWoqVuw7PF4rMG9DigWeBvT+VaOD/dyOipofv6caaQ5B7q QaeQ== X-Gm-Message-State: ANoB5pmUw3yxIPD/leXt2U2cpgpxpUvqntlbfCf+7QnYc5YZg0wNP67U X8sOrJm/WFf+B86/i0npWNVnrMgpdsbgTYMin0dbXNKpAMeUxoWnd9IeuzPbf6cB1nwMcoZMby9 H9AyGEdqqLcJS+X7pGSlpPJY3lGgDYfG1o5quHq4iB4zSOtArXOE/dYBHg20bVXQNIyq1oYAiYZ n9 X-Google-Smtp-Source: AA0mqf6xd5ciR4YBH5S8DBrWbjY38Ijmrryj2H9mjmHytxFmTN4agLy72Dz5L1INtw9UoMdtjo2/oHSrCWXlvedL/sUG X-Received: from twelve4.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:437a]) (user=jonathantanmy job=sendgmr) by 2002:a17:90b:f89:b0:219:5b3b:2b9f with SMTP id ft9-20020a17090b0f8900b002195b3b2b9fmr1267857pjb.2.1669922859141; Thu, 01 Dec 2022 11:27:39 -0800 (PST) Date: Thu, 1 Dec 2022 11:27:31 -0800 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.39.0.rc0.267.gcb52ba06e7-goog Message-ID: <1be60f1bf2f368f5e5c8b6550b3e4d4f3efe1496.1669922792.git.jonathantanmy@google.com> Subject: [PATCH v2 2/4] object-file: refactor corrupt object diagnosis From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , peff@peff.net, gitster@pobox.com Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This functionality will be used from another file in a subsequent patch, so refactor it into a public function. Signed-off-by: Jonathan Tan --- object-file.c | 29 ++++++++++++++++++----------- object-store.h | 9 +++++++++ 2 files changed, 27 insertions(+), 11 deletions(-) diff --git a/object-file.c b/object-file.c index 1cde477267..36f81c7958 100644 --- a/object-file.c +++ b/object-file.c @@ -1705,9 +1705,6 @@ void *read_object_file_extended(struct repository *r, int lookup_replace) { void *data; - const struct packed_git *p; - const char *path; - struct stat st; const struct object_id *repl = lookup_replace ? lookup_replace_object(r, oid) : oid; @@ -1715,26 +1712,36 @@ void *read_object_file_extended(struct repository *r, data = read_object(r, repl, type, size); if (data) return data; + die_if_corrupt(r, oid, repl); + + return NULL; +} + +void die_if_corrupt(struct repository *r, + const struct object_id *oid, + const struct object_id *real_oid) +{ + const struct packed_git *p; + const char *path; + struct stat st; obj_read_lock(); if (errno && errno != ENOENT) die_errno(_("failed to read object %s"), oid_to_hex(oid)); /* die if we replaced an object with one that does not exist */ - if (repl != oid) + if (!oideq(real_oid, oid)) die(_("replacement %s not found for %s"), - oid_to_hex(repl), oid_to_hex(oid)); + oid_to_hex(real_oid), oid_to_hex(oid)); - if (!stat_loose_object(r, repl, &st, &path)) + if (!stat_loose_object(r, real_oid, &st, &path)) die(_("loose object %s (stored in %s) is corrupt"), - oid_to_hex(repl), path); + oid_to_hex(real_oid), path); - if ((p = has_packed_and_bad(r, repl))) + if ((p = has_packed_and_bad(r, real_oid))) die(_("packed object %s (stored in %s) is corrupt"), - oid_to_hex(repl), p->pack_name); + oid_to_hex(real_oid), p->pack_name); obj_read_unlock(); - - return NULL; } void *read_object_with_reference(struct repository *r, diff --git a/object-store.h b/object-store.h index 1be57abaf1..88c879c61e 100644 --- a/object-store.h +++ b/object-store.h @@ -256,6 +256,15 @@ static inline void *repo_read_object_file(struct repository *r, #define read_object_file(oid, type, size) repo_read_object_file(the_repository, oid, type, size) #endif +/* + * Dies if real_oid is corrupt, not just missing. + * + * real_oid should be an oid that could not be read. + */ +void die_if_corrupt(struct repository *r, + const struct object_id *oid, + const struct object_id *real_oid); + /* Read and unpack an object file into memory, write memory to an object file */ int oid_object_info(struct repository *r, const struct object_id *, unsigned long *); From patchwork Thu Dec 1 19:27:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 13061750 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 77EE1C47088 for ; Thu, 1 Dec 2022 19:27:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230257AbiLAT1p (ORCPT ); Thu, 1 Dec 2022 14:27:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48362 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230123AbiLAT1n (ORCPT ); Thu, 1 Dec 2022 14:27:43 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A03C61208C for ; Thu, 1 Dec 2022 11:27:41 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id 187-20020a250dc4000000b006f8cd26bfcfso1063710ybn.13 for ; Thu, 01 Dec 2022 11:27:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=AjpFZiyVD04CAbZg7L/3S1F1hg/bH47GALCNYOb6SvM=; b=hiEmxEQpzrA+t2r+d5C1QT07SSEuybhGGrH8HVTeqX+neK7ia+DEJe1gImuwdslKf8 bDX1JQcnzigPwn4Bvh4A2LwMfFvSVc5dJuWr+6Zbja7x7uMCmiRnO0dfWBJ//Ir3sxi4 wVu2RbjVn520z1OlUjsLuKG9Lp+jGvcPYkdBtT7CJc4CgNHHJ80V3YdrNJH3FriRQHMf Zoel4SocdliAeginqhOTksxmbc1d3rzWa/6q7d2zJLge/RpM4EJc+g28Xm31IFlAPrjJ EgwWElXhdv8WIVl44on7f6zAFEctM2wV3Cx16szlwvblFBGs7c5s/RmQ52ohDLGQERKj BSqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=AjpFZiyVD04CAbZg7L/3S1F1hg/bH47GALCNYOb6SvM=; b=SxzHFolOozr21DY7gy54DmmfP635it684BWQcTtMf77xuwJJagjdVNbBBAvxdwCcKi G5hF+2Bc/Xf3oCA4uq0rnGVPQ0/5QY/E2lFrT5sbEDx2nW1rgSmkWvsJ9MVG2Gsfg50L WYtIBzajH7ZJbOhIwzYIh3Ej+KYQfeWol0T+HhW7x63qKG9lllVZCTFo4EIIoo0XqE2H iJSY60FbXJiMeItlifNGvWO+IAKqBuTjV4DsJo+0HRQDLRVXgOUO90v9BXlR3lgyKE8y kUhF2ik/sax1dodjutPkjo69uUdLU9/yfpni2WJZZczcRPyKN4slZhvxDUE6UjZ3cv0Y KZ0A== X-Gm-Message-State: ANoB5pm618NzHCYVsyHL0i+4aSeTtEkSGzTa/emzgwm3ujpWZjHNYExq 2zDOhwIefXcG4uoODWfYZFAG32ib4NL7B2zCInbhnLo7GnWdAUMnJq7U1NQI3a2+wh3roacVDFs JEWw1k9ZtIor92VKFSbtKKdTGvJyUoobsEk8ezuewLOqCrN1RIQSpPGOhwtxJARphAA/AdabM35 tt X-Google-Smtp-Source: AA0mqf4kgJRO74CAhSWphjFhN6Zx7dXVUdHEnqWvYFrdXoRNn/r+yTARH15XN5ky9eEAC+Pti5kjrdIUMnR5eXtJsCgw X-Received: from twelve4.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:437a]) (user=jonathantanmy job=sendgmr) by 2002:a81:6344:0:b0:352:5ccb:2273 with SMTP id x65-20020a816344000000b003525ccb2273mr62146184ywb.315.1669922860840; Thu, 01 Dec 2022 11:27:40 -0800 (PST) Date: Thu, 1 Dec 2022 11:27:32 -0800 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.39.0.rc0.267.gcb52ba06e7-goog Message-ID: <28935ba1b0132fa4d1a9f93e1d65835a8da12455.1669922792.git.jonathantanmy@google.com> Subject: [PATCH v2 3/4] object-file: refactor replace object lookup From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , peff@peff.net, gitster@pobox.com Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Move the replace object lookup (specifically, the ability for the caller to know the result of the lookup) from read_object_file_extended() to one of the functions that it indirectly calls, do_oid_object_info_extended(), because a subsequent patch will need that ability from the latter. Signed-off-by: Jonathan Tan --- object-file.c | 28 +++++++++++++++++++++------- object-store.h | 1 + 2 files changed, 22 insertions(+), 7 deletions(-) diff --git a/object-file.c b/object-file.c index 36f81c7958..8adef99a7c 100644 --- a/object-file.c +++ b/object-file.c @@ -1546,6 +1546,11 @@ static int do_oid_object_info_extended(struct repository *r, if (flags & OBJECT_INFO_LOOKUP_REPLACE) real = lookup_replace_object(r, oid); + if (oi && oi->real_oidp) { + if (!(flags & OBJECT_INFO_LOOKUP_REPLACE)) + BUG("specifying real_oidp does not make sense without OBJECT_INFO_LOOKUP_REPLACE"); + *oi->real_oidp = real; + } if (is_null_oid(real)) return -1; @@ -1659,17 +1664,27 @@ int oid_object_info(struct repository *r, return type; } +/* + * If real_oid is not NULL, check if oid has a replace object and store the + * object that we end up using there. + */ static void *read_object(struct repository *r, const struct object_id *oid, enum object_type *type, - unsigned long *size) + unsigned long *size, const struct object_id **real_oid) { struct object_info oi = OBJECT_INFO_INIT; void *content; + unsigned int flags = 0; oi.typep = type; oi.sizep = size; oi.contentp = &content; - if (oid_object_info_extended(r, oid, &oi, 0) < 0) + if (real_oid) { + flags |= OBJECT_INFO_LOOKUP_REPLACE; + oi.real_oidp = real_oid; + } + + if (oid_object_info_extended(r, oid, &oi, flags) < 0) return NULL; return content; } @@ -1705,14 +1720,13 @@ void *read_object_file_extended(struct repository *r, int lookup_replace) { void *data; - const struct object_id *repl = lookup_replace ? - lookup_replace_object(r, oid) : oid; + const struct object_id *real_oid; errno = 0; - data = read_object(r, repl, type, size); + data = read_object(r, oid, type, size, &real_oid); if (data) return data; - die_if_corrupt(r, oid, repl); + die_if_corrupt(r, oid, real_oid); return NULL; } @@ -2283,7 +2297,7 @@ int force_object_loose(const struct object_id *oid, time_t mtime) if (has_loose_object(oid)) return 0; - buf = read_object(the_repository, oid, &type, &len); + buf = read_object(the_repository, oid, &type, &len, NULL); if (!buf) return error(_("cannot read object for %s"), oid_to_hex(oid)); hdrlen = format_object_header(hdr, sizeof(hdr), type, len); diff --git a/object-store.h b/object-store.h index 88c879c61e..9684562eb2 100644 --- a/object-store.h +++ b/object-store.h @@ -406,6 +406,7 @@ struct object_info { struct object_id *delta_base_oid; struct strbuf *type_name; void **contentp; + const struct object_id **real_oidp; /* Response */ enum { From patchwork Thu Dec 1 19:27:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 13061752 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9C676C4321E for ; Thu, 1 Dec 2022 19:27:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229708AbiLAT15 (ORCPT ); Thu, 1 Dec 2022 14:27:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48464 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230256AbiLAT1p (ORCPT ); Thu, 1 Dec 2022 14:27:45 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BC41329C92 for ; Thu, 1 Dec 2022 11:27:43 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-3b48b605351so25902017b3.22 for ; Thu, 01 Dec 2022 11:27:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=RN/hHorXE4h1LGy1HPs5RWPP6Hyui6R8TjabuvFI1BI=; b=Z8S39NQUWEGmZk+i2iiIKsQXbaLf+IY/zS2oC6lprUfANhefXo1vyys+Dk7I7SKxh8 9IFYOEHavw8LBAHMP15r6IZLdMNuLn6FBPt8mNoPgxpHjpSLUeF8X+n18+/Gxi44WdzY 2kHK1xGkDVwijGLtLsqhmcfBlPoS61PxAtbX/lErtpuliMgwNzCkm4O3onHGPgh0neda zW7L48g0wgoptlg32Qk7TIn7v9vkCqIxzTmFgOoVH1Bh5yDuddxUfwk3hMbdmWqIgic8 igGuLlfagsjwQ5KXC2mebW784j1Ty4uNFr+SRsjswWQbzNYY8eGvxR6FAk2qzpVNlLwm pZYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=RN/hHorXE4h1LGy1HPs5RWPP6Hyui6R8TjabuvFI1BI=; b=7nJOdhF+JN7osbZ40YupvKXlJArh6OsupvWOOwEj/ywUHpWRvqjfsou867uQOGIVV4 BD740IZeYfiWF/tlzCJIOK7aX+jyIY7qqfp+27IUQIahiIXX+6RqNm48tnWkBoJPZRT/ IRVex3pyuEWp5zTKbWGmcElHhE+JypbgnqkAkTuvHU3vZtQGVR+YejQZhBxinA3V5Obz OY2sOcPzVYPGtFjB/VEFhVwOVWFfPswf4KOUg4C8iAEOXEOfl5OWVu+9jse/gyqFXHzU 6t3Huk8hK/LxLYAv2ojpaxJavEaJ+2oTrQlWWU+bsKFDefTSuiKb0/Xr75VZSZCnGvlf moyg== X-Gm-Message-State: ANoB5pnC90lkk5rOTeoX8ZNHHFM9WJZUFfzdYAblYosXNRcZxwjAgDkD izGGQLjoUnbxZ3q6AKSo39ku6WqLewa5IN8y4TLamie6kYpOTKXHSRfH3sRx+aqRnWJCjXBJI2Y kju/fXxdcb6E9gwIHIqMwHDX92eMO35qMYK6X4sgB/zCZr+bHBqBLqP89KaW8adU9ppOxmdA59L gj X-Google-Smtp-Source: AA0mqf4whqJsTt5kmp3pbAtP9j0pIBaWjp9+XSmpTntzZcJ1LwMGrcubq9VlKXXThn2v/+vDn2miYwTElCmEYVr+5eoz X-Received: from twelve4.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:437a]) (user=jonathantanmy job=sendgmr) by 2002:a25:5d5:0:b0:6f9:5e19:4729 with SMTP id 204-20020a2505d5000000b006f95e194729mr12850595ybf.311.1669922862409; Thu, 01 Dec 2022 11:27:42 -0800 (PST) Date: Thu, 1 Dec 2022 11:27:33 -0800 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.39.0.rc0.267.gcb52ba06e7-goog Message-ID: Subject: [PATCH v2 4/4] commit: don't lazy-fetch commits From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , peff@peff.net, gitster@pobox.com Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When parsing commits, fail fast when the commit is missing or corrupt, instead of attempting to fetch them. This is done by inlining repo_read_object_file() and setting the flag that prevents fetching. This is motivated by a situation in which through a bug (not necessarily through Git), there was corruption in the object store of a partial clone. In this particular case, the problem was exposed when "git gc" tried to expire reflogs, which calls repo_parse_commit(), which triggers fetches of the missing commits. (There are other possible solutions to this problem including passing an argument from "git gc" to "git reflog" to inhibit all lazy fetches, but I think that this fix is at the wrong level - fixing "git reflog" means that this particular command works fine, or so we think (it will fail if it somehow needs to read a legitimately missing blob, say, a .gitmodules file), but fixing repo_parse_commit() will fix a whole class of bugs.) Signed-off-by: Jonathan Tan --- commit.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/commit.c b/commit.c index 572301b80a..17e71f5be4 100644 --- a/commit.c +++ b/commit.c @@ -508,6 +508,13 @@ int repo_parse_commit_internal(struct repository *r, enum object_type type; void *buffer; unsigned long size; + const struct object_id *real_oid; + struct object_info oi = { + .typep = &type, + .sizep = &size, + .contentp = &buffer, + .real_oidp = &real_oid, + }; int ret; if (!item) @@ -516,11 +523,18 @@ int repo_parse_commit_internal(struct repository *r, return 0; if (use_commit_graph && parse_commit_in_graph(r, item)) return 0; - buffer = repo_read_object_file(r, &item->object.oid, &type, &size); - if (!buffer) + + /* + * Git does not support partial clones that exclude commits, so set + * OBJECT_INFO_SKIP_FETCH_OBJECT to fail fast when an object is missing. + */ + if (oid_object_info_extended(r, &item->object.oid, &oi, + OBJECT_INFO_LOOKUP_REPLACE | OBJECT_INFO_SKIP_FETCH_OBJECT) < 0) { + die_if_corrupt(r, &item->object.oid, real_oid); return quiet_on_missing ? -1 : error("Could not read %s", oid_to_hex(&item->object.oid)); + } if (type != OBJ_COMMIT) { free(buffer); return error("Object %s not a commit",