From patchwork Wed Nov 30 20:30:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 13060428 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03C32C4321E for ; Wed, 30 Nov 2022 20:31:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230265AbiK3Ubb (ORCPT ); Wed, 30 Nov 2022 15:31:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44030 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230225AbiK3UbH (ORCPT ); Wed, 30 Nov 2022 15:31:07 -0500 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 305134E690 for ; Wed, 30 Nov 2022 12:30:56 -0800 (PST) Received: by mail-pl1-x649.google.com with SMTP id u15-20020a170902e5cf00b001899d29276eso7074758plf.10 for ; Wed, 30 Nov 2022 12:30:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ZqK6HkWezKvySIHusR17fdjJw5ozc30S5dl097gY6+E=; b=sx3TfKK1R5sdes/RF1RasncChxmVsLHydmGVIX79q6maNxPproBwtCepG65mK8tE9Q 7ZdwfKND/DUpeCeYeQukblGa6OJdt/M65ynmpQqmJrXwnu616+JyoGJwY7vmfU+tJH9N 9QuuPxBKMjvFc3rwgnOOO2+qfxiG2RzLOSnlIFJ3OzylEjOMlscMVPTmrrOlubd6dtnz cbKnwZ/q0PbGkcCb+w6xO/Y5LLCtOhXL7pImnNjtEdwHGrMlNnUWTpIoQYC4RY6vmHtG XCxmtVJfxTrsqGOFIe45WuqJj0riLe4lyfanG5bzsv3lU3ZqugNftren1Z1WCiXVbtK+ LGUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ZqK6HkWezKvySIHusR17fdjJw5ozc30S5dl097gY6+E=; b=iJIOO5Wuh9ZKH/9z28z4GQUxUH3xZIyib/n4cLgMg7pXsZn2P3k80gLQ+P5kzgqfVl r8ANl8XMG5ZprvTow8KzvbTpQbTfcFejcpMBMwjXCJX3BV29CozHY6iOfRPGgnUsK3uS Z2U87fL+feFd6bIzauOwrn+QHy0cEPRmou3vpNIoTfh1rBGIjdACUXRroockPZ0m6aJY NdzarG5vfVVmWJqCiXtpkhkkJTJKsKrgJunrg0qOVyxSfoGOt92G/HvIrkLQHhg64h3w ZwIW+eMX+li2p/LgDa7mLkhFTBBSTYijJyz3NhBf1Wc9NaTWrkVMQ5wzQnVQ0FSo3v2d WEOg== X-Gm-Message-State: ANoB5pnZhOokE3C8o9W2L07eL7stn6bWF0/qiLtFsLGwD5OrbIAAl99K pPytvv31USinXbjLrJDYST6f6Pw5V+c85CUNqBVRZtpgjydvzF5r6rZrShElg/gsrhKaPpms49u EjXsKEPORxntuIRq+R8+2fPwxBrP/qq2tQEUSjOFjDQFtUGd3yrG6MjUwmxdkmjImy23Hk3uial jO X-Google-Smtp-Source: AA0mqf583nFDtdBNbj7uoh6nQwGvXTbsJ0N+9jDMUtChUoR8Lo+9GHg+wauU8CGrdYIXcV4s3gUGrXrpWSZgG8FmPBdv X-Received: from twelve4.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:437a]) (user=jonathantanmy job=sendgmr) by 2002:a05:6a00:f92:b0:562:317c:2a8 with SMTP id ct18-20020a056a000f9200b00562317c02a8mr66534322pfb.49.1669840255628; Wed, 30 Nov 2022 12:30:55 -0800 (PST) Date: Wed, 30 Nov 2022 12:30:46 -0800 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.38.1.584.g0f3c55d4c2-goog Message-ID: <604160e79cef94fd8e03fe025990c999bb795395.1669839849.git.jonathantanmy@google.com> Subject: [PATCH 1/4] object-file: reread object with exact same args From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When an object in do_oid_object_info_extended() is found in a packfile, but corrupt, that packfile entry is marked as bad and the read is retried. Currently, this is done by invoking the function again but with the replace target of the object and with no flags. This currently works, but will be clumsy when a later patch modifies this function to also return the "real" object being read (that is, the replace target). It does not make sense to pass a pointer in order to receive this information when no replace lookups are requested, which is exactly what the reinvocation does. Therefore, change this reinvocation to pass exactly the arguments which were originally passed. This also makes us forwards compatible with future flags that may change the behavior of this function. This does slow down the case when packfile corruption is detected, but that is expected to be a very rare case. Signed-off-by: Jonathan Tan --- object-file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/object-file.c b/object-file.c index 26290554bb..1cde477267 100644 --- a/object-file.c +++ b/object-file.c @@ -1621,7 +1621,7 @@ static int do_oid_object_info_extended(struct repository *r, rtype = packed_object_info(r, e.p, e.offset, oi); if (rtype < 0) { mark_bad_packed_object(e.p, real); - return do_oid_object_info_extended(r, real, oi, 0); + return do_oid_object_info_extended(r, oid, oi, flags); } else if (oi->whence == OI_PACKED) { oi->u.packed.offset = e.offset; oi->u.packed.pack = e.p; From patchwork Wed Nov 30 20:30:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 13060429 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4C359C47089 for ; Wed, 30 Nov 2022 20:31:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230202AbiK3Ubc (ORCPT ); Wed, 30 Nov 2022 15:31:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44042 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230232AbiK3UbH (ORCPT ); Wed, 30 Nov 2022 15:31:07 -0500 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BFC857062A for ; Wed, 30 Nov 2022 12:30:57 -0800 (PST) Received: by mail-pj1-x104a.google.com with SMTP id x17-20020a17090a8a9100b002196a3b190cso2331517pjn.6 for ; Wed, 30 Nov 2022 12:30:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ORptEKDWT/5lQ5u4HKujRlDs50a/yOrczO3C14UbvG8=; b=jACzVdrjoXTOsbz87B5tvvWZTEyggskT9G/b9GJijg8uEULtAPFMqH8j+5Qjk6uqVe L2ulzo2fSbdX3/fox2HmBs+mX3kf5sX/1LnAfxZC/eVJYpNTjsdfQBe5SIsJJUSEj1aC H39v/zpPDMhV+l1zoYJRG49CpL7lheRVsxvMNy1G6ssiDyi/bbZeZ1PX/KXxMk6iCfbR Fb+Hv4kWQWmLD1piB3i0SSEjiUQqoVJ0d7YPmPb+TATM502f0Hp7Aw5j6sXbl/s4WzYa 37ZsTE0Fd2hwZM5e79Z54cDFo1qSdMjsU5fOJpQlH86rk7qmGEutHEHv+BCwd5CH2127 Qe1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ORptEKDWT/5lQ5u4HKujRlDs50a/yOrczO3C14UbvG8=; b=i4zTHAiPsGjnaEfnxLpND5/xxNnVbU9llvW/q3J8QdoVuNfR5OtjYeLb2iXy/JU9NL cP9vRiF1hBFvg9759rjap4RQqkJA12G2rsudowpOplkJOQdUP25jtj0ioi/gtDYTlgm9 zAM4w+mDyjF1Cast7Nkgmdo3vm6Yhh0/Z8dcYU9TyT9WHiKCGG1yj4WpIJ244jCDWjHq 8Ip1xaSxri6buTzykI6bkqFucGlqVznyety0NHKo0i2yYs1qKq9+K1/p5nUdn0qQ/HZg Z3B4dboss9NWLUD862nkPZQ+4w2eUoOvRDGRc0y/biWgbu9+4WVhzzKqiT7308MLIHFd YmPQ== X-Gm-Message-State: ANoB5pnaO9TqaeZ+WobGiIHMUAbLr0ccCMVeLZurfMlzUobBHm5LkJi1 AEpj2h5hOM5KOs3zMGMqNBtvXUdZjVwyOkCwy+/Fzec4kdV3E03t7+FBQYcMI64Uzaumi97422A +k1QG+dg7xwz8mgN64+vP0Woe6obUukIJBsD4T4XsuZUjTFqNPMm5SBq0guUlmuDJYx27pyxIn7 ns X-Google-Smtp-Source: AA0mqf4G/hUOlGuXP+qFasvVGmqgEL7+53kI2AYcGbUuVbgpfFhot92Y+3F68M88B45xF7IvA/Y+Ca8oW42uyeUDmOdC X-Received: from twelve4.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:437a]) (user=jonathantanmy job=sendgmr) by 2002:aa7:8d0e:0:b0:574:6a48:3fd9 with SMTP id j14-20020aa78d0e000000b005746a483fd9mr35327897pfe.36.1669840257228; Wed, 30 Nov 2022 12:30:57 -0800 (PST) Date: Wed, 30 Nov 2022 12:30:47 -0800 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.38.1.584.g0f3c55d4c2-goog Message-ID: Subject: [PATCH 2/4] object-file: refactor corrupt object diagnosis From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This functionality will be used from another file in a subsequent patch, so refactor it into a public function. Signed-off-by: Jonathan Tan --- object-file.c | 29 ++++++++++++++++++----------- object-store.h | 9 +++++++++ 2 files changed, 27 insertions(+), 11 deletions(-) diff --git a/object-file.c b/object-file.c index 1cde477267..37468bc256 100644 --- a/object-file.c +++ b/object-file.c @@ -1705,9 +1705,6 @@ void *read_object_file_extended(struct repository *r, int lookup_replace) { void *data; - const struct packed_git *p; - const char *path; - struct stat st; const struct object_id *repl = lookup_replace ? lookup_replace_object(r, oid) : oid; @@ -1715,26 +1712,36 @@ void *read_object_file_extended(struct repository *r, data = read_object(r, repl, type, size); if (data) return data; + die_if_corrupt(r, oid, repl); + + return NULL; +} + +void die_if_corrupt(struct repository *r, + const struct object_id *oid, + const struct object_id *real_oid) +{ + const struct packed_git *p; + const char *path; + struct stat st; obj_read_lock(); if (errno && errno != ENOENT) die_errno(_("failed to read object %s"), oid_to_hex(oid)); /* die if we replaced an object with one that does not exist */ - if (repl != oid) + if (real_oid != oid) die(_("replacement %s not found for %s"), - oid_to_hex(repl), oid_to_hex(oid)); + oid_to_hex(real_oid), oid_to_hex(oid)); - if (!stat_loose_object(r, repl, &st, &path)) + if (!stat_loose_object(r, real_oid, &st, &path)) die(_("loose object %s (stored in %s) is corrupt"), - oid_to_hex(repl), path); + oid_to_hex(real_oid), path); - if ((p = has_packed_and_bad(r, repl))) + if ((p = has_packed_and_bad(r, real_oid))) die(_("packed object %s (stored in %s) is corrupt"), - oid_to_hex(repl), p->pack_name); + oid_to_hex(real_oid), p->pack_name); obj_read_unlock(); - - return NULL; } void *read_object_with_reference(struct repository *r, diff --git a/object-store.h b/object-store.h index 1be57abaf1..88c879c61e 100644 --- a/object-store.h +++ b/object-store.h @@ -256,6 +256,15 @@ static inline void *repo_read_object_file(struct repository *r, #define read_object_file(oid, type, size) repo_read_object_file(the_repository, oid, type, size) #endif +/* + * Dies if real_oid is corrupt, not just missing. + * + * real_oid should be an oid that could not be read. + */ +void die_if_corrupt(struct repository *r, + const struct object_id *oid, + const struct object_id *real_oid); + /* Read and unpack an object file into memory, write memory to an object file */ int oid_object_info(struct repository *r, const struct object_id *, unsigned long *); From patchwork Wed Nov 30 20:30:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 13060430 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0E32C4321E for ; Wed, 30 Nov 2022 20:31:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230207AbiK3Ubk (ORCPT ); Wed, 30 Nov 2022 15:31:40 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43902 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230198AbiK3UbU (ORCPT ); Wed, 30 Nov 2022 15:31:20 -0500 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C6A1783257 for ; Wed, 30 Nov 2022 12:30:59 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-3d0465d32deso56909817b3.20 for ; Wed, 30 Nov 2022 12:30:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=65M7tiq/FJ1u1l1cHdwb9zK5l8rNjAqx7bihp6tcD3Y=; b=VqGIbI+eCnlVQ5UDFNxo96i3PKvpJlxt0IbCc2+28U9cotjkLR21603HzbEFp8Nvrf QY1C0XVr+BXzxwSmtzD6iGWIt7i8+BG0XEfUdgM1VWIdltaE7pH5ZNmbUhLcz/h+A15Z 8tQjSV2UltxPrK9KJgYntLwLPppJ6Dz+aKJbOOCPo62+nbAuNRjOg6pYaTwe2jyP21cr TrmXB45ddp9hIJc+iQNAeOec3HPTfg6v5dQwyqMdqkVRFaPqmrRJWxJBIYbJVjc8bZk6 4o0Rbzum7JuA5Z8w0dLAG2E/tA77j/gMi3iTfFfuyTNeG1skcpEGtn49Mw3ibeIrLYRq lrAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=65M7tiq/FJ1u1l1cHdwb9zK5l8rNjAqx7bihp6tcD3Y=; b=jlQcCskzVLJ6O7xnuxqjbyJO4O4zyn0XqWYJk/S4j0uCzjCHCZ+jSgRnT7aCWRmZw5 TGQdBAzU0XUhGgGwKPjfJgiFyGaE7Pldh6h0HvQFZhLfoROPvY2eU4YZHKFV00/NEEOU qIAMjx/alL7Ai2RHiD0ELXXhot5vnZKbgA5ce7KwJDq87/V9xQ7qNd9EhdhXnDf5dntE tG4wQ1P2GRy7nyBun+SXPL3+RoSpgMaKgYPUn3lGTi5JQAREm2ol/fM0Jm3ri6fJDoHP LO/x12dZaQrkXOi479VjANTwqPepezT1FJv/Li9ss4TqKYdgBOL5gAtExI4NnM1bdmPg QR5g== X-Gm-Message-State: ANoB5ploq9wzatU6m5SLcM6yuOI0wutmlW6mLd6p8nS/CCEzRmNqIVQm y84Z4xLYBS6p7ISUxs/7DXLB11ragpsZaGBghQQLZQ9ip1LRl4YZ9F1/Gnz/o1TbKHp5bWt68gh iTAflVh+T+OOAsH3NSyjuMLFanhx7tcA9qaHwFNXlvNqPfLo3F/1tJnzm2fpT+eehK56zfHjnNo ++ X-Google-Smtp-Source: AA0mqf6yrc09RaBQrXarfbZqFtEKsHt+z3lSVPlhO/XlFaMRIRyk76lPSWVgoxEiS3rLp1Re1VfUYHtZ/JQH/VuFVdF6 X-Received: from twelve4.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:437a]) (user=jonathantanmy job=sendgmr) by 2002:a25:a241:0:b0:6ee:e865:c2e2 with SMTP id b59-20020a25a241000000b006eee865c2e2mr39262872ybi.206.1669840258972; Wed, 30 Nov 2022 12:30:58 -0800 (PST) Date: Wed, 30 Nov 2022 12:30:48 -0800 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.38.1.584.g0f3c55d4c2-goog Message-ID: <940396307fea59b434d33edbf2c7f98adc62c053.1669839849.git.jonathantanmy@google.com> Subject: [PATCH 3/4] object-file: refactor replace object lookup From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Move the replace object lookup (specifically, the ability for the caller to know the result of the lookup) from read_object_file_extended() to one of the functions that it indirectly calls, do_oid_object_info_extended(), because a subsequent patch will need that ability from the latter. Signed-off-by: Jonathan Tan --- object-file.c | 28 +++++++++++++++++++++------- object-store.h | 1 + 2 files changed, 22 insertions(+), 7 deletions(-) diff --git a/object-file.c b/object-file.c index 37468bc256..fd394f1ace 100644 --- a/object-file.c +++ b/object-file.c @@ -1546,6 +1546,11 @@ static int do_oid_object_info_extended(struct repository *r, if (flags & OBJECT_INFO_LOOKUP_REPLACE) real = lookup_replace_object(r, oid); + if (oi && oi->real_oidp) { + if (!(flags & OBJECT_INFO_LOOKUP_REPLACE)) + BUG("specifying real_oidp does not make sense without OBJECT_INFO_LOOKUP_REPLACE"); + *oi->real_oidp = real; + } if (is_null_oid(real)) return -1; @@ -1659,17 +1664,27 @@ int oid_object_info(struct repository *r, return type; } +/* + * If real_oid is not NULL, check if oid has a replace object and store the + * object that we end up using there. + */ static void *read_object(struct repository *r, const struct object_id *oid, enum object_type *type, - unsigned long *size) + unsigned long *size, const struct object_id **real_oid) { struct object_info oi = OBJECT_INFO_INIT; void *content; + unsigned int flags = 0; oi.typep = type; oi.sizep = size; oi.contentp = &content; - if (oid_object_info_extended(r, oid, &oi, 0) < 0) + if (real_oid) { + flags |= OBJECT_INFO_LOOKUP_REPLACE; + oi.real_oidp = real_oid; + } + + if (oid_object_info_extended(r, oid, &oi, flags) < 0) return NULL; return content; } @@ -1705,14 +1720,13 @@ void *read_object_file_extended(struct repository *r, int lookup_replace) { void *data; - const struct object_id *repl = lookup_replace ? - lookup_replace_object(r, oid) : oid; + const struct object_id *real_oid; errno = 0; - data = read_object(r, repl, type, size); + data = read_object(r, oid, type, size, &real_oid); if (data) return data; - die_if_corrupt(r, oid, repl); + die_if_corrupt(r, oid, real_oid); return NULL; } @@ -2283,7 +2297,7 @@ int force_object_loose(const struct object_id *oid, time_t mtime) if (has_loose_object(oid)) return 0; - buf = read_object(the_repository, oid, &type, &len); + buf = read_object(the_repository, oid, &type, &len, NULL); if (!buf) return error(_("cannot read object for %s"), oid_to_hex(oid)); hdrlen = format_object_header(hdr, sizeof(hdr), type, len); diff --git a/object-store.h b/object-store.h index 88c879c61e..9684562eb2 100644 --- a/object-store.h +++ b/object-store.h @@ -406,6 +406,7 @@ struct object_info { struct object_id *delta_base_oid; struct strbuf *type_name; void **contentp; + const struct object_id **real_oidp; /* Response */ enum { From patchwork Wed Nov 30 20:30:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 13060431 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4E4E3C4321E for ; Wed, 30 Nov 2022 20:31:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229575AbiK3Ubm (ORCPT ); Wed, 30 Nov 2022 15:31:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42880 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229759AbiK3UbU (ORCPT ); Wed, 30 Nov 2022 15:31:20 -0500 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0146E83252 for ; Wed, 30 Nov 2022 12:31:01 -0800 (PST) Received: by mail-pl1-x64a.google.com with SMTP id k18-20020a170902c41200b001896d523dc8so15318164plk.19 for ; Wed, 30 Nov 2022 12:31:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=6ZAi/d3yauFyq9ST7Ptxe9GujkLxsPKbmBtXc/BCexc=; b=i8lVeAlqmxTGnmYwm+0/5n8hSTZzwQu5S/sKYNsutL5z0HYnTKL90iVm/AlQ1g8Fld Nf+wQmpbFo94WB0zpd9MUtPadGZDsiiG7pOxXXaqV8NdfPlgNGBpIkJRcWsy/aFBPsuI 788cOd3e9yvI4FQ6LauEU0xlsRC/X4uJdPRqvob5ycE564ZDSd+TEiTe5LLFQLEhSzUd l/AtQUFNgIdG2rAO+mwr7QIWRsPEvyFWnsXEduz/54q7hrzvFvomv+EdC5RpWLavrMuW G+majJ60Yu5p++JEEIHaH/RBESebaQgsdy+VmEdTJMc5+ihLAxIIHjzwalxS6iOtXocL Dqpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=6ZAi/d3yauFyq9ST7Ptxe9GujkLxsPKbmBtXc/BCexc=; b=0Xq9z7JN0CrDiy+3ErugAKy0AsrCgnEKMpTRuritfx7qKglQoowKK237u48BwrArcp SdSTo8LxicJ1NtluQux2+oRStnU6lEWhcNEOb8YvbZ2QQa5LHyQrvUZ4x/e0Fog0MNRU hOhO18+xjNCJY9U1TzXo+b9NpKnMAUZQ/GRZ6MGkAkNgHroRzQFPcpsOSS52VB3ZWyg7 1OsNayKGZmBQXM9Ze7NkB+2tmY43s6WnkI1MIDvwZ6qEEJUHt8N8EtyMbrxNK17I1f9+ EQwvgxYA2taAcE9fF4BdjN05EeI7qhH2O0S/VovgJAzPVRqxJp87KQ/oL+vMvliWS+yx z2Nw== X-Gm-Message-State: ANoB5pmGCvcjGSBFh4yoTxHAUIeWzbBf7essmrYTOVSZHYveXl39krX2 Vv6kVhjfZ++4bL8Qm4fAlncvcJ+Du/JytrnVmAfpO5st1UX7AsHNDtrNkOrECodVPkLQez3PuJK qmiOoS8ORKIRNY2StafjbQMBXW98hQ2bPLdYOxIndSgbUXEw33eBXLG8O2arTPcm2iJ6OCs5wxL WM X-Google-Smtp-Source: AA0mqf7Wwj2WafJw9IqtSqrC9ILaBzB3U6Wa2k/0bffhOrCbqT4R4xUy9hOoujC++8BlUJtMgcf2DvrU1kTG/89fWhff X-Received: from twelve4.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:437a]) (user=jonathantanmy job=sendgmr) by 2002:a17:90a:d086:b0:219:227d:d91f with SMTP id k6-20020a17090ad08600b00219227dd91fmr2321890pju.0.1669840260639; Wed, 30 Nov 2022 12:31:00 -0800 (PST) Date: Wed, 30 Nov 2022 12:30:49 -0800 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.38.1.584.g0f3c55d4c2-goog Message-ID: <6af8dcebd14d803fc8d2a01fbcc7f42ff380719d.1669839849.git.jonathantanmy@google.com> Subject: [PATCH 4/4] commit: don't lazy-fetch commits From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When parsing commits, fail fast when the commit is missing or corrupt, instead of attempting to fetch them. This is done by inlining repo_read_object_file() and setting the flag that prevents fetching. This is motivated by a situation in which through a bug (not necessarily through Git), there was corruption in the object store of a partial clone. In this particular case, the problem was exposed when "git gc" tried to expire reflogs, which calls repo_parse_commit(), which triggers fetches of the missing commits. (There are other possible solutions to this problem including passing an argument from "git gc" to "git reflog" to inhibit all lazy fetches, but I think that this fix is at the wrong level - fixing "git reflog" means that this particular command works fine, or so we think (it will fail if it somehow needs to read a legitimately missing blob, say, a .gitmodules file), but fixing repo_parse_commit() will fix a whole class of bugs.) Signed-off-by: Jonathan Tan --- commit.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/commit.c b/commit.c index 572301b80a..17e71f5be4 100644 --- a/commit.c +++ b/commit.c @@ -508,6 +508,13 @@ int repo_parse_commit_internal(struct repository *r, enum object_type type; void *buffer; unsigned long size; + const struct object_id *real_oid; + struct object_info oi = { + .typep = &type, + .sizep = &size, + .contentp = &buffer, + .real_oidp = &real_oid, + }; int ret; if (!item) @@ -516,11 +523,18 @@ int repo_parse_commit_internal(struct repository *r, return 0; if (use_commit_graph && parse_commit_in_graph(r, item)) return 0; - buffer = repo_read_object_file(r, &item->object.oid, &type, &size); - if (!buffer) + + /* + * Git does not support partial clones that exclude commits, so set + * OBJECT_INFO_SKIP_FETCH_OBJECT to fail fast when an object is missing. + */ + if (oid_object_info_extended(r, &item->object.oid, &oi, + OBJECT_INFO_LOOKUP_REPLACE | OBJECT_INFO_SKIP_FETCH_OBJECT) < 0) { + die_if_corrupt(r, &item->object.oid, real_oid); return quiet_on_missing ? -1 : error("Could not read %s", oid_to_hex(&item->object.oid)); + } if (type != OBJ_COMMIT) { free(buffer); return error("Object %s not a commit",