From patchwork Thu Dec 8 20:57:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 13068891 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D9C99C4332F for ; Thu, 8 Dec 2022 20:57:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230033AbiLHU52 (ORCPT ); Thu, 8 Dec 2022 15:57:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49026 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229955AbiLHU5U (ORCPT ); Thu, 8 Dec 2022 15:57:20 -0500 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9E7C9934F4 for ; Thu, 8 Dec 2022 12:57:17 -0800 (PST) Received: by mail-pl1-x649.google.com with SMTP id c12-20020a170902d48c00b00189e5443387so2344356plg.15 for ; Thu, 08 Dec 2022 12:57:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=lwKbvaXU+L0SbOjSLYra6EM27mGx94iI5RwBlo+OQpM=; b=Fm3PpQyF4bgYhbXfi4SlPH0iVRTjjnfAbrd7QDI2Zu28qjgxp0LUYB2zKc94eKGZ6n 4blkCdOfsEtHsTdACDFPAih4vg95v1XXAwKRtvc75zOpTW7Pz0H0QDjhvK1GQCiGKqq5 VBTZ25nWTB3NHzrFGbo+HaGYbLscfmnEYG8WJ9psM05hq3Th9JyufREZnxGSoTIkwUU5 NqfMKqY1ThVT04FP9QH0oOFAgwpG0uJ1kOMBrKga0RJ+0917/q/vQf07H16/2e41C8YG JZXzQ4rnPYsTfaKw9g66y18BEzu9y+mz3fxhxKSbX5cjYI1FwciNbT2rgrhqDEYutKhM Gd7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=lwKbvaXU+L0SbOjSLYra6EM27mGx94iI5RwBlo+OQpM=; b=J39coZz16FsM1nXJKfmJ+y654MDwcnEWVkq1hUduL/OXUKOr3GXMmPuusjv8q8/NaG PCm3PyMHX5rsf5Gn20bSr6TqhGv1G4iCTfF73corOHMSAInSRFVI/pXZI2v1AQxFrYLD EUksajlbj8YflMb8fk+h6/h8L2oigIMaWza2ZAfxzD7+vEgo1DEifZJmQX6QoYuh6Gm9 KYuBs+mpfH/CIwm3fS+fIDTdSgmDobkRNWfyxPatehjwNMik4Omkyfq9xY7sz1P23oki So8VZxFDPFC4xeZeuiu0bCw2+VPdJI3kBtVbElzLWT5XzwmzWJeFLvDMcyy9+hHXLLyM d6nQ== X-Gm-Message-State: ANoB5pkQFkK9kqFUTPsbYbJZz/ZHw/kY4m2sur0f21cX5U0tRh4YnMV9 JyDvtd1uz5qGS0FjmOeqM122m+tMVq4Y2IKBdy4SAjsU3POD/A8DzQIONHTdiN8m5APfAvvOQsa eG3+ttaHcX01OvNPhYf1noTMLokhH3Ce3qy+iZW/iATBvdpqk5NJeMU1IlyAxFuMPwrA+r2VTHQ M2 X-Google-Smtp-Source: AA0mqf4qzvQvHMrb4lIEqTsRbbWxg/eK8oAb7V2JokJDtUwCI52w+GljKoksbsg0AlNHKBjqC+Q+4+g5gLxYulwbjvnM X-Received: from twelve4.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:437a]) (user=jonathantanmy job=sendgmr) by 2002:a17:90a:8406:b0:205:d3f8:5241 with SMTP id j6-20020a17090a840600b00205d3f85241mr105997506pjn.188.1670533037064; Thu, 08 Dec 2022 12:57:17 -0800 (PST) Date: Thu, 8 Dec 2022 12:57:05 -0800 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: Subject: [PATCH v3 1/4] object-file: remove OBJECT_INFO_IGNORE_LOOSE From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , peff@peff.net, avarab@gmail.com, gitster@pobox.com Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Its last user was removed in 97b2fa08b6 (fetch-pack: drop custom loose object cache, 2018-11-12), so we can remove it. Helped-by: Jeff King Signed-off-by: Jonathan Tan --- object-file.c | 3 --- object-store.h | 4 +--- 2 files changed, 1 insertion(+), 6 deletions(-) diff --git a/object-file.c b/object-file.c index 26290554bb..cf724bc19b 100644 --- a/object-file.c +++ b/object-file.c @@ -1575,9 +1575,6 @@ static int do_oid_object_info_extended(struct repository *r, if (find_pack_entry(r, real, &e)) break; - if (flags & OBJECT_INFO_IGNORE_LOOSE) - return -1; - /* Most likely it's a loose object. */ if (!loose_object_info(r, real, oi, flags)) return 0; diff --git a/object-store.h b/object-store.h index 1be57abaf1..b1ec0bde82 100644 --- a/object-store.h +++ b/object-store.h @@ -434,13 +434,11 @@ struct object_info { #define OBJECT_INFO_ALLOW_UNKNOWN_TYPE 2 /* Do not retry packed storage after checking packed and loose storage */ #define OBJECT_INFO_QUICK 8 -/* Do not check loose object */ -#define OBJECT_INFO_IGNORE_LOOSE 16 /* * Do not attempt to fetch the object if missing (even if fetch_is_missing is * nonzero). */ -#define OBJECT_INFO_SKIP_FETCH_OBJECT 32 +#define OBJECT_INFO_SKIP_FETCH_OBJECT 16 /* * This is meant for bulk prefetching of missing blobs in a partial * clone. Implies OBJECT_INFO_SKIP_FETCH_OBJECT and OBJECT_INFO_QUICK From patchwork Thu Dec 8 20:57:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 13068892 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B10A0C4332F for ; Thu, 8 Dec 2022 20:57:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230049AbiLHU5h (ORCPT ); Thu, 8 Dec 2022 15:57:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48828 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229649AbiLHU5W (ORCPT ); Thu, 8 Dec 2022 15:57:22 -0500 Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8AF989856E for ; Thu, 8 Dec 2022 12:57:19 -0800 (PST) Received: by mail-pg1-x549.google.com with SMTP id r126-20020a632b84000000b004393806c06eso1751308pgr.4 for ; Thu, 08 Dec 2022 12:57:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=V9OoDBP7BJA65Rln1WW347sVNHFjzrPX6KpvyHaLduo=; b=jyv1iVqZPyxRn+N7ujMWwqzGO35G6uRDk4h4LUmEMypo8JFsniuUF9dPbmsYNLOgCS DvLxGYcDRpFK9GO5KVTc7QipCc25Ccyw7nIkCZBwIiK6E1q68udKwLeUm6YgVvYcukAn b3JYKfXIIXLCKMnQ1daqYlf3dqZHZYZ6KqtxqDa3oA6nljJf3CnN2ByxdtPqIN924LFJ QtRQfKntCwKv5xVgTK6M+u/zYI02HuY2xPH622By841BOiYJwtHMTNLgHTc6/TJ90T/9 FnQ8Te16zB6W0qWMfQF5x/HPqTmrCO5LbdOXlZ+Qn9oCXW62yZK8LwUuaoHtImrE5aWD LV+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=V9OoDBP7BJA65Rln1WW347sVNHFjzrPX6KpvyHaLduo=; b=Pms+jfRJvaoLRoL0K8xKh6/IHHcbID/Sl1F5c2pLzsgPjNyfaTJvFmn97hEMBpC9yN 2LsLd3l3EcTX4/r8DBWws43P2A2AEJ/1Vrl8kmx5Vq8M2qa/gQTtHcm8OLEsWTa4hAEx lzNxyqPf8BigazaiRsM9mlhOZp5OU8N29pHY3FEWW/Dsf782i2EV/n/Du76YYfW6CAks rFFCJQYi4wkdA2BHTq8FdhfuaWQHVHLIBaeoNc+RmHklRuFqHNFyNysKS0ID3oHt/Eg1 vKKtpSLcUTNeuShXVALhPAhgZVk7/6axTBxUkx+Cf0n1bElxoH2M9a2NpfiFdb5VjAS6 Y5EQ== X-Gm-Message-State: ANoB5pnKusJMJZ4t0kHIVpQsIqokR4mT7jsLWM5/KMjR+9v+6iiDMPPE 1WXQ01SOjy5hHf9xSMjOxa+FUx3694tw2LV/dfxmKnb84fw6h3vj7U10xDBrPO2VbZf21pguFH5 nNOZr7DegWtPnshoNUzpiQGej0LleU1KOWFnJQfisTqdO+nb6wdlh560wZHhR2MhLKApHqZtp// km X-Google-Smtp-Source: AA0mqf6W5b5ZeSclPPkIRKESCW1OBcw/vH37QZT4XZp+hr+KEKHuD2xEXL8fCRyraXouwqCebeDXdcW6ZTHYdXM48lrF X-Received: from twelve4.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:437a]) (user=jonathantanmy job=sendgmr) by 2002:a63:500f:0:b0:478:bc19:a510 with SMTP id e15-20020a63500f000000b00478bc19a510mr15887040pgb.288.1670533038831; Thu, 08 Dec 2022 12:57:18 -0800 (PST) Date: Thu, 8 Dec 2022 12:57:06 -0800 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: <7419e4ac7053ab2d89a4cdc4612e5baeca48ce9f.1670532905.git.jonathantanmy@google.com> Subject: [PATCH v3 2/4] object-file: refactor map_loose_object_1() From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , peff@peff.net, avarab@gmail.com, gitster@pobox.com Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This function can do 3 things: 1. Gets an fd given a path 2. Simultaneously gets a path and fd given an OID 3. Memory maps an fd Split this function up. Only one caller needs 1, so inline that. As for 2, a future patch will also need this functionality and, in addition, the calculated path, so extract this into a separate function with an out parameter for the path. Signed-off-by: Jonathan Tan --- object-file.c | 60 +++++++++++++++++++++++++++++---------------------- 1 file changed, 34 insertions(+), 26 deletions(-) diff --git a/object-file.c b/object-file.c index cf724bc19b..d99d05839f 100644 --- a/object-file.c +++ b/object-file.c @@ -1211,43 +1211,48 @@ static int quick_has_loose(struct repository *r, } /* - * Map the loose object at "path" if it is not NULL, or the path found by - * searching for a loose object named "oid". + * Map and close the given loose object fd. The path argument is used for + * error reporting. */ -static void *map_loose_object_1(struct repository *r, const char *path, - const struct object_id *oid, unsigned long *size) +static void *map_fd(int fd, const char *path, unsigned long *size) { - void *map; - int fd; - - if (path) - fd = git_open(path); - else - fd = open_loose_object(r, oid, &path); - map = NULL; - if (fd >= 0) { - struct stat st; + void *map = NULL; + struct stat st; - if (!fstat(fd, &st)) { - *size = xsize_t(st.st_size); - if (!*size) { - /* mmap() is forbidden on empty files */ - error(_("object file %s is empty"), path); - close(fd); - return NULL; - } - map = xmmap(NULL, *size, PROT_READ, MAP_PRIVATE, fd, 0); + if (!fstat(fd, &st)) { + *size = xsize_t(st.st_size); + if (!*size) { + /* mmap() is forbidden on empty files */ + error(_("object file %s is empty"), path); + close(fd); + return NULL; } - close(fd); + map = xmmap(NULL, *size, PROT_READ, MAP_PRIVATE, fd, 0); } + close(fd); return map; } +static void *map_loose_object_1(struct repository *r, + const struct object_id *oid, + unsigned long *size, + const char **path) +{ + const char *p; + int fd = open_loose_object(r, oid, &p); + + if (fd < 0) + return NULL; + if (path) + *path = p; + return map_fd(fd, p, size); +} + void *map_loose_object(struct repository *r, const struct object_id *oid, unsigned long *size) { - return map_loose_object_1(r, NULL, oid, size); + return map_loose_object_1(r, oid, size, NULL); } enum unpack_loose_header_result unpack_loose_header(git_zstream *stream, @@ -2789,13 +2794,16 @@ int read_loose_object(const char *path, struct object_info *oi) { int ret = -1; + int fd; void *map = NULL; unsigned long mapsize; git_zstream stream; char hdr[MAX_HEADER_LEN]; unsigned long *size = oi->sizep; - map = map_loose_object_1(the_repository, path, NULL, &mapsize); + fd = git_open(path); + if (fd >= 0) + map = map_fd(fd, path, &mapsize); if (!map) { error_errno(_("unable to mmap %s"), path); goto out; From patchwork Thu Dec 8 20:57:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 13068893 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6057EC4332F for ; Thu, 8 Dec 2022 20:57:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229901AbiLHU5m (ORCPT ); Thu, 8 Dec 2022 15:57:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49092 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229900AbiLHU5X (ORCPT ); Thu, 8 Dec 2022 15:57:23 -0500 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1488E92FC8 for ; Thu, 8 Dec 2022 12:57:20 -0800 (PST) Received: by mail-pl1-x649.google.com with SMTP id t1-20020a170902b20100b001893ac9f0feso2374459plr.4 for ; Thu, 08 Dec 2022 12:57:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=z8WVt/LOKumxWbnwx1o4ggrfJzmaBF2OnyEcUOoLQTM=; b=LX16Vab7KD/iEvfYSdqnaR7gzHGLZCmpghoqKVTYWmFOmKF9NG6F9HorB/HvfpGyjI 8trxE7ON/evNC+3F/MxqHi8QCIwuEFF8Muafvzl2u4UeIl1ULRcFB2dtoHVc462eYlyk 8UcylKSHmp5U3DibEOyajvhMyxWt5ZhHGRaKEV86Nhzij7ooHvYGMZYIm+OoWtvbzHNn jgAYCe+FioE+eAXXHof8RWRrPGsllCa2vQ25mOI068va/giCb1sKXSjo2gn9a3vMLiR5 etYl9HJXMbrYZiXSObq1X7GtMyqqah85qv2v93Ve/LFPcHSC4DQk+vl8c8CS0N5LFx/q KKIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=z8WVt/LOKumxWbnwx1o4ggrfJzmaBF2OnyEcUOoLQTM=; b=OqWh89gCsz0mQwLAUhv4wQOaGMEJIUFogjm+q/GLB/ITZH57HGDN8QwnrvjvMPzfsn Kr7XjCh4CW5VGiIgtva8RvdxafdH7jMuhh0sSAhNfijHmXRNDIFuQg0hb83xxgB/5nZZ 4DmJbMAYqTXD+5ztY38QpjtqJl4HzdrJJBN9GkAFWxGYLqEqvIQNmjqynZ4fm7qccTcC V/aGNpZrhCEyepbtSUEsuRnEa8vSw21Yv4/4b0iatn5ZMy7jDOJ3wjIpsBJIamVD0lfK dS/UNZLMyEBYs8QRhltltH9J9ZEmMXETCmkVAgYxawGamSvtb2B/qZJMcYqFhOzsrcYQ v2+g== X-Gm-Message-State: ANoB5pm7ITltvQyoEwwMAIHrP5VdFm9Trck6r+yBXs4qwBADj3md3E94 awlGibis6oL+JstlHiX+uSV/sHvUWryl0MS9qZucWXBybD5/CThVKP6Px6SCNVEAidk6QFbN2gh uq+FwRM/QxsY4xl0MShAReHIxyGdEp7L20XF2ndOGfSKinQfCBCODvIL0Wwd/z8pJUroOpdPPlb Co X-Google-Smtp-Source: AA0mqf7pmhJTMdgVuYGF4qiejfRcOnIVCna351vXOjXEriSGocxS+TKEy+exSneQ8Pk6cPcllDNtMb2xd2W56J3IwpT/ X-Received: from twelve4.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:437a]) (user=jonathantanmy job=sendgmr) by 2002:aa7:80c4:0:b0:574:8c08:595f with SMTP id a4-20020aa780c4000000b005748c08595fmr66006149pfn.38.1670533040262; Thu, 08 Dec 2022 12:57:20 -0800 (PST) Date: Thu, 8 Dec 2022 12:57:07 -0800 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: <7c9ed861e7431352df864c8d2c3bec7dee6e3905.1670532905.git.jonathantanmy@google.com> Subject: [PATCH v3 3/4] object-file: emit corruption errors when detected From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , peff@peff.net, avarab@gmail.com, gitster@pobox.com Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Instead of relying on errno being preserved across function calls, teach do_oid_object_info_extended() to itself report object corruption when it first detects it. There are 3 types of corruption being detected: - when a replacement object is missing - when a loose object is corrupt - when a packed object is corrupt and the object cannot be read in another way Note that in the RHS of this patch's diff, a check for ENOENT that was introduced in 3ba7a06552 (A loose object is not corrupt if it cannot be read due to EMFILE, 2010-10-28) is also removed. The purpose of this check is to avoid a false report of corruption if the errno contains something like EMFILE (or anything that is not ENOENT), in which case a more generic report is presented. Because, as of this patch, we no longer rely on such a heuristic to determine corruption, but surface the error message at the point when we read something that we did not expect, this check is no longer necessary. Besides being more resilient, this also prepares for a future patch in which an indirect caller of do_oid_object_info_extended() will need such functionality. Signed-off-by: Jonathan Tan --- object-file.c | 48 ++++++++++++++++++++++-------------------------- object-store.h | 3 +++ 2 files changed, 25 insertions(+), 26 deletions(-) diff --git a/object-file.c b/object-file.c index d99d05839f..f166065f32 100644 --- a/object-file.c +++ b/object-file.c @@ -1433,6 +1433,7 @@ static int loose_object_info(struct repository *r, { int status = 0; unsigned long mapsize; + const char *path = NULL; void *map; git_zstream stream; char hdr[MAX_HEADER_LEN]; @@ -1464,7 +1465,7 @@ static int loose_object_info(struct repository *r, return 0; } - map = map_loose_object(r, oid, &mapsize); + map = map_loose_object_1(r, oid, &mapsize, &path); if (!map) return -1; @@ -1502,6 +1503,10 @@ static int loose_object_info(struct repository *r, break; } + if (status && (flags & OBJECT_INFO_DIE_IF_CORRUPT)) + die(_("loose object %s (stored in %s) is corrupt"), + oid_to_hex(oid), path); + git_inflate_end(&stream); cleanup: munmap(map, mapsize); @@ -1611,6 +1616,15 @@ static int do_oid_object_info_extended(struct repository *r, continue; } + if (flags & OBJECT_INFO_DIE_IF_CORRUPT) { + const struct packed_git *p; + if ((flags & OBJECT_INFO_LOOKUP_REPLACE) && !oideq(real, oid)) + die(_("replacement %s not found for %s"), + oid_to_hex(real), oid_to_hex(oid)); + if ((p = has_packed_and_bad(r, real))) + die(_("packed object %s (stored in %s) is corrupt"), + oid_to_hex(real), p->pack_name); + } return -1; } @@ -1663,7 +1677,8 @@ int oid_object_info(struct repository *r, static void *read_object(struct repository *r, const struct object_id *oid, enum object_type *type, - unsigned long *size) + unsigned long *size, + int die_if_corrupt) { struct object_info oi = OBJECT_INFO_INIT; void *content; @@ -1671,7 +1686,9 @@ static void *read_object(struct repository *r, oi.sizep = size; oi.contentp = &content; - if (oid_object_info_extended(r, oid, &oi, 0) < 0) + if (oid_object_info_extended(r, oid, &oi, + die_if_corrupt ? OBJECT_INFO_DIE_IF_CORRUPT : 0) + < 0) return NULL; return content; } @@ -1707,35 +1724,14 @@ void *read_object_file_extended(struct repository *r, int lookup_replace) { void *data; - const struct packed_git *p; - const char *path; - struct stat st; const struct object_id *repl = lookup_replace ? lookup_replace_object(r, oid) : oid; errno = 0; - data = read_object(r, repl, type, size); + data = read_object(r, repl, type, size, 1); if (data) return data; - obj_read_lock(); - if (errno && errno != ENOENT) - die_errno(_("failed to read object %s"), oid_to_hex(oid)); - - /* die if we replaced an object with one that does not exist */ - if (repl != oid) - die(_("replacement %s not found for %s"), - oid_to_hex(repl), oid_to_hex(oid)); - - if (!stat_loose_object(r, repl, &st, &path)) - die(_("loose object %s (stored in %s) is corrupt"), - oid_to_hex(repl), path); - - if ((p = has_packed_and_bad(r, repl))) - die(_("packed object %s (stored in %s) is corrupt"), - oid_to_hex(repl), p->pack_name); - obj_read_unlock(); - return NULL; } @@ -2278,7 +2274,7 @@ int force_object_loose(const struct object_id *oid, time_t mtime) if (has_loose_object(oid)) return 0; - buf = read_object(the_repository, oid, &type, &len); + buf = read_object(the_repository, oid, &type, &len, 0); if (!buf) return error(_("cannot read object for %s"), oid_to_hex(oid)); hdrlen = format_object_header(hdr, sizeof(hdr), type, len); diff --git a/object-store.h b/object-store.h index b1ec0bde82..98c1d67946 100644 --- a/object-store.h +++ b/object-store.h @@ -445,6 +445,9 @@ struct object_info { */ #define OBJECT_INFO_FOR_PREFETCH (OBJECT_INFO_SKIP_FETCH_OBJECT | OBJECT_INFO_QUICK) +/* Die if object corruption (not just an object being missing) was detected. */ +#define OBJECT_INFO_DIE_IF_CORRUPT 32 + int oid_object_info_extended(struct repository *r, const struct object_id *, struct object_info *, unsigned flags); From patchwork Thu Dec 8 20:57:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 13068894 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2DAD9C4332F for ; Thu, 8 Dec 2022 20:57:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229877AbiLHU55 (ORCPT ); Thu, 8 Dec 2022 15:57:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49008 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229910AbiLHU5Z (ORCPT ); Thu, 8 Dec 2022 15:57:25 -0500 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D54A593A59 for ; Thu, 8 Dec 2022 12:57:22 -0800 (PST) Received: by mail-pj1-x104a.google.com with SMTP id pm5-20020a17090b3c4500b00219864a46f0so1824619pjb.7 for ; Thu, 08 Dec 2022 12:57:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=JPpuIbJOfBZ40jm8I4bnWOn08D+JMmEeOVygJEeJuNM=; b=DnG0WL72FurdYIreQ0qyJXDk+6kkLu3lDRREa0I0TAO2UPQHL0MX8BdXtgwAfzaD8q 40sJ+KNtn9/wXEY+5RQE/50YxHzQLksYnUh1fV+XizdLxN10VdihHNR+p7L9CerOcHvZ f4xv6OSA+GTCHEF6IKiRuLnKPpCbHEPSZ9Rez5U64bucKxYBHUSrvCoBiWYvOjaKNrR6 dV3u5Dn7pGe51hGPNpWKjWo5fQCPmGnmcID3CH4bOdKNZiQISOeT47ulJg+B9Ax4dnvt wceq+nXlZKZIlNZUmlzdxn0wTe2tsrk7jqaCeqZFbeLsNXeHrra2DWxEvcDBsEG7QpQo UzLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=JPpuIbJOfBZ40jm8I4bnWOn08D+JMmEeOVygJEeJuNM=; b=MC9iuE1bKQfo6MPcXQeJU994+n8AifaewcWXnyQzWJ2FXQjKdq/eeN3AprkuIb/IAw HVX5Y9lKEa8gz4/xwPIf2b03UkRFzp3AiFHLzKDAdd6hMA5TgdlZTvUlKV4l/H6ATv8P 0kSD7VEiPkStK6S845iJIC3MOGK75ZLw5opVp2wP2HThO31DhFRM8RLkEjoRlcw99/4I J5hvSEIHw4od16wVS/z8vNT9dLz3CD8OFzCi2hpQpBr1FTPiA462gzoUrIZgiKO0oWqT gzQXHvtINgo9f0EH4VlNsVLjyMqx1YQjrGqriCycvTr4k5wlb47vdsw94VgloEDXp3b2 Vftg== X-Gm-Message-State: ANoB5pmn6apBrTXoUYEwSFs+ctked2GyGX6VUiM3aWmWQtw2+pBGkqMD H3YagAjAzEn1ZjU2DfMXPcRU8SXbTRnsP/iw+vXqKtcb/zjbhwqYeClDSprSV+oYuzW8Uuo7cHl DJjk/TEdia8fHvOoj9DbDi6pElq5vEocR9jPt65g+qtteyPifR9WfjAXW0Mhbn4ihH7Ea+JqYJe x7 X-Google-Smtp-Source: AA0mqf6lGXFfgEISf120UBhuMM8OgXCmVyQK/T/lHA8/Or+4W0TJW0Xv27pPwCSIq5rvZAiZeBlGuhR91FK6D+6OhoxJ X-Received: from twelve4.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:437a]) (user=jonathantanmy job=sendgmr) by 2002:a17:90b:f89:b0:219:5b3b:2b9f with SMTP id ft9-20020a17090b0f8900b002195b3b2b9fmr4101309pjb.2.1670533042033; Thu, 08 Dec 2022 12:57:22 -0800 (PST) Date: Thu, 8 Dec 2022 12:57:08 -0800 In-Reply-To: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: <5924a5120bc8e0bf529fc1cde5c23724550f72a4.1670532905.git.jonathantanmy@google.com> Subject: [PATCH v3 4/4] commit: don't lazy-fetch commits From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , peff@peff.net, avarab@gmail.com, gitster@pobox.com Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When parsing commits, fail fast when the commit is missing or corrupt, instead of attempting to fetch them. This is done by inlining repo_read_object_file() and setting the flag that prevents fetching. This is motivated by a situation in which through a bug (not necessarily through Git), there was corruption in the object store of a partial clone. In this particular case, the problem was exposed when "git gc" tried to expire reflogs, which calls repo_parse_commit(), which triggers fetches of the missing commits. (There are other possible solutions to this problem including passing an argument from "git gc" to "git reflog" to inhibit all lazy fetches, but I think that this fix is at the wrong level - fixing "git reflog" means that this particular command works fine, or so we think (it will fail if it somehow needs to read a legitimately missing blob, say, a .gitmodules file), but fixing repo_parse_commit() will fix a whole class of bugs.) Signed-off-by: Jonathan Tan Signed-off-by: Junio C Hamano --- commit.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/commit.c b/commit.c index 572301b80a..a02723f06b 100644 --- a/commit.c +++ b/commit.c @@ -508,6 +508,17 @@ int repo_parse_commit_internal(struct repository *r, enum object_type type; void *buffer; unsigned long size; + struct object_info oi = { + .typep = &type, + .sizep = &size, + .contentp = &buffer, + }; + /* + * Git does not support partial clones that exclude commits, so set + * OBJECT_INFO_SKIP_FETCH_OBJECT to fail fast when an object is missing. + */ + int flags = OBJECT_INFO_LOOKUP_REPLACE | OBJECT_INFO_SKIP_FETCH_OBJECT | + OBJECT_INFO_DIE_IF_CORRUPT; int ret; if (!item) @@ -516,8 +527,8 @@ int repo_parse_commit_internal(struct repository *r, return 0; if (use_commit_graph && parse_commit_in_graph(r, item)) return 0; - buffer = repo_read_object_file(r, &item->object.oid, &type, &size); - if (!buffer) + + if (oid_object_info_extended(r, &item->object.oid, &oi, flags) < 0) return quiet_on_missing ? -1 : error("Could not read %s", oid_to_hex(&item->object.oid));