From patchwork Tue Jan 24 14:14:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 13114202 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 615A2C25B4E for ; Tue, 24 Jan 2023 14:14:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233728AbjAXOOm (ORCPT ); Tue, 24 Jan 2023 09:14:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53158 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233594AbjAXOOk (ORCPT ); Tue, 24 Jan 2023 09:14:40 -0500 Received: from mail-yw1-x1130.google.com (mail-yw1-x1130.google.com [IPv6:2607:f8b0:4864:20::1130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B9C6213DE9 for ; Tue, 24 Jan 2023 06:14:39 -0800 (PST) Received: by mail-yw1-x1130.google.com with SMTP id 00721157ae682-4b718cab0e4so219245587b3.9 for ; Tue, 24 Jan 2023 06:14:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=github.com; s=google; h=content-transfer-encoding:in-reply-to:content-language:references :cc:to:from:subject:user-agent:mime-version:date:message-id:from:to :cc:subject:date:message-id:reply-to; bh=lT1H7OqJs+cakHt26XnlIGG53LeF/i07z0Q7GCWWu64=; b=Y7sNf8EPmyUrCn2gV7HjgBg+IFDIKDg3f2wGAjn8jn8qfNt+g2gab4QOE7Un8i2bZG G1MKAUsAFujZzQT5ousUw9LH94Dt1wc+R+nX4Empp/Chuly+fX8I4T9hyUgI8AKoEkhA 5ILcvBiHpX/R4DrSaae1z0WX/eG7tu+EV6B5/XCdgTaPwLvQlgjymZfDgYvlXIJ0qHie /MBO9Dg95OZl7RLsOlTfk9IOrVJEufi5Mg4MZnoI6AqX5vpTS59DBBl5RTeZWJwLeqXm WlIudkaY0iqjAaN5BhhSKf6yIEEOCNVCFNJuR1T2imGq8RubWvjuFFheAicCrA5RnsoZ CiLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:content-language:references :cc:to:from:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=lT1H7OqJs+cakHt26XnlIGG53LeF/i07z0Q7GCWWu64=; b=cDwsFOncx4wlzhtne42qeUfYhGKmQl7WdavzaoFGZm3mOQZMpxyQkLEB5LtxzzzWUj P/nOAsUszoqDz8CnROMD4rR70YYy9eYVVEAW4+q2Va4DWDyWxLJB3uTOnx+LfsfDVGgs bjvs8kyedL3obQg7+xCzCkVFRBT7ENkb8bftHbpnjzbb2CkirySeLCMgyr1iNHJVTgSj VIWyKs7w1CAH+m/CgOgeQM2CsTsXhrT5vQK0aJvd0Z+4luxftvpZAz4mcvwhyYmg4VFD 6md4cSUDhVg3k91lMB3ThcgI//JYEuK6reQoqWHZb6VkyIaqG6i00UHI1LLzkmhMVFni bVyg== X-Gm-Message-State: AFqh2kob0Lts4PTBmlYpz0Zp1DQDltA+Oq0zQ2vcbmmuqvI5anaGAzMI 7Fzub/R1+4TCKbY/FLHNM9qT X-Google-Smtp-Source: AMrXdXuQwbEpNKGYvFZiKIiA3rPFbwjyehyiV6GtBInZgLrLszo42Lk3eJDLPD7hh/fAY5Um3Eudcw== X-Received: by 2002:a05:690c:a99:b0:4d1:da3:c18f with SMTP id ci25-20020a05690c0a9900b004d10da3c18fmr18034022ywb.39.1674569677915; Tue, 24 Jan 2023 06:14:37 -0800 (PST) Received: from ?IPV6:2600:1700:e72:80a0:70c6:2d77:ae17:eae3? ([2600:1700:e72:80a0:70c6:2d77:ae17:eae3]) by smtp.gmail.com with ESMTPSA id e127-20020a376985000000b006f9ddaaf01esm1452786qkc.102.2023.01.24.06.14.36 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 24 Jan 2023 06:14:37 -0800 (PST) Message-ID: <01f97aff-58a1-ef2c-e668-d37ea513c64e@github.com> Date: Tue, 24 Jan 2023 09:14:36 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.6.1 Subject: [PATCH v2.5 01/11] bundle: test unbundling with incomplete history From: Derrick Stolee To: Junio C Hamano Cc: Derrick Stolee via GitGitGadget , git@vger.kernel.org, me@ttaylorr.com, vdye@github.com, avarab@gmail.com, steadmon@google.com, chooglen@google.com References: <771a2993-85bd-0831-0977-24204f84e206@github.com> Content-Language: en-US In-Reply-To: <771a2993-85bd-0831-0977-24204f84e206@github.com> Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On 1/24/2023 7:27 AM, Derrick Stolee wrote: > I'll focus on this area today and see what I can learn and how I > can approach this problem in a different way. The first thing I did was try to figure out how things work today, so I created this test case. It appears we were not testing this at all previously. This is just a candidate replacement for v3, so don't worry about applying it until I re-roll. Thanks, -Stolee --- >8 --- From f9b0cc872ac44892fe6b1c973f16b35edfdc5b20 Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Tue, 24 Jan 2023 08:47:19 -0500 Subject: [PATCH v2.5 01/11] bundle: test unbundling with incomplete history When verifying a bundle, Git checks first that all prerequisite commits exist in the object store, then adds an additional check: those prerequisite commits must be reachable from references in the repository. This check is stronger than what is checked for refs being added during 'git fetch', which simply guarantees that the new refs have a complete history up to the point where it intersects with the current reachable history. However, we also do not have any tests that check the behavior under this condition. Create a test that demonstrates its behavior. In order to construct a broken history, perform a shallow clone of a repository with a linear history, but whose default branch ('base') has a single commit, so dropping the shallow markers leaves a complete history from that reference. However, the 'tip' reference adds a shallow commit whose parent is missing in the cloned repository. Trying to unbundle a bundle with the 'tip' as a prerequisite will succeed past the object store check and move into the reachability check. The two errors that are reported are of this form: error: Could not read fatal: Failed to traverse parents of commit These messages are not particularly helpful for the person running the unbundle command, but they do prevent the command from succeeding. Signed-off-by: Derrick Stolee --- t/t6020-bundle-misc.sh | 40 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) -- 2.39.1.vfs.0.0 diff --git a/t/t6020-bundle-misc.sh b/t/t6020-bundle-misc.sh index 3a1cf30b1d7..38dbbf89155 100755 --- a/t/t6020-bundle-misc.sh +++ b/t/t6020-bundle-misc.sh @@ -566,4 +566,44 @@ test_expect_success 'cloning from filtered bundle has useful error' ' grep "cannot clone from filtered bundle" err ' +test_expect_success 'verify catches unreachable, broken prerequisites' ' + test_when_finished rm -rf clone-from clone-to && + git init clone-from && + ( + cd clone-from && + git checkout -b base && + test_commit A && + git checkout -b tip && + git commit --allow-empty -m "will drop by shallow" && + git commit --allow-empty -m "will keep by shallow" && + git commit --allow-empty -m "for bundle, not clone" && + git bundle create tip.bundle tip~1..tip && + git reset --hard HEAD~1 && + git checkout base + ) && + BAD_OID=$(git -C clone-from rev-parse tip~1) && + TIP_OID=$(git -C clone-from rev-parse tip) && + git clone --depth=1 --no-single-branch \ + "file://$(pwd)/clone-from" clone-to && + ( + cd clone-to && + + # Set up broken history by removing shallow markers + git update-ref -d refs/remotes/origin/tip && + rm .git/shallow && + + # Verify should fail + test_must_fail git bundle verify \ + ../clone-from/tip.bundle 2>err && + grep "Could not read $BAD_OID" err && + grep "Failed to traverse parents of commit $TIP_OID" err && + + # Unbundling should fail + test_must_fail git bundle unbundle \ + ../clone-from/tip.bundle 2>err && + grep "Could not read $BAD_OID" err && + grep "Failed to traverse parents of commit $TIP_OID" err + ) +' + test_done From patchwork Tue Jan 24 14:16:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 13114203 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 61CE5C38142 for ; Tue, 24 Jan 2023 14:16:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234839AbjAXOQi (ORCPT ); Tue, 24 Jan 2023 09:16:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55910 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234315AbjAXOQg (ORCPT ); Tue, 24 Jan 2023 09:16:36 -0500 Received: from mail-qt1-x82e.google.com (mail-qt1-x82e.google.com [IPv6:2607:f8b0:4864:20::82e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F1B51485A2 for ; Tue, 24 Jan 2023 06:16:32 -0800 (PST) Received: by mail-qt1-x82e.google.com with SMTP id j9so13133075qtv.4 for ; Tue, 24 Jan 2023 06:16:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=github.com; s=google; h=content-transfer-encoding:in-reply-to:references:cc:to:from :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=hJQIML/+X9b3V8r+JvMgcLagaPX3JwpMZSNIOg5llQE=; b=SyrLbXoPSIpmItiqEZMiVrzP2q2+o48LdP2CVJvlzS3B4k5ZOPD7cAYDZyfqbq7eaG hol1p3xjE5hFKnVYTBLeWcqafphgPjdbqywJzbquywLqtj7vcdwnlUeXHdmgOKFcTrJ3 F90QQA5aMxQ4CYK7FpO1uazCS5qQPL638C1kKOisqEOKkoX4BSancnHzH74VIDNRx4GH mwOV55gBD89ivsyTqY94XoUfotnEKf/WJQEzjohUGHs/qzy8SWEuZp6EN/DRAfw/dLf4 J7ulpWJXaYWWrZYQepyAuw3oWxuUFh4nAd6LKrpdeYq4HCMCWXbhkM2NVuGrdQx6ee6N QaTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:references:cc:to:from :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=hJQIML/+X9b3V8r+JvMgcLagaPX3JwpMZSNIOg5llQE=; b=b871JwuBo8qpqbF+CkOAwxtniy3H6H5ONpWtvzJhn2R4RUTNDLmh1w6X9ZnKKnGvQV kJNj+5UgCFOHLFtUJgj3EXUZ9nOS11n+nwoIC8UGOWtVJj5Ok4XC9HCcKyW+hGTLVKrh O0Wi1a4mrufxMRCwsGs2ukijjxKkHCG7qHZdB8D0Q8NlBOuWItEkE7VeeQqpBGT8zcfN UADgxegqXpqo5x26WQr75NenfJxy67OnMfR+fOpF4Kzb1tKF2Ad7DjI+PwOpLQfYtFsB k7ZSexKWS5/3Mgv+eIgi2DvxP9o4hcqkpOQnSlMKwR7foiqowU4lP9ptdsttJj8rtNCG dI8Q== X-Gm-Message-State: AFqh2kqboSMdrUJxPynOBpz6bOlVT3RT1km9jKOvm3EQzJzcrtfcorYh aZvP3iNHH5FDHCRSbXNMcJqw X-Google-Smtp-Source: AMrXdXs6bY5VoFXoRlxzwJB+oS0F4R1FEmQeryB/5WADbDjWFhtCx4bBXQhL/TkEavdKR3aTA0LbMQ== X-Received: by 2002:ac8:4cc7:0:b0:3ab:2a7f:83e4 with SMTP id l7-20020ac84cc7000000b003ab2a7f83e4mr37222211qtv.35.1674569791999; Tue, 24 Jan 2023 06:16:31 -0800 (PST) Received: from ?IPV6:2600:1700:e72:80a0:70c6:2d77:ae17:eae3? ([2600:1700:e72:80a0:70c6:2d77:ae17:eae3]) by smtp.gmail.com with ESMTPSA id b10-20020ac812ca000000b003a7f3c4dcdfsm1301697qtj.47.2023.01.24.06.16.31 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 24 Jan 2023 06:16:31 -0800 (PST) Message-ID: Date: Tue, 24 Jan 2023 09:16:30 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.6.1 Subject: [PATCH v2.5 02/11] bundle: verify using connected() Content-Language: en-US From: Derrick Stolee To: Junio C Hamano Cc: Derrick Stolee via GitGitGadget , git@vger.kernel.org, me@ttaylorr.com, vdye@github.com, avarab@gmail.com, steadmon@google.com, chooglen@google.com References: <771a2993-85bd-0831-0977-24204f84e206@github.com> In-Reply-To: <771a2993-85bd-0831-0977-24204f84e206@github.com> Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On 1/24/2023 7:27 AM, Derrick Stolee wrote: > I'll focus on this area today and see what I can learn and how I > can approach this problem in a different way. > 2. Find out how to modify verify_bundle() so it can do the more > relaxed connectivity check. And here is the modification to verify_bundle() to depend on check_connected() instead. This also improves (in my opinion) the error reporting from this situation, as seen in the test edits. Again, this is a placeholder before I re-roll this series into an inevitable v3, so don't bother applying this patch until then. Thanks, -Stolee --- >8 --- From 6a3d64761e9691994f9310b9ce2338f49aa72d48 Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Tue, 24 Jan 2023 08:47:00 -0500 Subject: [PATCH v2.5 02/11] bundle: verify using connected() When Git verifies a bundle to see if it is safe for unbundling, it first looks to see if the prerequisite commits are in the object store. This is usually a sufficient filter, and those missing commits are indicated clearly in the error messages. However, if the commits are present in the object store, then there could still be issues if those commits are not reachable from the repository's references. The repository only has guarantees that its object store is closed under reachability for the objects that are reachable from references. Thus, the code in verify_bundle() has previously had the additional check that all prerequisite commits are reachable from repository references. This is done via a revision walk from all references, stopping only if all prerequisite commits are discovered or all commits are walked. This uses a custom walk to verify_bundle(). This check is more strict than what Git applies even to fetched pack-files. In the fetch case, Git guarantees that the new references are closed under reachability by walking from the new references until walking commits that are reachable from repository refs. This is done through the well-used check_connected() method. To better align with the restrictions required by 'git fetch', reimplement this check in verify_bundle() to use check_connected(). This also simplifies the code significantly. The previous change added a test that verified the behavior of 'git bundle verify' and 'git bundle unbundle' in this case, and the error messages looked like this: error: Could not read fatal: Failed to traverse parents of commit However, by changing the revision walk slightly within check_connected() and using its quiet mode, we can omit those messages. Instead, we get only this message, tailored to describing the current state of the repository: error: some prerequisite commits exist in the object store, but are not connected to the repository's history (Line break added here for the commit message formatting, only.) While this message does not include any object IDs, there is no guarantee that those object IDs would help the user diagnose what is going on, as they could be separated from the prerequisite commits by some distance. At minimum, this situation describes the situation in a more informative way than the previous error messages. Signed-off-by: Derrick Stolee --- bundle.c | 75 ++++++++++++++++-------------------------- t/t6020-bundle-misc.sh | 8 ++--- 2 files changed, 33 insertions(+), 50 deletions(-) -- 2.39.1.vfs.0.0 diff --git a/bundle.c b/bundle.c index 4ef7256aa11..76c3a904898 100644 --- a/bundle.c +++ b/bundle.c @@ -12,6 +12,7 @@ #include "refs.h" #include "strvec.h" #include "list-objects-filter-options.h" +#include "connected.h" static const char v2_bundle_signature[] = "# v2 git bundle\n"; static const char v3_bundle_signature[] = "# v3 git bundle\n"; @@ -187,6 +188,21 @@ static int list_refs(struct string_list *r, int argc, const char **argv) /* Remember to update object flag allocation in object.h */ #define PREREQ_MARK (1u<<16) +struct string_list_iterator { + struct string_list *list; + size_t cur; +}; + +static const struct object_id *iterate_ref_map(void *cb_data) +{ + struct string_list_iterator *iter = cb_data; + + if (iter->cur >= iter->list->nr) + return NULL; + + return iter->list->items[iter->cur++].util; +} + int verify_bundle(struct repository *r, struct bundle_header *header, enum verify_bundle_flags flags) @@ -196,26 +212,25 @@ int verify_bundle(struct repository *r, * to be verbose about the errors */ struct string_list *p = &header->prerequisites; - struct rev_info revs = REV_INFO_INIT; - const char *argv[] = {NULL, "--all", NULL}; - struct commit *commit; - int i, ret = 0, req_nr; + int i, ret = 0; const char *message = _("Repository lacks these prerequisite commits:"); + struct string_list_iterator iter = { + .list = p, + }; + struct check_connected_options opts = { + .quiet = 1, + }; if (!r || !r->objects || !r->objects->odb) return error(_("need a repository to verify a bundle")); - repo_init_revisions(r, &revs, NULL); for (i = 0; i < p->nr; i++) { struct string_list_item *e = p->items + i; const char *name = e->string; struct object_id *oid = e->util; struct object *o = parse_object(r, oid); - if (o) { - o->flags |= PREREQ_MARK; - add_pending_object(&revs, o, name); + if (o) continue; - } ret++; if (flags & VERIFY_BUNDLE_QUIET) continue; @@ -223,37 +238,14 @@ int verify_bundle(struct repository *r, error("%s", message); error("%s %s", oid_to_hex(oid), name); } - if (revs.pending.nr != p->nr) + if (ret) goto cleanup; - req_nr = revs.pending.nr; - setup_revisions(2, argv, &revs, NULL); - - list_objects_filter_copy(&revs.filter, &header->filter); - - if (prepare_revision_walk(&revs)) - die(_("revision walk setup failed")); - i = req_nr; - while (i && (commit = get_revision(&revs))) - if (commit->object.flags & PREREQ_MARK) - i--; - - for (i = 0; i < p->nr; i++) { - struct string_list_item *e = p->items + i; - const char *name = e->string; - const struct object_id *oid = e->util; - struct object *o = parse_object(r, oid); - assert(o); /* otherwise we'd have returned early */ - if (o->flags & SHOWN) - continue; - ret++; - if (flags & VERIFY_BUNDLE_QUIET) - continue; - if (ret == 1) - error("%s", message); - error("%s %s", oid_to_hex(oid), name); - } + if ((ret = check_connected(iterate_ref_map, &iter, &opts))) + error(_("some prerequisite commits exist in the object store, " + "but are not connected to the repository's history")); + /* TODO: preserve this verbose language. */ if (flags & VERIFY_BUNDLE_VERBOSE) { struct string_list *r; @@ -282,15 +274,6 @@ int verify_bundle(struct repository *r, list_objects_filter_spec(&header->filter)); } cleanup: - /* Clean up objects used, as they will be reused. */ - for (i = 0; i < p->nr; i++) { - struct string_list_item *e = p->items + i; - struct object_id *oid = e->util; - commit = lookup_commit_reference_gently(r, oid, 1); - if (commit) - clear_commit_marks(commit, ALL_REV_FLAGS | PREREQ_MARK); - } - release_revisions(&revs); return ret; } diff --git a/t/t6020-bundle-misc.sh b/t/t6020-bundle-misc.sh index 38dbbf89155..7d40994991e 100755 --- a/t/t6020-bundle-misc.sh +++ b/t/t6020-bundle-misc.sh @@ -595,14 +595,14 @@ test_expect_success 'verify catches unreachable, broken prerequisites' ' # Verify should fail test_must_fail git bundle verify \ ../clone-from/tip.bundle 2>err && - grep "Could not read $BAD_OID" err && - grep "Failed to traverse parents of commit $TIP_OID" err && + grep "some prerequisite commits .* are not connected" err && + test_line_count = 1 err && # Unbundling should fail test_must_fail git bundle unbundle \ ../clone-from/tip.bundle 2>err && - grep "Could not read $BAD_OID" err && - grep "Failed to traverse parents of commit $TIP_OID" err + grep "some prerequisite commits .* are not connected" err && + test_line_count = 1 err ) '