From patchwork Tue Sep 7 21:17:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12479349 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE871C433FE for ; Tue, 7 Sep 2021 21:18:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B301B61102 for ; Tue, 7 Sep 2021 21:18:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1347057AbhIGVTH (ORCPT ); Tue, 7 Sep 2021 17:19:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53018 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243883AbhIGVTH (ORCPT ); Tue, 7 Sep 2021 17:19:07 -0400 Received: from mail-il1-x135.google.com (mail-il1-x135.google.com [IPv6:2607:f8b0:4864:20::135]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6379FC061575 for ; Tue, 7 Sep 2021 14:18:00 -0700 (PDT) Received: by mail-il1-x135.google.com with SMTP id a20so72909ilq.7 for ; Tue, 07 Sep 2021 14:18:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=7nudgiP2W7+e4NGltXu94bbx2ocxjBoo5J68bLuXjIg=; b=N+G9/53aXTGEXAHCvm0J/iaLakpvuAGomWHj5jov3T7ssUJ0j3/KzZ+bRVc1omOKVD 5x0pnMS8/2dicLcWAEB5EWXJol/XbcrhRd/t8AUa7/87lWZnQvbTyFHvCDQBQkbXyDm9 pIyesTyKfc375P8ikbrrZakPDIZsYhpN9x7axiKrdCcq7azAhOJnhTd5uaH3QfyngnE6 tRVp3zFmYIPIK3RGebvY9ivqHJms6W0lkgbQupxNsjP5bdoEWTuMUw/4364NFuG190Tr hFYzg71FCMtbwrc+cp6+KDz4KPXMHQboTiVrKecjwoCzt9iJJY3zA4CBgeVtFLL8VB+I irrg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=7nudgiP2W7+e4NGltXu94bbx2ocxjBoo5J68bLuXjIg=; b=eBdnnsRD4A8+5VSgJHqZvGiAqtopxd/6mCmFA+jgGUOUVtM0dv85WeWZU3wJsrZFol FnnDMe5N4CxrpwwzU9nnApsK779xYw8qsK6dy3TmwYHv50NpuuFmEdaZ9jT2S4LPduxT 9lwpyBQVolxc8qYppEzjScHFBwSEr6dU6bzKjVdu7q3yVbWuhaO8WabWo7C4xvkXjs+n +RLRhl2/rQVc+gXcRFqi4/HUfYqY/NKYlFP0tr2mVkx5z2z7pq5sfFIa4g+43XpkiHRJ BmPOZVxnESbA53bBcQvfu/+Uw42L3TtGfrNg/06UkQwkvIquq39IWYJI7siB9yVc7Fli sfgA== X-Gm-Message-State: AOAM530tRGYF2IrTzf6/7MC1UMc1hrJ0JnzA3AuC0cwlUof1pghb0hEe fbWjdzJK55LsbrieUzpkj4mxbMO2jboi3yZ0 X-Google-Smtp-Source: ABdhPJzuuLnTEZ2+6h/Gcfx50xeMpA4kj2w7RCxb2qQPNBty9BUwKnhlsepzJZ8uVek+9IAyy2rRJQ== X-Received: by 2002:a92:d6c7:: with SMTP id z7mr208411ilp.4.1631049479619; Tue, 07 Sep 2021 14:17:59 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id h8sm135194ile.39.2021.09.07.14.17.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 07 Sep 2021 14:17:59 -0700 (PDT) Date: Tue, 7 Sep 2021 17:17:58 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net Subject: [PATCH 2/4] pack-bitmap.c: propagate namehash values from existing bitmaps Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When an old bitmap exists while writing a new one, we load it and build a "reposition" table which maps bit positions of objects from the old bitmap to their respective positions in the new bitmap. This can help when we encounter a commit which was selected in both the old and new bitmap, since we only need to permute its bit (not recompute it from scratch). We do not, however, repurpose existing namehash values in the case of the hash-cache extension. There has been thus far no good reason to do so, since all of the namehash values for objects in the new bitmap would be populated during the traversal that was just performed by pack-objects when generating single-pack reachability bitmaps. But this isn't the case for multi-pack bitmaps, which are written via `git multi-pack-index write --bitmap` and do not perform any traversal. In this case all namehash values are set to zero, but we don't even bother to check the `pack.writeBitmapHashcache` option anyway, so it fails to matter. There are two approaches we could take to fill in non-zero hash-cache values: - have either the multi-pack-index builtin run its own traversal to attempt to fill in some values, or let a hypothetical caller (like `pack-objects` when `repack` eventually drives the `multi-pack-index` builtin) fill in the values they found during their traversal - or copy any existing namehash values that were stored in an existing bitmap to their corresponding positions in the new bitmap In a system where a repository is generally repacked with `git repack --geometric=` and occasionally repacked with `git repack -a`, the hash-cache coverage will tend towards all objects. Since populating the hash-cache is additive (i.e., doing so only helps our delta search), any intermediate lack of full coverage is just fine. So let's start by just propagating any values from the existing hash-cache if we see one. The next patch will respect the `pack.writeBitmapHashcache` option while writing MIDX bitmaps, and then test this new behavior. Signed-off-by: Taylor Blau --- pack-bitmap.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index e44af36933..cb876e7e9d 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -1815,18 +1815,20 @@ uint32_t *create_bitmap_mapping(struct bitmap_index *bitmap_git, for (i = 0; i < num_objects; ++i) { struct object_id oid; struct object_entry *oe; + uint32_t index_pos; if (bitmap_is_midx(bitmap_git)) - nth_midxed_object_oid(&oid, - bitmap_git->midx, - pack_pos_to_midx(bitmap_git->midx, i)); + index_pos = pack_pos_to_midx(bitmap_git->midx, i); else - nth_packed_object_id(&oid, bitmap_git->pack, - pack_pos_to_index(bitmap_git->pack, i)); + index_pos = pack_pos_to_index(bitmap_git->pack, i); + nth_bitmap_object_oid(bitmap_git, &oid, index_pos); oe = packlist_find(mapping, &oid); - if (oe) + if (oe) { reposition[i] = oe_in_pack_pos(mapping, oe) + 1; + if (bitmap_git->hashes && !oe->hash) + oe->hash = get_be32(bitmap_git->hashes + index_pos); + } } return reposition;