From patchwork Mon Jun 20 12:33:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Abhradeep Chakraborty X-Patchwork-Id: 12887520 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 511E0C43334 for ; Mon, 20 Jun 2022 12:33:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242175AbiFTMdi (ORCPT ); Mon, 20 Jun 2022 08:33:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48692 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242722AbiFTMdW (ORCPT ); Mon, 20 Jun 2022 08:33:22 -0400 Received: from mail-wm1-x330.google.com (mail-wm1-x330.google.com [IPv6:2a00:1450:4864:20::330]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2D79D5FAA for ; Mon, 20 Jun 2022 05:33:19 -0700 (PDT) Received: by mail-wm1-x330.google.com with SMTP id i81-20020a1c3b54000000b0039c76434147so7707689wma.1 for ; Mon, 20 Jun 2022 05:33:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=hk+odqG8J7B8EapEcXY6hcN+RYOMJxUlv6hQ88ppQoE=; b=aFC7hgbrTKLXHXLmlVSI7hruu4jNH6mN/tfJxdQ/WNNO3uohgoVo200xFvNAOSEQei 8yEr7pMQbK6wL+2XiCUgyXyubzzO1navM+571uodmf3RqYKQJyCoXZ59KJh+EodtJVq+ lIgTRVTV9xuigkMQzxe1xZwQKso/1KfqCNJ1ykTTbPsaYBGsdRj/mWw78u5dn1Uq0KyW uxlJoTSG6f35zlLIzxyTFRkcDBnEFmvwOcFXpIGnrY0ImTP7KL7QGupjiS6e6SQ81UAg l2k9U+EQ5WuW67LCszcvLk6xzZeSJpxarHuMxlacNCfwmpgSK7c3infLpUsH89SgEuc3 GVLQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=hk+odqG8J7B8EapEcXY6hcN+RYOMJxUlv6hQ88ppQoE=; b=sgjvXk+6vekLk9dT29ObhpO7QsMqBEQuGPwj4omsYmE24g4qxJ7yd3Uygt804tIo0G QvKeIui04wbYfalR1Nesae6tcUv4xxSlSSJkgIgfUleL4wGElxjd9XHySLg7/QDmYgti 677O0djuYILYiir+hfNrqjxRWeqZdSDXK6PcxwPF/lDBALcIUNex6XS9ZCapm4s0S/vK ghM8vrfd7xLutLjp9oXjoaTN+9la1NQckscwiRFVRil0TAp+sBU+3/nq8+CKb8IQNed6 kHyahWR4xKLJHhgIxrCy3tmUxmqrgyNh+dISTHCrqVlYW404xpencuY3hsIxCByWckUb Xt5w== X-Gm-Message-State: AJIora9ytemCHiiRmNxQPjmO9ZFzXJBXnGEL7HRGtu7+uYCbtKL6fat3 vWOq25xOAWkFfSsGNImE99mHMYkT68q9IA== X-Google-Smtp-Source: AGRyM1tH6k9O84p7MF6NIYAjAJ1H3mYJ6Gsd0kgD6fhMWGPmW9qQUozSpONPDMsp1oJJt+sHsdHvLQ== X-Received: by 2002:a05:600c:3655:b0:39c:6745:ec53 with SMTP id y21-20020a05600c365500b0039c6745ec53mr23789232wmq.186.1655728397780; Mon, 20 Jun 2022 05:33:17 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h18-20020a5d4312000000b002167efdd549sm13169958wrq.38.2022.06.20.05.33.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Jun 2022 05:33:16 -0700 (PDT) Message-Id: <2e22ca5069af617fe23072d78efb08b26d6130be.1655728395.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 20 Jun 2022 12:33:09 +0000 Subject: [PATCH 1/6] Documentation/technical: describe bitmap lookup table extension Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Taylor Blau , Kaartic Sivaram , Abhradeep Chakraborty , Abhradeep Chakraborty Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhradeep Chakraborty From: Abhradeep Chakraborty When reading bitmap file, git loads each and every bitmap one by one even if all the bitmaps are not required. A "bitmap lookup table" extension to the bitmap format can reduce the overhead of loading bitmaps which stores a list of bitmapped commit oids, along with their offset and xor offset. This way git can load only the neccesary bitmaps without loading the previous bitmaps. Add some information for the new "bitmap lookup table" extension in the bitmap-format documentation. Co-Authored-by: Taylor Blau Mentored-by: Taylor Blau Co-Mentored-by: Kaartic Sivaraam Signed-off-by: Abhradeep Chakraborty --- Documentation/technical/bitmap-format.txt | 31 +++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/Documentation/technical/bitmap-format.txt b/Documentation/technical/bitmap-format.txt index 04b3ec21785..34e98787b78 100644 --- a/Documentation/technical/bitmap-format.txt +++ b/Documentation/technical/bitmap-format.txt @@ -67,6 +67,14 @@ MIDXs, both the bit-cache and rev-cache extensions are required. pack/MIDX. The format and meaning of the name-hash is described below. + ** {empty} + BITMAP_OPT_LOOKUP_TABLE (0xf) : ::: + If present, the end of the bitmap file contains a table + containing a list of `N` object ids, a list of pairs of + offset and xor offset of respective objects, and 4-byte + integer denoting the flags (currently none). The format + and meaning of the table is described below. + 4-byte entry count (network byte order) The total count of entries (bitmapped commits) in this bitmap index. @@ -205,3 +213,26 @@ Note that this hashing scheme is tied to the BITMAP_OPT_HASH_CACHE flag. If implementations want to choose a different hashing scheme, they are free to do so, but MUST allocate a new header flag (because comparing hashes made under two different schemes would be pointless). + +Commit lookup table +------------------- + +If the BITMAP_OPT_LOOKUP_TABLE flag is set, the end of the `.bitmap` +contains a lookup table specifying the positions of commits which have a +bitmap. + +For a `.bitmap` containing `nr_entries` reachability bitmaps, the format +is as follows: + + - `nr_entries` object names. + + - `nr_entries` pairs of 4-byte integers, each in network order. + The first holds the offset from which that commit's bitmap can + be read. The second number holds the position of the commit + whose bitmap the current bitmap is xor'd with in lexicographic + order, or 0xffffffff if the current commit is not xor'd with + anything. + + - One 4-byte network byte order integer specifying + table-specific flags. None exist currently, so this is always + "0". From patchwork Mon Jun 20 12:33:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Abhradeep Chakraborty X-Patchwork-Id: 12887522 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4D01C43334 for ; Mon, 20 Jun 2022 12:33:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242669AbiFTMdq (ORCPT ); Mon, 20 Jun 2022 08:33:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48690 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242726AbiFTMdW (ORCPT ); Mon, 20 Jun 2022 08:33:22 -0400 Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2E2296459 for ; Mon, 20 Jun 2022 05:33:20 -0700 (PDT) Received: by mail-wr1-x42c.google.com with SMTP id w17so14457627wrg.7 for ; Mon, 20 Jun 2022 05:33:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=qB40vecJidre48+Xh+Saa3dq72UdlnOWRldQU5puktU=; b=UEBzeoBx713cRahyiGYaJuTxYno8O7dk/HOAgJ4saG5rkuiV3Cwzy2ix65lx4XKhTi gfJERif4cE/B/i4WqRcKvqxmmYsH21KSgcKuQT14pbo3CMUv+AIGe7INHayZm7P5V56j KD6XsIezvV0wiIbEMSB2+mrgshQwGkkW35qlMYcM78g/0l85Qx15Sm9XX5J/KTNZ4oYS VaoTEt2yH7aaJT4X8ZJCPcg9JQMp0CBGcAtzBJobGcVz3VrzEjYcSNPE0GSZFJK4Whyt 6xPGtQIQZZDQfgIf3jKiQE5cP3t+Sg/BGGgcvGmLnLYEbllslsjibbm0xuH3dO/FxBXB wIJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=qB40vecJidre48+Xh+Saa3dq72UdlnOWRldQU5puktU=; b=LZ0jUdLR+SI02HYpV3ZsnYZqV8MP6BZ7ykEXvoIYL38Y2QUy1rydk+oOnFnpHFTm7B r2E0aiO5gV0dnoOf/FNor7JjHKcpj1Fq0/6kvr8Q+UqDOouYYl/wc1/aLTrQT0z+DIjU VCUveZS2Yv7MJ270N6sEPjhAb20bknlm3PryFTtxT8jRrTieUFDnQt4l6BX7PWplPSzG ptkjwf6ulBK3lNFV+/pwQd5NRv98zx+HPmC1zqDcGn8I/mcboFzS1zSBiIWxLIkHM07h 4xWj+lIMBwLQRBqkSOC9c0wzOrbNW/Q/45K+KxfyDB+2jjPCaEJoTCK5lOdkeHOh7iC5 u5Pg== X-Gm-Message-State: AJIora8iayPTQrR+RoDMIu3A4GJmu4IOze9SCbmRGA06L8HJFv2jFFvJ 1sDqpB6Q2yyVlbRczJW35TZ8ygFpGdrjFw== X-Google-Smtp-Source: AGRyM1vnX5XVn35qPwFbDGixdR5GLCLR2BSoC/j3FoFUYW6MPLaUUDtdxhmDa9X3FQyihjeav2gksA== X-Received: by 2002:adf:e752:0:b0:21b:80ae:9d7a with SMTP id c18-20020adfe752000000b0021b80ae9d7amr16462961wrn.362.1655728398996; Mon, 20 Jun 2022 05:33:18 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id b9-20020adfe309000000b0020d0c9c95d3sm13488197wrj.77.2022.06.20.05.33.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Jun 2022 05:33:18 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Mon, 20 Jun 2022 12:33:10 +0000 Subject: [PATCH 2/6] pack-bitmap: prepare to read lookup table extension Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Taylor Blau , Kaartic Sivaram , Abhradeep Chakraborty , Abhradeep Chakraborty Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhradeep Chakraborty From: Abhradeep Chakraborty Bitmap lookup table extension can let git to parse only the necessary bitmaps without loading the previous bitmaps one by one. Teach git to read and use the bitmap lookup table extension. Co-Authored-by: Taylor Blau Mentored-by: Taylor Blau Co-Mentored-by: Kaartic Sivaraam Signed-off-by: Abhradeep Chakraborty --- pack-bitmap.c | 172 ++++++++++++++++++++++++++++++++++++++++++++++++-- pack-bitmap.h | 1 + 2 files changed, 166 insertions(+), 7 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index 36134222d7a..d5e5973a79f 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -15,6 +15,7 @@ #include "list-objects-filter-options.h" #include "midx.h" #include "config.h" +#include "hash-lookup.h" /* * An entry on the bitmap index, representing the bitmap for a given @@ -82,6 +83,13 @@ struct bitmap_index { /* The checksum of the packfile or MIDX; points into map. */ const unsigned char *checksum; + /* + * If not NULL, these point into the various commit table sections + * (within map). + */ + unsigned char *table_lookup; + unsigned char *table_offsets; + /* * Extended index. * @@ -185,6 +193,24 @@ static int load_bitmap_header(struct bitmap_index *index) index->hashes = (void *)(index_end - cache_size); index_end -= cache_size; } + + if (flags & BITMAP_OPT_LOOKUP_TABLE && + git_env_bool("GIT_READ_COMMIT_TABLE", 1)) { + uint32_t entry_count = ntohl(header->entry_count); + uint32_t table_size = + (entry_count * the_hash_algo->rawsz) /* oids */ + + (entry_count * sizeof(uint32_t)) /* offsets */ + + (entry_count * sizeof(uint32_t)) /* xor offsets */ + + (sizeof(uint32_t)) /* flags */; + + if (table_size > index_end - index->map - header_size) + return error("corrupted bitmap index file (too short to fit commit table)"); + + index->table_lookup = (void *)(index_end - table_size); + index->table_offsets = index->table_lookup + the_hash_algo->rawsz * entry_count; + + index_end -= table_size; + } } index->entry_count = ntohl(header->entry_count); @@ -470,7 +496,7 @@ static int load_bitmap(struct bitmap_index *bitmap_git) !(bitmap_git->tags = read_bitmap_1(bitmap_git))) goto failed; - if (load_bitmap_entries_v1(bitmap_git) < 0) + if (!bitmap_git->table_lookup && load_bitmap_entries_v1(bitmap_git) < 0) goto failed; return 0; @@ -557,14 +583,145 @@ struct include_data { struct bitmap *seen; }; -struct ewah_bitmap *bitmap_for_commit(struct bitmap_index *bitmap_git, - struct commit *commit) +static struct stored_bitmap *stored_bitmap_for_commit(struct bitmap_index *bitmap_git, + struct commit *commit, + uint32_t *pos_hint); + +static inline const unsigned char *bitmap_oid_pos(struct bitmap_index *bitmap_git, + uint32_t pos) +{ + return bitmap_git->table_lookup + (pos * the_hash_algo->rawsz); +} + +static inline const void *bitmap_offset_pos(struct bitmap_index *bitmap_git, + uint32_t pos) +{ + return bitmap_git->table_offsets + (pos * 2 * sizeof(uint32_t)); +} + +static inline const void *xor_position_pos(struct bitmap_index *bitmap_git, + uint32_t pos) +{ + return (unsigned char*) bitmap_offset_pos(bitmap_git, pos) + sizeof(uint32_t); +} + +static int bitmap_lookup_cmp(const void *_va, const void *_vb) +{ + return hashcmp(_va, _vb); +} + +static int bitmap_table_lookup(struct bitmap_index *bitmap_git, + struct object_id *oid, + uint32_t *commit_pos) +{ + unsigned char *found = bsearch(oid->hash, bitmap_git->table_lookup, + bitmap_git->entry_count, + the_hash_algo->rawsz, bitmap_lookup_cmp); + if (found) + *commit_pos = (found - bitmap_git->table_lookup) / the_hash_algo->rawsz; + return !!found; +} + +static struct stored_bitmap *lazy_bitmap_for_commit(struct bitmap_index *bitmap_git, + struct object_id *oid, + uint32_t commit_pos) +{ + uint32_t xor_pos; + off_t bitmap_ofs; + + int flags; + struct ewah_bitmap *bitmap; + struct stored_bitmap *xor_bitmap; + + bitmap_ofs = get_be32(bitmap_offset_pos(bitmap_git, commit_pos)); + xor_pos = get_be32(xor_position_pos(bitmap_git, commit_pos)); + + /* + * Lazily load the xor'd bitmap if required (and we haven't done so + * already). Make sure to pass the xor'd bitmap's position along as a + * hint to avoid an unnecessary binary search in + * stored_bitmap_for_commit(). + */ + if (xor_pos == 0xffffffff) { + xor_bitmap = NULL; + } else { + struct commit *xor_commit; + struct object_id xor_oid; + + oidread(&xor_oid, bitmap_oid_pos(bitmap_git, xor_pos)); + + xor_commit = lookup_commit(the_repository, &xor_oid); + if (!xor_commit) + return NULL; + + xor_bitmap = stored_bitmap_for_commit(bitmap_git, xor_commit, + &xor_pos); + } + + /* + * Don't bother reading the commit's index position or its xor + * offset: + * + * - The commit's index position is irrelevant to us, since + * load_bitmap_entries_v1 only uses it to learn the object + * id which is used to compute the hashmap's key. We already + * have an object id, so no need to look it up again. + * + * - The xor_offset is unusable for us, since it specifies how + * many entries previous to ours we should look at. This + * makes sense when reading the bitmaps sequentially (as in + * load_bitmap_entries_v1()), since we can keep track of + * each bitmap as we read them. + * + * But it can't work for us, since the bitmap's don't have a + * fixed size. So we learn the position of the xor'd bitmap + * from the commit table (and resolve it to a bitmap in the + * above if-statement). + * + * Instead, we can skip ahead and immediately read the flags and + * ewah bitmap. + */ + bitmap_git->map_pos = bitmap_ofs + sizeof(uint32_t) + sizeof(uint8_t); + flags = read_u8(bitmap_git->map, &bitmap_git->map_pos); + bitmap = read_bitmap_1(bitmap_git); + if (!bitmap) + return NULL; + + return store_bitmap(bitmap_git, bitmap, oid, xor_bitmap, flags); +} + +static struct stored_bitmap *stored_bitmap_for_commit(struct bitmap_index *bitmap_git, + struct commit *commit, + uint32_t *pos_hint) { khiter_t hash_pos = kh_get_oid_map(bitmap_git->bitmaps, commit->object.oid); - if (hash_pos >= kh_end(bitmap_git->bitmaps)) + if (hash_pos >= kh_end(bitmap_git->bitmaps)) { + uint32_t commit_pos; + if (!bitmap_git->table_lookup) + return NULL; + + /* NEEDSWORK: cache misses aren't recorded. */ + if (pos_hint) + commit_pos = *pos_hint; + else if (!bitmap_table_lookup(bitmap_git, + &commit->object.oid, + &commit_pos)) + return NULL; + return lazy_bitmap_for_commit(bitmap_git, &commit->object.oid, + commit_pos); + } + return kh_value(bitmap_git->bitmaps, hash_pos); +} + +struct ewah_bitmap *bitmap_for_commit(struct bitmap_index *bitmap_git, + struct commit *commit) +{ + struct stored_bitmap *sb = stored_bitmap_for_commit(bitmap_git, commit, + NULL); + if (!sb) return NULL; - return lookup_stored_bitmap(kh_value(bitmap_git->bitmaps, hash_pos)); + return lookup_stored_bitmap(sb); } static inline int bitmap_position_extended(struct bitmap_index *bitmap_git, @@ -1699,8 +1856,9 @@ void test_bitmap_walk(struct rev_info *revs) if (revs->pending.nr != 1) die("you must specify exactly one commit to test"); - fprintf(stderr, "Bitmap v%d test (%d entries loaded)\n", - bitmap_git->version, bitmap_git->entry_count); + if (!bitmap_git->table_lookup) + fprintf(stderr, "Bitmap v%d test (%d entries loaded)\n", + bitmap_git->version, bitmap_git->entry_count); root = revs->pending.objects[0].item; bm = bitmap_for_commit(bitmap_git, (struct commit *)root); diff --git a/pack-bitmap.h b/pack-bitmap.h index 3d3ddd77345..37f86787a4d 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -26,6 +26,7 @@ struct bitmap_disk_header { enum pack_bitmap_opts { BITMAP_OPT_FULL_DAG = 1, BITMAP_OPT_HASH_CACHE = 4, + BITMAP_OPT_LOOKUP_TABLE = 16, }; enum pack_bitmap_flags { From patchwork Mon Jun 20 12:33:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Abhradeep Chakraborty X-Patchwork-Id: 12887523 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A83BBC433EF for ; Mon, 20 Jun 2022 12:33:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242697AbiFTMdr (ORCPT ); Mon, 20 Jun 2022 08:33:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48728 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242746AbiFTMdX (ORCPT ); Mon, 20 Jun 2022 08:33:23 -0400 Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4A6186407 for ; Mon, 20 Jun 2022 05:33:22 -0700 (PDT) Received: by mail-wm1-x32c.google.com with SMTP id x6-20020a1c7c06000000b003972dfca96cso5618499wmc.4 for ; Mon, 20 Jun 2022 05:33:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Aau4V7jULEjonH1fV+eoCMGtH3S5VuUSfqMYvIszRwE=; b=It7hEgypbVTPkbfrssjM3QqBvwb/lgB6kii/HE73Y2UTJuqpQ29SObuznePFi1iCKC ZKMf822aRGegckxhSVqpaJchwZA2VCLQ86EGz5ZFyRqqiU5t2qBxYnKtolYSTVZNwAN+ x+HpgEUnkPbdZ4ZszRDpY2ZDn1lbqrbTWIRZ31/W6Eo/Wn0ofB6C68Rq0/lJErCNINnY wULAOef2CmMlUrcvGa9jhiXKLfZ+iTEFi5JmkfoH81xYpPSZGPcQVi6myEoD7qKO6nrL HyDlG70J0BAiaCO6CoDnBNGfglpngBTVinYNXwGQ4kX4W7SPK2VcW5IHYCz35fkqYBu/ zxqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Aau4V7jULEjonH1fV+eoCMGtH3S5VuUSfqMYvIszRwE=; b=OCFdAbO1b/wNp3xWyCvGuY/wakmANCFuq2gXdarKNkdcHirt3Oa6zXaz3lXbu6m+kI sHAIwSM/VTM9emyXWSOUv6MlvbW/wrVWMBWdjkTGV2KxhigfPbT6PfEWEsT1HZEFF7NK Ln8KtFGCv4gT6MEIizdmQkK6duKMq+4CMlrQWy1RK5mqgXq82a5Ha3azENYEtsdJP4k7 CAJ2MWLTCb4XKuWdfb07Ft7ezjJGp6ImrDaQt8lUZerYZjqe+eHViwsf0v2aCZpIowf1 ceMHQs2SD7mJ/R7M5xY4gviS1XuQfjoRMdJ9N4RvLhJOcLn+8kJcqbhFKzKfwYMxXMTr OLZA== X-Gm-Message-State: AJIora9zylzTQC41MmTJQPrEadYp2EHNgTnYqd29x0fzM78QqosoW0O8 3HyMd+xazFrc9Z6C021BrMPeh9WofV2KpQ== X-Google-Smtp-Source: AGRyM1sU4lfbccgeVU2e7f/84WBgyLlzkVd3xac+2Qn/hJbNqgX2eJnV62TvDz0n32L5krw5MQEcBA== X-Received: by 2002:a05:600c:2189:b0:39c:4bfe:880e with SMTP id e9-20020a05600c218900b0039c4bfe880emr24019079wme.78.1655728400249; Mon, 20 Jun 2022 05:33:20 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id y6-20020a5d6206000000b0021350f7b22esm14136172wru.109.2022.06.20.05.33.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Jun 2022 05:33:19 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Mon, 20 Jun 2022 12:33:11 +0000 Subject: [PATCH 3/6] pack-bitmap-write.c: write lookup table extension Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Taylor Blau , Kaartic Sivaram , Abhradeep Chakraborty , Abhradeep Chakraborty Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhradeep Chakraborty From: Abhradeep Chakraborty Teach git to write bitmap lookup table extension. The table has the following information: - `N` no of Object ids of each bitmapped commits - A list of offset, xor-offset pair; the i'th pair denotes the offsets and xor-offsets of i'th commit in the previous list. - 4-byte integer denoting the flags Co-authored-by: Taylor Blau Mentored-by: Taylor Blau Co-mentored-by: Kaartic Sivaraam Signed-off-by: Abhradeep Chakraborty --- pack-bitmap-write.c | 59 +++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 57 insertions(+), 2 deletions(-) diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index c43375bd344..9e88a64dd65 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -650,7 +650,8 @@ static const struct object_id *oid_access(size_t pos, const void *table) static void write_selected_commits_v1(struct hashfile *f, struct pack_idx_entry **index, - uint32_t index_nr) + uint32_t index_nr, + off_t *offsets) { int i; @@ -663,6 +664,9 @@ static void write_selected_commits_v1(struct hashfile *f, if (commit_pos < 0) BUG("trying to write commit not in index"); + if (offsets) + offsets[i] = hashfile_total(f); + hashwrite_be32(f, commit_pos); hashwrite_u8(f, stored->xor_offset); hashwrite_u8(f, stored->flags); @@ -671,6 +675,49 @@ static void write_selected_commits_v1(struct hashfile *f, } } +static int table_cmp(const void *_va, const void *_vb) +{ + return oidcmp(&writer.selected[*(uint32_t*)_va].commit->object.oid, + &writer.selected[*(uint32_t*)_vb].commit->object.oid); +} + +static void write_lookup_table(struct hashfile *f, + off_t *offsets) +{ + uint32_t i; + uint32_t flags = 0; + uint32_t *table, *table_inv; + + ALLOC_ARRAY(table, writer.selected_nr); + ALLOC_ARRAY(table_inv, writer.selected_nr); + + for (i = 0; i < writer.selected_nr; i++) + table[i] = i; + QSORT(table, writer.selected_nr, table_cmp); + for (i = 0; i < writer.selected_nr; i++) + table_inv[table[i]] = i; + + for (i = 0; i < writer.selected_nr; i++) { + struct bitmapped_commit *selected = &writer.selected[table[i]]; + struct object_id *oid = &selected->commit->object.oid; + + hashwrite(f, oid->hash, the_hash_algo->rawsz); + } + for (i = 0; i < writer.selected_nr; i++) { + struct bitmapped_commit *selected = &writer.selected[table[i]]; + + hashwrite_be32(f, offsets[table[i]]); + hashwrite_be32(f, selected->xor_offset + ? table_inv[table[i] - selected->xor_offset] + : 0xffffffff); + } + + hashwrite_be32(f, flags); + + free(table); + free(table_inv); +} + static void write_hash_cache(struct hashfile *f, struct pack_idx_entry **index, uint32_t index_nr) @@ -695,6 +742,7 @@ void bitmap_writer_finish(struct pack_idx_entry **index, { static uint16_t default_version = 1; static uint16_t flags = BITMAP_OPT_FULL_DAG; + off_t *offsets = NULL; struct strbuf tmp_file = STRBUF_INIT; struct hashfile *f; @@ -715,8 +763,14 @@ void bitmap_writer_finish(struct pack_idx_entry **index, dump_bitmap(f, writer.trees); dump_bitmap(f, writer.blobs); dump_bitmap(f, writer.tags); - write_selected_commits_v1(f, index, index_nr); + if (options & BITMAP_OPT_LOOKUP_TABLE) + CALLOC_ARRAY(offsets, index_nr); + + write_selected_commits_v1(f, index, index_nr, offsets); + + if (options & BITMAP_OPT_LOOKUP_TABLE) + write_lookup_table(f, offsets); if (options & BITMAP_OPT_HASH_CACHE) write_hash_cache(f, index, index_nr); @@ -730,4 +784,5 @@ void bitmap_writer_finish(struct pack_idx_entry **index, die_errno("unable to rename temporary bitmap file to '%s'", filename); strbuf_release(&tmp_file); + free(offsets); } From patchwork Mon Jun 20 12:33:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12887525 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0084BC43334 for ; Mon, 20 Jun 2022 12:33:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240227AbiFTMd4 (ORCPT ); Mon, 20 Jun 2022 08:33:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48730 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242766AbiFTMdY (ORCPT ); Mon, 20 Jun 2022 08:33:24 -0400 Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [IPv6:2a00:1450:4864:20::333]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6612D13CE4 for ; Mon, 20 Jun 2022 05:33:23 -0700 (PDT) Received: by mail-wm1-x333.google.com with SMTP id j5-20020a05600c1c0500b0039c5dbbfa48so7693568wms.5 for ; Mon, 20 Jun 2022 05:33:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=f3k/zoH0LfINS3pkX23UdFehADMYmGQwxuchnEd6NIY=; b=hdRDuNdtLvnbju9dEDETskhui3SqKgJEL34ejE+ZT5loI5ia4RwutZ0ik3JhQbth1Q oLwu/34TpiK2K4dcdmkLDMvPDxc9tSd7P1WVaWlTj5THXXdkq1Xp+uCBooT8rz4Alkxv 9y4QgO3yHpKAfLQVUuphEFpbsMtGfcFR+/rVNOFpQrBHxz9Y6DX+uz4Tc/SBTYMSeFXX bJ5TO4J9yAXwy7BsNpLiiuCfcP+WxmezQVhfEYa1nEGGOCh+NIcfNHpLJESAFLPLDign J/4emhyK9TNdB75rSKPUJdpXsfIzrJxzsCupGcxaEWhSQ76dXhp/2+z1+md2xB0DNiIm Uhuw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=f3k/zoH0LfINS3pkX23UdFehADMYmGQwxuchnEd6NIY=; b=X9HL8wSba5RPXOXU9fvYPlRWBu7D8/omlUaGhi8Jn8jC053IddfLbpQNbAWmr03s8u yGLcIFBuadd5KaF+NzXEgdP+CPMlT3WqYiKvOglSJwO1eJJilVps16dwVu8xqV1+V4BP GDL1m9u0IV+gM1ih+iVAd9nVE3d19R0qEVOcuZYc3FWn8qfomEWHoOKj2MjHaLwNtnop h4dZiiTZV8pM37mgj7JsEe56igw8yMlW1XM2bqxcM1EQEspGqn+KhoWjg/q5cVwQnGV2 j017LDV9piXb6hu9qiq7MDQEEEsPCLqY+ODEYhwA5F/b3954FHARKIoIHZ/hozl77+F8 4ioQ== X-Gm-Message-State: AJIora/lORBQtCuYPkZ4H/pocnfz4qpsRPU7ZGuB8MSYX4zC5X2HHbum fCcFnGieZYBAxfQCz2rTOD3w3Di6zKeUGA== X-Google-Smtp-Source: AGRyM1t/cLeh1P4LCZAUhhmW/cZbfuf23P2vogwl+hr+ibP3/YjzYZ7BEuKpQC6W2SXNcFQ4iy80Cg== X-Received: by 2002:a05:600c:228e:b0:39c:47a8:a870 with SMTP id 14-20020a05600c228e00b0039c47a8a870mr24718722wmf.136.1655728401520; Mon, 20 Jun 2022 05:33:21 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id o12-20020a05600c4fcc00b0039751bb8c62sm19366009wmq.24.2022.06.20.05.33.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Jun 2022 05:33:20 -0700 (PDT) Message-Id: <661c1137e1c918619f6624d2e331bafd9c3281dc.1655728395.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 20 Jun 2022 12:33:12 +0000 Subject: [PATCH 4/6] builtin/pack-objects.c: learn pack.writeBitmapLookupTable Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Taylor Blau , Kaartic Sivaram , Abhradeep Chakraborty , Taylor Blau Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Taylor Blau From: Taylor Blau Teach git to provide a way for users to enable/disable bitmap lookup table extension by providing a config option named 'writeBitmapLookupTable'. Signed-off-by: Taylor Blau Signed-off-by: Abhradeep Chakraborty --- Documentation/config/pack.txt | 7 +++++++ builtin/pack-objects.c | 8 ++++++++ 2 files changed, 15 insertions(+) diff --git a/Documentation/config/pack.txt b/Documentation/config/pack.txt index ad7f73a1ead..e12008d2415 100644 --- a/Documentation/config/pack.txt +++ b/Documentation/config/pack.txt @@ -164,6 +164,13 @@ When writing a multi-pack reachability bitmap, no new namehashes are computed; instead, any namehashes stored in an existing bitmap are permuted into their appropriate location when writing a new bitmap. +pack.writeBitmapLookupTable:: + When true, git will include a "lookup table" section in the + bitmap index (if one is written). This table is used to defer + loading individual bitmaps as late as possible. This can be + beneficial in repositories which have relatively large bitmap + indexes. Defaults to false. + pack.writeReverseIndex:: When true, git will write a corresponding .rev file (see: link:../technical/pack-format.html[Documentation/technical/pack-format.txt]) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index cc5f41086da..3ba20301980 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -3148,6 +3148,14 @@ static int git_pack_config(const char *k, const char *v, void *cb) else write_bitmap_options &= ~BITMAP_OPT_HASH_CACHE; } + + if (!strcmp(k, "pack.writebitmaplookuptable")) { + if (git_config_bool(k, v)) + write_bitmap_options |= BITMAP_OPT_LOOKUP_TABLE; + else + write_bitmap_options &= ~BITMAP_OPT_LOOKUP_TABLE; + } + if (!strcmp(k, "pack.usebitmaps")) { use_bitmap_index_default = git_config_bool(k, v); return 0; From patchwork Mon Jun 20 12:33:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Abhradeep Chakraborty X-Patchwork-Id: 12887524 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5F16C433EF for ; Mon, 20 Jun 2022 12:33:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242705AbiFTMdu (ORCPT ); Mon, 20 Jun 2022 08:33:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48698 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242772AbiFTMdY (ORCPT ); Mon, 20 Jun 2022 08:33:24 -0400 Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7AA2C13D05 for ; Mon, 20 Jun 2022 05:33:23 -0700 (PDT) Received: by mail-wr1-x42c.google.com with SMTP id c21so14490414wrb.1 for ; Mon, 20 Jun 2022 05:33:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=gUbCpDN2jTH5+D0bUi+XNXyUq2/eAlfEYKliBR1v7xk=; b=Y0/hc2ufSDeAnaNCJY9+ShgEmIKWgFrRob8KOgRv+ml9z5+vY9pBAE1aYE4qr4Ty/D YmuCwrZMdfjmnNsaIsAGhohIYsKPmz8GKARY9b2v2+KzIbasoHuFdPq1xLfX895j49vU LglO5WIP5re0nMt1jv9flk3TeQr6Idb6MFBxJ+rskP91Rr9oVeLXqiRM5ZUQdDq0C2wg nsaE76Td+XBOTB803Kf8pXFVVlYaLhgscvIiWips7upH4nOffc0pGHUTxr+QiiWRFiPD mtDpgmHtDVx2kVJ1XUvS2b9szux9mBtwxyRyAAz3SbAK/wXxCKiQSl1tzgqLWVcURvAN pkOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=gUbCpDN2jTH5+D0bUi+XNXyUq2/eAlfEYKliBR1v7xk=; b=qIXt6o4CF/RT11F1Hq2nXfDJEgKL0l22GW06aNEJkg35h+Rmg/O/WiKFwiuQaE2961 6WVE3Mmtq3XmSzIU/T+53O90/HTM2w3WriGL4qVjyiubP+vLD1mUEmY5sXOLxtdq1DE5 hjUn1tv1/i6yc94yMV2TviVpTxvLxqNHetXX77Yxxu7XoRt4GeFhxxvzOe0gZ0qcWl47 Gqle1XqSOiHh2pQs28iFZOC29w++e9UZBLqBkmUctJmvkhVBpItCf4phQ5DIC8GMPCmG M2tEpqm5KIBrp2WDizl9EdyauyzG9hApmLhHm8Cql8/VS+r74gIvUXC0yP3feJVJUQl5 frkA== X-Gm-Message-State: AJIora9iVD7SMjaIuBWivPacDSMwXPZWuIisHfdycCAoCDyGp4ZAhHzY pyYUwODlVZx/RpxuVhjTRbAM7x+ExjoXmg== X-Google-Smtp-Source: AGRyM1vgs8jS1f+7vWFBQ1FiPnUiSek7V3aqgOvQZrf7APOPtobxDJD03dxpQ635auJYrojiAY5+nw== X-Received: by 2002:a05:6000:18aa:b0:21b:946f:94d8 with SMTP id b10-20020a05600018aa00b0021b946f94d8mr800593wri.259.1655728402661; Mon, 20 Jun 2022 05:33:22 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id k20-20020adfc714000000b002103aebe8absm13198064wrg.93.2022.06.20.05.33.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Jun 2022 05:33:22 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Mon, 20 Jun 2022 12:33:13 +0000 Subject: [PATCH 5/6] bitmap-commit-table: add tests for the bitmap lookup table Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Taylor Blau , Kaartic Sivaram , Abhradeep Chakraborty , Abhradeep Chakraborty Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhradeep Chakraborty From: Abhradeep Chakraborty Add tests to check the working of the newly implemented lookup table. Mentored-by: Taylor Blau Co-Mentored-by: Kaartic Sivaraam Signed-off-by: Abhradeep Chakraborty --- t/t5310-pack-bitmaps.sh | 14 ++++++++++++++ t/t5326-multi-pack-bitmaps.sh | 19 +++++++++++++++++++ 2 files changed, 33 insertions(+) diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh index f775fc1ce69..f05d3e6ace7 100755 --- a/t/t5310-pack-bitmaps.sh +++ b/t/t5310-pack-bitmaps.sh @@ -43,6 +43,20 @@ test_expect_success 'full repack creates bitmaps' ' basic_bitmap_tests +test_expect_success 'using lookup table does not affect basic bitmap tests' ' + test_config pack.writeBitmapLookupTable true && + git repack -adb +' +basic_bitmap_tests + +test_expect_success 'using lookup table does not let each entries to be parsed one by one' ' + test_config pack.writeBitmapLookupTable true && + git repack -adb && + git rev-list --test-bitmap HEAD 2>out && + grep "Found bitmap for" out && + ! grep "Bitmap v1 test " +' + test_expect_success 'incremental repack fails when bitmaps are requested' ' test_commit more-1 && test_must_fail git repack -d 2>err && diff --git a/t/t5326-multi-pack-bitmaps.sh b/t/t5326-multi-pack-bitmaps.sh index 4fe57414c13..85fbdf5e4bb 100755 --- a/t/t5326-multi-pack-bitmaps.sh +++ b/t/t5326-multi-pack-bitmaps.sh @@ -306,5 +306,24 @@ test_expect_success 'graceful fallback when missing reverse index' ' ! grep "ignoring extra bitmap file" err ) ' +test_expect_success 'multi-pack-index write --bitmap writes lookup table if enabled' ' + rm -fr repo && + git init repo && + test_when_finished "rm -fr repo" && + ( + cd repo && + test_commit_bulk 106 && + + git repack -d && + + git config pack.writeBitmapLookupTable true && + git multi-pack-index write --bitmap && + + git rev-list --test-bitmap HEAD 2>out && + grep "Found bitmap for" out && + ! grep "Bitmap v1 test " + + ) +' test_done From patchwork Mon Jun 20 12:33:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Abhradeep Chakraborty X-Patchwork-Id: 12887526 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4DD00C433EF for ; Mon, 20 Jun 2022 12:35:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242376AbiFTMfR (ORCPT ); Mon, 20 Jun 2022 08:35:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48692 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242800AbiFTMd1 (ORCPT ); Mon, 20 Jun 2022 08:33:27 -0400 Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E94A813D14 for ; Mon, 20 Jun 2022 05:33:25 -0700 (PDT) Received: by mail-wr1-x42b.google.com with SMTP id g4so14452219wrh.11 for ; Mon, 20 Jun 2022 05:33:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=I1WccOGBcRQKSV7lB8p9fhYXV6oEGnh8EPVB1xQnH2Q=; b=emBK3En6Nzn8izcEx+piD65OrE7RFeO7ikZ0yi3AzspExdzkMk8AQQ1AFg4mI/SSCp BbKtX1CikMb0k6U17X7Fx79yRDX1rsz3nFZznBG2v4r5N3p8bIjZNBy377HuxgWvsM19 KgdYwLt6Zv11sVnQ3RrV0cf/ysUR7xD+5F2IwsyYK59Ct2Ucvp4r5x5dYANhu9rVPcnu MwS+18zCz82A4FsdDf7dSHYVpE/D7o/z7NZOhAENIsvSb55GxsvqEu9cXuRbPCC47NkP +l9cI4Hm8MdxAORjod9ZSueG5H5XF8EgCUbFoRHr4KIw+3pS4V7BTWZ/Q+VIyDErsbNl sGUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=I1WccOGBcRQKSV7lB8p9fhYXV6oEGnh8EPVB1xQnH2Q=; b=FjzRjOVVqlANqzOMPT7OfcFbpgIUTZZ1TtxYqZ/ZNXZfB5Fm5Rr1aRotGepVyBdJvl fouUfyxFtvCLmVPtvq5o3iMF5mNqHtuQ6+05dIxCPihzEFtfn7CYxhvS2sOi3/F/U1xy fuVf1Ce1KKDgQUexuhuSl6n28Ijt99WqolnVtlKe28W4vP0lRbO7+8AR23fGdO2+lgO5 GdjM5QYmBjR4tU3kk7IT8gBoq/3YF7w378ihtH8WbOgPSALLbvGg4sjDbnY5QewmoUVq Q4QGgXosvwm92BMdtuDAlxpyaLmGJXjfkvgl7qiidYVQZdtGeptEhzDnQLZ7VtT39RfB mXOA== X-Gm-Message-State: AJIora+mw/orTksr3oGX20mG7jY6OsFcYRu9nN/SObrwaM4KwydhZs82 nSxJN2hjXtDxdX9UBc6cRMMPakkdZkkrIA== X-Google-Smtp-Source: AGRyM1sMXTXsv6WV1ayr35kwXLRpr7JTk8JmODzwcSQbf286GYdl/FBnSbT5/rnBgE2BFeDTuhpKGA== X-Received: by 2002:a5d:5234:0:b0:21b:829c:3058 with SMTP id i20-20020a5d5234000000b0021b829c3058mr14292226wra.13.1655728404102; Mon, 20 Jun 2022 05:33:24 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id k17-20020a5d6e91000000b0021a39f5ba3bsm12372258wrz.7.2022.06.20.05.33.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Jun 2022 05:33:23 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Mon, 20 Jun 2022 12:33:14 +0000 Subject: [PATCH 6/6] bitmap-lookup-table: add performance tests Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Taylor Blau , Kaartic Sivaram , Abhradeep Chakraborty , Abhradeep Chakraborty Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhradeep Chakraborty From: Abhradeep Chakraborty Add performance tests for bitmap lookup table extension. Mentored-by: Taylor Blau Co-mentored-by: Kaartic Sivaraam Signed-off-by: Abhradeep Chakraborty --- t/perf/p5310-pack-bitmaps.sh | 60 +++++++++++++++++++----------- t/perf/p5326-multi-pack-bitmaps.sh | 55 +++++++++++++++++---------- 2 files changed, 73 insertions(+), 42 deletions(-) diff --git a/t/perf/p5310-pack-bitmaps.sh b/t/perf/p5310-pack-bitmaps.sh index 7ad4f237bc3..a8d9414de92 100755 --- a/t/perf/p5310-pack-bitmaps.sh +++ b/t/perf/p5310-pack-bitmaps.sh @@ -10,10 +10,11 @@ test_perf_large_repo # since we want to be able to compare bitmap-aware # git versus non-bitmap git # -# We intentionally use the deprecated pack.writebitmaps +# We intentionally use the deprecated pack.writeBitmaps # config so that we can test against older versions of git. test_expect_success 'setup bitmap config' ' - git config pack.writebitmaps true + git config pack.writeBitmaps true && + git config pack.writeReverseIndex true ' # we need to create the tag up front such that it is covered by the repack and @@ -28,27 +29,42 @@ test_perf 'repack to disk' ' test_full_bitmap -test_expect_success 'create partial bitmap state' ' - # pick a commit to represent the repo tip in the past - cutoff=$(git rev-list HEAD~100 -1) && - orig_tip=$(git rev-parse HEAD) && - - # now kill off all of the refs and pretend we had - # just the one tip - rm -rf .git/logs .git/refs/* .git/packed-refs && - git update-ref HEAD $cutoff && - - # and then repack, which will leave us with a nice - # big bitmap pack of the "old" history, and all of - # the new history will be loose, as if it had been pushed - # up incrementally and exploded via unpack-objects - git repack -Ad && - - # and now restore our original tip, as if the pushes - # had happened - git update-ref HEAD $orig_tip +test_perf 'use lookup table' ' + git config pack.writeBitmapLookupTable true ' -test_partial_bitmap +test_perf 'repack to disk (lookup table)' ' + git repack -adb +' + +test_full_bitmap + +for i in false true +do + $i && lookup=" (lookup table)" + test_expect_success "create partial bitmap state$lookup" ' + git config pack.writeBitmapLookupTable '"$i"' && + # pick a commit to represent the repo tip in the past + cutoff=$(git rev-list HEAD~100 -1) && + orig_tip=$(git rev-parse HEAD) && + + # now kill off all of the refs and pretend we had + # just the one tip + rm -rf .git/logs .git/refs/* .git/packed-refs && + git update-ref HEAD $cutoff && + + # and then repack, which will leave us with a nice + # big bitmap pack of the "old" history, and all of + # the new history will be loose, as if it had been pushed + # up incrementally and exploded via unpack-objects + git repack -Ad && + + # and now restore our original tip, as if the pushes + # had happened + git update-ref HEAD $orig_tip + ' + + test_partial_bitmap +done test_done diff --git a/t/perf/p5326-multi-pack-bitmaps.sh b/t/perf/p5326-multi-pack-bitmaps.sh index f2fa228f16a..9001eb4533e 100755 --- a/t/perf/p5326-multi-pack-bitmaps.sh +++ b/t/perf/p5326-multi-pack-bitmaps.sh @@ -26,27 +26,42 @@ test_expect_success 'drop pack bitmap' ' test_full_bitmap -test_expect_success 'create partial bitmap state' ' - # pick a commit to represent the repo tip in the past - cutoff=$(git rev-list HEAD~100 -1) && - orig_tip=$(git rev-parse HEAD) && - - # now pretend we have just one tip - rm -rf .git/logs .git/refs/* .git/packed-refs && - git update-ref HEAD $cutoff && - - # and then repack, which will leave us with a nice - # big bitmap pack of the "old" history, and all of - # the new history will be loose, as if it had been pushed - # up incrementally and exploded via unpack-objects - git repack -Ad && - git multi-pack-index write --bitmap && - - # and now restore our original tip, as if the pushes - # had happened - git update-ref HEAD $orig_tip +test_expect_success 'use lookup table' ' + git config pack.writeBitmapLookupTable true ' -test_partial_bitmap +test_perf 'setup multi-pack-index (lookup table)' ' + git multi-pack-index write --bitmap +' + +test_full_bitmap + +for i in false true +do + $i && lookup=" (lookup table)" + test_expect_success "create partial bitmap state$lookup" ' + git config pack.writeBitmapLookupTable '"$i"' && + # pick a commit to represent the repo tip in the past + cutoff=$(git rev-list HEAD~100 -1) && + orig_tip=$(git rev-parse HEAD) && + + # now pretend we have just one tip + rm -rf .git/logs .git/refs/* .git/packed-refs && + git update-ref HEAD $cutoff && + + # and then repack, which will leave us with a nice + # big bitmap pack of the "old" history, and all of + # the new history will be loose, as if it had been pushed + # up incrementally and exploded via unpack-objects + git repack -Ad && + git multi-pack-index write --bitmap && + + # and now restore our original tip, as if the pushes + # had happened + git update-ref HEAD $orig_tip + ' + + test_partial_bitmap +done test_done