From patchwork Sun Jun 26 13:10:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Abhradeep Chakraborty X-Patchwork-Id: 12895764 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3C07C43334 for ; Sun, 26 Jun 2022 13:10:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234447AbiFZNKY (ORCPT ); Sun, 26 Jun 2022 09:10:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49754 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234280AbiFZNKX (ORCPT ); Sun, 26 Jun 2022 09:10:23 -0400 Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3189DB7C1 for ; Sun, 26 Jun 2022 06:10:22 -0700 (PDT) Received: by mail-wr1-x42f.google.com with SMTP id i25so3882988wrc.13 for ; Sun, 26 Jun 2022 06:10:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=KYUcpt9cHi2agWK3matXO5G6OMok3RbTL6LJ8CCZF1w=; b=adE4bxGfTJYy3a5AZGmnH31oZ+WcGiFKy/N3CT32uSBx0cheiUmoJMRE4ocEGjX8au k3Ee3fvyazy5YR9qKbTQV8O77HTUjy6mBY+HNrFHcutrfEm8+mS8kQUffKxwpbAC/Hvr p9imOrp7qBk5g/N+IFb7774t3cWp3XWg15C7MynXOZ6tDf904NGQrOtSe5wmIH966gPe TcmOOJtISl5Y8Ln1veH+ityOCq7tdrmgCvwm4Jdvpzf2I6cEdSgMD1tEh9vLoeJ2gsfo ko00OHdrNWcizByla6U7p0LIogNZurImX2Namp44OCLp4nuZOZe2/FWvE/naSBvu+/G/ Y91Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=KYUcpt9cHi2agWK3matXO5G6OMok3RbTL6LJ8CCZF1w=; b=X64oSBUZUhwLbpWJffkdjUN1wRMgfY3W1vF+GOBqqRu6aCN9yFZqPT4jvtCqB4hOHG ZTHShyrRAMK9fg67q8VKetr/tRR8Kgcm6BcLWl3esNGVjVPpa/AHyh1/YIB1iN5mV80T QNGQJ3/ep4Qlov+Ps2FNAOgbPw/HP1MhzLkihbkolhdKH0zrI0h2V/oAk/0GxOIg3W+W 3ukK1WnEsynMErbTNUIbisbCWTjeF+YgeuXlu+s9a+11lVqUqlc/WembSsklQLlaL9vV zSBReJGGxAvD/JotHJJErru8RX9hL78iP/QSt4IA6o9WZne4p6K8cxQyOCMaLEquuKCM nOSg== X-Gm-Message-State: AJIora9KdLdVrTogu68Q5yGX8Yjj+dIFMlwcxq38P+Ov16j3yEhYeIvG 6AAmrr+VqMrAJfPVBVFqRBTpyzJQcWcq8g== X-Google-Smtp-Source: AGRyM1vly/WZFwwDQtm/dFJ7bO+F8H9mqQw94tSkDC9Mvq9Vd6Cc2B8GbsE/AZmtZx6txX1zK07q9A== X-Received: by 2002:a05:6000:798:b0:21b:8ca1:9b52 with SMTP id bu24-20020a056000079800b0021b8ca19b52mr7901286wrb.374.1656249020301; Sun, 26 Jun 2022 06:10:20 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id r68-20020a1c2b47000000b0039c4b518df4sm11798044wmr.5.2022.06.26.06.10.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 26 Jun 2022 06:10:19 -0700 (PDT) Message-Id: <4d11be66cfa2cd667df56ab9260903a37900bd4c.1656249017.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 26 Jun 2022 13:10:12 +0000 Subject: [PATCH v2 1/6] Documentation/technical: describe bitmap lookup table extension Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Taylor Blau , Kaartic Sivaram , Derrick Stolee , Abhradeep Chakraborty , Abhradeep Chakraborty Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhradeep Chakraborty From: Abhradeep Chakraborty When reading bitmap file, git loads each and every bitmap one by one even if all the bitmaps are not required. A "bitmap lookup table" extension to the bitmap format can reduce the overhead of loading bitmaps which stores a list of bitmapped commit id pos (in the midx or pack, along with their offset and xor offset. This way git can load only the neccesary bitmaps without loading the previous bitmaps. The older version of Git ignores the lookup table extension and doesn't throw any kind of warning or error while parsing the bitmap file. Add some information for the new "bitmap lookup table" extension in the bitmap-format documentation. Co-Authored-by: Taylor Blau Mentored-by: Taylor Blau Co-Mentored-by: Kaartic Sivaraam Signed-off-by: Abhradeep Chakraborty --- Documentation/technical/bitmap-format.txt | 41 +++++++++++++++++++++++ 1 file changed, 41 insertions(+) diff --git a/Documentation/technical/bitmap-format.txt b/Documentation/technical/bitmap-format.txt index 04b3ec21785..7d4e450d3d8 100644 --- a/Documentation/technical/bitmap-format.txt +++ b/Documentation/technical/bitmap-format.txt @@ -67,6 +67,19 @@ MIDXs, both the bit-cache and rev-cache extensions are required. pack/MIDX. The format and meaning of the name-hash is described below. + ** {empty} + BITMAP_OPT_LOOKUP_TABLE (0x10): ::: + If present, the end of the bitmap file contains a table + containing a list of `N` + triplets. The format and meaning of the table is described + below. ++ +NOTE: This xor_offset is different from the bitmap's xor_offset. +Bitmap's xor_offset is relative i.e. it tells how many bitmaps we have +to go back from the current bitmap. Lookup table's xor_offset tells the +position of the triplet in the list whose bitmap the current commit's +bitmap have to xor with. + 4-byte entry count (network byte order) The total count of entries (bitmapped commits) in this bitmap index. @@ -205,3 +218,31 @@ Note that this hashing scheme is tied to the BITMAP_OPT_HASH_CACHE flag. If implementations want to choose a different hashing scheme, they are free to do so, but MUST allocate a new header flag (because comparing hashes made under two different schemes would be pointless). + +Commit lookup table +------------------- + +If the BITMAP_OPT_LOOKUP_TABLE flag is set, the last `N * (4 + 8 + 4)` +(preceding the name-hash cache and trailing hash) of the `.bitmap` file +contains a lookup table specifying the information needed to get the +desired bitmap from the entries without parsing previous unnecessary +bitmaps. + +For a `.bitmap` containing `nr_entries` reachability bitmaps, the table +contains a list of `nr_entries` triplets. +The content of i'th triplet is - + + * {empty} + commit pos (4 byte integer, network byte order): :: + It stores the object position of the commit (in the midx or pack index) + to which the i'th bitmap in the bitmap entries belongs. + + * {empty} + offset (8 byte integer, network byte order): :: + The offset from which that commit's bitmap can be read. + + * {empty} + xor offset (4 byte integer, network byte order): :: + It holds the position of the triplet with whose bitmap the + current bitmap need to xor. If the current triplet's bitmap + do not have any xor bitmap, it defaults to 0xffffffff. From patchwork Sun Jun 26 13:10:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Abhradeep Chakraborty X-Patchwork-Id: 12895765 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 32519C43334 for ; Sun, 26 Jun 2022 13:10:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234512AbiFZNK0 (ORCPT ); Sun, 26 Jun 2022 09:10:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49772 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234452AbiFZNKY (ORCPT ); Sun, 26 Jun 2022 09:10:24 -0400 Received: from mail-wm1-x330.google.com (mail-wm1-x330.google.com [IPv6:2a00:1450:4864:20::330]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B3A4BA46A for ; Sun, 26 Jun 2022 06:10:23 -0700 (PDT) Received: by mail-wm1-x330.google.com with SMTP id t17-20020a1c7711000000b003a0434b0af7so1524844wmi.0 for ; Sun, 26 Jun 2022 06:10:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=CUil831QgYbwZEuou9/h47jJVdMvKqvGtYftNx2sG6U=; b=a/dGU2LK7hyIOZPgGm7QJvHxaaIg3k/EU2ra1ecPxbC2S9RmtPPKBbC1ghNFglaIpo 7QZpVm6Pn7mVCxkrEk6008dj9OCdLc4Qn5xD0pw2Eiunz72GOaoU4wPddXRlIBM9quzt wWBBrWcYYjSdaRt6MG7k00PHAnNsTNQIkOznXjAnexNFWITXplTOX+sBoLxObVhrpy5Z 7D3YN1fXWnLkbzNT9YRuZcVtFFe4LkV1nYJPlDU3l/cBJbV9PDQtDndYzjzZ93JN0Npl Vd2afyY6jSG9fysizhyYIxiQXq6ioRlNGZwpnN4xAQrD+nMaDxMK4nK7WQwq8HC3nm6u WW6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=CUil831QgYbwZEuou9/h47jJVdMvKqvGtYftNx2sG6U=; b=vZtw1Q1C9e4CkYC+CfiNy4Ud3xGgg8CIK38tDKCjSVbIMA7Vs3vw1Q0aPylq1fXg4+ isVyUvt6oNDdMCbRVwiWQ4LGVvXTTM60BDaoks5MiFVc1oAaTCKn7DIgfoa0YmNrnveR gfuFuLzW7SAAtqSHer/iPxVQy2rRQM6yr8b+w7OVJ3G5gu2nVXSQz5q6KgIUsQBdOUr6 Ougkgtx2gcCjM7+qGacqEB4huAteC67Ya54lejejCMklhNUKv5eiLRK2gMRHeJnxlQwC BLa72Rqk+Ylr+0HVcI4EnAwmgHIZgctol1iPxjry9L5U8CSljJ/LjiQAFPP0/mO7MxOK 3J+A== X-Gm-Message-State: AJIora9B0CkNv5mwGS3IQD6dlEgRuLYXfxyZ81WwXqkioxUIG0hvv4kH g5yeWkbFvGiYDdCmH9+WSy1bKZ3TEDO4EA== X-Google-Smtp-Source: AGRyM1sLME/Nv4iRDzTesxEK+QnhsEKUSkq8hL0DxVyvQqp8aDYAY1ghs2H227hhNR5pqcV/YILTvw== X-Received: by 2002:a05:600c:34c5:b0:3a0:4187:22d0 with SMTP id d5-20020a05600c34c500b003a0418722d0mr7890492wmq.54.1656249021996; Sun, 26 Jun 2022 06:10:21 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id n17-20020a05600c501100b003975c7058bfsm9549259wmr.12.2022.06.26.06.10.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 26 Jun 2022 06:10:21 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Sun, 26 Jun 2022 13:10:13 +0000 Subject: [PATCH v2 2/6] pack-bitmap-write.c: write lookup table extension Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Taylor Blau , Kaartic Sivaram , Derrick Stolee , Abhradeep Chakraborty , Abhradeep Chakraborty Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhradeep Chakraborty From: Abhradeep Chakraborty The bitmap lookup table extension was documentated by an earlier change, but Git does not yet knowhow to write that extension. Teach git to write bitmap lookup table extension. The table contains the list of `N` ` triplets. These triplets are sorted according to their commit pos (ascending order). The meaning of each data in the i'th triplet is given below: - Commit pos is the position of the commit in the pack-index (or midx) to which the i'th bitmap belongs. It is a 4 byte network byte order integer. - offset is the position of the i'th bitmap. - xor offset denotes the position of the triplet with whose bitmap the current triplet's bitmap need to xor with. Co-authored-by: Taylor Blau Mentored-by: Taylor Blau Co-mentored-by: Kaartic Sivaraam Signed-off-by: Abhradeep Chakraborty --- pack-bitmap-write.c | 72 +++++++++++++++++++++++++++++++++++++++++++-- pack-bitmap.h | 5 ++-- 2 files changed, 73 insertions(+), 4 deletions(-) diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index c43375bd344..899a4a941e1 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -650,7 +650,9 @@ static const struct object_id *oid_access(size_t pos, const void *table) static void write_selected_commits_v1(struct hashfile *f, struct pack_idx_entry **index, - uint32_t index_nr) + uint32_t index_nr, + uint64_t *offsets, + uint32_t *commit_positions) { int i; @@ -663,6 +665,11 @@ static void write_selected_commits_v1(struct hashfile *f, if (commit_pos < 0) BUG("trying to write commit not in index"); + if (offsets) + offsets[i] = hashfile_total(f); + if (commit_positions) + commit_positions[i] = commit_pos; + hashwrite_be32(f, commit_pos); hashwrite_u8(f, stored->xor_offset); hashwrite_u8(f, stored->flags); @@ -671,6 +678,55 @@ static void write_selected_commits_v1(struct hashfile *f, } } +static int table_cmp(const void *_va, const void *_vb, void *commit_positions) +{ + int8_t result = 0; + uint32_t *positions = (uint32_t *) commit_positions; + uint32_t a = positions[*(uint32_t *)_va]; + uint32_t b = positions[*(uint32_t *)_vb]; + + if (a > b) + result = 1; + else if (a < b) + result = -1; + else + result = 0; + + return result; +} + +static void write_lookup_table(struct hashfile *f, + uint64_t *offsets, + uint32_t *commit_positions) +{ + uint32_t i; + uint32_t *table, *table_inv; + + ALLOC_ARRAY(table, writer.selected_nr); + ALLOC_ARRAY(table_inv, writer.selected_nr); + + for (i = 0; i < writer.selected_nr; i++) + table[i] = i; + + QSORT_S(table, writer.selected_nr, table_cmp, commit_positions); + + for (i = 0; i < writer.selected_nr; i++) + table_inv[table[i]] = i; + + for (i = 0; i < writer.selected_nr; i++) { + struct bitmapped_commit *selected = &writer.selected[table[i]]; + uint32_t xor_offset = selected->xor_offset; + + hashwrite_be32(f, commit_positions[table[i]]); + hashwrite_be64(f, offsets[table[i]]); + hashwrite_be32(f, xor_offset ? + table_inv[table[i] - xor_offset]: 0xffffffff); + } + + free(table); + free(table_inv); +} + static void write_hash_cache(struct hashfile *f, struct pack_idx_entry **index, uint32_t index_nr) @@ -695,6 +751,8 @@ void bitmap_writer_finish(struct pack_idx_entry **index, { static uint16_t default_version = 1; static uint16_t flags = BITMAP_OPT_FULL_DAG; + uint64_t *offsets = NULL; + uint32_t *commit_positions = NULL; struct strbuf tmp_file = STRBUF_INIT; struct hashfile *f; @@ -715,8 +773,16 @@ void bitmap_writer_finish(struct pack_idx_entry **index, dump_bitmap(f, writer.trees); dump_bitmap(f, writer.blobs); dump_bitmap(f, writer.tags); - write_selected_commits_v1(f, index, index_nr); + if (options & BITMAP_OPT_LOOKUP_TABLE) { + CALLOC_ARRAY(offsets, index_nr); + CALLOC_ARRAY(commit_positions, index_nr); + } + + write_selected_commits_v1(f, index, index_nr, offsets, commit_positions); + + if (options & BITMAP_OPT_LOOKUP_TABLE) + write_lookup_table(f, offsets, commit_positions); if (options & BITMAP_OPT_HASH_CACHE) write_hash_cache(f, index, index_nr); @@ -730,4 +796,6 @@ void bitmap_writer_finish(struct pack_idx_entry **index, die_errno("unable to rename temporary bitmap file to '%s'", filename); strbuf_release(&tmp_file); + free(offsets); + free(commit_positions); } diff --git a/pack-bitmap.h b/pack-bitmap.h index 3d3ddd77345..67a9d0fc303 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -24,8 +24,9 @@ struct bitmap_disk_header { #define NEEDS_BITMAP (1u<<22) enum pack_bitmap_opts { - BITMAP_OPT_FULL_DAG = 1, - BITMAP_OPT_HASH_CACHE = 4, + BITMAP_OPT_FULL_DAG = 0x1, + BITMAP_OPT_HASH_CACHE = 0x4, + BITMAP_OPT_LOOKUP_TABLE = 0x10, }; enum pack_bitmap_flags { From patchwork Sun Jun 26 13:10:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Abhradeep Chakraborty X-Patchwork-Id: 12895766 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D7F8C433EF for ; Sun, 26 Jun 2022 13:10:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234553AbiFZNK2 (ORCPT ); Sun, 26 Jun 2022 09:10:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49774 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234280AbiFZNKZ (ORCPT ); Sun, 26 Jun 2022 09:10:25 -0400 Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B932BB7C1 for ; Sun, 26 Jun 2022 06:10:23 -0700 (PDT) Received: by mail-wr1-x42f.google.com with SMTP id i25so3882988wrc.13 for ; Sun, 26 Jun 2022 06:10:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=7z8VzqJ+VkTtT25Vm8lkEmhKRt12h3iNvlfPrqtDT44=; b=pYJ5sVNn7+yFDv25zMGOAzgsY8HAUOG93KoAZQ/8pcUvgs8eouUAMQHkw/4NWq64dY SpV3sz6F18aperBYyofSqEWUyrUXnahYMOQCYr7PZJrkdax4DWRc+UTKnAIZccgcYWKL wFVQWhbh6ZIJLYyT2EOwiocCup1MCX/OrSRmc1Sqk/XAr5mLDBUuxKulVgOEk8ZZLdrS O0UvwjruqtpaJWKq8+tFNVlGKVbIKnpH+fj4avKkSUBQD/H9+4YCD6I0S0kl0ZAk10fH 2KOitNHMFD9/I+aatAef3JuPUj05ZYBEKKKFTciiX8zZkJwRmAQKcT91dIAtmOYuHXYG 9r+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=7z8VzqJ+VkTtT25Vm8lkEmhKRt12h3iNvlfPrqtDT44=; b=wKSBUlu92spdoDiVsowgQ6L65FrbMMt2dFU2eyeH3aFroP1XK1nBDx8ua4mJvAUvp+ buDI9dLwkvvuesZ5lcn+dwIZc4pZe8hXWbXRxumtXdfoBghRgsBBmqkSzc9gX9PObR0R qUnbtXzzcEB//Ea3vkkLVEffRvVrBwcYP+ia5A8BG7/mTYP4crPM/roxTouQ6K75Wj9h r33N1f9a8QCDJtwCrMBtsSiSpe1D6Hk39L/b3OhzLzw0/mVPlbBZQRIqEdvRSaBbAodj AFZ2xIptARkbpxYqshFHM3vkE5wJ/DAICyOYCmcVNT5wtryE5x5RVtbnHCO7P1OVqnJz pkwQ== X-Gm-Message-State: AJIora+6SKtq/rY0WluD+31Kv5DTMNJqKhtm+enyuLuLiOMtUwLCniA9 CTAEfKmIV2cxO/Dm4YQp+hEO/0DEDuSWFQ== X-Google-Smtp-Source: AGRyM1vUGomFFeVK704SvTab0L8M5fUMbAJY+lCh3qzvfebCjt0cuhs+8zb8FALJ1sWBTAEP/UHMTA== X-Received: by 2002:adf:d084:0:b0:21b:8a7c:d260 with SMTP id y4-20020adfd084000000b0021b8a7cd260mr7725945wrh.68.1656249022952; Sun, 26 Jun 2022 06:10:22 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id y21-20020a7bc195000000b0039c362311d2sm16136073wmi.9.2022.06.26.06.10.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 26 Jun 2022 06:10:22 -0700 (PDT) Message-Id: <7786dc879f006c8316c33dd70e98888ceb50a014.1656249017.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 26 Jun 2022 13:10:14 +0000 Subject: [PATCH v2 3/6] pack-bitmap-write: learn pack.writeBitmapLookupTable and add tests Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Taylor Blau , Kaartic Sivaram , Derrick Stolee , Abhradeep Chakraborty , Abhradeep Chakraborty Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhradeep Chakraborty From: Abhradeep Chakraborty Teach git to provide a way for users to enable/disable bitmap lookup table extension by providing a config option named 'writeBitmapLookupTable'. Default is true. Also add test to verify writting of lookup table. Co-Authored-by: Taylor Blau Signed-off-by: Abhradeep Chakraborty Mentored-by: Taylor Blau Co-Mentored-by: Kaartic Sivaraam --- Documentation/config/pack.txt | 7 +++++++ builtin/multi-pack-index.c | 8 ++++++++ builtin/pack-objects.c | 10 +++++++++- midx.c | 3 +++ midx.h | 1 + pack-bitmap-write.c | 2 ++ t/t5310-pack-bitmaps.sh | 3 ++- t/t5326-multi-pack-bitmaps.sh | 13 +++++++++++++ 8 files changed, 45 insertions(+), 2 deletions(-) diff --git a/Documentation/config/pack.txt b/Documentation/config/pack.txt index ad7f73a1ead..6e1f454c4d6 100644 --- a/Documentation/config/pack.txt +++ b/Documentation/config/pack.txt @@ -164,6 +164,13 @@ When writing a multi-pack reachability bitmap, no new namehashes are computed; instead, any namehashes stored in an existing bitmap are permuted into their appropriate location when writing a new bitmap. +pack.writeBitmapLookupTable:: + When true, git will include a "lookup table" section in the + bitmap index (if one is written). This table is used to defer + loading individual bitmaps as late as possible. This can be + beneficial in repositories which have relatively large bitmap + indexes. Defaults to true. + pack.writeReverseIndex:: When true, git will write a corresponding .rev file (see: link:../technical/pack-format.html[Documentation/technical/pack-format.txt]) diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index 5edbb7fe86e..3757616f09c 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -87,6 +87,13 @@ static int git_multi_pack_index_write_config(const char *var, const char *value, opts.flags &= ~MIDX_WRITE_BITMAP_HASH_CACHE; } + if (!strcmp(var, "pack.writebitmaplookuptable")) { + if (git_config_bool(var, value)) + opts.flags |= MIDX_WRITE_BITMAP_LOOKUP_TABLE; + else + opts.flags &= ~MIDX_WRITE_BITMAP_LOOKUP_TABLE; + } + /* * We should never make a fall-back call to 'git_default_config', since * this was already called in 'cmd_multi_pack_index()'. @@ -123,6 +130,7 @@ static int cmd_multi_pack_index_write(int argc, const char **argv) }; opts.flags |= MIDX_WRITE_BITMAP_HASH_CACHE; + opts.flags |= MIDX_WRITE_BITMAP_LOOKUP_TABLE; git_config(git_multi_pack_index_write_config, NULL); diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 39e28cfcafc..d6a33fd486c 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -228,7 +228,7 @@ static enum { WRITE_BITMAP_QUIET, WRITE_BITMAP_TRUE, } write_bitmap_index; -static uint16_t write_bitmap_options = BITMAP_OPT_HASH_CACHE; +static uint16_t write_bitmap_options = BITMAP_OPT_HASH_CACHE | BITMAP_OPT_LOOKUP_TABLE; static int exclude_promisor_objects; @@ -3148,6 +3148,14 @@ static int git_pack_config(const char *k, const char *v, void *cb) else write_bitmap_options &= ~BITMAP_OPT_HASH_CACHE; } + + if (!strcmp(k, "pack.writebitmaplookuptable")) { + if (git_config_bool(k, v)) + write_bitmap_options |= BITMAP_OPT_LOOKUP_TABLE; + else + write_bitmap_options &= ~BITMAP_OPT_LOOKUP_TABLE; + } + if (!strcmp(k, "pack.usebitmaps")) { use_bitmap_index_default = git_config_bool(k, v); return 0; diff --git a/midx.c b/midx.c index 5f0dd386b02..9c26d04bfde 100644 --- a/midx.c +++ b/midx.c @@ -1072,6 +1072,9 @@ static int write_midx_bitmap(char *midx_name, unsigned char *midx_hash, if (flags & MIDX_WRITE_BITMAP_HASH_CACHE) options |= BITMAP_OPT_HASH_CACHE; + if (flags & MIDX_WRITE_BITMAP_LOOKUP_TABLE) + options |= BITMAP_OPT_LOOKUP_TABLE; + prepare_midx_packing_data(&pdata, ctx); commits = find_commits_for_midx_bitmap(&commits_nr, refs_snapshot, ctx); diff --git a/midx.h b/midx.h index 22e8e53288e..5578cd7b835 100644 --- a/midx.h +++ b/midx.h @@ -47,6 +47,7 @@ struct multi_pack_index { #define MIDX_WRITE_REV_INDEX (1 << 1) #define MIDX_WRITE_BITMAP (1 << 2) #define MIDX_WRITE_BITMAP_HASH_CACHE (1 << 3) +#define MIDX_WRITE_BITMAP_LOOKUP_TABLE (1 << 4) const unsigned char *get_midx_checksum(struct multi_pack_index *m); void get_midx_filename(struct strbuf *out, const char *object_dir); diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index 899a4a941e1..79be0cf80e6 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -713,6 +713,7 @@ static void write_lookup_table(struct hashfile *f, for (i = 0; i < writer.selected_nr; i++) table_inv[table[i]] = i; + trace2_region_enter("pack-bitmap-write", "writing_lookup_table", the_repository); for (i = 0; i < writer.selected_nr; i++) { struct bitmapped_commit *selected = &writer.selected[table[i]]; uint32_t xor_offset = selected->xor_offset; @@ -725,6 +726,7 @@ static void write_lookup_table(struct hashfile *f, free(table); free(table_inv); + trace2_region_leave("pack-bitmap-write", "writing_lookup_table", the_repository); } static void write_hash_cache(struct hashfile *f, diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh index f775fc1ce69..c669ed959e9 100755 --- a/t/t5310-pack-bitmaps.sh +++ b/t/t5310-pack-bitmaps.sh @@ -38,7 +38,8 @@ test_expect_success 'full repack creates bitmaps' ' ls .git/objects/pack/ | grep bitmap >output && test_line_count = 1 output && grep "\"key\":\"num_selected_commits\",\"value\":\"106\"" trace && - grep "\"key\":\"num_maximal_commits\",\"value\":\"107\"" trace + grep "\"key\":\"num_maximal_commits\",\"value\":\"107\"" trace && + grep "\"label\":\"writing_lookup_table\"" trace ' basic_bitmap_tests diff --git a/t/t5326-multi-pack-bitmaps.sh b/t/t5326-multi-pack-bitmaps.sh index 4fe57414c13..43be49617b8 100755 --- a/t/t5326-multi-pack-bitmaps.sh +++ b/t/t5326-multi-pack-bitmaps.sh @@ -307,4 +307,17 @@ test_expect_success 'graceful fallback when missing reverse index' ' ) ' +test_expect_success 'multi-pack-index write writes lookup table if enabled' ' + rm -fr repo && + git init repo && + test_when_finished "rm -fr repo" && + ( + cd repo && + test_commit base && + git repack -ad && + GIT_TRACE2_EVENT="$(pwd)/trace" \ + git multi-pack-index write --bitmap && + grep "\"label\":\"writing_lookup_table\"" trace + ) +' test_done From patchwork Sun Jun 26 13:10:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Abhradeep Chakraborty X-Patchwork-Id: 12895770 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 168FDC433EF for ; Sun, 26 Jun 2022 13:10:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234572AbiFZNKi (ORCPT ); Sun, 26 Jun 2022 09:10:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49834 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234550AbiFZNKa (ORCPT ); Sun, 26 Jun 2022 09:10:30 -0400 Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [IPv6:2a00:1450:4864:20::333]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F1152B7C1 for ; Sun, 26 Jun 2022 06:10:25 -0700 (PDT) Received: by mail-wm1-x333.google.com with SMTP id h14-20020a1ccc0e000000b0039eff745c53so4041172wmb.5 for ; Sun, 26 Jun 2022 06:10:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=jcHZa6OxZmATA8PmZA8rXPGZ6qgI9ipXlfsvYD2N1zc=; b=Z2EtUCKtpu0ye+TROZOm6SBDv6P5UjGXY3rnpeCuEBqEY+vi1lflR5MJQEGrmiFLjm 2ADsgJdqq8fg4d8SUc9hal2HfOKmF79BezKZUpkmiKz5wd+xqdPfqEYl/kvgBqDvTSDu ezvNU9FaLbYVGd6dJudsHII9Zf7SQ1IySZcaenHHHywJKjLmGm6Dal7TDpwHE8S8VrKy mXjnPyFO0/C41Qhu/+49T2tgXAIGWYVVbPht+RhUApNtOvp2xBNmZEE1WENTDVvkBpAx zYk6WldYbQfvlt0AF3OVV4+monYFOhPf/7JfCbMGKJCVVT94a9nEM61pf4fq44H6Debt IDlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=jcHZa6OxZmATA8PmZA8rXPGZ6qgI9ipXlfsvYD2N1zc=; b=YwWgwRTfITV1EUH7PDQswtmlU4qsD1EljwIT3h/A5VZaZ4tABgqctlfA4Vf4v6qKEd MZuAmV9cPfVKYdQP5lbAydEGZyG4diY4KAprc3CmTK5N1LnrcUVRkGUedp9vqmZqPUFh L5jBezk/mKYLAXk1DcnV4rG8TiS3ksx4SpR+gfI1G4V94/apvJHYvFicgQUiNsBw5dTX H8D3T5gkwM5ZR+X0XJOeyHEW+J7xsNpX75qpfKZMs9fFAChstlnTlp1N84c7pXF56+yb WOf2pj0FLBi1ND6Hz1i6Kp5KA+LMuWbTKMsTc5Nm9Kvrs4kqzQIAU1G82yztLh5Hxfp1 g1Zw== X-Gm-Message-State: AJIora+EBIFfSKICbZXUVS+BQkhh7a1LgjqwqRngQdJPba+U8q5acikC tsRHLeS0kdNVSukehLtxCPfY9v8ZaRIY5Q== X-Google-Smtp-Source: AGRyM1uT1v9O3iqs1b1YyxLGle55JxAy1H85Yguwi0jwS3q9VgCUUALz0fzBq7YQ8sTXk/1Yypk01A== X-Received: by 2002:a1c:3b07:0:b0:3a0:333d:ae22 with SMTP id i7-20020a1c3b07000000b003a0333dae22mr9446816wma.1.1656249024049; Sun, 26 Jun 2022 06:10:24 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h5-20020a5d4305000000b00210bac248c8sm7695582wrq.11.2022.06.26.06.10.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 26 Jun 2022 06:10:23 -0700 (PDT) Message-Id: <4fbfcff8a208798146cd561b0185e094a116cf0e.1656249017.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 26 Jun 2022 13:10:15 +0000 Subject: [PATCH v2 4/6] pack-bitmap: prepare to read lookup table extension Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Taylor Blau , Kaartic Sivaram , Derrick Stolee , Abhradeep Chakraborty , Abhradeep Chakraborty Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhradeep Chakraborty From: Abhradeep Chakraborty Earlier change teaches Git to write bitmap lookup table. But Git does not know how to parse them. Teach Git to parse the existing bitmap lookup table. The older versions of git are not affected by it. Those versions ignore the lookup table. Signed-off-by: Abhradeep Chakraborty Mentored-by: Taylor Blau Co-Mentored-by: Kaartic Sivaraam --- pack-bitmap.c | 193 ++++++++++++++++++++++++++++++++-- t/t5310-pack-bitmaps.sh | 7 ++ t/t5326-multi-pack-bitmaps.sh | 1 + 3 files changed, 191 insertions(+), 10 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index 36134222d7a..9e09c5824fc 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -82,6 +82,12 @@ struct bitmap_index { /* The checksum of the packfile or MIDX; points into map. */ const unsigned char *checksum; + /* + * If not NULL, this point into the commit table extension + * (within map). + */ + unsigned char *table_lookup; + /* * Extended index. * @@ -185,6 +191,22 @@ static int load_bitmap_header(struct bitmap_index *index) index->hashes = (void *)(index_end - cache_size); index_end -= cache_size; } + + if (flags & BITMAP_OPT_LOOKUP_TABLE && + git_env_bool("GIT_TEST_READ_COMMIT_TABLE", 1)) { + size_t table_size = 0; + size_t triplet_sz = st_add3(sizeof(uint32_t), /* commit position */ + sizeof(uint64_t), /* offset */ + sizeof(uint32_t)); /* xor offset */ + + table_size = st_add(table_size, + st_mult(ntohl(header->entry_count), + triplet_sz)); + if (table_size > index_end - index->map - header_size) + return error("corrupted bitmap index file (too short to fit lookup table)"); + index->table_lookup = (void *)(index_end - table_size); + index_end -= table_size; + } } index->entry_count = ntohl(header->entry_count); @@ -211,12 +233,20 @@ static struct stored_bitmap *store_bitmap(struct bitmap_index *index, hash_pos = kh_put_oid_map(index->bitmaps, stored->oid, &ret); - /* a 0 return code means the insertion succeeded with no changes, - * because the SHA1 already existed on the map. this is bad, there - * shouldn't be duplicated commits in the index */ + /* A 0 return code means the insertion succeeded with no changes, + * because the SHA1 already existed on the map. If lookup table + * is NULL, this is bad, there shouldn't be duplicated commits + * in the index. + * + * If table_lookup exists, that means the desired bitmap is already + * loaded. Either this bitmap has been stored directly or another + * bitmap has a direct or indirect xor relation with it. */ if (ret == 0) { - error("Duplicate entry in bitmap index: %s", oid_to_hex(oid)); - return NULL; + if (!index->table_lookup) { + error("Duplicate entry in bitmap index: %s", oid_to_hex(oid)); + return NULL; + } + return kh_value(index->bitmaps, hash_pos); } kh_value(index->bitmaps, hash_pos) = stored; @@ -470,7 +500,7 @@ static int load_bitmap(struct bitmap_index *bitmap_git) !(bitmap_git->tags = read_bitmap_1(bitmap_git))) goto failed; - if (load_bitmap_entries_v1(bitmap_git) < 0) + if (!bitmap_git->table_lookup && load_bitmap_entries_v1(bitmap_git) < 0) goto failed; return 0; @@ -557,13 +587,145 @@ struct include_data { struct bitmap *seen; }; +static inline const void *bitmap_get_triplet(struct bitmap_index *bitmap_git, uint32_t xor_pos) +{ + size_t triplet_sz = st_add3(sizeof(uint32_t), sizeof(uint64_t), sizeof(uint32_t)); + const void *p = bitmap_git->table_lookup + st_mult(xor_pos, triplet_sz); + return p; +} + +static uint64_t triplet_get_offset(const void *triplet) +{ + const void *p = (unsigned char*) triplet + sizeof(uint32_t); + return get_be64(p); +} + +static uint32_t triplet_get_xor_pos(const void *triplet) +{ + const void *p = (unsigned char*) triplet + st_add(sizeof(uint32_t), sizeof(uint64_t)); + return get_be32(p); +} + +static int triplet_cmp(const void *va, const void *vb) +{ + int result = 0; + uint32_t *a = (uint32_t *) va; + uint32_t b = get_be32(vb); + if (*a > b) + result = 1; + else if (*a < b) + result = -1; + else + result = 0; + + return result; +} + +static uint32_t bsearch_pos(struct bitmap_index *bitmap_git, struct object_id *oid, + uint32_t *result) +{ + int found; + + if (bitmap_git->midx) + found = bsearch_midx(oid, bitmap_git->midx, result); + else + found = bsearch_pack(oid, bitmap_git->pack, result); + + return found; +} + +static struct stored_bitmap *lazy_bitmap_for_commit(struct bitmap_index *bitmap_git, + struct commit *commit) +{ + uint32_t commit_pos, xor_pos; + uint64_t offset; + int flags; + const void *triplet = NULL; + struct object_id *oid = &commit->object.oid; + struct ewah_bitmap *bitmap; + struct stored_bitmap *xor_bitmap = NULL; + size_t triplet_sz = st_add3(sizeof(uint32_t), sizeof(uint64_t), sizeof(uint32_t)); + + int found = bsearch_pos(bitmap_git, oid, &commit_pos); + + if (!found) + return NULL; + + triplet = bsearch(&commit_pos, bitmap_git->table_lookup, bitmap_git->entry_count, + triplet_sz, triplet_cmp); + if (!triplet) + return NULL; + + offset = triplet_get_offset(triplet); + xor_pos = triplet_get_xor_pos(triplet); + + if (xor_pos != 0xffffffff) { + int xor_flags; + uint64_t offset_xor; + uint32_t *xor_positions; + struct object_id xor_oid; + size_t size = 0; + + ALLOC_ARRAY(xor_positions, bitmap_git->entry_count); + while (xor_pos != 0xffffffff) { + xor_positions[size++] = xor_pos; + triplet = bitmap_get_triplet(bitmap_git, xor_pos); + xor_pos = triplet_get_xor_pos(triplet); + } + + while (size){ + xor_pos = xor_positions[size - 1]; + triplet = bitmap_get_triplet(bitmap_git, xor_pos); + commit_pos = get_be32(triplet); + offset_xor = triplet_get_offset(triplet); + + if (nth_bitmap_object_oid(bitmap_git, &xor_oid, commit_pos) < 0) { + free(xor_positions); + return NULL; + } + + bitmap_git->map_pos = offset_xor + sizeof(uint32_t) + sizeof(uint8_t); + xor_flags = read_u8(bitmap_git->map, &bitmap_git->map_pos); + bitmap = read_bitmap_1(bitmap_git); + + if (!bitmap){ + free(xor_positions); + return NULL; + } + + xor_bitmap = store_bitmap(bitmap_git, bitmap, &xor_oid, xor_bitmap, xor_flags); + size--; + } + + free(xor_positions); + } + + bitmap_git->map_pos = offset + sizeof(uint32_t) + sizeof(uint8_t); + flags = read_u8(bitmap_git->map, &bitmap_git->map_pos); + bitmap = read_bitmap_1(bitmap_git); + + if (!bitmap) + return NULL; + + return store_bitmap(bitmap_git, bitmap, oid, xor_bitmap, flags); +} + struct ewah_bitmap *bitmap_for_commit(struct bitmap_index *bitmap_git, struct commit *commit) { khiter_t hash_pos = kh_get_oid_map(bitmap_git->bitmaps, commit->object.oid); - if (hash_pos >= kh_end(bitmap_git->bitmaps)) - return NULL; + if (hash_pos >= kh_end(bitmap_git->bitmaps)) { + struct stored_bitmap *bitmap = NULL; + if (!bitmap_git->table_lookup) + return NULL; + + /* NEEDSWORK: cache misses aren't recorded */ + bitmap = lazy_bitmap_for_commit(bitmap_git, commit); + if(!bitmap) + return NULL; + return lookup_stored_bitmap(bitmap); + } return lookup_stored_bitmap(kh_value(bitmap_git->bitmaps, hash_pos)); } @@ -1699,9 +1861,13 @@ void test_bitmap_walk(struct rev_info *revs) if (revs->pending.nr != 1) die("you must specify exactly one commit to test"); - fprintf(stderr, "Bitmap v%d test (%d entries loaded)\n", + fprintf(stderr, "Bitmap v%d test (%d entries)\n", bitmap_git->version, bitmap_git->entry_count); + if (!bitmap_git->table_lookup) + fprintf(stderr, "Bitmap v%d test (%d entries loaded)\n", + bitmap_git->version, bitmap_git->entry_count); + root = revs->pending.objects[0].item; bm = bitmap_for_commit(bitmap_git, (struct commit *)root); @@ -1753,10 +1919,16 @@ void test_bitmap_walk(struct rev_info *revs) int test_bitmap_commits(struct repository *r) { - struct bitmap_index *bitmap_git = prepare_bitmap_git(r); + struct bitmap_index *bitmap_git = NULL; struct object_id oid; MAYBE_UNUSED void *value; + /* As this function is only used to print bitmap selected + * commits, we don't have to read the commit table. + */ + setenv("GIT_TEST_READ_COMMIT_TABLE", "0", 1); + + bitmap_git = prepare_bitmap_git(r); if (!bitmap_git) die("failed to load bitmap indexes"); @@ -1764,6 +1936,7 @@ int test_bitmap_commits(struct repository *r) printf("%s\n", oid_to_hex(&oid)); }); + setenv("GIT_TEST_READ_COMMIT_TABLE", "1", 1); free_bitmap_index(bitmap_git); return 0; diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh index c669ed959e9..10d7691d973 100755 --- a/t/t5310-pack-bitmaps.sh +++ b/t/t5310-pack-bitmaps.sh @@ -42,6 +42,12 @@ test_expect_success 'full repack creates bitmaps' ' grep "\"label\":\"writing_lookup_table\"" trace ' +test_expect_success 'using lookup table loads only necessary bitmaps' ' + git rev-list --test-bitmap HEAD 2>out && + ! grep "Bitmap v1 test (106 entries loaded)" out && + grep "Found bitmap for" out +' + basic_bitmap_tests test_expect_success 'incremental repack fails when bitmaps are requested' ' @@ -255,6 +261,7 @@ test_expect_success 'pack reuse respects --incremental' ' test_expect_success 'truncated bitmap fails gracefully (ewah)' ' test_config pack.writebitmaphashcache false && + test_config pack.writebitmaplookuptable false && git repack -ad && git rev-list --use-bitmap-index --count --all >expect && bitmap=$(ls .git/objects/pack/*.bitmap) && diff --git a/t/t5326-multi-pack-bitmaps.sh b/t/t5326-multi-pack-bitmaps.sh index 43be49617b8..7d36dbcf722 100755 --- a/t/t5326-multi-pack-bitmaps.sh +++ b/t/t5326-multi-pack-bitmaps.sh @@ -320,4 +320,5 @@ test_expect_success 'multi-pack-index write writes lookup table if enabled' ' grep "\"label\":\"writing_lookup_table\"" trace ) ' + test_done From patchwork Sun Jun 26 13:10:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Abhradeep Chakraborty X-Patchwork-Id: 12895768 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1DEBBC43334 for ; Sun, 26 Jun 2022 13:10:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234560AbiFZNKe (ORCPT ); Sun, 26 Jun 2022 09:10:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49808 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234541AbiFZNK2 (ORCPT ); Sun, 26 Jun 2022 09:10:28 -0400 Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E6CB9CE08 for ; Sun, 26 Jun 2022 06:10:26 -0700 (PDT) Received: by mail-wm1-x32c.google.com with SMTP id l126-20020a1c2584000000b0039c1a10507fso4057275wml.1 for ; Sun, 26 Jun 2022 06:10:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=NACJRToHJP1PjjfvpsJUUgREmA7eTJdIN8BGQAEUa10=; b=dCt6I/RkhkTV3iJ5Dqz0oGZwVy6IlTXFe52Vf7C/H51nnMk0GSy6IDkAxnPjhq6bEH BWav9B6sj1tugXMXYkQChjm5ahpCit3uIh70Bo2ok3V0+f+drQMbvOA2SAnpiC1tW2hV HH+G9/JIwpthYjAcARBcQHlVMFK4LVzvqz3B8UogSml4VKgRPp0ZkIlkSuEXXFjbCyS7 iIFmL/WKtGUaNU8fcEs+V77cmadhsh5ZCR9LKAlP9mdFuEOYVPVj065TiJ/7I+blaVqa Nyrj5pU0AbrY254P/5LBiiLCVESHAMXdQbEiTAHczlF2GvIWp+Aw5BtXolNjCXUh6WAL +7Ng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=NACJRToHJP1PjjfvpsJUUgREmA7eTJdIN8BGQAEUa10=; b=ORF3NVJCfmQOFjj5SBtomqsxYhIrySwqnNV79kAlApbS/1f2Vj/FAGjOb/rGZXjqNM chPC3Oome9tlM5eQb3sMxvZGWn/3sP5J8X9lPGVZDranmbHim69TP5Ik8JwtUo9V8XkX 7Ho5D5dpndnRa7+9IrDYjNOK4TFo07Eygl3blX0IhDaLfNd/BJbPPxuBG159ZeUxOZ0i PMMXdYkHrcj1saaCaRkWVpVseo8VyDVp9jU+KtG8WvoOcBsC1FCh3gLPp3bjyjSnJ4Eb ZbFkVOuajwnEfDBuUwjv8ga4zwbUTS0eL0Y2q+8u73j+2tqv7PgDrVOH+IDMQTNYpy4N JSJQ== X-Gm-Message-State: AJIora8/VQCuSfM3IezB8vBWP/afFOr2j2A4KQ7omoUas+f6LY2AiuD1 6Lkn24vCVfHtRfPXFJeYx1h+rRA33GFEMQ== X-Google-Smtp-Source: AGRyM1v1yly+ybs3V5YKmx4mK2yjae5Qio8P4wEx44H96Xh97luZ0lFy/6NJjgiTzcWgikTAj4Uubg== X-Received: by 2002:a05:600c:1d17:b0:3a0:481b:f1e1 with SMTP id l23-20020a05600c1d1700b003a0481bf1e1mr2911622wms.136.1656249025059; Sun, 26 Jun 2022 06:10:25 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id f18-20020adfb612000000b002185631adf0sm7738392wre.23.2022.06.26.06.10.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 26 Jun 2022 06:10:24 -0700 (PDT) Message-Id: <96c0041688f6139c17611203f98274988ced25ab.1656249018.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 26 Jun 2022 13:10:16 +0000 Subject: [PATCH v2 5/6] bitmap-lookup-table: add performance tests for lookup table Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Taylor Blau , Kaartic Sivaram , Derrick Stolee , Abhradeep Chakraborty , Abhradeep Chakraborty Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhradeep Chakraborty From: Abhradeep Chakraborty Add performance tests to verify the performance of lookup table. Lookup table makes Git run faster in most of the cases. Below is the result of `t/perf/p5310-pack-bitmaps.sh`.`perf/p5326-multi-pack-bitmaps.sh` gives similar result. The repository used in the test is linux kernel. Test this tree -------------------------------------------------------------------------- 5310.4: repack to disk (lookup=false) 295.94(250.45+15.24) 5310.5: simulated clone 12.52(5.07+1.40) 5310.6: simulated fetch 1.89(2.94+0.24) 5310.7: pack to file (bitmap) 41.39(20.33+7.20) 5310.8: rev-list (commits) 0.98(0.59+0.12) 5310.9: rev-list (objects) 3.40(3.27+0.10) 5310.10: rev-list with tag negated via --not 0.07(0.02+0.04) --all (objects) 5310.11: rev-list with negative tag (objects) 0.23(0.16+0.06) 5310.12: rev-list count with blob:none 0.26(0.18+0.07) 5310.13: rev-list count with blob:limit=1k 6.45(5.94+0.37) 5310.14: rev-list count with tree:0 0.26(0.18+0.07) 5310.15: simulated partial clone 4.99(3.19+0.45) 5310.19: repack to disk (lookup=true) 269.67(174.70+21.33) 5310.20: simulated clone 11.03(5.07+1.11) 5310.21: simulated fetch 0.79(0.79+0.17) 5310.22: pack to file (bitmap) 43.03(20.28+7.43) 5310.23: rev-list (commits) 0.86(0.54+0.09) 5310.24: rev-list (objects) 3.35(3.26+0.07) 5310.25: rev-list with tag negated via --not 0.05(0.00+0.03) --all (objects) 5310.26: rev-list with negative tag (objects) 0.22(0.16+0.05) 5310.27: rev-list count with blob:none 0.22(0.16+0.05) 5310.28: rev-list count with blob:limit=1k 6.45(5.87+0.31) 5310.29: rev-list count with tree:0 0.22(0.16+0.05) 5310.30: simulated partial clone 5.17(3.12+0.48) Test 4-15 are tested without using lookup table. Same tests are repeated in 16-30 (using lookup table). Signed-off-by: Abhradeep Chakraborty Mentored-by: Taylor Blau Co-Mentored-by: Kaartic Sivaraam --- t/perf/p5310-pack-bitmaps.sh | 77 ++++++++++++++----------- t/perf/p5326-multi-pack-bitmaps.sh | 93 ++++++++++++++++-------------- 2 files changed, 94 insertions(+), 76 deletions(-) diff --git a/t/perf/p5310-pack-bitmaps.sh b/t/perf/p5310-pack-bitmaps.sh index 7ad4f237bc3..6ff42bdd391 100755 --- a/t/perf/p5310-pack-bitmaps.sh +++ b/t/perf/p5310-pack-bitmaps.sh @@ -16,39 +16,48 @@ test_expect_success 'setup bitmap config' ' git config pack.writebitmaps true ' -# we need to create the tag up front such that it is covered by the repack and -# thus by generated bitmaps. -test_expect_success 'create tags' ' - git tag --message="tag pointing to HEAD" perf-tag HEAD -' - -test_perf 'repack to disk' ' - git repack -ad -' - -test_full_bitmap - -test_expect_success 'create partial bitmap state' ' - # pick a commit to represent the repo tip in the past - cutoff=$(git rev-list HEAD~100 -1) && - orig_tip=$(git rev-parse HEAD) && - - # now kill off all of the refs and pretend we had - # just the one tip - rm -rf .git/logs .git/refs/* .git/packed-refs && - git update-ref HEAD $cutoff && - - # and then repack, which will leave us with a nice - # big bitmap pack of the "old" history, and all of - # the new history will be loose, as if it had been pushed - # up incrementally and exploded via unpack-objects - git repack -Ad && - - # and now restore our original tip, as if the pushes - # had happened - git update-ref HEAD $orig_tip -' - -test_partial_bitmap +test_bitmap () { + local enabled="$1" + + # we need to create the tag up front such that it is covered by the repack and + # thus by generated bitmaps. + test_expect_success 'create tags' ' + git tag --message="tag pointing to HEAD" perf-tag HEAD + ' + + test_expect_success "use lookup table: $enabled" ' + git config pack.writeBitmapLookupTable '"$enabled"' + ' + + test_perf "repack to disk (lookup=$enabled)" ' + git repack -ad + ' + + test_full_bitmap + + test_expect_success "create partial bitmap state (lookup=$enabled)" ' + # pick a commit to represent the repo tip in the past + cutoff=$(git rev-list HEAD~100 -1) && + orig_tip=$(git rev-parse HEAD) && + + # now kill off all of the refs and pretend we had + # just the one tip + rm -rf .git/logs .git/refs/* .git/packed-refs && + git update-ref HEAD $cutoff && + + # and then repack, which will leave us with a nice + # big bitmap pack of the "old" history, and all of + # the new history will be loose, as if it had been pushed + # up incrementally and exploded via unpack-objects + git repack -Ad && + + # and now restore our original tip, as if the pushes + # had happened + git update-ref HEAD $orig_tip + ' +} + +test_bitmap false +test_bitmap true test_done diff --git a/t/perf/p5326-multi-pack-bitmaps.sh b/t/perf/p5326-multi-pack-bitmaps.sh index f2fa228f16a..d67e7437493 100755 --- a/t/perf/p5326-multi-pack-bitmaps.sh +++ b/t/perf/p5326-multi-pack-bitmaps.sh @@ -6,47 +6,56 @@ test_description='Tests performance using midx bitmaps' test_perf_large_repo -# we need to create the tag up front such that it is covered by the repack and -# thus by generated bitmaps. -test_expect_success 'create tags' ' - git tag --message="tag pointing to HEAD" perf-tag HEAD -' - -test_expect_success 'start with bitmapped pack' ' - git repack -adb -' - -test_perf 'setup multi-pack index' ' - git multi-pack-index write --bitmap -' - -test_expect_success 'drop pack bitmap' ' - rm -f .git/objects/pack/pack-*.bitmap -' - -test_full_bitmap - -test_expect_success 'create partial bitmap state' ' - # pick a commit to represent the repo tip in the past - cutoff=$(git rev-list HEAD~100 -1) && - orig_tip=$(git rev-parse HEAD) && - - # now pretend we have just one tip - rm -rf .git/logs .git/refs/* .git/packed-refs && - git update-ref HEAD $cutoff && - - # and then repack, which will leave us with a nice - # big bitmap pack of the "old" history, and all of - # the new history will be loose, as if it had been pushed - # up incrementally and exploded via unpack-objects - git repack -Ad && - git multi-pack-index write --bitmap && - - # and now restore our original tip, as if the pushes - # had happened - git update-ref HEAD $orig_tip -' - -test_partial_bitmap +test_bitmap () { + local enabled="$1" + + # we need to create the tag up front such that it is covered by the repack and + # thus by generated bitmaps. + test_expect_success 'create tags' ' + git tag --message="tag pointing to HEAD" perf-tag HEAD + ' + + test_expect_success "use lookup table: $enabled" ' + git config pack.writeBitmapLookupTable '"$enabled"' + ' + + test_expect_success "start with bitmapped pack (lookup=$enabled)" ' + git repack -adb + ' + + test_perf "setup multi-pack index (lookup=$enabled)" ' + git multi-pack-index write --bitmap + ' + + test_expect_success "drop pack bitmap (lookup=$enabled)" ' + rm -f .git/objects/pack/pack-*.bitmap + ' + + test_full_bitmap + + test_expect_success "create partial bitmap state (lookup=$enabled)" ' + # pick a commit to represent the repo tip in the past + cutoff=$(git rev-list HEAD~100 -1) && + orig_tip=$(git rev-parse HEAD) && + + # now pretend we have just one tip + rm -rf .git/logs .git/refs/* .git/packed-refs && + git update-ref HEAD $cutoff && + + # and then repack, which will leave us with a nice + # big bitmap pack of the "old" history, and all of + # the new history will be loose, as if it had been pushed + # up incrementally and exploded via unpack-objects + git repack -Ad && + git multi-pack-index write --bitmap && + + # and now restore our original tip, as if the pushes + # had happened + git update-ref HEAD $orig_tip + ' +} + +test_bitmap false +test_bitmap true test_done From patchwork Sun Jun 26 13:10:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Abhradeep Chakraborty X-Patchwork-Id: 12895769 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F4FACCA47C for ; Sun, 26 Jun 2022 13:10:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234545AbiFZNKg (ORCPT ); Sun, 26 Jun 2022 09:10:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49826 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234555AbiFZNK3 (ORCPT ); Sun, 26 Jun 2022 09:10:29 -0400 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E83FBDF11 for ; Sun, 26 Jun 2022 06:10:27 -0700 (PDT) Received: by mail-wr1-x42d.google.com with SMTP id o16so9420825wra.4 for ; Sun, 26 Jun 2022 06:10:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Ttd+toG8GGhKdiiHSvU7Y18seRuo6ZMnslKhMPaaPCU=; b=PS8gu7UyjWDa+q2nGTF1RxIr542N22oCcUMWuI63ANItBbqNwjo2fU7HmI6WWokVX3 WuplGHzvgXBKABHYXx5/Paki6o049lfMrRX7LtAgj9sdTWDTCMgnKFZI3IswkFHkAobo NxHKMfTKCAMyfZSh9rg3jDkyKONnLMEKmJMGHHaeQF8AYVMnJObZKpxloY3M0oUrQsu3 VnsWopGjheMEizS8fwnsQmAzIRlRjTLlWnECikRhoU2K0FJkQETzyjt2aU1NoVbHlIFl liJ7yPm1oiPEpQDmQB78aTcW/XsfyPUxJ0sq4F8emiFVRlGn3VLRY8VjMghBrInDWMmd JfoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Ttd+toG8GGhKdiiHSvU7Y18seRuo6ZMnslKhMPaaPCU=; b=xA6Ha+RIu76ihRGrrA2e2HfqbDYzWAK+AUeigWgzWs7O8SuBqL+ocYNfZ18EMOzP/f 3Yh+5fRAxllDPoQkpJcNIuwiL0rIiMPNrBgYbeNOexzCLz2WDoOy/fi/P00ntMJM/q71 hSMFmq7HoBxm0UOtThqwErlJ1UQtOxihc7U7sB5que2UjUQVNHjauYuo3jHEcIDgjL3T gzZ2e80RJMTEr6Do2JuyZeZ/2Kt/uLE/oBlsSqGMGgaTQNxwvS6BNNyuOSf0/C1Pr8wW umvcyRZNn4ZIf79t55K5klOQzlJ/7fUZjSR7279H8fl0G7KPiUmqxSD8rm/eaEJsEOpJ q7Fw== X-Gm-Message-State: AJIora8wSWe5hBesdJljNa1fufpyguCGMYrEr2EoKJN7v/yoJKJWU8Mz A/PCehQVODkf7PB4i5TZBbdmH8fkdcGobA== X-Google-Smtp-Source: AGRyM1veQqo75pynb1iTtV/N3PwlD6xAYL9SGa6R1XRdQz88FXLLSAQr4V/1UxEAWsyU0JkRSY52+g== X-Received: by 2002:a5d:6d86:0:b0:21b:ce00:4163 with SMTP id l6-20020a5d6d86000000b0021bce004163mr978385wrs.511.1656249026247; Sun, 26 Jun 2022 06:10:26 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id e29-20020a5d595d000000b0021bc663ed67sm2248962wri.56.2022.06.26.06.10.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 26 Jun 2022 06:10:25 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Sun, 26 Jun 2022 13:10:17 +0000 Subject: [PATCH v2 6/6] p5310-pack-bitmaps.sh: enable pack.writeReverseIndex for testing Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Taylor Blau , Kaartic Sivaram , Derrick Stolee , Abhradeep Chakraborty , Abhradeep Chakraborty Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhradeep Chakraborty From: Abhradeep Chakraborty Enable pack.writeReverseIndex to true to see the effect of writing the reverse index in the existing bitmap tests (with and without lookup table). Below is the result of performance test. Output format is in seconds. Test this tree ------------------------------------------------------------------- 5310.4: repack to disk (lookup=false) 294.92(257.60+14.29) 5310.5: simulated clone 14.97(8.95+1.31) 5310.6: simulated fetch 1.64(2.77+0.20) 5310.7: pack to file (bitmap) 41.76(29.33+6.77) 5310.8: rev-list (commits) 0.71(0.49+0.09) 5310.9: rev-list (objects) 4.65(4.55+0.09) 5310.10: rev-list with tag negated via --not 0.08(0.02+0.05) --all (objects) 5310.11: rev-list with negative tag (objects) 0.06(0.01+0.04) 5310.12: rev-list count with blob:none 0.09(0.03+0.05) 5310.13: rev-list count with blob:limit=1k 7.58(7.06+0.33) 5310.14: rev-list count with tree:0 0.09(0.03+0.06) 5310.15: simulated partial clone 8.64(8.04+0.35) 5310.19: repack to disk (lookup=true) 249.86(191.57+19.50) 5310.20: simulated clone 13.67(8.83+1.06) 5310.21: simulated fetch 0.50(0.63+0.13) 5310.22: pack to file (bitmap) 41.24(28.99+6.67) 5310.23: rev-list (commits) 0.67(0.50+0.07) 5310.24: rev-list (objects) 4.88(4.79+0.08) 5310.25: rev-list with tag negated via --not 0.04(0.00+0.03) --all (objects) 5310.26: rev-list with negative tag (objects) 0.05(0.00+0.04) 5310.27: rev-list count with blob:none 0.05(0.01+0.03) 5310.28: rev-list count with blob:limit=1k 8.02(7.16+0.34) 5310.29: rev-list count with tree:0 0.05(0.01+0.04) 5310.30: simulated partial clone 8.57(8.16+0.32) Tests 4-15 are without the use of lookup table. The rests are repeatation of the previous tests but using lookup table. Signed-off-by: Abhradeep Chakraborty Mentored-by: Taylor Blau Co-Mentored-by: Kaartic Sivaraam --- t/perf/p5310-pack-bitmaps.sh | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/t/perf/p5310-pack-bitmaps.sh b/t/perf/p5310-pack-bitmaps.sh index 6ff42bdd391..9848c5d5040 100755 --- a/t/perf/p5310-pack-bitmaps.sh +++ b/t/perf/p5310-pack-bitmaps.sh @@ -13,7 +13,8 @@ test_perf_large_repo # We intentionally use the deprecated pack.writebitmaps # config so that we can test against older versions of git. test_expect_success 'setup bitmap config' ' - git config pack.writebitmaps true + git config pack.writebitmaps true && + git config pack.writeReverseIndex true ' test_bitmap () {