From patchwork Wed Apr 12 22:20:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13209581 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E10C9C77B6E for ; Wed, 12 Apr 2023 22:20:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230034AbjDLWUX (ORCPT ); Wed, 12 Apr 2023 18:20:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60732 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229990AbjDLWUV (ORCPT ); Wed, 12 Apr 2023 18:20:21 -0400 Received: from mail-yb1-xb35.google.com (mail-yb1-xb35.google.com [IPv6:2607:f8b0:4864:20::b35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2DA334C32 for ; Wed, 12 Apr 2023 15:20:20 -0700 (PDT) Received: by mail-yb1-xb35.google.com with SMTP id j10so3288028ybj.1 for ; Wed, 12 Apr 2023 15:20:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20221208.gappssmtp.com; s=20221208; t=1681338019; x=1683930019; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=pwy4KBmIx/l8Hxrn2NtHez0hMpYyvVY/Y94tNLdEaZA=; b=qz2FCpknAEl2C+dwlO28iAFgmOhpV3wBTtxurSGzf4up19/aagUkyJZFvp1VFL5b7H soXgQweYesnIZokpzQPxNcmtBcYA6zASFfW+BmeFE7mRrL4ZHa7Bvm9tDlXVhUX44MT4 jF3vQIlSxasKb3o84ONocqldo65bvNg1vqP74J3PWOKjxC0njOCzx3x3ke7wyglGdIFr 9O7/Rl4quytan2ohF+6G/MMbNvjmDFDxh5RST2wgnR44JntLcPBBBqds1MOYMGY3iURz WVzIUoxJISIyxnleXZshTc3knQMdTH0pa7heMZIKpjpuUe/gVcL5Y6zmnCKl20hNX7Aj 44Bw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681338019; x=1683930019; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=pwy4KBmIx/l8Hxrn2NtHez0hMpYyvVY/Y94tNLdEaZA=; b=Q72vIBGFFOsXVC8xrLLoVha+JNP4MvyiBbgg4PCKNa6rkoG6PGzuYrstAqEf5h8pkz zrXGmyR/LJoE29HH/McTV8qtNfPrQLaPBFU/MWxOb016IMJdJ8SUhAzeu2lDO9RuijpX 0ZAbDaoO4B6NYvSdRhXMFW365inJN/bfauX8OpA63nEb/IWrTb6xtDeiNhf8xTA4Psm2 qAaq+HWhsm6UEV+L8FsRPcAc2/6AvnyYkThQZDC/i5xbO/DMde/NGp9uVUUxGYwW6wXe RIVOrBf3F7lNAdGWCG76UX0Io1JNMtgLc0MFYZxnfXfQGIkm+ePQfMaoShOLX9uEx4rb eHMw== X-Gm-Message-State: AAQBX9dIBnwzZhemXhnX2lzIyFcJG2b3Gm61inanpnVSM2ieYWZ+1Ypu dgDSgrzMW+wQEmCIiVoFn7FhEuNG0012Nir3h7z+ww== X-Google-Smtp-Source: AKy350Y7//F8pyBX/OHnQL74FdlNYTjz7oL0yliGwwWG5cB1vQ+5cwME/uUtQVd26aloEEn4Tlj3/w== X-Received: by 2002:a25:b11e:0:b0:b8e:df9b:b2d4 with SMTP id g30-20020a25b11e000000b00b8edf9bb2d4mr111627ybj.11.1681338019168; Wed, 12 Apr 2023 15:20:19 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id bw26-20020a056902161a00b00b8c08669033sm4134232ybb.40.2023.04.12.15.20.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Apr 2023 15:20:18 -0700 (PDT) Date: Wed, 12 Apr 2023 18:20:17 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Derrick Stolee , Jeff King , Junio C Hamano Subject: [PATCH v2 1/7] pack-write.c: plug a leak in stage_tmp_packfiles() Message-ID: <18be29c3988295cd58521f8cc4a729897df074c8.1681338013.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The function `stage_tmp_packfiles()` generates a filename to use for staging the contents of what will become the pack's ".rev" file. The name is generated in `write_rev_file_order()` (via its caller `write_rev_file()`) in a string buffer, and the result is returned back to `stage_tmp_packfiles()` which uses it to rename the temporary file into place via `rename_tmp_packfiles()`. That name is not visible outside of `stage_tmp_packfiles()`, so it can (and should) be `free()`'d at the end of that function. We can't free it in `rename_tmp_packfile()` since not all of its `source` arguments are unreachable after calling it. Instead, simply free() `rev_tmp_name` at the end of `stage_tmp_packfiles()`. (Note that the same leak exists for `mtimes_tmp_name`, but we do not address it in this commit). Signed-off-by: Taylor Blau --- pack-write.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/pack-write.c b/pack-write.c index f1714054951..f27c1f7f281 100644 --- a/pack-write.c +++ b/pack-write.c @@ -568,6 +568,8 @@ void stage_tmp_packfiles(struct strbuf *name_buffer, rename_tmp_packfile(name_buffer, rev_tmp_name, "rev"); if (mtimes_tmp_name) rename_tmp_packfile(name_buffer, mtimes_tmp_name, "mtimes"); + + free((char *)rev_tmp_name); } void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought) From patchwork Wed Apr 12 22:20:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13209582 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F11B3C7619A for ; Wed, 12 Apr 2023 22:20:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230111AbjDLWUc (ORCPT ); Wed, 12 Apr 2023 18:20:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60858 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230016AbjDLWU0 (ORCPT ); Wed, 12 Apr 2023 18:20:26 -0400 Received: from mail-yw1-x112e.google.com (mail-yw1-x112e.google.com [IPv6:2607:f8b0:4864:20::112e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 61F0E658E for ; Wed, 12 Apr 2023 15:20:23 -0700 (PDT) Received: by mail-yw1-x112e.google.com with SMTP id 00721157ae682-54c12009c30so354177847b3.9 for ; Wed, 12 Apr 2023 15:20:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20221208.gappssmtp.com; s=20221208; t=1681338022; x=1683930022; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=Op2dNgfz2LJJmlQ7oVycm1fitJJMjgLuH55iEowcxig=; b=q0NAle4Q5XTmU+VOLFf/Ml7NH/+y+MUu+ZU/ACjyeJ+TDZxcAZ7HPWrsImiugqriRJ VV2lfyNmMbkLwhMO4viHhSRIxyyoc2cy14SvLNpx/q9MRMOcZHV10t5MNU19LDwTKiD4 IkZxPUeDg36Y4dgIOey6uIFrcWW5+VMdpjnvvbqqI9ZiUgUOW4I6jgamzCb7j/qT1b9D 5jfAzA7hgfItmxHa1UoypKBMFyKTTVUgdegZRXmqbN2ZcuSaAFNp7sIO/QBSNzRb84hE UDXNLZ1adOQU/aRqq1lFB2qNILiWFhWVvC1z2n4/8PMLxx6udr1+Q+a+/XgBqMfTyWo3 lwVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681338022; x=1683930022; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Op2dNgfz2LJJmlQ7oVycm1fitJJMjgLuH55iEowcxig=; b=IGUk5YpTXtf/Ut8D11wGmukdkZFFzu5u+U1yx9g1AFRzRm55rI1m147ciMQj21ss2m CLI/k0l7hKYt5wI5/v/FFRLRaTZN/HuryEBe7CmHV4ZH+SDJihOvwYp2Yvv9TuH0476E 3kXrpxXoJ5R6Z+kyluVFrRp7dBvcABvrNgu03JIINR6rTvVxEzQOB+fvCWYWuQ5AFr3b jVX3GpywU7CMduF+nApt/UtGDjygjzzBq6R17szodd5XQZcU/2CLdyG5gX31dbniFso3 zmiJzLZYeZRRC5iz29ciG7uOLW6lOI7BLvyaSB8A2LckBZgIh7a+Ot991aKhhAeFSYDg aOHA== X-Gm-Message-State: AAQBX9f5dmaMj1vLno8wIAV//E6YvL8hzwKtruxTzkWSq71CiL8laOpU kdd+KjYBEenWKN5pO4Dycj4vlivQf92PQ0CTiiKTMA== X-Google-Smtp-Source: AKy350aFRipSLOHlfJiHKVJlY4mmHXI587INuYu1dCdN1Z0Zb+igpcKk2TRkRZIfL0LFDtYypf8Now== X-Received: by 2002:a0d:d607:0:b0:54e:d7f3:2be0 with SMTP id y7-20020a0dd607000000b0054ed7f32be0mr147112ywd.12.1681338022461; Wed, 12 Apr 2023 15:20:22 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id f69-20020a81a848000000b00545a08184c9sm13879ywh.89.2023.04.12.15.20.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Apr 2023 15:20:22 -0700 (PDT) Date: Wed, 12 Apr 2023 18:20:21 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Derrick Stolee , Jeff King , Junio C Hamano Subject: [PATCH v2 2/7] t5325: mark as leak-free Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This test is leak-free as of the previous commit, so let's mark it as such to ensure we don't regress and introduce a leak in the future. Signed-off-by: Taylor Blau --- t/t5325-reverse-index.sh | 2 ++ 1 file changed, 2 insertions(+) diff --git a/t/t5325-reverse-index.sh b/t/t5325-reverse-index.sh index d042d26f2b3..48c14d85764 100755 --- a/t/t5325-reverse-index.sh +++ b/t/t5325-reverse-index.sh @@ -1,6 +1,8 @@ #!/bin/sh test_description='on-disk reverse index' + +TEST_PASSES_SANITIZE_LEAK=true . ./test-lib.sh # The below tests want control over the 'pack.writeReverseIndex' setting From patchwork Wed Apr 12 22:20:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13209583 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F79BC77B6C for ; Wed, 12 Apr 2023 22:20:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230016AbjDLWUj (ORCPT ); Wed, 12 Apr 2023 18:20:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60732 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230114AbjDLWUd (ORCPT ); Wed, 12 Apr 2023 18:20:33 -0400 Received: from mail-yb1-xb30.google.com (mail-yb1-xb30.google.com [IPv6:2607:f8b0:4864:20::b30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A13896A60 for ; Wed, 12 Apr 2023 15:20:26 -0700 (PDT) Received: by mail-yb1-xb30.google.com with SMTP id e10so4443241ybp.4 for ; Wed, 12 Apr 2023 15:20:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20221208.gappssmtp.com; s=20221208; t=1681338025; x=1683930025; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=/LfGro+h1F09xBK4RMyYjrQBvNLcVwWdie470x5g0AQ=; b=a+0BfR823aze/kPSRwhA9YFIJWXULMt8JSrJDNoQd129OUYedQf9GJLyX6u3B4iliY dpuPKzmEsSrYf9j7R8UzYvEBcpK3QBsH8HF3PxD8EUT56KvbntUNqED+137P2tu52+S7 diVpS0ovmyhbDId8zOIHd4aPoyiJ8+ff6NlbKyqriNNW0QMU2OoYLD8gXu4B+6RfXVI2 J73Z4N0cmx5N14oPtVVCgNPkTmyR29CjcUA9lC/gtvNE3nIbUuc0NMzgmXx3Fjee4OE3 4/+dWsBZw9oY3wOxYXRuhutdXoroyHQ3F13qidp2XrezEsJrLw9hb3ngCIzyRAr5cFnn qgVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681338025; x=1683930025; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=/LfGro+h1F09xBK4RMyYjrQBvNLcVwWdie470x5g0AQ=; b=GJyGCeNdUxbIaDtRrjB0m7vkInPXupnbhQiSsjUYOilNWNwGw99DGLzSw5oq9iC6K+ cHMzG2ZtAIiNJqhKBP7lkOZ9wCHy4eGI0yekxixw0203XUY12TKQ9Ue436xDFY/urAs/ bwerXIS78wQD8l28wHMMfwhMtTOusb+1blSpi/E8RmOm+Gp9tRLuGgDrgeP4jjSikK3N MH/Ka5JcUCldX/9lwxfddVBEczk4SuCVAMvLTJb0sI1BWOIrYhqODQNlDQqC1jjgHIVJ l97pdP+QG1M2YzvsdUIYSS9E5UjWs7ddCwVlwGilD1Vg54OSsA06YnFKZ8uv5x+UYRB7 f3iA== X-Gm-Message-State: AAQBX9d+hBEY7SkZ/yk7ktJoniTXliBLVLJXZ0RHQckBV8ctkIu23P6o OfnCEsm9lkgq4FuGenuR3sAmUeBhoMfKjRVyh6Bogw== X-Google-Smtp-Source: AKy350Z8vcQ9+JGV4IEcuf1B/Nq7ckF5Z2Bez1hkK3/53l7vEJXizlt18ZKmy7FFg+pGQNvFYWGzlw== X-Received: by 2002:a25:65d7:0:b0:b8f:545e:3428 with SMTP id z206-20020a2565d7000000b00b8f545e3428mr130752ybb.16.1681338025427; Wed, 12 Apr 2023 15:20:25 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id s5-20020a25ba85000000b00b8bf2d8a5c5sm1572879ybg.56.2023.04.12.15.20.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Apr 2023 15:20:25 -0700 (PDT) Date: Wed, 12 Apr 2023 18:20:24 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Derrick Stolee , Jeff King , Junio C Hamano Subject: [PATCH v2 3/7] pack-revindex: make `load_pack_revindex` take a repository Message-ID: <687a9a589249a327251b90f558917aa0f4ae79a4.1681338013.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In a future commit, we will introduce a `pack.readReverseIndex` configuration, which forces Git to generate the reverse index from scratch instead of loading it from disk. In order to avoid reading this configuration value more than once, we'll use the `repo_settings` struct to lazily load this value. In order to access the `struct repo_settings`, add a repository argument to `load_pack_revindex`, and update all callers to pass the correct instance (in all cases, `the_repository`). In certain instances, a new function-local variable is introduced to take the place of a `struct repository *` argument to the function itself to avoid propagating the new parameter even further throughout the tree. Co-authored-by: Derrick Stolee Signed-off-by: Derrick Stolee Signed-off-by: Taylor Blau --- pack-bitmap.c | 23 +++++++++++++---------- pack-revindex.c | 4 ++-- pack-revindex.h | 3 ++- packfile.c | 2 +- 4 files changed, 18 insertions(+), 14 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index b2e7d06d604..38b35c48237 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -463,7 +463,7 @@ static int open_pack_bitmap_1(struct bitmap_index *bitmap_git, struct packed_git return 0; } -static int load_reverse_index(struct bitmap_index *bitmap_git) +static int load_reverse_index(struct repository *r, struct bitmap_index *bitmap_git) { if (bitmap_is_midx(bitmap_git)) { uint32_t i; @@ -477,23 +477,23 @@ static int load_reverse_index(struct bitmap_index *bitmap_git) * since we will need to make use of them in pack-objects. */ for (i = 0; i < bitmap_git->midx->num_packs; i++) { - ret = load_pack_revindex(bitmap_git->midx->packs[i]); + ret = load_pack_revindex(r, bitmap_git->midx->packs[i]); if (ret) return ret; } return 0; } - return load_pack_revindex(bitmap_git->pack); + return load_pack_revindex(r, bitmap_git->pack); } -static int load_bitmap(struct bitmap_index *bitmap_git) +static int load_bitmap(struct repository *r, struct bitmap_index *bitmap_git) { assert(bitmap_git->map); bitmap_git->bitmaps = kh_init_oid_map(); bitmap_git->ext_index.positions = kh_init_oid_pos(); - if (load_reverse_index(bitmap_git)) + if (load_reverse_index(r, bitmap_git)) goto failed; if (!(bitmap_git->commits = read_bitmap_1(bitmap_git)) || @@ -580,7 +580,7 @@ struct bitmap_index *prepare_bitmap_git(struct repository *r) { struct bitmap_index *bitmap_git = xcalloc(1, sizeof(*bitmap_git)); - if (!open_bitmap(r, bitmap_git) && !load_bitmap(bitmap_git)) + if (!open_bitmap(r, bitmap_git) && !load_bitmap(r, bitmap_git)) return bitmap_git; free_bitmap_index(bitmap_git); @@ -589,9 +589,10 @@ struct bitmap_index *prepare_bitmap_git(struct repository *r) struct bitmap_index *prepare_midx_bitmap_git(struct multi_pack_index *midx) { + struct repository *r = the_repository; struct bitmap_index *bitmap_git = xcalloc(1, sizeof(*bitmap_git)); - if (!open_midx_bitmap_1(bitmap_git, midx) && !load_bitmap(bitmap_git)) + if (!open_midx_bitmap_1(bitmap_git, midx) && !load_bitmap(r, bitmap_git)) return bitmap_git; free_bitmap_index(bitmap_git); @@ -1592,7 +1593,7 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs, * from disk. this is the point of no return; after this the rev_list * becomes invalidated and we must perform the revwalk through bitmaps */ - if (load_bitmap(bitmap_git) < 0) + if (load_bitmap(revs->repo, bitmap_git) < 0) goto cleanup; object_array_clear(&revs->pending); @@ -1742,6 +1743,7 @@ int reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, uint32_t *entries, struct bitmap **reuse_out) { + struct repository *r = the_repository; struct packed_git *pack; struct bitmap *result = bitmap_git->result; struct bitmap *reuse; @@ -1752,7 +1754,7 @@ int reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, assert(result); - load_reverse_index(bitmap_git); + load_reverse_index(r, bitmap_git); if (bitmap_is_midx(bitmap_git)) pack = bitmap_git->midx->packs[midx_preferred_pack(bitmap_git)]; @@ -2132,11 +2134,12 @@ int rebuild_bitmap(const uint32_t *reposition, uint32_t *create_bitmap_mapping(struct bitmap_index *bitmap_git, struct packing_data *mapping) { + struct repository *r = the_repository; uint32_t i, num_objects; uint32_t *reposition; if (!bitmap_is_midx(bitmap_git)) - load_reverse_index(bitmap_git); + load_reverse_index(r, bitmap_git); else if (load_midx_revindex(bitmap_git->midx) < 0) BUG("rebuild_existing_bitmaps: missing required rev-cache " "extension"); diff --git a/pack-revindex.c b/pack-revindex.c index 03c7e81f9da..e3d69cc0f7a 100644 --- a/pack-revindex.c +++ b/pack-revindex.c @@ -283,7 +283,7 @@ static int load_pack_revindex_from_disk(struct packed_git *p) return ret; } -int load_pack_revindex(struct packed_git *p) +int load_pack_revindex(struct repository *r, struct packed_git *p) { if (p->revindex || p->revindex_data) return 0; @@ -356,7 +356,7 @@ int offset_to_pack_pos(struct packed_git *p, off_t ofs, uint32_t *pos) { unsigned lo, hi; - if (load_pack_revindex(p) < 0) + if (load_pack_revindex(the_repository, p) < 0) return -1; lo = 0; diff --git a/pack-revindex.h b/pack-revindex.h index 4974e75eb4d..3d1969ce8b9 100644 --- a/pack-revindex.h +++ b/pack-revindex.h @@ -39,6 +39,7 @@ struct packed_git; struct multi_pack_index; +struct repository; /* * load_pack_revindex populates the revindex's internal data-structures for the @@ -47,7 +48,7 @@ struct multi_pack_index; * If a '.rev' file is present it is mmap'd, and pointers are assigned into it * (instead of using the in-memory variant). */ -int load_pack_revindex(struct packed_git *p); +int load_pack_revindex(struct repository *r, struct packed_git *p); /* * load_midx_revindex loads the '.rev' file corresponding to the given diff --git a/packfile.c b/packfile.c index b120405ccc8..717fd685c97 100644 --- a/packfile.c +++ b/packfile.c @@ -2151,7 +2151,7 @@ int for_each_object_in_pack(struct packed_git *p, int r = 0; if (flags & FOR_EACH_OBJECT_PACK_ORDER) { - if (load_pack_revindex(p)) + if (load_pack_revindex(the_repository, p)) return -1; } From patchwork Wed Apr 12 22:20:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13209584 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6AFA7C77B6C for ; Wed, 12 Apr 2023 22:20:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230098AbjDLWUo (ORCPT ); Wed, 12 Apr 2023 18:20:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32822 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230097AbjDLWUj (ORCPT ); Wed, 12 Apr 2023 18:20:39 -0400 Received: from mail-yw1-x1132.google.com (mail-yw1-x1132.google.com [IPv6:2607:f8b0:4864:20::1132]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7069383C2 for ; Wed, 12 Apr 2023 15:20:29 -0700 (PDT) Received: by mail-yw1-x1132.google.com with SMTP id 00721157ae682-54ee0b73e08so249763567b3.0 for ; Wed, 12 Apr 2023 15:20:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20221208.gappssmtp.com; s=20221208; t=1681338028; x=1683930028; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=veA3cT8nwAjRstGR+3jsN+Mqq+aC6f8PYllE4p98wu8=; b=bs7mjJhGTJTUSeLj9iGmXDSD9TiBdBolpFq9IxFGBweMxLd/APDFykVF/up7PVFMqB lqraEJwUdeXLiKGiRebM3j+3o8W+wPTmC8ovp/+vAuO4H/6hoTuhUbydyytTNcmAvp9q xBX2ti4GuLCaxibb9xai1DVuKXsrVm6KQFkht7Yx21bG/2P+0WP+FQBOHv9Bf92fgyzS KNeB1rFV0BsGRk1H92rJm8kXs1QxSBhHP/X35UNfT0pyYoCmPqqD6oB+ahQFEkgaqBKI 5ZL5dMnJLEQK+bfvKJkWaLv6mjOz670tb3S3amc/m6KLnkflnvyY4VbagZFDo+aCtdka jKvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681338028; x=1683930028; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=veA3cT8nwAjRstGR+3jsN+Mqq+aC6f8PYllE4p98wu8=; b=CJlQb+dCtgGI1uAr+0AhXwbwDMtXdTdBSlZT79KlQINVL+52pyHkY+O7fst84FWb0v GKi2g9XCb4o27h9WROSmosLX4UfTFFPg9Fms3Kb4IevSYklY/cNcrWFiTvOQyI/9uUUx q1P2Ly9Uqcd5cKR1B5Qax8C9jJy7vYeXgt3XqW07kNajQdSZFaL0MctufGuTyMfTfZ1D JSODwg7Q+cj7+CDGuHVDurRkHGk17aRfQi4xbOP9SNN6iib0UQJeXqa0ymLGPIJvT86K H296SsdTUmefABGLTs7i24r95m95nsJy0FII1hECT5SLA0uQUEP+TBxTlZprB3jq/slW 4+pw== X-Gm-Message-State: AAQBX9eAfEULmBI3HNnGj6GfIX1COKXdSA3Y/AOt245pEyCYkfJl25We a4470mdQIQzUKhJImyxLybSLlDbeNr7IXJytWnhgDQ== X-Google-Smtp-Source: AKy350Zq8MB1AguzcFlZjph64ig5N9a73tQzvZy7GdRMGGaLB8Xmm3I9ZE0a3wlVl7gXPIxGubuS+Q== X-Received: by 2002:a81:6a07:0:b0:541:876d:ae50 with SMTP id f7-20020a816a07000000b00541876dae50mr72550ywc.44.1681338028441; Wed, 12 Apr 2023 15:20:28 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id d123-20020a811d81000000b0054f856bdc4dsm23824ywd.38.2023.04.12.15.20.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Apr 2023 15:20:28 -0700 (PDT) Date: Wed, 12 Apr 2023 18:20:27 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Derrick Stolee , Jeff King , Junio C Hamano Subject: [PATCH v2 4/7] pack-revindex: introduce GIT_TEST_REV_INDEX_DIE_ON_DISK Message-ID: <8eec5bacd3a39d4d1586459f4a23422aff903139.1681338013.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In ec8e7760ac (pack-revindex: ensure that on-disk reverse indexes are given precedence, 2021-01-25), we introduced GIT_TEST_REV_INDEX_DIE_IN_MEMORY to abort the process when Git generated a reverse index from scratch. ec8e7760ac was about ensuring that Git prefers a .rev file when available over generating the same information in memory from scratch. In a subsequent patch, we'll introduce `pack.readReverseIndex`, which may be used to disable reading ".rev" files when available. In order to ensure that those files are indeed being ignored, introduce an analogous option to abort the process when Git reads a ".rev" file from disk. Signed-off-by: Taylor Blau --- pack-revindex.c | 3 +++ pack-revindex.h | 1 + 2 files changed, 4 insertions(+) diff --git a/pack-revindex.c b/pack-revindex.c index e3d69cc0f7a..44e1b3fed95 100644 --- a/pack-revindex.c +++ b/pack-revindex.c @@ -205,6 +205,9 @@ static int load_revindex_from_disk(char *revindex_name, size_t revindex_size; struct revindex_header *hdr; + if (git_env_bool(GIT_TEST_REV_INDEX_DIE_ON_DISK, 0)) + die("dying as requested by '%s'", GIT_TEST_REV_INDEX_DIE_ON_DISK); + fd = git_open(revindex_name); if (fd < 0) { diff --git a/pack-revindex.h b/pack-revindex.h index 3d1969ce8b9..ef8afee88b0 100644 --- a/pack-revindex.h +++ b/pack-revindex.h @@ -36,6 +36,7 @@ #define GIT_TEST_WRITE_REV_INDEX "GIT_TEST_WRITE_REV_INDEX" #define GIT_TEST_REV_INDEX_DIE_IN_MEMORY "GIT_TEST_REV_INDEX_DIE_IN_MEMORY" +#define GIT_TEST_REV_INDEX_DIE_ON_DISK "GIT_TEST_REV_INDEX_DIE_ON_DISK" struct packed_git; struct multi_pack_index; From patchwork Wed Apr 12 22:20:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13209585 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7B98C77B6C for ; Wed, 12 Apr 2023 22:20:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230163AbjDLWUy (ORCPT ); Wed, 12 Apr 2023 18:20:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33074 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230088AbjDLWUm (ORCPT ); Wed, 12 Apr 2023 18:20:42 -0400 Received: from mail-yw1-x112f.google.com (mail-yw1-x112f.google.com [IPv6:2607:f8b0:4864:20::112f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A1F3F4C1F for ; Wed, 12 Apr 2023 15:20:32 -0700 (PDT) Received: by mail-yw1-x112f.google.com with SMTP id 00721157ae682-54f64b29207so138893007b3.8 for ; Wed, 12 Apr 2023 15:20:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20221208.gappssmtp.com; s=20221208; t=1681338032; x=1683930032; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=PZjs8syBbXW9mbGiXsnmClVcRhYNSEc8yV+7PpNIur0=; b=3sChyL6NSX9ECuGX8k7s6CeUgdAGgeb5ZmSUx6O1NYWuXQ3Qrx/nYx9ULC3qBUceZ/ muakWfnL83/n7b6B+QUFoxDZpEdBVp2kH8WBP2f45gpEczDW9KdWE7qfa4uxnnqzKsy5 ebrhP9nk45mczF/9kxrEkXvZTpwf01RuaYKC2+g7brzUdMoCkfNgAabejtIwSibyvOyq yBwt2RQtdzUGbJuZRYw3TKu6cSJv+ACk39NDFblWCCETqAoJfv/fMUhQUIRtwNrgWg3r hEs0XdeUIzorY4Cjca8ewQKew21HxGr/SqyIoaQLkfdnl9E5B50RiYFGKSRtQ8khw9vc LbZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681338032; x=1683930032; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=PZjs8syBbXW9mbGiXsnmClVcRhYNSEc8yV+7PpNIur0=; b=WmEC5UEtd/JwAczr35q1OXLBILKXY9TXJl/lEhNLm1VYC1x7OKpwUttSFJtAZ3fyU4 967KPoVE61r89LovLtLFolhISwPAoBBCcsv2r8qkFh4vgi+cR6IFdfPdKGcWTTmppfz6 IXdWxn1RiunLd2YZi7mSyrL5YSimos3JZ7ETRK3vzNh7J2nqC7Zssy0l9zMIsRlyTsBt u2x5jVSrnkRMCMPalicR0tf7wz6+zEfPjCxPXw5g0DXS5Ujg9S0MhSz7kbAaDSjWDN/J dyob+e+0N+72924S5MxnvhGNGCGEvnJJJHTwc/11wkLsZyBunOCcvQd/9eeg9hmEUoH4 c2SA== X-Gm-Message-State: AAQBX9de5olhAtiqYcZ6dVTQ4GrGfI+RZhKdffleZBm8EGwxxhY+wgAA uzXl6OfOL22LIwFSx2iQqkD4ctE2hM7dzN5xn+cEBw== X-Google-Smtp-Source: AKy350bGFN7FvOYSEk21Bssj4IpN+66IIzEf0DzN0shR4bzxpXygaz8Gv0qhKFnmHQ1PFFZQbgx1rg== X-Received: by 2002:a0d:e843:0:b0:544:77f1:a0f3 with SMTP id r64-20020a0de843000000b0054477f1a0f3mr67986ywe.37.1681338031692; Wed, 12 Apr 2023 15:20:31 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id ce1-20020a05690c098100b0054fa9e39be0sm18637ywb.56.2023.04.12.15.20.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Apr 2023 15:20:31 -0700 (PDT) Date: Wed, 12 Apr 2023 18:20:30 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Derrick Stolee , Jeff King , Junio C Hamano Subject: [PATCH v2 5/7] pack-revindex: introduce `pack.readReverseIndex` Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Since 1615c567b8 (Documentation/config/pack.txt: advertise 'pack.writeReverseIndex', 2021-01-25), we have had the `pack.writeReverseIndex` configuration option, which tells Git whether or not it is allowed to write a ".rev" file when indexing a pack. Introduce a complementary configuration knob, `pack.readReverseIndex` to control whether or not Git will read any ".rev" file(s) that may be available on disk. This option is useful for debugging, as well as disabling the effect of ".rev" files in certain instances. This is useful because of the trade-off[^1] between the time it takes to generate a reverse index (slow from scratch, fast when reading an existing ".rev" file), and the time it takes to access a record (the opposite). For example, even though it is faster to use the on-disk reverse index when computing the on-disk size of a packed object, it is slower to enumerate the same value for all objects. Here are a couple of examples from linux.git. When computing the above for a single object, using the on-disk reverse index is significantly faster: $ git rev-parse HEAD >in $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} cat-file --batch-check="%(objectsize:disk)" --- Documentation/config/pack.txt | 6 ++++++ pack-revindex.c | 5 ++++- repo-settings.c | 1 + repository.h | 1 + t/t5325-reverse-index.sh | 11 +++++++++++ 5 files changed, 23 insertions(+), 1 deletion(-) diff --git a/Documentation/config/pack.txt b/Documentation/config/pack.txt index 53093d99969..7db7fed4667 100644 --- a/Documentation/config/pack.txt +++ b/Documentation/config/pack.txt @@ -171,6 +171,12 @@ pack.writeBitmapLookupTable:: beneficial in repositories that have relatively large bitmap indexes. Defaults to false. +pack.readReverseIndex:: + When true, git will read any .rev file(s) that may be available + (see: linkgit:gitformat-pack[5]). When false, the reverse index + will be generated from scratch and stored in memory. Defaults to + true. + pack.writeReverseIndex:: When true, git will write a corresponding .rev file (see: linkgit:gitformat-pack[5]) diff --git a/pack-revindex.c b/pack-revindex.c index 44e1b3fed95..29f5358b256 100644 --- a/pack-revindex.c +++ b/pack-revindex.c @@ -291,7 +291,10 @@ int load_pack_revindex(struct repository *r, struct packed_git *p) if (p->revindex || p->revindex_data) return 0; - if (!load_pack_revindex_from_disk(p)) + prepare_repo_settings(r); + + if (r->settings.pack_read_reverse_index && + !load_pack_revindex_from_disk(p)) return 0; else if (!create_pack_revindex_in_memory(p)) return 0; diff --git a/repo-settings.c b/repo-settings.c index 0a6c0b381fe..bdd7640ab0b 100644 --- a/repo-settings.c +++ b/repo-settings.c @@ -63,6 +63,7 @@ void prepare_repo_settings(struct repository *r) repo_cfg_bool(r, "core.multipackindex", &r->settings.core_multi_pack_index, 1); repo_cfg_bool(r, "index.sparse", &r->settings.sparse_index, 0); repo_cfg_bool(r, "index.skiphash", &r->settings.index_skip_hash, r->settings.index_skip_hash); + repo_cfg_bool(r, "pack.readreverseindex", &r->settings.pack_read_reverse_index, 1); /* * The GIT_TEST_MULTI_PACK_INDEX variable is special in that diff --git a/repository.h b/repository.h index 15a8afc5fb5..ed73e799b61 100644 --- a/repository.h +++ b/repository.h @@ -37,6 +37,7 @@ struct repo_settings { int fetch_write_commit_graph; int command_requires_full_index; int sparse_index; + int pack_read_reverse_index; struct fsmonitor_settings *fsmonitor; /* lazily loaded */ diff --git a/t/t5325-reverse-index.sh b/t/t5325-reverse-index.sh index 48c14d85764..66171c1d67b 100755 --- a/t/t5325-reverse-index.sh +++ b/t/t5325-reverse-index.sh @@ -96,6 +96,17 @@ test_expect_success 'reverse index is not generated when available on disk' ' --batch-check="%(objectsize:disk)" tip && + GIT_TEST_REV_INDEX_DIE_ON_DISK=1 git cat-file \ + --batch-check="%(objectsize:disk)" X-Patchwork-Id: 13209587 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4BBAC77B73 for ; Wed, 12 Apr 2023 22:21:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230032AbjDLWVA (ORCPT ); Wed, 12 Apr 2023 18:21:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32948 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229990AbjDLWUp (ORCPT ); Wed, 12 Apr 2023 18:20:45 -0400 Received: from mail-yw1-x1129.google.com (mail-yw1-x1129.google.com [IPv6:2607:f8b0:4864:20::1129]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2A5FB8690 for ; Wed, 12 Apr 2023 15:20:36 -0700 (PDT) Received: by mail-yw1-x1129.google.com with SMTP id 00721157ae682-54f6a796bd0so132440797b3.12 for ; Wed, 12 Apr 2023 15:20:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20221208.gappssmtp.com; s=20221208; t=1681338035; x=1683930035; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=+JdHP8LScD2A/vTVh7knRBPQRib4QCXu2Acp21kkKYY=; b=gajXfhraIOxS7OY9PUrtb3X81WY7kPMg2UTtF6bnxRd/koPoU/hUjcAO56qEOm/LAP XZB2Vl1mmze8LyfaeMc4phyvCEJY8flkMLklkRAp+pb30i6A6jwFVz8DpMHXu+Zi/uxE SnwTvjVlmruU40pohFyPpyjRojFuPcGIJIMGTWbI3Gl3Aelae3pDgs7/ybavfumA5Ze0 dcP5q04VrqxTL29dikiyIgRGAa7eeXvr47nrHfpvClVUiK17Oz3BMtST+4ykUI6PVy6n TVOjbaPy1OBNdFGmqM8HmapPW4hxxAghVW8mA7eqr7oEBigMpqrnnpGSLfxdFtrsXcAn s85w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681338035; x=1683930035; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=+JdHP8LScD2A/vTVh7knRBPQRib4QCXu2Acp21kkKYY=; b=ist4EgwkMoCl/N+p9tG34D2RW6T5vjPAdI+jQ1ECnvr4uDcQb4gw0jwrnby1lzxtKE MODQ0xPCSQ+ciGBDsEY7w2UqVvDqZra5uh2ugUSYIQQOadzN5psyK3qBC2ftaCCWlWwG oVKoBpmhPKtjNc12f8JGqTWCcgk4bz5kIihjI+0lhFB1hoGfjHkxe9rx3AlorYN2dTzL mGN2CF+fcdPXqFtcHr3HD4yVVUAqO70AUHh+8RbzsPSZ1Dy7YblkhIGCP/DfSiZRupqX NX2rCEhqx4WHC0E8HkhglceK4szCMACKlNh0kzXGlXxKP8qDrxgn3cXk+K+WxWWHNN7u x6jQ== X-Gm-Message-State: AAQBX9chJanZc9GbBWISxA/eZdieepKBPCyLkyDkG3PkG1EXSRjmGGek izwNrCKLvRdzH+fFBidFuMVTGCLB84rZb57Cb3FfQA== X-Google-Smtp-Source: AKy350bmoyDQf8WHAR2WwFNHcYgNU8Iru6/RqQUHb2xsFbuAhlGLaePjC5edN9X1VyIBvUwhip3vlA== X-Received: by 2002:a0d:d6d0:0:b0:54f:895e:70f7 with SMTP id y199-20020a0dd6d0000000b0054f895e70f7mr151999ywd.9.1681338034835; Wed, 12 Apr 2023 15:20:34 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id by7-20020a05690c082700b00545ef25cec6sm7916ywb.105.2023.04.12.15.20.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Apr 2023 15:20:34 -0700 (PDT) Date: Wed, 12 Apr 2023 18:20:33 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Derrick Stolee , Jeff King , Junio C Hamano Subject: [PATCH v2 6/7] config: enable `pack.writeReverseIndex` by default Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Back in e37d0b8730 (builtin/index-pack.c: write reverse indexes, 2021-01-25), Git learned how to read and write a pack's reverse index from a file instead of in-memory. A pack's reverse index is a mapping from pack position (that is, the order that objects appear together in a ".pack") to their position in lexical order (that is, the order that objects are listed in an ".idx" file). Reverse indexes are consulted often during pack-objects, as well as during auxiliary operations that require mapping between pack offsets, pack order, and index index. They are useful in GitHub's infrastructure, where we have seen a dramatic increase in performance when writing ".rev" files[1]. In particular: - an ~80% reduction in the time it takes to serve fetches on a popular repository, Homebrew/homebrew-core. - a ~60% reduction in the peak memory usage to serve fetches on that same repository. - a collective savings of ~35% in CPU time across all pack-objects invocations serving fetches across all repositories in a single datacenter. Reverse indexes are also beneficial to end-users as well as forges. For example, the time it takes to generate a pack containing the objects for the 10 most recent commits in linux.git (representing a typical push) is significantly faster when on-disk reverse indexes are available: $ { git rev-parse HEAD && printf '^' && git rev-parse HEAD~10 } >in $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} pack-objects --delta-base-offset --revs --stdout /dev/null' Benchmark 1: git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout /dev/null Time (mean ± σ): 543.0 ms ± 20.3 ms [User: 616.2 ms, System: 58.8 ms] Range (min … max): 521.0 ms … 577.9 ms 10 runs Benchmark 2: git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout /dev/null Time (mean ± σ): 245.0 ms ± 11.4 ms [User: 335.6 ms, System: 31.3 ms] Range (min … max): 226.0 ms … 259.6 ms 13 runs Summary 'git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout /dev/null' ran 2.22 ± 0.13 times faster than 'git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout /dev/null' The same is true of writing a pack containing the objects for the 30 most-recent commits: $ { git rev-parse HEAD && printf '^' && git rev-parse HEAD~30 } >in $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} pack-objects --delta-base-offset --revs --stdout /dev/null' Benchmark 1: git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout /dev/null Time (mean ± σ): 866.5 ms ± 16.2 ms [User: 1414.5 ms, System: 97.0 ms] Range (min … max): 839.3 ms … 886.9 ms 10 runs Benchmark 2: git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout /dev/null Time (mean ± σ): 581.6 ms ± 10.2 ms [User: 1181.7 ms, System: 62.6 ms] Range (min … max): 567.5 ms … 599.3 ms 10 runs Summary 'git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout /dev/null' ran 1.49 ± 0.04 times faster than 'git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout /dev/null' ...and savings on trivial operations like computing the on-disk size of a single (packed) object are even more dramatic: $ git rev-parse HEAD >in $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} cat-file --batch-check="%(objectsize:disk)" --- Documentation/config/pack.txt | 2 +- builtin/index-pack.c | 1 + builtin/pack-objects.c | 1 + t/perf/p5312-pack-bitmaps-revs.sh | 3 +-- t/t5325-reverse-index.sh | 1 + 5 files changed, 5 insertions(+), 3 deletions(-) diff --git a/Documentation/config/pack.txt b/Documentation/config/pack.txt index 7db7fed4667..d4c7c9d4e4e 100644 --- a/Documentation/config/pack.txt +++ b/Documentation/config/pack.txt @@ -182,4 +182,4 @@ pack.writeReverseIndex:: linkgit:gitformat-pack[5]) for each new packfile that it writes in all places except for linkgit:git-fast-import[1] and in the bulk checkin mechanism. - Defaults to false. + Defaults to true. diff --git a/builtin/index-pack.c b/builtin/index-pack.c index b17e79cd40f..323c063f9db 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1753,6 +1753,7 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) fsck_options.walk = mark_link; reset_pack_idx_option(&opts); + opts.flags |= WRITE_REV; git_config(git_index_pack_config, &opts); if (prefix && chdir(prefix)) die(_("Cannot come back to cwd")); diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 77d88f85b04..dbaa04482fd 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -4293,6 +4293,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) } reset_pack_idx_option(&pack_idx_opts); + pack_idx_opts.flags |= WRITE_REV; git_config(git_pack_config, NULL); if (git_env_bool(GIT_TEST_WRITE_REV_INDEX, 0)) pack_idx_opts.flags |= WRITE_REV; diff --git a/t/perf/p5312-pack-bitmaps-revs.sh b/t/perf/p5312-pack-bitmaps-revs.sh index 0684b690af0..ceec60656b5 100755 --- a/t/perf/p5312-pack-bitmaps-revs.sh +++ b/t/perf/p5312-pack-bitmaps-revs.sh @@ -12,8 +12,7 @@ test_lookup_pack_bitmap () { test_perf_large_repo test_expect_success 'setup bitmap config' ' - git config pack.writebitmaps true && - git config pack.writeReverseIndex true + git config pack.writebitmaps true ' # we need to create the tag up front such that it is covered by the repack and diff --git a/t/t5325-reverse-index.sh b/t/t5325-reverse-index.sh index 66171c1d67b..149dcf5193b 100755 --- a/t/t5325-reverse-index.sh +++ b/t/t5325-reverse-index.sh @@ -14,6 +14,7 @@ packdir=.git/objects/pack test_expect_success 'setup' ' test_commit base && + test_config pack.writeReverseIndex false && pack=$(git pack-objects --all $packdir/pack) && rev=$packdir/pack-$pack.rev && From patchwork Wed Apr 12 22:20:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13209586 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5243EC7619A for ; Wed, 12 Apr 2023 22:21:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230188AbjDLWU6 (ORCPT ); Wed, 12 Apr 2023 18:20:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33044 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229805AbjDLWUq (ORCPT ); Wed, 12 Apr 2023 18:20:46 -0400 Received: from mail-yb1-xb29.google.com (mail-yb1-xb29.google.com [IPv6:2607:f8b0:4864:20::b29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E39F3659E for ; Wed, 12 Apr 2023 15:20:38 -0700 (PDT) Received: by mail-yb1-xb29.google.com with SMTP id n203so1684915ybg.6 for ; Wed, 12 Apr 2023 15:20:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20221208.gappssmtp.com; s=20221208; t=1681338038; x=1683930038; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=pjcohNR2HZuxa6NkSdaeejPxXtnVK+o6h1kxpBF0hQ0=; b=mwSKHbStrzTshrKBCaHStQWBxEmwloCTcjFWX+sd57SvZV606ULBqiCtvqBZprfr8j V955KUiSiAJRc3tMctE9R9wqsuCa+M3kcQBlo849us3UC9D9jERpPST3Q1jQHJZ91w2N UnnEUdaH0zgV8fQpdU9MIejqdfw1MlZO9+h1gcl07WV8rzodPmqsfIYV0FYJBRUgIKj4 4Yb2Y6OaSPL3EFjJkIng3A2XcIhF3cDj9xLRFl3TtZRBb3SID30oWrid8xKnxO4Uc9do O6EwLEWxu+qh9TbjhkrjTsZzE9qfbXVAT6ubWFcd8GhkcSM7ybySgohpTW5R3NhDA4ap eLWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681338038; x=1683930038; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=pjcohNR2HZuxa6NkSdaeejPxXtnVK+o6h1kxpBF0hQ0=; b=FbSInH+9g/WYLihBqWJ16V34w7ulw2QBJxcdKzBjw6h87h2GSE7XQ4shv/o+px0gVx 2ckn7TB3mQXeWEQmCV8IbQfO0pZjI+oD3qPTJpEgMeXOj1cluagUAO670qjUG8xvxyjY KhAxNYv1/0Nf3oMXVEC0krOiUFL9d+vdfRmek5a1vFHUELHJKIeywguiZfe6lfKAWf9N m/Yx1Zdic7mL1OdIyb2Gix/pgZAp//prfV0tTTm/JwJ0DFzpPymrRRwRTyZEe2QjXGly NfPsaLhVP+Ql8Rtyvp/OVyJW+ThHqNCPMb0VlJpiegWndR0ZUWSd1W0IKbYsVt5Zob7X aaDA== X-Gm-Message-State: AAQBX9clUBX3ep6pxsHHDa37eqQWRRowvjqnYfu+qqN84jb8WtPVBdZp LHuuHc9H9frAhvV25uzKk58LfioXrshrDer899qikQ== X-Google-Smtp-Source: AKy350bPtz26w9NJ8sBB23wLlgXDAsQs4/nSZ9eGTLvBxnNxSdKAA9vhQ3TAHXajmKpImayBcQsRfQ== X-Received: by 2002:a25:65c5:0:b0:b8b:d5d7:69cb with SMTP id z188-20020a2565c5000000b00b8bd5d769cbmr32963ybb.47.1681338037941; Wed, 12 Apr 2023 15:20:37 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id j19-20020a25d213000000b00b7767ca7487sm4621713ybg.36.2023.04.12.15.20.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Apr 2023 15:20:37 -0700 (PDT) Date: Wed, 12 Apr 2023 18:20:36 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Derrick Stolee , Jeff King , Junio C Hamano Subject: [PATCH v2 7/7] t: invert `GIT_TEST_WRITE_REV_INDEX` Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Back in e8c58f894b (t: support GIT_TEST_WRITE_REV_INDEX, 2021-01-25), we added a test knob to conditionally enable writing a ".rev" file when indexing a pack. At the time, this was used to ensure that the test suite worked even when ".rev" files were written, which served as a stress-test for the on-disk reverse index implementation. Now that reading from on-disk ".rev" files is enabled by default, the test knob `GIT_TEST_WRITE_REV_INDEX` no longer has any meaning. We could get rid of the option entirely, but there would be no convenient way to test Git when ".rev" files *aren't* in place. Instead of getting rid of the option, invert its meaning to instead disable writing ".rev" files, thereby running the test suite in a mode where the reverse index is generated from scratch. This ensures that, when GIT_TEST_NO_WRITE_REV_INDEX is set to some spelling of "true", we are still running and exercising Git's behavior when forced to generate reverse indexes from scratch. Do so by setting it in the linux-TEST-vars CI run to ensure that we are maintaining good coverage of this now-legacy code. Signed-off-by: Taylor Blau --- builtin/index-pack.c | 4 ++-- builtin/pack-objects.c | 4 ++-- ci/run-build-and-tests.sh | 2 +- pack-revindex.h | 2 +- t/README | 2 +- t/t5325-reverse-index.sh | 2 +- 6 files changed, 8 insertions(+), 8 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 323c063f9db..9e36c985cf1 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1758,8 +1758,8 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) if (prefix && chdir(prefix)) die(_("Cannot come back to cwd")); - if (git_env_bool(GIT_TEST_WRITE_REV_INDEX, 0)) - rev_index = 1; + if (git_env_bool(GIT_TEST_NO_WRITE_REV_INDEX, 0)) + rev_index = 0; else rev_index = !!(opts.flags & (WRITE_REV_VERIFY | WRITE_REV)); diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index dbaa04482fd..1797871ce90 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -4295,8 +4295,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) reset_pack_idx_option(&pack_idx_opts); pack_idx_opts.flags |= WRITE_REV; git_config(git_pack_config, NULL); - if (git_env_bool(GIT_TEST_WRITE_REV_INDEX, 0)) - pack_idx_opts.flags |= WRITE_REV; + if (git_env_bool(GIT_TEST_NO_WRITE_REV_INDEX, 0)) + pack_idx_opts.flags &= ~WRITE_REV; progress = isatty(2); argc = parse_options(argc, argv, prefix, pack_objects_options, diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh index b098e10f52a..a18b13a41dd 100755 --- a/ci/run-build-and-tests.sh +++ b/ci/run-build-and-tests.sh @@ -27,7 +27,7 @@ linux-TEST-vars) export GIT_TEST_MULTI_PACK_INDEX=1 export GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=1 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master - export GIT_TEST_WRITE_REV_INDEX=1 + export GIT_TEST_NO_WRITE_REV_INDEX=1 export GIT_TEST_CHECKOUT_WORKERS=2 ;; linux-clang) diff --git a/pack-revindex.h b/pack-revindex.h index ef8afee88b0..46e834064e1 100644 --- a/pack-revindex.h +++ b/pack-revindex.h @@ -34,7 +34,7 @@ #define RIDX_SIGNATURE 0x52494458 /* "RIDX" */ #define RIDX_VERSION 1 -#define GIT_TEST_WRITE_REV_INDEX "GIT_TEST_WRITE_REV_INDEX" +#define GIT_TEST_NO_WRITE_REV_INDEX "GIT_TEST_NO_WRITE_REV_INDEX" #define GIT_TEST_REV_INDEX_DIE_IN_MEMORY "GIT_TEST_REV_INDEX_DIE_IN_MEMORY" #define GIT_TEST_REV_INDEX_DIE_ON_DISK "GIT_TEST_REV_INDEX_DIE_ON_DISK" diff --git a/t/README b/t/README index 29576c37488..bdfac4cceb2 100644 --- a/t/README +++ b/t/README @@ -475,7 +475,7 @@ GIT_TEST_DEFAULT_HASH= specifies which hash algorithm to use in the test scripts. Recognized values for are "sha1" and "sha256". -GIT_TEST_WRITE_REV_INDEX=, when true enables the +GIT_TEST_NO_WRITE_REV_INDEX=, when true disables the 'pack.writeReverseIndex' setting. GIT_TEST_SPARSE_INDEX=, when true enables index writes to use the diff --git a/t/t5325-reverse-index.sh b/t/t5325-reverse-index.sh index 149dcf5193b..0548fce1aa6 100755 --- a/t/t5325-reverse-index.sh +++ b/t/t5325-reverse-index.sh @@ -7,7 +7,7 @@ TEST_PASSES_SANITIZE_LEAK=true # The below tests want control over the 'pack.writeReverseIndex' setting # themselves to assert various combinations of it with other options. -sane_unset GIT_TEST_WRITE_REV_INDEX +sane_unset GIT_TEST_NO_WRITE_REV_INDEX packdir=.git/objects/pack