From patchwork Fri Jan 8 18:19:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12007255 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3FD2C433DB for ; Fri, 8 Jan 2021 18:20:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4B04223A7B for ; Fri, 8 Jan 2021 18:20:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728558AbhAHSUv (ORCPT ); Fri, 8 Jan 2021 13:20:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34464 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728493AbhAHSUv (ORCPT ); Fri, 8 Jan 2021 13:20:51 -0500 Received: from mail-il1-x133.google.com (mail-il1-x133.google.com [IPv6:2607:f8b0:4864:20::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BD306C0612EA for ; Fri, 8 Jan 2021 10:20:01 -0800 (PST) Received: by mail-il1-x133.google.com with SMTP id v3so11167677ilo.5 for ; Fri, 08 Jan 2021 10:20:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=FOO5oPq4pYVdQbc8czJCLuvwkq5Q/i2zlIMSIThVTUc=; b=OuDs2McHo5Jd1emHKVJaY460YG1EyxAmmW9/meneMsieW0HUwSS28fDre3Ron8sBGN MM+Lvs8f7mVvsGhsq94v8FiI4tDlS66i42WkPwXNqYJ0D5Z65pqQCRXW1JpgrEFyytA6 qmziG3lYbeOJmwlxecT9p29Z56TeEjx9Y45zhHf9yss+WdPVD3s8KVjNECM1D4zr89PA 3hgBj/raBKXTcK70y0V3PCntEmqIreeU06uWJseFD+Cxgs8I3PI7g4SEi7uQ1U/HALBe rx6QT0ll+eQdrcQ8GZaJkNZLN0FRqblyjE6ymHJ/SfS4u4Er/zB42oHdHKQkp4Kim6Pz vkbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=FOO5oPq4pYVdQbc8czJCLuvwkq5Q/i2zlIMSIThVTUc=; b=nai7L3kbkuiO4AnXHs/YhuaL2QcqRH4ff8vZ7aZG2rDI+h8symq1dsLL03+neft/FD 276O1ItxTM38eDLGxgmrNDuRP5bP+KGedHaB2rXc7VhO9LRC9vBBTDDbHid978Az2vUS MGjqsYcP+qRQW2hUlmuo+ee99oBFRGNUqa0E83XmoLrWvgZp2I3aW+91RIO6Y/pYCQLe 8mqySB86OrSnkYVqiU4FgAIAzDSmzzDMMuWb+PvY580kYYn5MNbQZixMj56sSItPFxOe NqvFRMjzyE7dhgnK94Rv8IcrDl3FP4lRaQ8HAJ/+wbmBwKEch/NwP6R0eVsQY6af2t7r t9jQ== X-Gm-Message-State: AOAM530EoELjsF/pzpQB3XSbo4O1Cxf59k0aAY9+ffzN8805cCxMUNXI yG8n5uub+I2CNnRTdJRO6Mb8ty3k+eyPRg== X-Google-Smtp-Source: ABdhPJzGuJ3eAM9Jo67qL1nzg1nXsaMSDc1G42G3pEF7pQO1fhZKT49/KPIrbT9M0mkeJPCyq6Mfiw== X-Received: by 2002:a05:6e02:92b:: with SMTP id o11mr5246113ilt.210.1610130000642; Fri, 08 Jan 2021 10:20:00 -0800 (PST) Received: from localhost ([8.9.92.205]) by smtp.gmail.com with ESMTPSA id o10sm7906857ili.82.2021.01.08.10.19.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 Jan 2021 10:20:00 -0800 (PST) Date: Fri, 8 Jan 2021 13:19:57 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, jrnieder@gmail.com Subject: [PATCH 1/8] packfile: prepare for the existence of '*.rev' files Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Specify the format of the on-disk reverse index 'pack-*.rev' file, as well as prepare the code for the existence of such files. The reverse index maps from pack relative positions (i.e., an index into the array of object which is sorted by their offsets within the packfile) to their position within the 'pack-*.idx' file. Today, this is done by building up a list of (off_t, uint32_t) tuples for each object (the off_t corresponding to that object's offset, and the uint32_t corresponding to its position in the index). To convert between pack and index position quickly, this array of tuples is radix sorted based on its offset. This has two major drawbacks: First, the in-memory cost scales linearly with the number of objects in a pack. Each 'struct revindex_entry' is sizeof(off_t) + sizeof(uint32_t) + padding bytes for a total of 16. To observe this, force Git to load the reverse index by, for e.g., running 'git cat-file --batch-check="%(objectsize:disk)"'. When asking for a single object in a fresh clone of the kernel, Git needs to allocate 120+ MB of memory in order to hold the reverse index in memory. Second, the cost to sort also scales with the size of the pack. Luckily, this is a linear function since 'load_pack_revindex()' uses a radix sort, but this cost still must be paid once per pack per process. As an example, it takes ~60x longer to print the _size_ of an object as it does to print that entire object's _contents_: Benchmark #1: git.compile cat-file --batch /dev/null Time (mean ± σ): 22.6 ms ± 0.5 ms [User: 2.4 ms, System: 7.9 ms] Range (min … max): 21.4 ms … 23.5 ms 41 runs Benchmark #2: git.compile cat-file --batch /dev/null Time (mean ± σ): 17.2 ms ± 0.7 ms [User: 2.8 ms, System: 5.5 ms] Range (min … max): 15.6 ms … 18.2 ms 45 runs (Numbers taken in the kernel after cheating and using the next patch to generate a reverse index). There are a couple of approaches to improve cold cache performance not pursued here: - We could include the object offsets in the reverse index format. Predictably, this does result in fewer page faults, but it triples the size of the file, while simultaneously duplicating a ton of data already available in the .idx file. (This was the original way I implemented the format, and it did show `--batch-check='%(objectsize:disk)'` winning out against `--batch`.) On the other hand, this increase in size also results in a large block-cache footprint, which could potentially hurt other workloads. - We could store the mapping from pack to index position in more cache-friendly way, like constructing a binary search tree from the table and writing the values in breadth-first order. This would result in much better locality, but the price you pay is trading O(1) lookup in 'pack_pos_to_index()' for an O(log n) one (since you can no longer directly index the table). So, neither of these approaches are taken here. (Thankfully, the format is versioned, so we are free to pursue these in the future.) But, cold cache performance likely isn't interesting outside of one-off cases like asking for the size of an object directly. In real-world usage, Git is often performing many operations in the revindex, The trade-off is worth it, since we will avoid the vast majority of the cost of generating the revindex that the extra pointer chase will look like noise in the following patch's benchmarks. This patch describes the format and prepares callers (like in pack-revindex.c) to be able to read *.rev files once they exist. An implementation of the writer will appear in the next patch, and callers will gradually begin to start using the writer in the patches that follow after that. Signed-off-by: Taylor Blau --- Documentation/technical/pack-format.txt | 17 ++++ builtin/repack.c | 1 + object-store.h | 3 + pack-revindex.c | 112 +++++++++++++++++++++--- packfile.c | 13 ++- packfile.h | 1 + tmp-objdir.c | 4 +- 7 files changed, 139 insertions(+), 12 deletions(-) diff --git a/Documentation/technical/pack-format.txt b/Documentation/technical/pack-format.txt index f96b2e605f..9593f8bc68 100644 --- a/Documentation/technical/pack-format.txt +++ b/Documentation/technical/pack-format.txt @@ -259,6 +259,23 @@ Pack file entry: <+ Index checksum of all of the above. +== pack-*.rev files have the format: + + - A 4-byte magic number '0x52494458' ('RIDX'). + + - A 4-byte version identifier (= 1) + + - A 4-byte hash function identifier (= 1 for SHA-1, 2 for SHA-256) + + - A table of index positions, sorted by their corresponding offsets in the + packfile. + + - A trailer, containing a: + + checksum of the corresponding packfile, and + + a checksum of all of the above. + == multi-pack-index (MIDX) files have the following format: The multi-pack-index files refer to multiple pack-files and loose objects. diff --git a/builtin/repack.c b/builtin/repack.c index 279be11a16..8d643ddcb9 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -208,6 +208,7 @@ static struct { } exts[] = { {".pack"}, {".idx"}, + {".rev", 1}, {".bitmap", 1}, {".promisor", 1}, }; diff --git a/object-store.h b/object-store.h index c4fc9dd74e..3fbf11280f 100644 --- a/object-store.h +++ b/object-store.h @@ -85,6 +85,9 @@ struct packed_git { multi_pack_index:1; unsigned char hash[GIT_MAX_RAWSZ]; struct revindex_entry *revindex; + const void *revindex_data; + const void *revindex_map; + size_t revindex_size; /* something like ".git/objects/pack/xxxxx.pack" */ char pack_name[FLEX_ARRAY]; /* more */ }; diff --git a/pack-revindex.c b/pack-revindex.c index 2cd9d632f1..1baaf2c42a 100644 --- a/pack-revindex.c +++ b/pack-revindex.c @@ -164,16 +164,98 @@ static void create_pack_revindex(struct packed_git *p) sort_revindex(p->revindex, num_ent, p->pack_size); } -int load_pack_revindex(struct packed_git *p) +static int load_pack_revindex_from_memory(struct packed_git *p) { - if (!p->revindex) { - if (open_pack_index(p)) - return -1; - create_pack_revindex(p); - } + if (open_pack_index(p)) + return -1; + create_pack_revindex(p); return 0; } +static char *pack_revindex_filename(struct packed_git *p) +{ + size_t len; + if (!strip_suffix(p->pack_name, ".pack", &len)) + BUG("pack_name does not end in .pack"); + return xstrfmt("%.*s.rev", (int)len, p->pack_name); +} + +#define RIDX_MIN_SIZE (12 + (2 * the_hash_algo->rawsz)) + +static int load_revindex_from_disk(char *revindex_name, + uint32_t num_objects, + const void **data, size_t *len) +{ + int fd, ret = 0; + struct stat st; + size_t revindex_size; + + fd = git_open(revindex_name); + + if (fd < 0) { + ret = -1; + goto cleanup; + } + if (fstat(fd, &st)) { + ret = error_errno(_("failed to read %s"), revindex_name); + goto cleanup; + } + + revindex_size = xsize_t(st.st_size); + + if (revindex_size < RIDX_MIN_SIZE) { + ret = error(_("reverse-index file %s is too small"), revindex_name); + goto cleanup; + } + + if (revindex_size - RIDX_MIN_SIZE != st_mult(sizeof(uint32_t), num_objects)) { + ret = error(_("reverse-index file %s is corrupt"), revindex_name); + goto cleanup; + } + + *len = revindex_size; + *data = xmmap(NULL, revindex_size, PROT_READ, MAP_PRIVATE, fd, 0); + +cleanup: + close(fd); + return ret; +} + +static int load_pack_revindex_from_disk(struct packed_git *p) +{ + char *revindex_name; + int ret; + if (open_pack_index(p)) + return -1; + + revindex_name = pack_revindex_filename(p); + + ret = load_revindex_from_disk(revindex_name, + p->num_objects, + &p->revindex_map, + &p->revindex_size); + if (ret) + goto cleanup; + + p->revindex_data = (char *)p->revindex_map + 12; + +cleanup: + free(revindex_name); + return ret; +} + +int load_pack_revindex(struct packed_git *p) +{ + if (p->revindex || p->revindex_data) + return 0; + + if (!load_pack_revindex_from_disk(p)) + return 0; + else if (!load_pack_revindex_from_memory(p)) + return 0; + return -1; +} + int offset_to_pack_pos(struct packed_git *p, off_t ofs, uint32_t *pos) { int lo = 0; @@ -203,18 +285,28 @@ int offset_to_pack_pos(struct packed_git *p, off_t ofs, uint32_t *pos) uint32_t pack_pos_to_index(struct packed_git *p, uint32_t pos) { - if (!p->revindex) + if (!(p->revindex || p->revindex_data)) BUG("pack_pos_to_index: reverse index not yet loaded"); if (pos >= p->num_objects) BUG("pack_pos_to_index: out-of-bounds object at %"PRIu32, pos); - return p->revindex[pos].nr; + + if (p->revindex) + return p->revindex[pos].nr; + else + return get_be32((char *)p->revindex_data + (pos * sizeof(uint32_t))); } off_t pack_pos_to_offset(struct packed_git *p, uint32_t pos) { - if (!p->revindex) + if (!(p->revindex || p->revindex_data)) BUG("pack_pos_to_index: reverse index not yet loaded"); if (pos > p->num_objects) BUG("pack_pos_to_offset: out-of-bounds object at %"PRIu32, pos); - return p->revindex[pos].offset; + + if (p->revindex) + return p->revindex[pos].offset; + else if (pos == p->num_objects) + return p->pack_size - the_hash_algo->rawsz; + else + return nth_packed_object_offset(p, pack_pos_to_index(p, pos)); } diff --git a/packfile.c b/packfile.c index 46c9c7ea3c..e636e5ca17 100644 --- a/packfile.c +++ b/packfile.c @@ -324,11 +324,21 @@ void close_pack_index(struct packed_git *p) } } +void close_pack_revindex(struct packed_git *p) { + if (!p->revindex_map) + return; + + munmap((void *)p->revindex_map, p->revindex_size); + p->revindex_map = NULL; + p->revindex_data = NULL; +} + void close_pack(struct packed_git *p) { close_pack_windows(p); close_pack_fd(p); close_pack_index(p); + close_pack_revindex(p); } void close_object_store(struct raw_object_store *o) @@ -351,7 +361,7 @@ void close_object_store(struct raw_object_store *o) void unlink_pack_path(const char *pack_name, int force_delete) { - static const char *exts[] = {".pack", ".idx", ".keep", ".bitmap", ".promisor"}; + static const char *exts[] = {".pack", ".idx", ".rev", ".keep", ".bitmap", ".promisor"}; int i; struct strbuf buf = STRBUF_INIT; size_t plen; @@ -853,6 +863,7 @@ static void prepare_pack(const char *full_name, size_t full_name_len, if (!strcmp(file_name, "multi-pack-index")) return; if (ends_with(file_name, ".idx") || + ends_with(file_name, ".rev") || ends_with(file_name, ".pack") || ends_with(file_name, ".bitmap") || ends_with(file_name, ".keep") || diff --git a/packfile.h b/packfile.h index a58fc738e0..4cfec9e8d3 100644 --- a/packfile.h +++ b/packfile.h @@ -90,6 +90,7 @@ uint32_t get_pack_fanout(struct packed_git *p, uint32_t value); unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *); void close_pack_windows(struct packed_git *); +void close_pack_revindex(struct packed_git *); void close_pack(struct packed_git *); void close_object_store(struct raw_object_store *o); void unuse_pack(struct pack_window **); diff --git a/tmp-objdir.c b/tmp-objdir.c index 42ed4db5d3..da414df14f 100644 --- a/tmp-objdir.c +++ b/tmp-objdir.c @@ -187,7 +187,9 @@ static int pack_copy_priority(const char *name) return 2; if (ends_with(name, ".idx")) return 3; - return 4; + if (ends_with(name, ".rev")) + return 4; + return 5; } static int pack_copy_cmp(const char *a, const char *b) From patchwork Fri Jan 8 18:20:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12007249 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8E540C433E0 for ; Fri, 8 Jan 2021 18:20:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5EF1A23A69 for ; Fri, 8 Jan 2021 18:20:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728480AbhAHSUc (ORCPT ); Fri, 8 Jan 2021 13:20:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34374 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726666AbhAHSUb (ORCPT ); Fri, 8 Jan 2021 13:20:31 -0500 Received: from mail-il1-x136.google.com (mail-il1-x136.google.com [IPv6:2607:f8b0:4864:20::136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3E7D7C0612FD for ; Fri, 8 Jan 2021 10:20:06 -0800 (PST) Received: by mail-il1-x136.google.com with SMTP id e7so11160434ile.7 for ; Fri, 08 Jan 2021 10:20:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=WRMJaTWVSXgyiGEnGhhwjp1kz6Bn3xbvxoqFvfWwets=; b=GW6DR94UdTP0ehJHkyiGnaPqxErHLy4FJwgf+IvpNtQCpT6ZykyWG3Z8RCXXj9zjJq jQsHpZVYsrxEpSMEferL15++zLPo0fK18RF+VfduhPsyf6E3oVJnaZIJnvIE5DSIK2AH aRLLRkEud8aE5CX9ZWmSJjSb3vU3OKx/gTAtlv0TDA35hlMn/T/cArk7CojeJHd57uRM 0r9GcTDSwBQRSM9ZeqmYXbHbZ3wlt026tvl/Z6ibH7Aa286aDnisDGMpPwckwBtMNhAv YRU3UNtyf8Gy/3dF4K9OiuqIsYfKZJ5yjeT7w/Vnhe8BSfFPMa2ldhojB7Cjj6rYt8O1 N5RQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=WRMJaTWVSXgyiGEnGhhwjp1kz6Bn3xbvxoqFvfWwets=; b=obOS/Te1MDKGb9/WFjrJi2/KZqAc4PovyQMXBEfjU+P4xZe0iG07k4HgY+OrmXvMmS p0ntq+xqd86iHw5QyibosDGviFAplwt6dYiUyCohSpRns5BUDmbCQTJobL87PUDlNMji tGvYebOLNMlKwuUgBxSAdjdj1CA+YrrfmEYphBpQcAwkTYuIw9o/W46cvxIqkYzI7LTT HFZ2/4gxQn+dHi0U5GS4Vu5KbkTP/fjVdP2iFzvt1iiYavHk6mfmlMw+QGFJMmw7KyWk QgCMUUpZJW6ctzZotcYy6xJdWgLt8vJLRp385mNTUJV7heKO6dMqppobHiSHUqS0nkLC BsdA== X-Gm-Message-State: AOAM533c+nIuQvhbpQbeO5cTPyyHvtDoedGiSAVt+mdtSs4HJxWzjT8r Grm/Nt0G+L7fkbTMv0lmxAqDnAvrtZzd3A== X-Google-Smtp-Source: ABdhPJyDtv7khRDjjR9xD81AVCf324+1SfYcIFdWL+cunZIehY35RnRtY7pRoc338GD4T6k+0xoPZA== X-Received: by 2002:a92:ccce:: with SMTP id u14mr5088041ilq.152.1610130005327; Fri, 08 Jan 2021 10:20:05 -0800 (PST) Received: from localhost ([8.9.92.205]) by smtp.gmail.com with ESMTPSA id l9sm7462796ilg.51.2021.01.08.10.20.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 Jan 2021 10:20:04 -0800 (PST) Date: Fri, 8 Jan 2021 13:20:02 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, jrnieder@gmail.com Subject: [PATCH 2/8] pack-write.c: prepare to write 'pack-*.rev' files Message-ID: <88393e266292b022fba3ff41f6623486cb41f66d.1610129989.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This patch prepares for callers to be able to write reverse index files to disk. It adds the necessary machinery to write a format-compliant .rev file from within 'write_rev_file()', which is called from 'finish_tmp_packfile()'. Signed-off-by: Taylor Blau --- pack-write.c | 123 ++++++++++++++++++++++++++++++++++++++++++++++++++- pack.h | 4 ++ 2 files changed, 126 insertions(+), 1 deletion(-) diff --git a/pack-write.c b/pack-write.c index 3513665e1e..68db5a9edf 100644 --- a/pack-write.c +++ b/pack-write.c @@ -166,6 +166,116 @@ const char *write_idx_file(const char *index_name, struct pack_idx_entry **objec return index_name; } +static int pack_order_cmp(const void *va, const void *vb, void *ctx) +{ + struct pack_idx_entry **objects = ctx; + + off_t oa = objects[*(uint32_t*)va]->offset; + off_t ob = objects[*(uint32_t*)vb]->offset; + + if (oa < ob) + return -1; + if (oa > ob) + return 1; + return 0; +} + +#define RIDX_SIGNATURE 0x52494458 /* "RIDX" */ +#define RIDX_VERSION 1 + +static void write_rev_header(struct hashfile *f) +{ + uint32_t oid_version; + switch (hash_algo_by_ptr(the_hash_algo)) { + case GIT_HASH_SHA1: + oid_version = 1; + break; + case GIT_HASH_SHA256: + oid_version = 2; + break; + default: + die("write_rev_header: unknown hash version"); + } + + hashwrite_be32(f, RIDX_SIGNATURE); + hashwrite_be32(f, RIDX_VERSION); + hashwrite_be32(f, oid_version); +} + +static void write_rev_index_positions(struct hashfile *f, + struct pack_idx_entry **objects, + uint32_t nr_objects) +{ + uint32_t *pack_order; + uint32_t i; + + ALLOC_ARRAY(pack_order, nr_objects); + for (i = 0; i < nr_objects; i++) + pack_order[i] = i; + QSORT_S(pack_order, nr_objects, pack_order_cmp, objects); + + for (i = 0; i < nr_objects; i++) + hashwrite_be32(f, pack_order[i]); + + free(pack_order); +} + +static void write_rev_trailer(struct hashfile *f, const unsigned char *hash) +{ + hashwrite(f, hash, the_hash_algo->rawsz); +} + +const char *write_rev_file(const char *rev_name, + struct pack_idx_entry **objects, + uint32_t nr_objects, + const unsigned char *hash, + unsigned flags) +{ + struct hashfile *f; + int fd; + + if ((flags & WRITE_REV) && (flags & WRITE_REV_VERIFY)) + die(_("cannot both write and verify reverse index")); + + if (flags & WRITE_REV) { + if (!rev_name) { + struct strbuf tmp_file = STRBUF_INIT; + fd = odb_mkstemp(&tmp_file, "pack/tmp_rev_XXXXXX"); + rev_name = strbuf_detach(&tmp_file, NULL); + } else { + unlink(rev_name); + fd = open(rev_name, O_CREAT|O_EXCL|O_WRONLY, 0600); + if (fd < 0) + die_errno("unable to create '%s'", rev_name); + } + f = hashfd(fd, rev_name); + } else if (flags & WRITE_REV_VERIFY) { + struct stat statbuf; + if (stat(rev_name, &statbuf)) { + if (errno == ENOENT) { + /* .rev files are optional */ + return NULL; + } else + die_errno(_("could not stat: %s"), rev_name); + } + f = hashfd_check(rev_name); + } else + return NULL; + + write_rev_header(f); + + write_rev_index_positions(f, objects, nr_objects); + write_rev_trailer(f, hash); + + if (rev_name && adjust_shared_perm(rev_name) < 0) + die(_("failed to make %s readable"), rev_name); + + finalize_hashfile(f, NULL, CSUM_HASH_IN_STREAM | CSUM_CLOSE | + ((flags & WRITE_IDX_VERIFY) ? 0 : CSUM_FSYNC)); + + return rev_name; +} + off_t write_pack_header(struct hashfile *f, uint32_t nr_entries) { struct pack_header hdr; @@ -341,7 +451,7 @@ void finish_tmp_packfile(struct strbuf *name_buffer, struct pack_idx_option *pack_idx_opts, unsigned char hash[]) { - const char *idx_tmp_name; + const char *idx_tmp_name, *rev_tmp_name = NULL; int basename_len = name_buffer->len; if (adjust_shared_perm(pack_tmp_name)) @@ -352,6 +462,9 @@ void finish_tmp_packfile(struct strbuf *name_buffer, if (adjust_shared_perm(idx_tmp_name)) die_errno("unable to make temporary index file readable"); + rev_tmp_name = write_rev_file(NULL, written_list, nr_written, hash, + pack_idx_opts->flags); + strbuf_addf(name_buffer, "%s.pack", hash_to_hex(hash)); if (rename(pack_tmp_name, name_buffer->buf)) @@ -365,5 +478,13 @@ void finish_tmp_packfile(struct strbuf *name_buffer, strbuf_setlen(name_buffer, basename_len); + if (rev_tmp_name) { + strbuf_addf(name_buffer, "%s.rev", hash_to_hex(hash)); + if (rename(rev_tmp_name, name_buffer->buf)) + die_errno("unable to rename temporary reverse-index file"); + } + + strbuf_setlen(name_buffer, basename_len); + free((void *)idx_tmp_name); } diff --git a/pack.h b/pack.h index 9fc0945ac9..30439e0784 100644 --- a/pack.h +++ b/pack.h @@ -42,6 +42,8 @@ struct pack_idx_option { /* flag bits */ #define WRITE_IDX_VERIFY 01 /* verify only, do not write the idx file */ #define WRITE_IDX_STRICT 02 +#define WRITE_REV 04 +#define WRITE_REV_VERIFY 010 uint32_t version; uint32_t off32_limit; @@ -87,6 +89,8 @@ off_t write_pack_header(struct hashfile *f, uint32_t); void fixup_pack_header_footer(int, unsigned char *, const char *, uint32_t, unsigned char *, off_t); char *index_pack_lockfile(int fd); +const char *write_rev_file(const char *rev_name, struct pack_idx_entry **objects, uint32_t nr_objects, const unsigned char *hash, unsigned flags); + /* * The "hdr" output buffer should be at least this big, which will handle sizes * up to 2^67. From patchwork Fri Jan 8 18:20:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12007253 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DDBCEC433E0 for ; Fri, 8 Jan 2021 18:20:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9273D23A7B for ; Fri, 8 Jan 2021 18:20:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728577AbhAHSUw (ORCPT ); Fri, 8 Jan 2021 13:20:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34466 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728505AbhAHSUv (ORCPT ); Fri, 8 Jan 2021 13:20:51 -0500 Received: from mail-io1-xd2b.google.com (mail-io1-xd2b.google.com [IPv6:2607:f8b0:4864:20::d2b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D7461C0612FF for ; Fri, 8 Jan 2021 10:20:10 -0800 (PST) Received: by mail-io1-xd2b.google.com with SMTP id u26so10670722iof.3 for ; Fri, 08 Jan 2021 10:20:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=FopYTCAPpSmkVT6qgVdcS3FCh1/P8uB4hGHQFy2K3PU=; b=AZ7pe0WHiCwvoOMzxkMVGwbBJ9bN7ENtu8j9+D2PXFZ0p+GN3tMPJ8KVMjKiPkz0Pc K4/DWSZ8RnO252H/g68hfZniBsw+bHcmR8oWQV2MR+I47tpz5zizbyOiq5F23FJ5txDw P6yt/PnTjzSmX/Ybs8GR5O9l1fem4FmzDDZqV8zMd1id4F8GYyFppne5xd9SlYUtC5KP BHqgMxV+mcm4xUKRI5NKAVenscQj5e/dSNvJMBYECExO5T/kHR9TltPn/81wSiYN1IWU dhFudcBu/Z+cWhHs+a9EB74TdXYUAClkOmvlTgg2S3BZOixUO1kGP+ZFUTDHO3+YE5/I 5RGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=FopYTCAPpSmkVT6qgVdcS3FCh1/P8uB4hGHQFy2K3PU=; b=jcOQsVuSIRloZpWTOfBeYWLL7P5sMUBZUSUgvWqN73DhKe+VUBNNxZNRyAzsjePIyB KM4Xz0/hBNO3jmn2StUsl3AMFNk0mc2iJDFE9memDZbpSF0BDT7irgD2XPLlmWMiETOO xsr7OrSWAQ5+TpgjVXOEiTJg2uKlndftzDZ1btm+g2rLqIxs9yTItfLa7En0KtHxC1jn yYVGpqpBTzx4iPGpEygKko9zHI8gKC+h4Sh98MW3F46t0dejL3DI1d/HhUvOuXngXmKM 6MufE3etehMvUayhRobNQAhCHh7X43khSd7WZCM1W08QuyOnrFQJxSr/b4/60X0V4lzk B7yA== X-Gm-Message-State: AOAM532kDqiMGku3kU10UVxb+USHLxgA0OWnP9bz8svIcnbWmr04+ohR PWomxlTuZ+pl+aWPl/EquQyJB5+LJjh4iw== X-Google-Smtp-Source: ABdhPJxWaSphpFlVR4QngZnxhvEdA3KxYxNXoc7U4jQ21sz3LQ1gjR7cNXlvORIg+1mi6K0ZqbToXQ== X-Received: by 2002:a02:9691:: with SMTP id w17mr4509546jai.9.1610130009837; Fri, 08 Jan 2021 10:20:09 -0800 (PST) Received: from localhost ([8.9.92.205]) by smtp.gmail.com with ESMTPSA id j15sm7790496ile.1.2021.01.08.10.20.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 Jan 2021 10:20:09 -0800 (PST) Date: Fri, 8 Jan 2021 13:20:07 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, jrnieder@gmail.com Subject: [PATCH 3/8] builtin/index-pack.c: write reverse indexes Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Teach 'git index-pack' to optionally write and verify reverse index with '--[no-]rev-index', as well as respecting the 'pack.writeReverseIndex' configuration option. Signed-off-by: Taylor Blau --- Documentation/git-index-pack.txt | 20 ++++++--- builtin/index-pack.c | 64 ++++++++++++++++++++++++----- t/t5325-reverse-index.sh | 69 ++++++++++++++++++++++++++++++++ 3 files changed, 137 insertions(+), 16 deletions(-) create mode 100755 t/t5325-reverse-index.sh diff --git a/Documentation/git-index-pack.txt b/Documentation/git-index-pack.txt index af0c26232c..b65f380269 100644 --- a/Documentation/git-index-pack.txt +++ b/Documentation/git-index-pack.txt @@ -9,17 +9,18 @@ git-index-pack - Build pack index file for an existing packed archive SYNOPSIS -------- [verse] -'git index-pack' [-v] [-o ] +'git index-pack' [-v] [-o ] [--[no-]rev-index] 'git index-pack' --stdin [--fix-thin] [--keep] [-v] [-o ] - [] + [--[no-]rev-index] [] DESCRIPTION ----------- Reads a packed archive (.pack) from the specified file, and -builds a pack index file (.idx) for it. The packed archive -together with the pack index can then be placed in the -objects/pack/ directory of a Git repository. +builds a pack index file (.idx) for it. Optionally writes a +reverse-index (.rev) for the specified pack. The packed +archive together with the pack index can then be placed in +the objects/pack/ directory of a Git repository. OPTIONS @@ -33,7 +34,14 @@ OPTIONS file is constructed from the name of packed archive file by replacing .pack with .idx (and the program fails if the name of packed archive does not end - with .pack). + with .pack). Incompatible with `--rev-index`. + +--[no-]rev-index:: + When this flag is provided, generate a reverse index + (a `.rev` file) corresponding to the given pack. If + `--verify` is given, ensure that the existing + reverse index is correct. Takes precedence over + `pack.writeReverseIndex`. --stdin:: When this flag is provided, the pack is read from stdin diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 4b8d86e0ad..03408250b1 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -17,7 +17,7 @@ #include "promisor-remote.h" static const char index_pack_usage[] = -"git index-pack [-v] [-o ] [--keep | --keep=] [--verify] [--strict] ( | --stdin [--fix-thin] [])"; +"git index-pack [-v] [-o ] [--keep | --keep=] [--[no-]rev-index] [--verify] [--strict] ( | --stdin [--fix-thin] [])"; struct object_entry { struct pack_idx_entry idx; @@ -1436,13 +1436,13 @@ static void fix_unresolved_deltas(struct hashfile *f) free(sorted_by_pos); } -static const char *derive_filename(const char *pack_name, const char *suffix, - struct strbuf *buf) +static const char *derive_filename(const char *pack_name, const char *strip, + const char *suffix, struct strbuf *buf) { size_t len; - if (!strip_suffix(pack_name, ".pack", &len)) - die(_("packfile name '%s' does not end with '.pack'"), - pack_name); + if (!strip_suffix(pack_name, strip, &len)) + die(_("packfile name '%s' does not end with '%s'"), + pack_name, strip); strbuf_add(buf, pack_name, len); strbuf_addch(buf, '.'); strbuf_addstr(buf, suffix); @@ -1459,7 +1459,7 @@ static void write_special_file(const char *suffix, const char *msg, int msg_len = strlen(msg); if (pack_name) - filename = derive_filename(pack_name, suffix, &name_buf); + filename = derive_filename(pack_name, ".pack", suffix, &name_buf); else filename = odb_pack_name(&name_buf, hash, suffix); @@ -1484,12 +1484,14 @@ static void write_special_file(const char *suffix, const char *msg, static void final(const char *final_pack_name, const char *curr_pack_name, const char *final_index_name, const char *curr_index_name, + const char *final_rev_index_name, const char *curr_rev_index_name, const char *keep_msg, const char *promisor_msg, unsigned char *hash) { const char *report = "pack"; struct strbuf pack_name = STRBUF_INIT; struct strbuf index_name = STRBUF_INIT; + struct strbuf rev_index_name = STRBUF_INIT; int err; if (!from_stdin) { @@ -1524,6 +1526,16 @@ static void final(const char *final_pack_name, const char *curr_pack_name, } else chmod(final_index_name, 0444); + if (curr_rev_index_name) { + if (final_rev_index_name != curr_rev_index_name) { + if (!final_rev_index_name) + final_rev_index_name = odb_pack_name(&rev_index_name, hash, "rev"); + if (finalize_object_file(curr_rev_index_name, final_rev_index_name)) + die(_("cannot store reverse index file")); + } else + chmod(final_rev_index_name, 0444); + } + if (do_fsck_object) { struct packed_git *p; p = add_packed_git(final_index_name, strlen(final_index_name), 0); @@ -1553,6 +1565,7 @@ static void final(const char *final_pack_name, const char *curr_pack_name, } } + strbuf_release(&rev_index_name); strbuf_release(&index_name); strbuf_release(&pack_name); } @@ -1578,6 +1591,12 @@ static int git_index_pack_config(const char *k, const char *v, void *cb) } return 0; } + if (!strcmp(k, "pack.writereverseindex")) { + if (git_config_bool(k, v)) + opts->flags |= WRITE_REV; + else + opts->flags &= ~WRITE_REV; + } return git_default_config(k, v, cb); } @@ -1695,12 +1714,14 @@ static void show_pack_info(int stat_only) int cmd_index_pack(int argc, const char **argv, const char *prefix) { - int i, fix_thin_pack = 0, verify = 0, stat_only = 0; + int i, fix_thin_pack = 0, verify = 0, stat_only = 0, rev_index; const char *curr_index; - const char *index_name = NULL, *pack_name = NULL; + const char *curr_rev_index = NULL; + const char *index_name = NULL, *pack_name = NULL, *rev_index_name = NULL; const char *keep_msg = NULL; const char *promisor_msg = NULL; struct strbuf index_name_buf = STRBUF_INIT; + struct strbuf rev_index_name_buf = STRBUF_INIT; struct pack_idx_entry **idx_objects; struct pack_idx_option opts; unsigned char pack_hash[GIT_MAX_RAWSZ]; @@ -1727,6 +1748,8 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) if (prefix && chdir(prefix)) die(_("Cannot come back to cwd")); + rev_index = !!(opts.flags & (WRITE_REV_VERIFY | WRITE_REV)); + for (i = 1; i < argc; i++) { const char *arg = argv[i]; @@ -1805,6 +1828,10 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) if (hash_algo == GIT_HASH_UNKNOWN) die(_("unknown hash algorithm '%s'"), arg); repo_set_hash_algo(the_repository, hash_algo); + } else if (!strcmp(arg, "--rev-index")) { + rev_index = 1; + } else if (!strcmp(arg, "--no-rev-index")) { + rev_index = 0; } else usage(index_pack_usage); continue; @@ -1824,7 +1851,16 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) if (from_stdin && hash_algo) die(_("--object-format cannot be used with --stdin")); if (!index_name && pack_name) - index_name = derive_filename(pack_name, "idx", &index_name_buf); + index_name = derive_filename(pack_name, ".pack", "idx", &index_name_buf); + + opts.flags &= ~(WRITE_REV | WRITE_REV_VERIFY); + if (rev_index) { + opts.flags |= verify ? WRITE_REV_VERIFY : WRITE_REV; + if (index_name) + rev_index_name = derive_filename(index_name, + ".idx", "rev", + &rev_index_name_buf); + } if (verify) { if (!index_name) @@ -1878,11 +1914,16 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) for (i = 0; i < nr_objects; i++) idx_objects[i] = &objects[i].idx; curr_index = write_idx_file(index_name, idx_objects, nr_objects, &opts, pack_hash); + if (rev_index) + curr_rev_index = write_rev_file(rev_index_name, idx_objects, + nr_objects, pack_hash, + opts.flags); free(idx_objects); if (!verify) final(pack_name, curr_pack, index_name, curr_index, + rev_index_name, curr_rev_index, keep_msg, promisor_msg, pack_hash); else @@ -1893,10 +1934,13 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) free(objects); strbuf_release(&index_name_buf); + strbuf_release(&rev_index_name_buf); if (pack_name == NULL) free((void *) curr_pack); if (index_name == NULL) free((void *) curr_index); + if (rev_index_name == NULL) + free((void *) curr_rev_index); /* * Let the caller know this pack is not self contained diff --git a/t/t5325-reverse-index.sh b/t/t5325-reverse-index.sh new file mode 100755 index 0000000000..f9afa698dc --- /dev/null +++ b/t/t5325-reverse-index.sh @@ -0,0 +1,69 @@ +#!/bin/sh + +test_description='on-disk reverse index' +. ./test-lib.sh + +packdir=.git/objects/pack + +test_expect_success 'setup' ' + test_commit base && + + pack=$(git pack-objects --all $packdir/pack) && + rev=$packdir/pack-$pack.rev && + + test_path_is_missing $rev +' + +test_index_pack () { + rm -f $rev && + conf=$1 && + shift && + git -c pack.writeReverseIndex=$conf index-pack "$@" \ + $packdir/pack-$pack.pack +} + +test_expect_success 'index-pack with pack.writeReverseIndex' ' + test_index_pack "" && + test_path_is_missing $rev && + + test_index_pack false && + test_path_is_missing $rev && + + test_index_pack true && + test_path_is_file $rev +' + +test_expect_success 'index-pack with --[no-]rev-index' ' + for conf in "" true false + do + test_index_pack "$conf" --rev-index && + test_path_exists $rev && + + test_index_pack "$conf" --no-rev-index && + test_path_is_missing $rev + done +' + +test_expect_success 'index-pack can verify reverse indexes' ' + test_when_finished "rm -f $rev" && + test_index_pack true && + + test_path_is_file $rev && + git index-pack --rev-index --verify $packdir/pack-$pack.pack && + + # Intentionally corrupt the reverse index. + chmod u+w $rev && + printf "xxxx" | dd of=$rev bs=1 count=4 conv=notrunc && + + test_must_fail git index-pack --rev-index --verify \ + $packdir/pack-$pack.pack 2>err && + grep "validation error" err +' + +test_expect_success 'index-pack infers reverse index name with -o' ' + git index-pack --rev-index -o other.idx $packdir/pack-$pack.pack && + test_path_is_file other.idx && + test_path_is_file other.rev +' + +test_done From patchwork Fri Jan 8 18:20:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12007257 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 072A2C433DB for ; Fri, 8 Jan 2021 18:21:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B83E823A7D for ; Fri, 8 Jan 2021 18:21:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728797AbhAHSVC (ORCPT ); Fri, 8 Jan 2021 13:21:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34478 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728505AbhAHSVB (ORCPT ); Fri, 8 Jan 2021 13:21:01 -0500 Received: from mail-qv1-xf32.google.com (mail-qv1-xf32.google.com [IPv6:2607:f8b0:4864:20::f32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C1618C0612FE for ; Fri, 8 Jan 2021 10:20:15 -0800 (PST) Received: by mail-qv1-xf32.google.com with SMTP id et9so4703810qvb.10 for ; Fri, 08 Jan 2021 10:20:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=C4E7i9lQf/Y1JGFO8SDD0QiIPuB2ZBV9sZCrcfvUxxM=; b=NfFBwB4O6/okA/QVKccIih1zdBgf5OuCwJpS4DnRGqHetoElkwPd6Pueq5y6Mry1wh hDuE1ODXluL0q7L3fFXrgUYy48pMPFfOaVe2KPMlmOZIapd1/IpMcqNrQeRk0dH3BoQ4 irOspkHgxW9XplWcwmF/ZpL1W+Bwvq1l5hY8nQkOLvNyFE4+FV7I94Zpzb32bzdd8CzR ks/gbKDvHTa/65+rwX33S6L6LbeAbrRE6DR1r0eyp94/sLEp3IYuxG/rMQ0JbfKy8aW1 kgmJSI4RiEj6FvmcFktCMkuqga6/HcZrB3WBqvvVfCBwV8IkQFsm8jFhd+ogMM8wI3/W pXnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=C4E7i9lQf/Y1JGFO8SDD0QiIPuB2ZBV9sZCrcfvUxxM=; b=EbcSNge/fEcL2CB8i0WYNUupZIzoUxUUGYGClHPMPb0wwbj6lWNkNxnqZsZ9cfvWUA ziRU1KRcbBzVeJVuNyUfmmm4jpyWW3wxlcdE36eVyt3Zj8tgHw2idRci2XQbbSRvfvhd g4zKcqJAJfpgB1yOXdUJVxPXzJ0GipH+Wf0rqlCSKm1onElR6UfOJlNXDuOWpUsN0RPO ExqjL0o/9ZnOwGDEtgzIuNqkyry1N8tBNQ9hDWxF0jGvY5vCO1KEWCQSSp/2Wi76i4FP 2sUzWGRVOAg8KBbfIHhaS1UiVx2m2pjR73fFIW5jOAQ/ywOHW6WFYKSotrVVYqZxmRAC EYcA== X-Gm-Message-State: AOAM533CfrNElKzZdtMB5G6HT4Ea+289KkT2Z1MxtQYTerNsKMndwKUB BBwjpw3Y4u0PQPo46iXWi9LBSRxdkU2ZZQ== X-Google-Smtp-Source: ABdhPJzIghdJQZ08PANbjK6xljVkUD/kHRwwjd0s15ReJNp5bB85gD/ZDdSGZRQtl+CfOzABhaM9Zw== X-Received: by 2002:ad4:4643:: with SMTP id y3mr4816619qvv.3.1610130014279; Fri, 08 Jan 2021 10:20:14 -0800 (PST) Received: from localhost ([8.9.92.205]) by smtp.gmail.com with ESMTPSA id p75sm5290221qka.72.2021.01.08.10.20.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 Jan 2021 10:20:13 -0800 (PST) Date: Fri, 8 Jan 2021 13:20:11 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, jrnieder@gmail.com Subject: [PATCH 4/8] builtin/pack-objects.c: respect 'pack.writeReverseIndex' Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Now that we have an implementation that can write the new reverse index format, enable writing a .rev file in 'git pack-objects' by consulting the pack.writeReverseIndex configuration variable. Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 7 +++++++ t/t5325-reverse-index.sh | 13 +++++++++++++ 2 files changed, 20 insertions(+) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index a193cdaf2f..80adce154a 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -2951,6 +2951,13 @@ static int git_pack_config(const char *k, const char *v, void *cb) pack_idx_opts.version); return 0; } + if (!strcmp(k, "pack.writereverseindex")) { + if (git_config_bool(k, v)) + pack_idx_opts.flags |= WRITE_REV; + else + pack_idx_opts.flags &= ~WRITE_REV; + return 0; + } if (!strcmp(k, "uploadpack.blobpackfileuri")) { struct configured_exclusion *ex = xmalloc(sizeof(*ex)); const char *oid_end, *pack_end; diff --git a/t/t5325-reverse-index.sh b/t/t5325-reverse-index.sh index f9afa698dc..431699ade2 100755 --- a/t/t5325-reverse-index.sh +++ b/t/t5325-reverse-index.sh @@ -66,4 +66,17 @@ test_expect_success 'index-pack infers reverse index name with -o' ' test_path_is_file other.rev ' +test_expect_success 'pack-objects respects pack.writeReverseIndex' ' + test_when_finished "rm -fr pack-1-*" && + + git -c pack.writeReverseIndex= pack-objects --all pack-1 && + test_path_is_missing pack-1-*.rev && + + git -c pack.writeReverseIndex=false pack-objects --all pack-1 && + test_path_is_missing pack-1-*.rev && + + git -c pack.writeReverseIndex=true pack-objects --all pack-1 && + test_path_is_file pack-1-*.rev +' + test_done From patchwork Fri Jan 8 18:20:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12007263 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CDBDFC43381 for ; Fri, 8 Jan 2021 18:21:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9163323A7C for ; Fri, 8 Jan 2021 18:21:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728792AbhAHSVA (ORCPT ); Fri, 8 Jan 2021 13:21:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34490 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728505AbhAHSU7 (ORCPT ); Fri, 8 Jan 2021 13:20:59 -0500 Received: from mail-qk1-x731.google.com (mail-qk1-x731.google.com [IPv6:2607:f8b0:4864:20::731]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B8F56C06129C for ; Fri, 8 Jan 2021 10:20:19 -0800 (PST) Received: by mail-qk1-x731.google.com with SMTP id h4so9288235qkk.4 for ; Fri, 08 Jan 2021 10:20:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=qlHVhEYX5cqJqldP2f83BYJJb/dO87xV8oTauLhAP6A=; b=ZcF//ZZA5fjFktTgf0tnnAAAakvIJ651RdZoOf84lgUSAnSS4O2GIs5fmDkJFBiRL8 tniL8oUNAmOGDUlZDYVac7W7VecEgrluoIW8s3exBXApnoQN6+fPo0FfjembM5xKQK7e gpg9UATxaNvti9OrGqelqBGN3Pe6bbNxlJGk6nlF3BfvWzSSWUJ/rxne9YltoYyF2v4h mYvUZI1akxHGKSwE9BC32gDFDoXSLBfmo0p0evtC/etvf9EEr4vD7vCDGUJu5beIdbxa EjM1mwelNnhWSSPqKFnicot1dC3o1bvXkietkvlP8Q5QyDAimxWQvMmnXKVyLE3cGg0e s2mw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=qlHVhEYX5cqJqldP2f83BYJJb/dO87xV8oTauLhAP6A=; b=Qhp4d0uEVChSqIpba6vJXnt9/X+m01zac5Ae2eokdgKKx69wy0EAwE25tCIpXmjWG+ sJvHVOuelujH6kNoF13jnwZ+KDB50zmAgPG3SL3L+zBGnDF+/gqJ36dX6xNuAszJMN7S sspGKu6Kb46m6DIzUZMdStGqB6jB2FK6m/ONR4xB8jNdvnu3QsKWnsWBKQAc42ZXHzjf FkjCYT5DZPLqfL58WbnUv+h0GoFqZtGiArfjUM7XQrUpCfJI5HtnpHPt7FGSeBiSyX9c rD5SvvD2GvzBMqUOAS/+58MWfKOUSBIYDIPUoImuDlRU0CBmTZlGDfQDxypTHJ97FoD1 yblA== X-Gm-Message-State: AOAM530UAZIQVMSAXOFa2LN0PAmMSWC6L6NSGOKC11evsX6xeUMBoRl3 DwtB9i6JH+rbJZf5L11G5Utg8/DxyvKUuA== X-Google-Smtp-Source: ABdhPJzY83d7yH4VJPNm3mzVn63pw/iF37LhByVkUtiVXf2tU12/W+MHvd2ltd/6PcY2P+Sc78D52w== X-Received: by 2002:a37:b983:: with SMTP id j125mr5033571qkf.418.1610130018726; Fri, 08 Jan 2021 10:20:18 -0800 (PST) Received: from localhost ([8.9.92.205]) by smtp.gmail.com with ESMTPSA id b78sm5041544qkg.29.2021.01.08.10.20.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 Jan 2021 10:20:18 -0800 (PST) Date: Fri, 8 Jan 2021 13:20:16 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, jrnieder@gmail.com Subject: [PATCH 5/8] Documentation/config/pack.txt: advertise 'pack.writeReverseIndex' Message-ID: <5d3e96a498527959ae2f77476cd00474e36fbc97.1610129989.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Now that the pack.writeReverseIndex configuration is respected in both 'git index-pack' and 'git pack-objects' (and therefore, all of their callers), we can safely advertise it for use in the git-config manual. Signed-off-by: Taylor Blau --- Documentation/config/pack.txt | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/Documentation/config/pack.txt b/Documentation/config/pack.txt index 837f1b1679..3da4ea98e2 100644 --- a/Documentation/config/pack.txt +++ b/Documentation/config/pack.txt @@ -133,3 +133,10 @@ pack.writeBitmapHashCache:: between an older, bitmapped pack and objects that have been pushed since the last gc). The downside is that it consumes 4 bytes per object of disk space. Defaults to true. + +pack.writeReverseIndex:: + When true, git will write a corresponding .rev file (see: + link:../technical/pack-format.html[Documentation/technical/pack-format.txt]) + for each new packfile that it writes in all places except for + linkgit:git-fast-import[1] and in the bulk checkin mechanism. + Defaults to false. From patchwork Fri Jan 8 18:20:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12007261 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23F42C433E9 for ; Fri, 8 Jan 2021 18:21:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DEF7223A7B for ; Fri, 8 Jan 2021 18:21:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728798AbhAHSVG (ORCPT ); Fri, 8 Jan 2021 13:21:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34502 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728505AbhAHSVE (ORCPT ); Fri, 8 Jan 2021 13:21:04 -0500 Received: from mail-il1-x12d.google.com (mail-il1-x12d.google.com [IPv6:2607:f8b0:4864:20::12d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 974B7C06129D for ; Fri, 8 Jan 2021 10:20:24 -0800 (PST) Received: by mail-il1-x12d.google.com with SMTP id x15so11206062ilq.1 for ; Fri, 08 Jan 2021 10:20:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=IQlZK0jd6JeMSvh7FeMG2U8oZmsTY8vnZxn+QdqX9BU=; b=QxURiWeo5xOV+7CXk9w+QHINGPvIMt7OJEKv+++LO2tE47PLIFIYRiJDZF8o+Fx06K ApuPph5scpX0SJnK+SWa6kHNjXVtuCGlysAB/tui6+ppk1gt785FDJxxSSpqHDmcUmzd L8MgtENnrIhBF8UjnAL+Xqt8bKugIACLCiB5fqG+yEcVtNzN5nhTw2yTDePuUJI8sWmZ KStn3VFJxLbJjW79N+d1E+QPZtDMRadVxVU1RiPD7EGeZYaDO3DF0fpbLbLKI981iy7D +9xBRtM8lTn/aGP4aCjYwNEbJu85ydSzQOGM7ZaZgCamqXoBv6GTWuqOPKue60vIYwoV sLCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=IQlZK0jd6JeMSvh7FeMG2U8oZmsTY8vnZxn+QdqX9BU=; b=ACZCVTNZrBQF0G8MBMRETJCHooxoBbugaR+YiNBh/KF6ypjeB0YXfY/2ECVj8ngQaT UYyE5mh6nk2fNKnrfzFSXP+OFmZDdiu27pta81Zzq1g6OlIkowpZxJid1g8/IjCK88Qy D4//+QCE/QFUcz10SAGuASC/q7T2eNqvzt27dzwy5oXdbyHFGbRvKX9sQV/JCYACLfUI ZYCMdjknt8STl+8DM2DDFSO5S5J3wI0/jl7+QKxs4B39qG3rrIIjnGnmjxcRbgf2/80U THO5u5TSQMvU6f43SEKupQx0yRYNIEZSGN5wBcgeN5TCqMvqR2hs0vYZLl/JKe1p9UX7 F1Bg== X-Gm-Message-State: AOAM531J3wvhuiCs29gbP8RhIINhZESw+ffDdlOihfpLc3DzoSrNAzo4 TIrxXkNKKF4sOevqXF++nF4aokOIOvdH+g== X-Google-Smtp-Source: ABdhPJz9jks+MmKMPf4redwIAOVINsaJmV9/j4Dm0+2swuY7TJd0ltoW2ru8eOk0mNa7anTY48hM5g== X-Received: by 2002:a92:1517:: with SMTP id v23mr5216982ilk.280.1610130023212; Fri, 08 Jan 2021 10:20:23 -0800 (PST) Received: from localhost ([8.9.92.205]) by smtp.gmail.com with ESMTPSA id m8sm7445555ild.18.2021.01.08.10.20.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 Jan 2021 10:20:22 -0800 (PST) Date: Fri, 8 Jan 2021 13:20:20 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, jrnieder@gmail.com Subject: [PATCH 6/8] t: prepare for GIT_TEST_WRITE_REV_INDEX Message-ID: <2288571fbeb83d6bd98917f4788c3221aa973c81.1610129989.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In the next patch, we'll add support for unconditionally enabling the 'pack.writeReverseIndex' setting with a new GIT_TEST_WRITE_REV_INDEX environment variable. This causes a little bit of fallout with tests that, for example, compare the list of files in the pack directory being unprepared to see .rev files in its output. For now, sprinkle these locations with a 'grep -v "\.rev$"' to ignore them. Once the pack.writeReverseIndex option has been thoroughly tested, we will default it to 'true', removing GIT_TEST_WRITE_REV_INDEX, and making it possible to revert this patch. At that time, we'll have to adjust the expected output to contain the relevant .rev files, but for now this will do just fine. Signed-off-by: Taylor Blau --- t/t5319-multi-pack-index.sh | 2 +- t/t5325-reverse-index.sh | 4 ++++ t/t5604-clone-reference.sh | 2 +- t/t5702-protocol-v2.sh | 4 ++-- t/t6500-gc.sh | 4 ++-- t/t9300-fast-import.sh | 2 +- 6 files changed, 11 insertions(+), 7 deletions(-) diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 297de502a9..9696f88c2f 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -710,7 +710,7 @@ test_expect_success 'expire respects .keep files' ' PACKA=$(ls .git/objects/pack/a-pack*\.pack | sed s/\.pack\$//) && touch $PACKA.keep && git multi-pack-index expire && - ls -S .git/objects/pack/a-pack* | grep $PACKA >a-pack-files && + ls -S .git/objects/pack/a-pack* | grep $PACKA | grep -v "\.rev$" >a-pack-files && test_line_count = 3 a-pack-files && test-tool read-midx .git/objects | grep idx >midx-list && test_line_count = 2 midx-list diff --git a/t/t5325-reverse-index.sh b/t/t5325-reverse-index.sh index 431699ade2..5a59cc71e4 100755 --- a/t/t5325-reverse-index.sh +++ b/t/t5325-reverse-index.sh @@ -3,6 +3,10 @@ test_description='on-disk reverse index' . ./test-lib.sh +# The below tests want control over the 'pack.writeReverseIndex' setting +# themselves to assert various combinations of it with other options. +sane_unset GIT_TEST_WRITE_REV_INDEX + packdir=.git/objects/pack test_expect_success 'setup' ' diff --git a/t/t5604-clone-reference.sh b/t/t5604-clone-reference.sh index 2f7be23044..d064426d03 100755 --- a/t/t5604-clone-reference.sh +++ b/t/t5604-clone-reference.sh @@ -318,7 +318,7 @@ test_expect_success SYMLINKS 'clone repo with symlinked or unknown files at obje test_cmp T.objects T$option.objects && ( cd T$option/.git/objects && - find . -type f | sort >../../../T$option.objects-files.raw && + find . -type f | grep -v \.rev$ | sort >../../../T$option.objects-files.raw && find . -type l | sort >../../../T$option.objects-symlinks.raw ) done && diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh index 7d5b17909b..9ebf045739 100755 --- a/t/t5702-protocol-v2.sh +++ b/t/t5702-protocol-v2.sh @@ -848,7 +848,7 @@ test_expect_success 'part of packfile response provided as URI' ' test -f h2found && # Ensure that there are exactly 6 files (3 .pack and 3 .idx). - ls http_child/.git/objects/pack/* >filelist && + ls http_child/.git/objects/pack/* | grep -v \.rev$ >filelist && test_line_count = 6 filelist ' @@ -902,7 +902,7 @@ test_expect_success 'packfile-uri with transfer.fsckobjects' ' clone "$HTTPD_URL/smart/http_parent" http_child && # Ensure that there are exactly 4 files (2 .pack and 2 .idx). - ls http_child/.git/objects/pack/* >filelist && + ls http_child/.git/objects/pack/* | grep -v \.rev$ >filelist && test_line_count = 4 filelist ' diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh index 4a3b8f48ac..d52f92f5a1 100755 --- a/t/t6500-gc.sh +++ b/t/t6500-gc.sh @@ -106,13 +106,13 @@ test_expect_success 'auto gc with too many loose objects does not attempt to cre test_commit "$(test_oid obj2)" && # Our first gc will create a pack; our second will create a second pack git gc --auto && - ls .git/objects/pack | sort >existing_packs && + ls .git/objects/pack | grep -v \.rev$ | sort >existing_packs && test_commit "$(test_oid obj3)" && test_commit "$(test_oid obj4)" && git gc --auto 2>err && test_i18ngrep ! "^warning:" err && - ls .git/objects/pack/ | sort >post_packs && + ls .git/objects/pack/ | grep -v \.rev$ | sort >post_packs && comm -1 -3 existing_packs post_packs >new && comm -2 -3 existing_packs post_packs >del && test_line_count = 0 del && # No packs are deleted diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh index 308c1ef42c..100df52a71 100755 --- a/t/t9300-fast-import.sh +++ b/t/t9300-fast-import.sh @@ -1629,7 +1629,7 @@ test_expect_success 'O: blank lines not necessary after other commands' ' INPUT_END git fast-import actual && test_cmp expect actual From patchwork Fri Jan 8 18:20:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12007259 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 463A3C433E6 for ; Fri, 8 Jan 2021 18:21:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1560C23A7E for ; Fri, 8 Jan 2021 18:21:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728806AbhAHSVJ (ORCPT ); Fri, 8 Jan 2021 13:21:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34514 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728505AbhAHSVI (ORCPT ); Fri, 8 Jan 2021 13:21:08 -0500 Received: from mail-io1-xd2a.google.com (mail-io1-xd2a.google.com [IPv6:2607:f8b0:4864:20::d2a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 78589C061381 for ; Fri, 8 Jan 2021 10:20:28 -0800 (PST) Received: by mail-io1-xd2a.google.com with SMTP id o6so10611945iob.10 for ; Fri, 08 Jan 2021 10:20:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=BobO7G9NRKAb9dtVec1L1u9TJeFc1s0PLLQlW9/BBA8=; b=dtvfjqYnQnqtpZVSGl5sEyBzc0cR322GReyh0E2DDfYeL0JltNuorwlEVC5mdML/K9 ohc8iTminVxbrZleGa/j9xxpziXrx6NgIonuPaKSds7TfyUc9c+k7SMDvRnxvyktFX5Y itS4a4SpPeER0mIoZi4Vy82olecQKBcIa4BQThWsMv5dA/AgpFvcaD1aGU1GvNmAE+59 ymZU90xJUT3JRN0gBe7TbSNwmVWMQzw2NRy2koZLzuPiMDlBEKGk4sOBN8yrrTTwM3jk NkQtp6u+NG0lKm1DL6TIYlLQHw6rxLKdcpV6PEJ3sBTo7l33MfNmSdBO3XRNSLVma+Xl 0Gkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=BobO7G9NRKAb9dtVec1L1u9TJeFc1s0PLLQlW9/BBA8=; b=i0Y01tDayeyA1teNJV0KMYxJoDmStHrEh5syOqCcG6aXyMxwvcY//0PPRWm9u45jwv TUCxDkn7GL2H33b3Q1tRT8qcP/08VUK9wUDrYvxEIZnuI+OIkFp+vVF9I77B8xXTD2XT C9PP4U9BDhDdc7zQVTwHWSEBKHJOeiL3S+pkejl1GuUk69Rz7kk5HGjr7TvfE+aasrvB S7JPXZnZPB0RHMZ9/SC1m0cIOQ7MJh9+prUBtNjUT+QOjdrDbMvvdn9Iy3XvI2GYQ+Gt pO1u3dx/X9m70zwZzO8LSH+Z4n1h0jWpQEObjtwkVUDTwKUm5591BLUIqstiqXGZ1/yF 9IYA== X-Gm-Message-State: AOAM532Yka8IRLBY3pHai/8Z63AfOxa0raW2KOP2MTNKt8VhMAFce34I lMO1RB4/EMqnAOijWHJ+do+GxEp3+RhLwQ== X-Google-Smtp-Source: ABdhPJwppcKq6MFD1rWHl0cgfsfD1goYCyAqkSvRO74oZ/MqPmXTxgg7Z6N8s9lrRaZAHOs5VbfKBg== X-Received: by 2002:a05:6638:1243:: with SMTP id o3mr4475338jas.129.1610130027662; Fri, 08 Jan 2021 10:20:27 -0800 (PST) Received: from localhost ([8.9.92.205]) by smtp.gmail.com with ESMTPSA id m8sm5562896ioh.16.2021.01.08.10.20.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 Jan 2021 10:20:27 -0800 (PST) Date: Fri, 8 Jan 2021 13:20:25 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, jrnieder@gmail.com Subject: [PATCH 7/8] t: support GIT_TEST_WRITE_REV_INDEX Message-ID: <3525c4d114c8655953aca44d8effa1652ccc93d3.1610129989.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Add a new option that unconditionally enables the pack.writeReverseIndex setting in order to run the whole test suite in a mode that generates on-disk reverse indexes. Once on-disk reverse indexes are proven out over several releases, we can change the default value of that configuration to 'true', and drop this patch. Signed-off-by: Taylor Blau --- builtin/index-pack.c | 5 ++++- builtin/pack-objects.c | 2 ++ pack-revindex.h | 2 ++ t/README | 3 +++ 4 files changed, 11 insertions(+), 1 deletion(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 03408250b1..0bde325a8b 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1748,7 +1748,10 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) if (prefix && chdir(prefix)) die(_("Cannot come back to cwd")); - rev_index = !!(opts.flags & (WRITE_REV_VERIFY | WRITE_REV)); + if (git_env_bool(GIT_TEST_WRITE_REV_INDEX, 0)) + rev_index = 1; + else + rev_index = !!(opts.flags & (WRITE_REV_VERIFY | WRITE_REV)); for (i = 1; i < argc; i++) { const char *arg = argv[i]; diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 80adce154a..5b3395bd7a 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -3597,6 +3597,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) reset_pack_idx_option(&pack_idx_opts); git_config(git_pack_config, NULL); + if (git_env_bool(GIT_TEST_WRITE_REV_INDEX, 0)) + pack_idx_opts.flags |= WRITE_REV; progress = isatty(2); argc = parse_options(argc, argv, prefix, pack_objects_options, diff --git a/pack-revindex.h b/pack-revindex.h index b501a7cd62..d2d466e298 100644 --- a/pack-revindex.h +++ b/pack-revindex.h @@ -1,6 +1,8 @@ #ifndef PACK_REVINDEX_H #define PACK_REVINDEX_H +#define GIT_TEST_WRITE_REV_INDEX "GIT_TEST_WRITE_REV_INDEX" + struct packed_git; int load_pack_revindex(struct packed_git *p); diff --git a/t/README b/t/README index c730a70770..0f97a51640 100644 --- a/t/README +++ b/t/README @@ -439,6 +439,9 @@ GIT_TEST_DEFAULT_HASH= specifies which hash algorithm to use in the test scripts. Recognized values for are "sha1" and "sha256". +GIT_TEST_WRITE_REV_INDEX=, when true enables the +'pack.writeReverseIndex' setting. + Naming Tests ------------ From patchwork Fri Jan 8 18:20:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12007251 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0EF3C433E0 for ; Fri, 8 Jan 2021 18:20:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8648523A69 for ; Fri, 8 Jan 2021 18:20:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728482AbhAHSUs (ORCPT ); Fri, 8 Jan 2021 13:20:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34374 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726666AbhAHSUs (ORCPT ); Fri, 8 Jan 2021 13:20:48 -0500 Received: from mail-qt1-x82d.google.com (mail-qt1-x82d.google.com [IPv6:2607:f8b0:4864:20::82d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3001EC0612FD for ; Fri, 8 Jan 2021 10:20:33 -0800 (PST) Received: by mail-qt1-x82d.google.com with SMTP id h19so7131523qtq.13 for ; Fri, 08 Jan 2021 10:20:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=5Mf8gKDjj3XIhPZjv3SqrlBeRDwCXz/5ajOwuex1ig4=; b=mCPvWDDrg8cxkH9+ZrDeD1Pbj2L8VadTjl0+swqltUVCbgBW5aohXVFtnExIIgK8Yy ILolIrJZDwY8lJAgnfbcqHpW8uyN4dBKyah6umU0K9n4FbD280wDoihCiQLNTPTQc02D wg+RoDjPfhSA3hvg3sqlSOBhnhlEYNx0M+GhulmxMh15XZ9u/P9pgkN4LabhySltgm1+ 7a0EI/PuL+dbv1wQjol4AZwK22sMK/vkHf1mOR8w4SSN7Qbs6bRTGFNl+nnrQDKq0Iu0 LPXJgBLqSAowHZXCp+enY14PyyB2WNYzP2zdbmH16dOWcv6i4hNPubZ5G1DoG5D8akdK VXmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=5Mf8gKDjj3XIhPZjv3SqrlBeRDwCXz/5ajOwuex1ig4=; b=dE/2Fnc8E/hD/nvoI/6+ZsEwPzxqzL3bGeItycK48tVa6/NlDd/3zzfPHfn+8Zb4/9 t4DnTA/1+Ac4XVoXfAGQ3nE/tsWx8gsGDQ3fEVWc4GAkgvGB9Z70GIsltuPjL6yEh7s9 f4aemXTOaQ6TkJdbNXwiEdwpuI+2ItLopVGjAQV+Y9Dg5lA3lDItPQMYZJp6v+a85rkb QPVCq1ck9mtfY1Tdi8E+g81rhXYfPNwFElCR13vYeYyqPoXQ4OiD7kGOoICK30oNgd1o QbujCYdYzvA8dWTKKvoDYCTWPmOgWlQ+Scn2iyVArGlN9CbKEjKO8EqTzac9NrLUsN9V XV8w== X-Gm-Message-State: AOAM532X3j1BCyCh2a13Ku47V0fQIR4Zwp2msneU8ak/nDDLGi6GjsIn RKXxKIC3ra58Wn1Xr9ay7yxelhoCUfdu/w== X-Google-Smtp-Source: ABdhPJx+UWi3p4VW0nfk5pvSwrA2d8ZKgFkW9cTF6+ZXRwdI3qd3yDSDzUiuMQI8ZHrsm2cmAo/oBA== X-Received: by 2002:aed:3b24:: with SMTP id p33mr4643778qte.299.1610130032115; Fri, 08 Jan 2021 10:20:32 -0800 (PST) Received: from localhost ([8.9.92.205]) by smtp.gmail.com with ESMTPSA id q3sm5144409qkb.73.2021.01.08.10.20.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 Jan 2021 10:20:31 -0800 (PST) Date: Fri, 8 Jan 2021 13:20:29 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, jrnieder@gmail.com Subject: [PATCH 8/8] pack-revindex: ensure that on-disk reverse indexes are given precedence Message-ID: <6e580d43d1d1e45bfe58606b98c5034a342d8241.1610129989.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When an on-disk reverse index exists, there is no need to generate one in memory. In fact, doing so can be slow, and require large amounts of the heap. Let's make sure that we treat the on-disk reverse index with precedence (i.e., that when it exists, we don't bother trying to generate an equivalent one in memory) by teaching Git how to conditionally die() when generating a reverse index in memory. Then, add a test to ensure that when (a) an on-disk reverse index exists, and (b) when setting GIT_TEST_REV_INDEX_DIE_IN_MEMORY, that we do not die, implying that we read from the on-disk one. Signed-off-by: Taylor Blau --- pack-revindex.c | 4 ++++ pack-revindex.h | 1 + t/t5325-reverse-index.sh | 8 ++++++++ 3 files changed, 13 insertions(+) diff --git a/pack-revindex.c b/pack-revindex.c index 1baaf2c42a..683d22e898 100644 --- a/pack-revindex.c +++ b/pack-revindex.c @@ -2,6 +2,7 @@ #include "pack-revindex.h" #include "object-store.h" #include "packfile.h" +#include "config.h" struct revindex_entry { off_t offset; @@ -166,6 +167,9 @@ static void create_pack_revindex(struct packed_git *p) static int load_pack_revindex_from_memory(struct packed_git *p) { + if (git_env_bool(GIT_TEST_REV_INDEX_DIE_IN_MEMORY, 0)) + die("dying as requested by '%s'", + GIT_TEST_REV_INDEX_DIE_IN_MEMORY); if (open_pack_index(p)) return -1; create_pack_revindex(p); diff --git a/pack-revindex.h b/pack-revindex.h index d2d466e298..e271da871a 100644 --- a/pack-revindex.h +++ b/pack-revindex.h @@ -2,6 +2,7 @@ #define PACK_REVINDEX_H #define GIT_TEST_WRITE_REV_INDEX "GIT_TEST_WRITE_REV_INDEX" +#define GIT_TEST_REV_INDEX_DIE_IN_MEMORY "GIT_TEST_REV_INDEX_DIE_IN_MEMORY" struct packed_git; diff --git a/t/t5325-reverse-index.sh b/t/t5325-reverse-index.sh index 5a59cc71e4..9d4eecccc9 100755 --- a/t/t5325-reverse-index.sh +++ b/t/t5325-reverse-index.sh @@ -83,4 +83,12 @@ test_expect_success 'pack-objects respects pack.writeReverseIndex' ' test_path_is_file pack-1-*.rev ' +test_expect_success 'reverse index is not generated when available on disk' ' + git index-pack --rev-index $packdir/pack-$pack.pack && + + git rev-parse HEAD >tip && + GIT_TEST_REV_INDEX_DIE_IN_MEMORY=1 git cat-file \ + --batch-check="%(objectsize:disk)"