From patchwork Wed Jan 13 22:28:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12018299 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1106C4332E for ; Thu, 14 Jan 2021 02:06:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AF63123447 for ; Thu, 14 Jan 2021 02:06:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729484AbhANCGH (ORCPT ); Wed, 13 Jan 2021 21:06:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56038 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727721AbhAMW3E (ORCPT ); Wed, 13 Jan 2021 17:29:04 -0500 Received: from mail-qt1-x831.google.com (mail-qt1-x831.google.com [IPv6:2607:f8b0:4864:20::831]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2BAE3C061794 for ; Wed, 13 Jan 2021 14:28:11 -0800 (PST) Received: by mail-qt1-x831.google.com with SMTP id r9so2302721qtp.11 for ; Wed, 13 Jan 2021 14:28:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=f1hm3vJlGY6epOAWRkBuswJR30nELgtokm54YWms9BI=; b=i0rf3mTJa/Tr9djSRNhz6h4J2F6XyLf/NwBEbJckcOABUY8prN0pelUovj1VYIgwmX qNs1B8rM1QWwX0A0CcEgQCiHmax+NoXZblRI1pfE4v6MXeeKNsP4T55gP+Zf9MlHOVVF 77jjoY+gJylahUcHlTinQBW/vl9UWLYhjf4LzKEfO0CJI2kKQqyg4qLtSAF8Ln9rkLZi iMeGl8/ES6NVJz2N3Yu/8Yp9axqRwNh69UVlejo8A6bz4xTl8vK9ZXCcQKvkmGhrjT7a No0XvLpPs2w62/w4A9nCjGU7w0DXDYm2a15JiMuO3ktWvyBdBnTBkyzFTg+cyNd3lDZ/ z5mA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=f1hm3vJlGY6epOAWRkBuswJR30nELgtokm54YWms9BI=; b=ePnBPo4di9ZNK3gLIYfVdgEeNaJ3Y05JOhNGAyeFgFinolKHOmmlaRmT1+qxbzIwvs PM6+NzkLbpMQOEWzug0iv8OqVMyOU/gE0bKOvwXJbMiDc+c4AoHRjx/zy0wRxm6oYetO 5u9Luw/G2KnDmyn+R9F+i+j1qRvau8gk/GsCAUiqYqRWHYBHZrQ9kOU8lxSUQx8gcWI3 5fC4TTN9hrKaXKOwNduiZcX/aq/AJovpoBOLwBQrPVdG0Ow/yZJby0fddbKdcHZjbScj FBFuHxYUA9MK/Pgng/4kT8D3b23T7VNbHs6RkdS5G0nfckwCLnfOzkovDGw7sqa7/wl7 8RiQ== X-Gm-Message-State: AOAM5313+0dCcFqvDgnp8LhYjzPJDHb68gECSBvRdrCbSmT33MQH4ei6 O3zX8PQKqaKMJrRASDnxu313lgXzsHJkBQ== X-Google-Smtp-Source: ABdhPJymLkDEmHht7QOmI63IoWhY+IlNcc22f43bx4aj5v106OLW/FuGZQZokw6UcDRSqt/awEz93g== X-Received: by 2002:ac8:5159:: with SMTP id h25mr4502598qtn.199.1610576889907; Wed, 13 Jan 2021 14:28:09 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:b172:2e4c:efe4:db53]) by smtp.gmail.com with ESMTPSA id b67sm1902165qkc.44.2021.01.13.14.28.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Jan 2021 14:28:09 -0800 (PST) Date: Wed, 13 Jan 2021 17:28:06 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: dstolee@microsoft.com, gitster@pobox.com, jrnieder@gmail.com, peff@peff.net Subject: [PATCH v2 1/8] packfile: prepare for the existence of '*.rev' files Message-ID: <6742c15c84bafbcc1c06e2633de51dcda63e3314.1610576805.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Specify the format of the on-disk reverse index 'pack-*.rev' file, as well as prepare the code for the existence of such files. The reverse index maps from pack relative positions (i.e., an index into the array of object which is sorted by their offsets within the packfile) to their position within the 'pack-*.idx' file. Today, this is done by building up a list of (off_t, uint32_t) tuples for each object (the off_t corresponding to that object's offset, and the uint32_t corresponding to its position in the index). To convert between pack and index position quickly, this array of tuples is radix sorted based on its offset. This has two major drawbacks: First, the in-memory cost scales linearly with the number of objects in a pack. Each 'struct revindex_entry' is sizeof(off_t) + sizeof(uint32_t) + padding bytes for a total of 16. To observe this, force Git to load the reverse index by, for e.g., running 'git cat-file --batch-check="%(objectsize:disk)"'. When asking for a single object in a fresh clone of the kernel, Git needs to allocate 120+ MB of memory in order to hold the reverse index in memory. Second, the cost to sort also scales with the size of the pack. Luckily, this is a linear function since 'load_pack_revindex()' uses a radix sort, but this cost still must be paid once per pack per process. As an example, it takes ~60x longer to print the _size_ of an object as it does to print that entire object's _contents_: Benchmark #1: git.compile cat-file --batch /dev/null Time (mean ± σ): 22.6 ms ± 0.5 ms [User: 2.4 ms, System: 7.9 ms] Range (min … max): 21.4 ms … 23.5 ms 41 runs Benchmark #2: git.compile cat-file --batch /dev/null Time (mean ± σ): 17.2 ms ± 0.7 ms [User: 2.8 ms, System: 5.5 ms] Range (min … max): 15.6 ms … 18.2 ms 45 runs (Numbers taken in the kernel after cheating and using the next patch to generate a reverse index). There are a couple of approaches to improve cold cache performance not pursued here: - We could include the object offsets in the reverse index format. Predictably, this does result in fewer page faults, but it triples the size of the file, while simultaneously duplicating a ton of data already available in the .idx file. (This was the original way I implemented the format, and it did show `--batch-check='%(objectsize:disk)'` winning out against `--batch`.) On the other hand, this increase in size also results in a large block-cache footprint, which could potentially hurt other workloads. - We could store the mapping from pack to index position in more cache-friendly way, like constructing a binary search tree from the table and writing the values in breadth-first order. This would result in much better locality, but the price you pay is trading O(1) lookup in 'pack_pos_to_index()' for an O(log n) one (since you can no longer directly index the table). So, neither of these approaches are taken here. (Thankfully, the format is versioned, so we are free to pursue these in the future.) But, cold cache performance likely isn't interesting outside of one-off cases like asking for the size of an object directly. In real-world usage, Git is often performing many operations in the revindex, The trade-off is worth it, since we will avoid the vast majority of the cost of generating the revindex that the extra pointer chase will look like noise in the following patch's benchmarks. This patch describes the format and prepares callers (like in pack-revindex.c) to be able to read *.rev files once they exist. An implementation of the writer will appear in the next patch, and callers will gradually begin to start using the writer in the patches that follow after that. Signed-off-by: Taylor Blau --- Documentation/technical/pack-format.txt | 17 ++++ builtin/repack.c | 1 + object-store.h | 3 + pack-revindex.c | 112 +++++++++++++++++++++--- pack-revindex.h | 7 +- packfile.c | 13 ++- packfile.h | 1 + tmp-objdir.c | 4 +- 8 files changed, 145 insertions(+), 13 deletions(-) diff --git a/Documentation/technical/pack-format.txt b/Documentation/technical/pack-format.txt index f96b2e605f..9593f8bc68 100644 --- a/Documentation/technical/pack-format.txt +++ b/Documentation/technical/pack-format.txt @@ -259,6 +259,23 @@ Pack file entry: <+ Index checksum of all of the above. +== pack-*.rev files have the format: + + - A 4-byte magic number '0x52494458' ('RIDX'). + + - A 4-byte version identifier (= 1) + + - A 4-byte hash function identifier (= 1 for SHA-1, 2 for SHA-256) + + - A table of index positions, sorted by their corresponding offsets in the + packfile. + + - A trailer, containing a: + + checksum of the corresponding packfile, and + + a checksum of all of the above. + == multi-pack-index (MIDX) files have the following format: The multi-pack-index files refer to multiple pack-files and loose objects. diff --git a/builtin/repack.c b/builtin/repack.c index 279be11a16..8d643ddcb9 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -208,6 +208,7 @@ static struct { } exts[] = { {".pack"}, {".idx"}, + {".rev", 1}, {".bitmap", 1}, {".promisor", 1}, }; diff --git a/object-store.h b/object-store.h index c4fc9dd74e..3fbf11280f 100644 --- a/object-store.h +++ b/object-store.h @@ -85,6 +85,9 @@ struct packed_git { multi_pack_index:1; unsigned char hash[GIT_MAX_RAWSZ]; struct revindex_entry *revindex; + const void *revindex_data; + const void *revindex_map; + size_t revindex_size; /* something like ".git/objects/pack/xxxxx.pack" */ char pack_name[FLEX_ARRAY]; /* more */ }; diff --git a/pack-revindex.c b/pack-revindex.c index 5e69bc7372..369812dd21 100644 --- a/pack-revindex.c +++ b/pack-revindex.c @@ -164,16 +164,98 @@ static void create_pack_revindex(struct packed_git *p) sort_revindex(p->revindex, num_ent, p->pack_size); } -int load_pack_revindex(struct packed_git *p) +static int load_pack_revindex_from_memory(struct packed_git *p) { - if (!p->revindex) { - if (open_pack_index(p)) - return -1; - create_pack_revindex(p); - } + if (open_pack_index(p)) + return -1; + create_pack_revindex(p); return 0; } +static char *pack_revindex_filename(struct packed_git *p) +{ + size_t len; + if (!strip_suffix(p->pack_name, ".pack", &len)) + BUG("pack_name does not end in .pack"); + return xstrfmt("%.*s.rev", (int)len, p->pack_name); +} + +#define RIDX_MIN_SIZE (12 + (2 * the_hash_algo->rawsz)) + +static int load_revindex_from_disk(char *revindex_name, + uint32_t num_objects, + const void **data, size_t *len) +{ + int fd, ret = 0; + struct stat st; + size_t revindex_size; + + fd = git_open(revindex_name); + + if (fd < 0) { + ret = -1; + goto cleanup; + } + if (fstat(fd, &st)) { + ret = error_errno(_("failed to read %s"), revindex_name); + goto cleanup; + } + + revindex_size = xsize_t(st.st_size); + + if (revindex_size < RIDX_MIN_SIZE) { + ret = error(_("reverse-index file %s is too small"), revindex_name); + goto cleanup; + } + + if (revindex_size - RIDX_MIN_SIZE != st_mult(sizeof(uint32_t), num_objects)) { + ret = error(_("reverse-index file %s is corrupt"), revindex_name); + goto cleanup; + } + + *len = revindex_size; + *data = xmmap(NULL, revindex_size, PROT_READ, MAP_PRIVATE, fd, 0); + +cleanup: + close(fd); + return ret; +} + +static int load_pack_revindex_from_disk(struct packed_git *p) +{ + char *revindex_name; + int ret; + if (open_pack_index(p)) + return -1; + + revindex_name = pack_revindex_filename(p); + + ret = load_revindex_from_disk(revindex_name, + p->num_objects, + &p->revindex_map, + &p->revindex_size); + if (ret) + goto cleanup; + + p->revindex_data = (char *)p->revindex_map + 12; + +cleanup: + free(revindex_name); + return ret; +} + +int load_pack_revindex(struct packed_git *p) +{ + if (p->revindex || p->revindex_data) + return 0; + + if (!load_pack_revindex_from_disk(p)) + return 0; + else if (!load_pack_revindex_from_memory(p)) + return 0; + return -1; +} + int offset_to_pack_pos(struct packed_git *p, off_t ofs, uint32_t *pos) { unsigned lo, hi; @@ -203,18 +285,28 @@ int offset_to_pack_pos(struct packed_git *p, off_t ofs, uint32_t *pos) uint32_t pack_pos_to_index(struct packed_git *p, uint32_t pos) { - if (!p->revindex) + if (!(p->revindex || p->revindex_data)) BUG("pack_pos_to_index: reverse index not yet loaded"); if (p->num_objects <= pos) BUG("pack_pos_to_index: out-of-bounds object at %"PRIu32, pos); - return p->revindex[pos].nr; + + if (p->revindex) + return p->revindex[pos].nr; + else + return get_be32((char *)p->revindex_data + (pos * sizeof(uint32_t))); } off_t pack_pos_to_offset(struct packed_git *p, uint32_t pos) { - if (!p->revindex) + if (!(p->revindex || p->revindex_data)) BUG("pack_pos_to_index: reverse index not yet loaded"); if (p->num_objects < pos) BUG("pack_pos_to_offset: out-of-bounds object at %"PRIu32, pos); - return p->revindex[pos].offset; + + if (p->revindex) + return p->revindex[pos].offset; + else if (pos == p->num_objects) + return p->pack_size - the_hash_algo->rawsz; + else + return nth_packed_object_offset(p, pack_pos_to_index(p, pos)); } diff --git a/pack-revindex.h b/pack-revindex.h index 6e0320b08b..01622cf21a 100644 --- a/pack-revindex.h +++ b/pack-revindex.h @@ -21,6 +21,9 @@ struct packed_git; /* * load_pack_revindex populates the revindex's internal data-structures for the * given pack, returning zero on success and a negative value otherwise. + * + * If a '.rev' file is present, it is checked for consistency, mmap'd, and + * pointers are assigned into it (instead of using the in-memory variant). */ int load_pack_revindex(struct packed_git *p); @@ -55,7 +58,9 @@ uint32_t pack_pos_to_index(struct packed_git *p, uint32_t pos); * If the reverse index has not yet been loaded, or the position is out of * bounds, this function aborts. * - * This function runs in constant time. + * This function runs in constant time under both in-memory and on-disk reverse + * indexes, but an additional step is taken to consult the corresponding .idx + * file when using the on-disk format. */ off_t pack_pos_to_offset(struct packed_git *p, uint32_t pos); diff --git a/packfile.c b/packfile.c index 7bb1750934..b04eac9286 100644 --- a/packfile.c +++ b/packfile.c @@ -324,11 +324,21 @@ void close_pack_index(struct packed_git *p) } } +void close_pack_revindex(struct packed_git *p) { + if (!p->revindex_map) + return; + + munmap((void *)p->revindex_map, p->revindex_size); + p->revindex_map = NULL; + p->revindex_data = NULL; +} + void close_pack(struct packed_git *p) { close_pack_windows(p); close_pack_fd(p); close_pack_index(p); + close_pack_revindex(p); } void close_object_store(struct raw_object_store *o) @@ -351,7 +361,7 @@ void close_object_store(struct raw_object_store *o) void unlink_pack_path(const char *pack_name, int force_delete) { - static const char *exts[] = {".pack", ".idx", ".keep", ".bitmap", ".promisor"}; + static const char *exts[] = {".pack", ".idx", ".rev", ".keep", ".bitmap", ".promisor"}; int i; struct strbuf buf = STRBUF_INIT; size_t plen; @@ -853,6 +863,7 @@ static void prepare_pack(const char *full_name, size_t full_name_len, if (!strcmp(file_name, "multi-pack-index")) return; if (ends_with(file_name, ".idx") || + ends_with(file_name, ".rev") || ends_with(file_name, ".pack") || ends_with(file_name, ".bitmap") || ends_with(file_name, ".keep") || diff --git a/packfile.h b/packfile.h index a58fc738e0..4cfec9e8d3 100644 --- a/packfile.h +++ b/packfile.h @@ -90,6 +90,7 @@ uint32_t get_pack_fanout(struct packed_git *p, uint32_t value); unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *); void close_pack_windows(struct packed_git *); +void close_pack_revindex(struct packed_git *); void close_pack(struct packed_git *); void close_object_store(struct raw_object_store *o); void unuse_pack(struct pack_window **); diff --git a/tmp-objdir.c b/tmp-objdir.c index 42ed4db5d3..da414df14f 100644 --- a/tmp-objdir.c +++ b/tmp-objdir.c @@ -187,7 +187,9 @@ static int pack_copy_priority(const char *name) return 2; if (ends_with(name, ".idx")) return 3; - return 4; + if (ends_with(name, ".rev")) + return 4; + return 5; } static int pack_copy_cmp(const char *a, const char *b) From patchwork Wed Jan 13 22:28:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12018317 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5417EC4332B for ; Thu, 14 Jan 2021 02:09:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1A51E2242A for ; Thu, 14 Jan 2021 02:09:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730061AbhANCFa (ORCPT ); Wed, 13 Jan 2021 21:05:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56054 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729271AbhAMW3E (ORCPT ); Wed, 13 Jan 2021 17:29:04 -0500 Received: from mail-qk1-x72b.google.com (mail-qk1-x72b.google.com [IPv6:2607:f8b0:4864:20::72b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 33785C061795 for ; Wed, 13 Jan 2021 14:28:15 -0800 (PST) Received: by mail-qk1-x72b.google.com with SMTP id z11so4451415qkj.7 for ; Wed, 13 Jan 2021 14:28:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=WRMJaTWVSXgyiGEnGhhwjp1kz6Bn3xbvxoqFvfWwets=; b=tNZugcyC5Y2VbLPmhEtzu9M6qVjfty6ISnG57loi6fq7hq0wqYoOs7Q3eQK6hyIb/+ 1kNkkCg5lJujvKZC1oxq1qWXvRULAdWL7MuXyXpja9HKepmItXvdMVPoQtnlhFaiwrod XlB3zbMRmJm0zfX2fTwPxMU2sRL8LRUXBtFiciLx+JIruMfux11oXtYO063p3vBx54ai mkcqjp5f95lhBw/zHj64WSqLk4XhgroEzcUsU7Y4rHh2DCHw+eDnS3r8xDFp5dS5Yssi njqzGZEcTZjoYG1oOM5v3GnP+ml9z4sKpVabcrAwpfZ2GCQIqYuPuxAB4t/3vGFTv8qI Bopg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=WRMJaTWVSXgyiGEnGhhwjp1kz6Bn3xbvxoqFvfWwets=; b=l1PHIOiPw/1ObXT+lIjCDYiNZO+4KsZrVZz+89U1t7BiZ22zGw86stUz7+Jl971He2 OD8ZORpJP3XgYPZtn7dX2yxuDjZJTUgj+WwA5G7qVoB49pPjCMN6cgazwlwXJIXkO1fT fWpN8h1k8RQvgLGbpjdAZY9bjtBzJsaO1X39O+SA+7IoKw/k1Xo7zUXydmB5rqlupi/h gVYwD8eYhl0ZDrmpFTTfnJzfDo0VP09dzeMJqxTavRy7gQ5pmLurQksayqvw/vXdZeuO WHL66JPc3eCRpXbbccv1kj6fxhZCYTfk9VuQqDO/N2cyE4mMoRikHOWKoR2gjAdAWgwe b/2Q== X-Gm-Message-State: AOAM531YQfGq0q8nmmUgx8P5UP/9ngU41ZK3lKpEHBkO+p0Od7+wnX+Q hzBD6s115jpy97XohN1g9RW4rq2+U5xWWg== X-Google-Smtp-Source: ABdhPJysg2Swa8VgV0rMoNuxC2oej+RGE9DSDZuFt+glQt7RbsAxJxlZdvIgGiL7YbM5dep60iGieA== X-Received: by 2002:a05:620a:13b0:: with SMTP id m16mr4474050qki.58.1610576894166; Wed, 13 Jan 2021 14:28:14 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:b172:2e4c:efe4:db53]) by smtp.gmail.com with ESMTPSA id y10sm1954595qkb.115.2021.01.13.14.28.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Jan 2021 14:28:13 -0800 (PST) Date: Wed, 13 Jan 2021 17:28:11 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: dstolee@microsoft.com, gitster@pobox.com, jrnieder@gmail.com, peff@peff.net Subject: [PATCH v2 2/8] pack-write.c: prepare to write 'pack-*.rev' files Message-ID: <8648c87fa71aec427dc11afcba6548fb66a1413b.1610576805.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This patch prepares for callers to be able to write reverse index files to disk. It adds the necessary machinery to write a format-compliant .rev file from within 'write_rev_file()', which is called from 'finish_tmp_packfile()'. Signed-off-by: Taylor Blau --- pack-write.c | 123 ++++++++++++++++++++++++++++++++++++++++++++++++++- pack.h | 4 ++ 2 files changed, 126 insertions(+), 1 deletion(-) diff --git a/pack-write.c b/pack-write.c index 3513665e1e..68db5a9edf 100644 --- a/pack-write.c +++ b/pack-write.c @@ -166,6 +166,116 @@ const char *write_idx_file(const char *index_name, struct pack_idx_entry **objec return index_name; } +static int pack_order_cmp(const void *va, const void *vb, void *ctx) +{ + struct pack_idx_entry **objects = ctx; + + off_t oa = objects[*(uint32_t*)va]->offset; + off_t ob = objects[*(uint32_t*)vb]->offset; + + if (oa < ob) + return -1; + if (oa > ob) + return 1; + return 0; +} + +#define RIDX_SIGNATURE 0x52494458 /* "RIDX" */ +#define RIDX_VERSION 1 + +static void write_rev_header(struct hashfile *f) +{ + uint32_t oid_version; + switch (hash_algo_by_ptr(the_hash_algo)) { + case GIT_HASH_SHA1: + oid_version = 1; + break; + case GIT_HASH_SHA256: + oid_version = 2; + break; + default: + die("write_rev_header: unknown hash version"); + } + + hashwrite_be32(f, RIDX_SIGNATURE); + hashwrite_be32(f, RIDX_VERSION); + hashwrite_be32(f, oid_version); +} + +static void write_rev_index_positions(struct hashfile *f, + struct pack_idx_entry **objects, + uint32_t nr_objects) +{ + uint32_t *pack_order; + uint32_t i; + + ALLOC_ARRAY(pack_order, nr_objects); + for (i = 0; i < nr_objects; i++) + pack_order[i] = i; + QSORT_S(pack_order, nr_objects, pack_order_cmp, objects); + + for (i = 0; i < nr_objects; i++) + hashwrite_be32(f, pack_order[i]); + + free(pack_order); +} + +static void write_rev_trailer(struct hashfile *f, const unsigned char *hash) +{ + hashwrite(f, hash, the_hash_algo->rawsz); +} + +const char *write_rev_file(const char *rev_name, + struct pack_idx_entry **objects, + uint32_t nr_objects, + const unsigned char *hash, + unsigned flags) +{ + struct hashfile *f; + int fd; + + if ((flags & WRITE_REV) && (flags & WRITE_REV_VERIFY)) + die(_("cannot both write and verify reverse index")); + + if (flags & WRITE_REV) { + if (!rev_name) { + struct strbuf tmp_file = STRBUF_INIT; + fd = odb_mkstemp(&tmp_file, "pack/tmp_rev_XXXXXX"); + rev_name = strbuf_detach(&tmp_file, NULL); + } else { + unlink(rev_name); + fd = open(rev_name, O_CREAT|O_EXCL|O_WRONLY, 0600); + if (fd < 0) + die_errno("unable to create '%s'", rev_name); + } + f = hashfd(fd, rev_name); + } else if (flags & WRITE_REV_VERIFY) { + struct stat statbuf; + if (stat(rev_name, &statbuf)) { + if (errno == ENOENT) { + /* .rev files are optional */ + return NULL; + } else + die_errno(_("could not stat: %s"), rev_name); + } + f = hashfd_check(rev_name); + } else + return NULL; + + write_rev_header(f); + + write_rev_index_positions(f, objects, nr_objects); + write_rev_trailer(f, hash); + + if (rev_name && adjust_shared_perm(rev_name) < 0) + die(_("failed to make %s readable"), rev_name); + + finalize_hashfile(f, NULL, CSUM_HASH_IN_STREAM | CSUM_CLOSE | + ((flags & WRITE_IDX_VERIFY) ? 0 : CSUM_FSYNC)); + + return rev_name; +} + off_t write_pack_header(struct hashfile *f, uint32_t nr_entries) { struct pack_header hdr; @@ -341,7 +451,7 @@ void finish_tmp_packfile(struct strbuf *name_buffer, struct pack_idx_option *pack_idx_opts, unsigned char hash[]) { - const char *idx_tmp_name; + const char *idx_tmp_name, *rev_tmp_name = NULL; int basename_len = name_buffer->len; if (adjust_shared_perm(pack_tmp_name)) @@ -352,6 +462,9 @@ void finish_tmp_packfile(struct strbuf *name_buffer, if (adjust_shared_perm(idx_tmp_name)) die_errno("unable to make temporary index file readable"); + rev_tmp_name = write_rev_file(NULL, written_list, nr_written, hash, + pack_idx_opts->flags); + strbuf_addf(name_buffer, "%s.pack", hash_to_hex(hash)); if (rename(pack_tmp_name, name_buffer->buf)) @@ -365,5 +478,13 @@ void finish_tmp_packfile(struct strbuf *name_buffer, strbuf_setlen(name_buffer, basename_len); + if (rev_tmp_name) { + strbuf_addf(name_buffer, "%s.rev", hash_to_hex(hash)); + if (rename(rev_tmp_name, name_buffer->buf)) + die_errno("unable to rename temporary reverse-index file"); + } + + strbuf_setlen(name_buffer, basename_len); + free((void *)idx_tmp_name); } diff --git a/pack.h b/pack.h index 9fc0945ac9..30439e0784 100644 --- a/pack.h +++ b/pack.h @@ -42,6 +42,8 @@ struct pack_idx_option { /* flag bits */ #define WRITE_IDX_VERIFY 01 /* verify only, do not write the idx file */ #define WRITE_IDX_STRICT 02 +#define WRITE_REV 04 +#define WRITE_REV_VERIFY 010 uint32_t version; uint32_t off32_limit; @@ -87,6 +89,8 @@ off_t write_pack_header(struct hashfile *f, uint32_t); void fixup_pack_header_footer(int, unsigned char *, const char *, uint32_t, unsigned char *, off_t); char *index_pack_lockfile(int fd); +const char *write_rev_file(const char *rev_name, struct pack_idx_entry **objects, uint32_t nr_objects, const unsigned char *hash, unsigned flags); + /* * The "hdr" output buffer should be at least this big, which will handle sizes * up to 2^67. From patchwork Wed Jan 13 22:28:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12018321 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 707EAC43331 for ; Thu, 14 Jan 2021 02:09:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3BD13236F9 for ; Thu, 14 Jan 2021 02:09:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730068AbhANCFe (ORCPT ); Wed, 13 Jan 2021 21:05:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56070 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729301AbhAMW3E (ORCPT ); Wed, 13 Jan 2021 17:29:04 -0500 Received: from mail-qv1-xf2a.google.com (mail-qv1-xf2a.google.com [IPv6:2607:f8b0:4864:20::f2a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 700FBC06179F for ; Wed, 13 Jan 2021 14:28:19 -0800 (PST) Received: by mail-qv1-xf2a.google.com with SMTP id l14so1572095qvh.2 for ; Wed, 13 Jan 2021 14:28:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=lLG+N2p2KpxAaNbICcrwZ6WiOzLWw9UinWhedVeYc9M=; b=ztWIcQQfZn4P80w6i7IH47QI2qzSLKp4lt4aQCFlVE1aAa/OR63wayGdDFDxGL3A3q UfTtWlYRPz7izNe4kEHyJ8CRlVuOlfybUVYILo4QOZqRuyTM0SZ4BY1jBs0Hjprq6uCt W9vvQWb1a1k0Dgk8IAHO3PC7DZx2Kr6JrvUvA38OYxqO3muf1G2I5EIFUzV9urEKRYF+ Pbj79aPpVrSil3bSvn41heq0yP73fjbvTkCnU9ru9NfC8S9KfaCWq4ywZgcqTOHhoAcq k/Odxo4ba893wuLm/OjRp+cCB8vXlMO99ZMoXpFMgvXbAiGehH6MXkrr8tRJH/VwbwNe uR3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=lLG+N2p2KpxAaNbICcrwZ6WiOzLWw9UinWhedVeYc9M=; b=PBIHITJo2lYDsIX3/I/cVQU6Y5pF57uGC51ZVns+HSMjomg3hy4YnecqyncqEBUiLb bQb/LBrCvcSPEFP+1ONGHNIeeoPnOH23X6fs9nqTs9RDjK0Aqo/A/mLE2xwIDk0Bd4x2 iXKlecSa1IAfr9qeERdqZJ36dKalyzW8NPHD2aX5fJHD3L1bpIkZcnWoTGqFiPB7Napy LVbY780/nxXassamm4wgXGub2VdI5w7IHzZE8SDMUYFgnI/HjZt5r0HZQnbfpmxok8m1 Ne8Pe13Vd8jxIm/uUvJc1CFJ99DNUqYGxbnaVOtCuV3g27VVZM13ze0Oklgb8HFQ+KBb +xXQ== X-Gm-Message-State: AOAM532aQ3/5BcXSZ20b56ttJT8Sf024efrbTp8eYEJLc9LhpIVBT5hX S3UgeLymSJMofXDDpmDG1wMQcDgHjjUsng== X-Google-Smtp-Source: ABdhPJyDDJzU/wqgnxxatE+hrLaUv7SPZsg3Lm0CO/gwnrPTDfA8WjplrIX3/V3VWxhbRkftsvvpRA== X-Received: by 2002:a05:6214:1511:: with SMTP id e17mr4829376qvy.4.1610576898303; Wed, 13 Jan 2021 14:28:18 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:b172:2e4c:efe4:db53]) by smtp.gmail.com with ESMTPSA id q20sm1969340qkj.49.2021.01.13.14.28.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Jan 2021 14:28:17 -0800 (PST) Date: Wed, 13 Jan 2021 17:28:15 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: dstolee@microsoft.com, gitster@pobox.com, jrnieder@gmail.com, peff@peff.net Subject: [PATCH v2 3/8] builtin/index-pack.c: write reverse indexes Message-ID: <5b18ada61113faa9dc1de584366cb39b6a449ec6.1610576805.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Teach 'git index-pack' to optionally write and verify reverse index with '--[no-]rev-index', as well as respecting the 'pack.writeReverseIndex' configuration option. Signed-off-by: Taylor Blau --- Documentation/git-index-pack.txt | 20 ++++++--- builtin/index-pack.c | 64 +++++++++++++++++++++++----- t/t5325-reverse-index.sh | 71 ++++++++++++++++++++++++++++++++ 3 files changed, 139 insertions(+), 16 deletions(-) create mode 100755 t/t5325-reverse-index.sh diff --git a/Documentation/git-index-pack.txt b/Documentation/git-index-pack.txt index af0c26232c..b65f380269 100644 --- a/Documentation/git-index-pack.txt +++ b/Documentation/git-index-pack.txt @@ -9,17 +9,18 @@ git-index-pack - Build pack index file for an existing packed archive SYNOPSIS -------- [verse] -'git index-pack' [-v] [-o ] +'git index-pack' [-v] [-o ] [--[no-]rev-index] 'git index-pack' --stdin [--fix-thin] [--keep] [-v] [-o ] - [] + [--[no-]rev-index] [] DESCRIPTION ----------- Reads a packed archive (.pack) from the specified file, and -builds a pack index file (.idx) for it. The packed archive -together with the pack index can then be placed in the -objects/pack/ directory of a Git repository. +builds a pack index file (.idx) for it. Optionally writes a +reverse-index (.rev) for the specified pack. The packed +archive together with the pack index can then be placed in +the objects/pack/ directory of a Git repository. OPTIONS @@ -33,7 +34,14 @@ OPTIONS file is constructed from the name of packed archive file by replacing .pack with .idx (and the program fails if the name of packed archive does not end - with .pack). + with .pack). Incompatible with `--rev-index`. + +--[no-]rev-index:: + When this flag is provided, generate a reverse index + (a `.rev` file) corresponding to the given pack. If + `--verify` is given, ensure that the existing + reverse index is correct. Takes precedence over + `pack.writeReverseIndex`. --stdin:: When this flag is provided, the pack is read from stdin diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 4b8d86e0ad..03408250b1 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -17,7 +17,7 @@ #include "promisor-remote.h" static const char index_pack_usage[] = -"git index-pack [-v] [-o ] [--keep | --keep=] [--verify] [--strict] ( | --stdin [--fix-thin] [])"; +"git index-pack [-v] [-o ] [--keep | --keep=] [--[no-]rev-index] [--verify] [--strict] ( | --stdin [--fix-thin] [])"; struct object_entry { struct pack_idx_entry idx; @@ -1436,13 +1436,13 @@ static void fix_unresolved_deltas(struct hashfile *f) free(sorted_by_pos); } -static const char *derive_filename(const char *pack_name, const char *suffix, - struct strbuf *buf) +static const char *derive_filename(const char *pack_name, const char *strip, + const char *suffix, struct strbuf *buf) { size_t len; - if (!strip_suffix(pack_name, ".pack", &len)) - die(_("packfile name '%s' does not end with '.pack'"), - pack_name); + if (!strip_suffix(pack_name, strip, &len)) + die(_("packfile name '%s' does not end with '%s'"), + pack_name, strip); strbuf_add(buf, pack_name, len); strbuf_addch(buf, '.'); strbuf_addstr(buf, suffix); @@ -1459,7 +1459,7 @@ static void write_special_file(const char *suffix, const char *msg, int msg_len = strlen(msg); if (pack_name) - filename = derive_filename(pack_name, suffix, &name_buf); + filename = derive_filename(pack_name, ".pack", suffix, &name_buf); else filename = odb_pack_name(&name_buf, hash, suffix); @@ -1484,12 +1484,14 @@ static void write_special_file(const char *suffix, const char *msg, static void final(const char *final_pack_name, const char *curr_pack_name, const char *final_index_name, const char *curr_index_name, + const char *final_rev_index_name, const char *curr_rev_index_name, const char *keep_msg, const char *promisor_msg, unsigned char *hash) { const char *report = "pack"; struct strbuf pack_name = STRBUF_INIT; struct strbuf index_name = STRBUF_INIT; + struct strbuf rev_index_name = STRBUF_INIT; int err; if (!from_stdin) { @@ -1524,6 +1526,16 @@ static void final(const char *final_pack_name, const char *curr_pack_name, } else chmod(final_index_name, 0444); + if (curr_rev_index_name) { + if (final_rev_index_name != curr_rev_index_name) { + if (!final_rev_index_name) + final_rev_index_name = odb_pack_name(&rev_index_name, hash, "rev"); + if (finalize_object_file(curr_rev_index_name, final_rev_index_name)) + die(_("cannot store reverse index file")); + } else + chmod(final_rev_index_name, 0444); + } + if (do_fsck_object) { struct packed_git *p; p = add_packed_git(final_index_name, strlen(final_index_name), 0); @@ -1553,6 +1565,7 @@ static void final(const char *final_pack_name, const char *curr_pack_name, } } + strbuf_release(&rev_index_name); strbuf_release(&index_name); strbuf_release(&pack_name); } @@ -1578,6 +1591,12 @@ static int git_index_pack_config(const char *k, const char *v, void *cb) } return 0; } + if (!strcmp(k, "pack.writereverseindex")) { + if (git_config_bool(k, v)) + opts->flags |= WRITE_REV; + else + opts->flags &= ~WRITE_REV; + } return git_default_config(k, v, cb); } @@ -1695,12 +1714,14 @@ static void show_pack_info(int stat_only) int cmd_index_pack(int argc, const char **argv, const char *prefix) { - int i, fix_thin_pack = 0, verify = 0, stat_only = 0; + int i, fix_thin_pack = 0, verify = 0, stat_only = 0, rev_index; const char *curr_index; - const char *index_name = NULL, *pack_name = NULL; + const char *curr_rev_index = NULL; + const char *index_name = NULL, *pack_name = NULL, *rev_index_name = NULL; const char *keep_msg = NULL; const char *promisor_msg = NULL; struct strbuf index_name_buf = STRBUF_INIT; + struct strbuf rev_index_name_buf = STRBUF_INIT; struct pack_idx_entry **idx_objects; struct pack_idx_option opts; unsigned char pack_hash[GIT_MAX_RAWSZ]; @@ -1727,6 +1748,8 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) if (prefix && chdir(prefix)) die(_("Cannot come back to cwd")); + rev_index = !!(opts.flags & (WRITE_REV_VERIFY | WRITE_REV)); + for (i = 1; i < argc; i++) { const char *arg = argv[i]; @@ -1805,6 +1828,10 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) if (hash_algo == GIT_HASH_UNKNOWN) die(_("unknown hash algorithm '%s'"), arg); repo_set_hash_algo(the_repository, hash_algo); + } else if (!strcmp(arg, "--rev-index")) { + rev_index = 1; + } else if (!strcmp(arg, "--no-rev-index")) { + rev_index = 0; } else usage(index_pack_usage); continue; @@ -1824,7 +1851,16 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) if (from_stdin && hash_algo) die(_("--object-format cannot be used with --stdin")); if (!index_name && pack_name) - index_name = derive_filename(pack_name, "idx", &index_name_buf); + index_name = derive_filename(pack_name, ".pack", "idx", &index_name_buf); + + opts.flags &= ~(WRITE_REV | WRITE_REV_VERIFY); + if (rev_index) { + opts.flags |= verify ? WRITE_REV_VERIFY : WRITE_REV; + if (index_name) + rev_index_name = derive_filename(index_name, + ".idx", "rev", + &rev_index_name_buf); + } if (verify) { if (!index_name) @@ -1878,11 +1914,16 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) for (i = 0; i < nr_objects; i++) idx_objects[i] = &objects[i].idx; curr_index = write_idx_file(index_name, idx_objects, nr_objects, &opts, pack_hash); + if (rev_index) + curr_rev_index = write_rev_file(rev_index_name, idx_objects, + nr_objects, pack_hash, + opts.flags); free(idx_objects); if (!verify) final(pack_name, curr_pack, index_name, curr_index, + rev_index_name, curr_rev_index, keep_msg, promisor_msg, pack_hash); else @@ -1893,10 +1934,13 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) free(objects); strbuf_release(&index_name_buf); + strbuf_release(&rev_index_name_buf); if (pack_name == NULL) free((void *) curr_pack); if (index_name == NULL) free((void *) curr_index); + if (rev_index_name == NULL) + free((void *) curr_rev_index); /* * Let the caller know this pack is not self contained diff --git a/t/t5325-reverse-index.sh b/t/t5325-reverse-index.sh new file mode 100755 index 0000000000..2dae213126 --- /dev/null +++ b/t/t5325-reverse-index.sh @@ -0,0 +1,71 @@ +#!/bin/sh + +test_description='on-disk reverse index' +. ./test-lib.sh + +packdir=.git/objects/pack + +test_expect_success 'setup' ' + test_commit base && + + pack=$(git pack-objects --all $packdir/pack) && + rev=$packdir/pack-$pack.rev && + + test_path_is_missing $rev +' + +test_index_pack () { + rm -f $rev && + conf=$1 && + shift && + # remove the index since Windows won't overwrite an existing file + rm $packdir/pack-$pack.idx && + git -c pack.writeReverseIndex=$conf index-pack "$@" \ + $packdir/pack-$pack.pack +} + +test_expect_success 'index-pack with pack.writeReverseIndex' ' + test_index_pack "" && + test_path_is_missing $rev && + + test_index_pack false && + test_path_is_missing $rev && + + test_index_pack true && + test_path_is_file $rev +' + +test_expect_success 'index-pack with --[no-]rev-index' ' + for conf in "" true false + do + test_index_pack "$conf" --rev-index && + test_path_exists $rev && + + test_index_pack "$conf" --no-rev-index && + test_path_is_missing $rev + done +' + +test_expect_success 'index-pack can verify reverse indexes' ' + test_when_finished "rm -f $rev" && + test_index_pack true && + + test_path_is_file $rev && + git index-pack --rev-index --verify $packdir/pack-$pack.pack && + + # Intentionally corrupt the reverse index. + chmod u+w $rev && + printf "xxxx" | dd of=$rev bs=1 count=4 conv=notrunc && + + test_must_fail git index-pack --rev-index --verify \ + $packdir/pack-$pack.pack 2>err && + grep "validation error" err +' + +test_expect_success 'index-pack infers reverse index name with -o' ' + git index-pack --rev-index -o other.idx $packdir/pack-$pack.pack && + test_path_is_file other.idx && + test_path_is_file other.rev +' + +test_done From patchwork Wed Jan 13 22:28:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12018313 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33158C4332D for ; Thu, 14 Jan 2021 02:09:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EEC5123602 for ; Thu, 14 Jan 2021 02:09:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730057AbhANCF3 (ORCPT ); Wed, 13 Jan 2021 21:05:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56368 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729356AbhAMWao (ORCPT ); Wed, 13 Jan 2021 17:30:44 -0500 Received: from mail-qv1-xf2d.google.com (mail-qv1-xf2d.google.com [IPv6:2607:f8b0:4864:20::f2d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 73736C0617A2 for ; Wed, 13 Jan 2021 14:28:23 -0800 (PST) Received: by mail-qv1-xf2d.google.com with SMTP id l14so1572164qvh.2 for ; Wed, 13 Jan 2021 14:28:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=F3JyToQNJRRcX0bUp7cyz8WZnPu5hTqqWdIFOgT0w0M=; b=c0XkOBJNXaWZrvFTV+QZ9QU8X/HDw3ABO5VL+IRuYr6Bp7Yo2Y8t6RM9beYOsbThhr WhPsyzEqkQ9CIZPjHPo9Ku/LwyMEMyuAJTGghmgYcMlNx05wyEDNUrAwzKh7VKeGGZe/ J4oLmOyWRSZfO4w6yh5ibEB3+xLHPFtyfo21oM/ZNJ3jmUXIooNtyrSR7aMaR+MG6VKJ HMVzcHwnoqKvynbc45JOLp7jTIwst6hm0+cvB7/3tus5xhkcQpY5eEsoJEn28Tsc7pqZ iSD9BumZDFh0ZQ+cM/buhiL6fJmw4xg+uKTzMOeoPaqWaJ44add4nuIFmUJ6Rfc5l1xB w+fA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=F3JyToQNJRRcX0bUp7cyz8WZnPu5hTqqWdIFOgT0w0M=; b=UDNaPbjOv8Ri4Q0/pxsUEej5PXukbpxWQDR4H0zaIGPOBPO3pvdDAwz0Y7Ebamhk0s EScydoEN6/tUC5G96GNqLxDlpZ3dydeoOxk0qhh4dVt8WkZs17qn74N8rwiPBHnz1dyW Aw79udsALrxxycd1lnzdfiKK+9rE5Ci1KtfwvCAmV1ZmHVGzBL3HcSKNvY3aWgROY1/t E3sbfYdA6KyHufzeA4SUks5OdiKVNJqf9s/KA+uw2MukawGnPlm7cS+omdiUGhYtxN7W 7DeYYoggBE/Pacu8RrOAfHJSf3LXgF0V5L/rQxb2EqIBS4LM/1WLV7rgBJlwVqxVImYq lBXA== X-Gm-Message-State: AOAM530726w119rWiMX9Y+yKOlcy5evN3YufdvW06/bdi0dWSH5A3kzx z6bA+4TyrhpMIm5XLOp7aqVtzlhIFBMKJQ== X-Google-Smtp-Source: ABdhPJx7ForC1aWbDfysAOppX5ByeRsA1FFGYEwdd30tXCmuEBJjGXoNFy/XxuyWKtBcYMPrlPT6uQ== X-Received: by 2002:a0c:80ca:: with SMTP id 68mr4513597qvb.28.1610576902498; Wed, 13 Jan 2021 14:28:22 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:b172:2e4c:efe4:db53]) by smtp.gmail.com with ESMTPSA id m13sm1822787qtu.93.2021.01.13.14.28.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Jan 2021 14:28:21 -0800 (PST) Date: Wed, 13 Jan 2021 17:28:19 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: dstolee@microsoft.com, gitster@pobox.com, jrnieder@gmail.com, peff@peff.net Subject: [PATCH v2 4/8] builtin/pack-objects.c: respect 'pack.writeReverseIndex' Message-ID: <68bde3ea972f5b3753d7e9063d0490c67c74709b.1610576805.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Now that we have an implementation that can write the new reverse index format, enable writing a .rev file in 'git pack-objects' by consulting the pack.writeReverseIndex configuration variable. Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 7 +++++++ t/t5325-reverse-index.sh | 13 +++++++++++++ 2 files changed, 20 insertions(+) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 5b0c4489e2..d784569200 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -2955,6 +2955,13 @@ static int git_pack_config(const char *k, const char *v, void *cb) pack_idx_opts.version); return 0; } + if (!strcmp(k, "pack.writereverseindex")) { + if (git_config_bool(k, v)) + pack_idx_opts.flags |= WRITE_REV; + else + pack_idx_opts.flags &= ~WRITE_REV; + return 0; + } if (!strcmp(k, "uploadpack.blobpackfileuri")) { struct configured_exclusion *ex = xmalloc(sizeof(*ex)); const char *oid_end, *pack_end; diff --git a/t/t5325-reverse-index.sh b/t/t5325-reverse-index.sh index 2dae213126..87040263b7 100755 --- a/t/t5325-reverse-index.sh +++ b/t/t5325-reverse-index.sh @@ -68,4 +68,17 @@ test_expect_success 'index-pack infers reverse index name with -o' ' test_path_is_file other.rev ' +test_expect_success 'pack-objects respects pack.writeReverseIndex' ' + test_when_finished "rm -fr pack-1-*" && + + git -c pack.writeReverseIndex= pack-objects --all pack-1 && + test_path_is_missing pack-1-*.rev && + + git -c pack.writeReverseIndex=false pack-objects --all pack-1 && + test_path_is_missing pack-1-*.rev && + + git -c pack.writeReverseIndex=true pack-objects --all pack-1 && + test_path_is_file pack-1-*.rev +' + test_done From patchwork Wed Jan 13 22:28:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12018315 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DEEE8C433E0 for ; Thu, 14 Jan 2021 02:09:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A4932235FA for ; Thu, 14 Jan 2021 02:09:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730049AbhANCF2 (ORCPT ); Wed, 13 Jan 2021 21:05:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56366 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729355AbhAMWao (ORCPT ); Wed, 13 Jan 2021 17:30:44 -0500 Received: from mail-qk1-x734.google.com (mail-qk1-x734.google.com [IPv6:2607:f8b0:4864:20::734]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 949CAC0617A3 for ; Wed, 13 Jan 2021 14:28:27 -0800 (PST) Received: by mail-qk1-x734.google.com with SMTP id 186so4468886qkj.3 for ; Wed, 13 Jan 2021 14:28:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=qlHVhEYX5cqJqldP2f83BYJJb/dO87xV8oTauLhAP6A=; b=Qxo2/EZj3alJJyRv0kMGAABtGqFFYhgAEwrn9pPx8U7UAbrifDN0bhQ0fRUs2sVsiQ ydxciMNW+IhtL9NU/tBw1+42mCw0Cp8frfyje6NY4vPs/CEdqWp4t4kcm6lAsYenpcwY 5IGxyM3iUHdJZ9yQryKstlmdeo0jUfwh2+/HsUiftr6SLr8OuzUoWUf3JuntPDVUbzmg lHZ5NC5bE/M+nxGlaWYXoTQh/JQUpD1qmr5dOl0oUH35A6vvH4rAuBtFfNwzvXQNy+BD T9eTsYlIasSdmIlL92HCV6BPaSq5EIPSGzugGmTyppfMRPODSZzC+iieYWmFMXI/UW47 WUQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=qlHVhEYX5cqJqldP2f83BYJJb/dO87xV8oTauLhAP6A=; b=ET1Cz+/su33QyxBp97pCDmQDq9KgtU3CrxvNaRciQ/INRjW2MXQdMtQYfuSNv874j9 cm7V8DTxJhhgyauGed6R0uxbc8eJ085u8DCE3rNlNBijo9KVp9QrineFFinNdBR0NEFH 4BHCMklHZ+9FRJ404Jfhhz0JzeRJPKamvpVqLKKpnDi3XghXoNkA+6VNxcUvcJXfA+S6 uctGx+pu49lmljvGUkKktz1OkzU+E66SORw6Jmg4HTl3CaHJoyzvSodRbVk9nq2WyUtK HD1/gQIWqxiG6wEOCEEfCzbjPrV6PKoHw0ppYfO73oEG8NYZ49kUmxsMICOrPncaVpaW Vz3A== X-Gm-Message-State: AOAM531IaXeW2DaUsN2Sfwq/Ai2dh7nVJIgGXuMmeEHgn1Gj1Pa0eAb/ X8Zv6Ey2b5rDuMrGwVdM7obJoCwT09Hrxg== X-Google-Smtp-Source: ABdhPJwUkZcl+E9Ztg8EM9rRrsMNgx+/0hHMa1HbH1f6djwhgxRr5vVsucf4xQ4dfdKB5OhyjlWP5A== X-Received: by 2002:a37:68c2:: with SMTP id d185mr4349091qkc.45.1610576906586; Wed, 13 Jan 2021 14:28:26 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:b172:2e4c:efe4:db53]) by smtp.gmail.com with ESMTPSA id i3sm1915864qkd.119.2021.01.13.14.28.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Jan 2021 14:28:26 -0800 (PST) Date: Wed, 13 Jan 2021 17:28:24 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: dstolee@microsoft.com, gitster@pobox.com, jrnieder@gmail.com, peff@peff.net Subject: [PATCH v2 5/8] Documentation/config/pack.txt: advertise 'pack.writeReverseIndex' Message-ID: <38a253d0ce29868000d7659158b1a1d5b841cdc7.1610576805.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Now that the pack.writeReverseIndex configuration is respected in both 'git index-pack' and 'git pack-objects' (and therefore, all of their callers), we can safely advertise it for use in the git-config manual. Signed-off-by: Taylor Blau --- Documentation/config/pack.txt | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/Documentation/config/pack.txt b/Documentation/config/pack.txt index 837f1b1679..3da4ea98e2 100644 --- a/Documentation/config/pack.txt +++ b/Documentation/config/pack.txt @@ -133,3 +133,10 @@ pack.writeBitmapHashCache:: between an older, bitmapped pack and objects that have been pushed since the last gc). The downside is that it consumes 4 bytes per object of disk space. Defaults to true. + +pack.writeReverseIndex:: + When true, git will write a corresponding .rev file (see: + link:../technical/pack-format.html[Documentation/technical/pack-format.txt]) + for each new packfile that it writes in all places except for + linkgit:git-fast-import[1] and in the bulk checkin mechanism. + Defaults to false. From patchwork Wed Jan 13 22:28:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12018273 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EA3C0C4332B for ; Thu, 14 Jan 2021 02:05:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B2C9D235E4 for ; Thu, 14 Jan 2021 02:05:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729448AbhANCFW (ORCPT ); Wed, 13 Jan 2021 21:05:22 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56374 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729358AbhAMWao (ORCPT ); Wed, 13 Jan 2021 17:30:44 -0500 Received: from mail-qk1-x729.google.com (mail-qk1-x729.google.com [IPv6:2607:f8b0:4864:20::729]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 22623C0617A4 for ; Wed, 13 Jan 2021 14:28:33 -0800 (PST) Received: by mail-qk1-x729.google.com with SMTP id c7so4467329qke.1 for ; Wed, 13 Jan 2021 14:28:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=JnwXMfy71TFG3OSt+vbb+6ryqmj8M60Ak9pyu+d4MB4=; b=FI9mzWg3lvb8QI5vsTohpR+eBnyHqZ+Fsx2wsNOy6L9y7HqVBDc6i46mB+4m2PQyPX LTuJ7MFp9S2Fel+5Y1JEHCfAuZIhHTylT1rcAN0h3nQtvxDlMuWqCDeAf1wKbCMbzPZ2 DKF95TxOwt3VxeYWikNyq9HJqOORhgTPBvqsKW63MTlPpwXZLH7Qjg1GLkH7/QTdZyZT wrnJi/w0Jpf6SZ/ym+THmiyQgqEL1ekV6ICHLWwz+4Lj3czy5yzsDtlpXu+RGgFSgtel s0mnvrhT4EUg+nBLYEuHXRywhEe3K0yZdDOFOVZvuTizY1zAJd4uo2r087Ie209g8qSP N+mA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=JnwXMfy71TFG3OSt+vbb+6ryqmj8M60Ak9pyu+d4MB4=; b=hvKROPtLNvZyXfD0f+6Gx54kdsWbqGmfucOGN7hFPwo9PhsWFkgU/D2/81bkYUCwSL CgcfXHQehHardurU8DUBgX2VYY/9NGi2J2uoDTCRTwNlp+xp3O1qkDr29PRH+4ncf6tb Sgp/O4IdVoTqvuBKpjhdMCXny5A/O6f8zz2y3WUYDyb5weBlIzW6sQu8Aq6y0O39g4V6 1NkQYv5z/JUzMBQB0ZHDuQoecmh3kF6Dq7fWFqRgon+87cFxBnGyNjn0FiLplpKPO6SX 908xC0wtuycxv8moeQim6HUmUDVpYdb6M+qGRHUKtqPNtP+C6FMOLxJaJh1nbOjkYtI6 X4Fw== X-Gm-Message-State: AOAM532UAXLiZcLZhXWQgQ9DB9cOwrDB4CODZzRVIY5080AR1ySvNNBZ E27RnTwNq2F40rUh0Gjgy/crzOykV+OtbQ== X-Google-Smtp-Source: ABdhPJwfve0k/+rJhPYkxdTLM0uk5ju1A0edeiGddV9tLdLJIt6vkcMzvfZeJDW8FUwCnfaqDQRR3g== X-Received: by 2002:a37:a14a:: with SMTP id k71mr4256320qke.33.1610576912084; Wed, 13 Jan 2021 14:28:32 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:b172:2e4c:efe4:db53]) by smtp.gmail.com with ESMTPSA id e11sm1895662qtg.46.2021.01.13.14.28.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Jan 2021 14:28:31 -0800 (PST) Date: Wed, 13 Jan 2021 17:28:28 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: dstolee@microsoft.com, gitster@pobox.com, jrnieder@gmail.com, peff@peff.net Subject: [PATCH v2 6/8] t: prepare for GIT_TEST_WRITE_REV_INDEX Message-ID: <12cdf2d67a30e3d082bbe0597e3ae97bb753f8bb.1610576805.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In the next patch, we'll add support for unconditionally enabling the 'pack.writeReverseIndex' setting with a new GIT_TEST_WRITE_REV_INDEX environment variable. This causes a little bit of fallout with tests that, for example, compare the list of files in the pack directory being unprepared to see .rev files in its output. Those locations can be cleaned up to look for specific file extensions, rather than take everything in the pack directory (for instance) and then grep out unwanted items. Once the pack.writeReverseIndex option has been thoroughly tested, we will default it to 'true', removing GIT_TEST_WRITE_REV_INDEX, and making it possible to revert this patch. Signed-off-by: Taylor Blau --- t/t5319-multi-pack-index.sh | 5 +++-- t/t5325-reverse-index.sh | 4 ++++ t/t5604-clone-reference.sh | 2 +- t/t5702-protocol-v2.sh | 12 ++++++++---- t/t6500-gc.sh | 6 +++--- t/t9300-fast-import.sh | 5 ++++- 6 files changed, 23 insertions(+), 11 deletions(-) diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 297de502a9..2fc3aadbd1 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -710,8 +710,9 @@ test_expect_success 'expire respects .keep files' ' PACKA=$(ls .git/objects/pack/a-pack*\.pack | sed s/\.pack\$//) && touch $PACKA.keep && git multi-pack-index expire && - ls -S .git/objects/pack/a-pack* | grep $PACKA >a-pack-files && - test_line_count = 3 a-pack-files && + test_path_is_file $PACKA.idx && + test_path_is_file $PACKA.keep && + test_path_is_file $PACKA.pack && test-tool read-midx .git/objects | grep idx >midx-list && test_line_count = 2 midx-list ) diff --git a/t/t5325-reverse-index.sh b/t/t5325-reverse-index.sh index 87040263b7..be452bb343 100755 --- a/t/t5325-reverse-index.sh +++ b/t/t5325-reverse-index.sh @@ -3,6 +3,10 @@ test_description='on-disk reverse index' . ./test-lib.sh +# The below tests want control over the 'pack.writeReverseIndex' setting +# themselves to assert various combinations of it with other options. +sane_unset GIT_TEST_WRITE_REV_INDEX + packdir=.git/objects/pack test_expect_success 'setup' ' diff --git a/t/t5604-clone-reference.sh b/t/t5604-clone-reference.sh index 2f7be23044..7d93588982 100755 --- a/t/t5604-clone-reference.sh +++ b/t/t5604-clone-reference.sh @@ -326,7 +326,7 @@ test_expect_success SYMLINKS 'clone repo with symlinked or unknown files at obje for raw in $(ls T*.raw) do sed -e "s!/../!/Y/!; s![0-9a-f]\{38,\}!Z!" -e "/commit-graph/d" \ - -e "/multi-pack-index/d" <$raw >$raw.de-sha-1 && + -e "/multi-pack-index/d" -e "/rev/d" <$raw >$raw.de-sha-1 && sort $raw.de-sha-1 >$raw.de-sha || return 1 done && diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh index 7d5b17909b..73cd9e3ff6 100755 --- a/t/t5702-protocol-v2.sh +++ b/t/t5702-protocol-v2.sh @@ -848,8 +848,10 @@ test_expect_success 'part of packfile response provided as URI' ' test -f h2found && # Ensure that there are exactly 6 files (3 .pack and 3 .idx). - ls http_child/.git/objects/pack/* >filelist && - test_line_count = 6 filelist + ls http_child/.git/objects/pack/*.pack >packlist && + ls http_child/.git/objects/pack/*.idx >idxlist && + test_line_count = 3 idxlist && + test_line_count = 3 packlist ' test_expect_success 'fetching with valid packfile URI but invalid hash fails' ' @@ -902,8 +904,10 @@ test_expect_success 'packfile-uri with transfer.fsckobjects' ' clone "$HTTPD_URL/smart/http_parent" http_child && # Ensure that there are exactly 4 files (2 .pack and 2 .idx). - ls http_child/.git/objects/pack/* >filelist && - test_line_count = 4 filelist + ls http_child/.git/objects/pack/*.pack >packlist && + ls http_child/.git/objects/pack/*.idx >idxlist && + test_line_count = 2 idxlist && + test_line_count = 2 packlist ' test_expect_success 'packfile-uri with transfer.fsckobjects fails on bad object' ' diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh index 4a3b8f48ac..f76586f808 100755 --- a/t/t6500-gc.sh +++ b/t/t6500-gc.sh @@ -106,17 +106,17 @@ test_expect_success 'auto gc with too many loose objects does not attempt to cre test_commit "$(test_oid obj2)" && # Our first gc will create a pack; our second will create a second pack git gc --auto && - ls .git/objects/pack | sort >existing_packs && + ls .git/objects/pack/pack-*.pack | sort >existing_packs && test_commit "$(test_oid obj3)" && test_commit "$(test_oid obj4)" && git gc --auto 2>err && test_i18ngrep ! "^warning:" err && - ls .git/objects/pack/ | sort >post_packs && + ls .git/objects/pack/pack-*.pack | sort >post_packs && comm -1 -3 existing_packs post_packs >new && comm -2 -3 existing_packs post_packs >del && test_line_count = 0 del && # No packs are deleted - test_line_count = 2 new # There is one new pack and its .idx + test_line_count = 1 new # There is one new pack ' test_expect_success 'gc --no-quiet' ' diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh index 308c1ef42c..2cc1f43c1b 100755 --- a/t/t9300-fast-import.sh +++ b/t/t9300-fast-import.sh @@ -1629,7 +1629,10 @@ test_expect_success 'O: blank lines not necessary after other commands' ' INPUT_END git fast-import packlist && + ls -la .git/objects/pack/pack-*.pack >idxlist && + test_line_count = 4 idxlist && + test_line_count = 4 packlist && test $(git rev-parse refs/tags/O3-2nd) = $(git rev-parse O3^) && git log --reverse --pretty=oneline O3 | sed s/^.*z// >actual && test_cmp expect actual From patchwork Wed Jan 13 22:28:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12018319 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 847B7C43333 for ; Thu, 14 Jan 2021 02:09:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 60D0C235F8 for ; Thu, 14 Jan 2021 02:09:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729903AbhANCF0 (ORCPT ); Wed, 13 Jan 2021 21:05:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56382 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729363AbhAMWbL (ORCPT ); Wed, 13 Jan 2021 17:31:11 -0500 Received: from mail-qk1-x735.google.com (mail-qk1-x735.google.com [IPv6:2607:f8b0:4864:20::735]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 58398C0617AA for ; Wed, 13 Jan 2021 14:28:37 -0800 (PST) Received: by mail-qk1-x735.google.com with SMTP id z11so4453221qkj.7 for ; Wed, 13 Jan 2021 14:28:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=R9VxRrGALjumWhx0ag6SP3iU6tyRas3+pb/mtSZZ4vI=; b=G5FmC5yR66VUZzfgMM+FLYLh5UyDm9kMjm3BTkitKzD9XDY3A1YlAFufFSIK9dbqdW W5PW+l2UHePUNdhR9ZGIyKd19TGtX3faRHkFBZ+Zv4bKB7bJH5fustQQJLM29iw8Mwpz EtedvpL/48bFpbuEWbDq4ht+uyGN3DJhpe5YqZfrE5BQ5Gwzla8TMB7RNmHUuzw40/fJ ifLpUBj6Zm9w4IeR0TwtkGAHtTj0JEtVgGI0TgX05+LzHnRUQYv+TOctV3yZdoTauALQ 1BAI3LS+6mzT9w7d8DX4KSFFYStHSlT1Z7vMO/LvkEJrQ8kuIoDdS1iH6jbOYJzUVf92 zVYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=R9VxRrGALjumWhx0ag6SP3iU6tyRas3+pb/mtSZZ4vI=; b=qqtcrhOaO1savotTCWDSbydgmjMJcCO9QYZqcZsHF8+P8iiEcTnwe3XihtYdEKcgzo FwGVBf5APY9T9b+ZRXGhkiCjwSo7i1Mvj0kOrxfUQmtwSifPf3XUKmcMLjxKgcM/PXpJ p2DHQdUMdQfHQwjhKQS9HI5cqK3nVQt6pqIS5wCZHhfdZ++w2CxXloWWEp/97pafeHxL 0DUxhBJ9SmLL1q9ZpoqS08X7GBcXOfd2K7DZyLxxTePoTKon3Mj/HijI+mjak0Gmetr3 meFT3wakJ/WrAwqms7sdLznYiMbnlRMQykT4329eKOdhn10td6KbN1Cfgwy3/fbRG9Pm sFgA== X-Gm-Message-State: AOAM531fRMwfm6zl26p7JuNQW8wKWom6fXoFZGk7W8LPVbTE/W2p9u4r EnQjH63T7k4MaNHhx4dBs7QM31dVg/iQrQ== X-Google-Smtp-Source: ABdhPJyw6/GBs6Aj4+w85fLaRDYQC9MgaHYF623o2VRgfdF4SjQQ1JraTKEEYThq8/ReKOj6DJitMQ== X-Received: by 2002:a37:63c3:: with SMTP id x186mr4485254qkb.361.1610576916339; Wed, 13 Jan 2021 14:28:36 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:b172:2e4c:efe4:db53]) by smtp.gmail.com with ESMTPSA id j203sm1889033qke.134.2021.01.13.14.28.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Jan 2021 14:28:35 -0800 (PST) Date: Wed, 13 Jan 2021 17:28:33 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: dstolee@microsoft.com, gitster@pobox.com, jrnieder@gmail.com, peff@peff.net Subject: [PATCH v2 7/8] t: support GIT_TEST_WRITE_REV_INDEX Message-ID: <6b647d9775fd97238f1170dc998770c196a92cc4.1610576805.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Add a new option that unconditionally enables the pack.writeReverseIndex setting in order to run the whole test suite in a mode that generates on-disk reverse indexes. Additionally, enable this mode in the second run of tests under linux-gcc in 'ci/run-build-and-tests.sh'. Once on-disk reverse indexes are proven out over several releases, we can change the default value of that configuration to 'true', and drop this patch. Signed-off-by: Taylor Blau --- builtin/index-pack.c | 5 ++++- builtin/pack-objects.c | 2 ++ ci/run-build-and-tests.sh | 1 + pack-revindex.h | 2 ++ t/README | 3 +++ 5 files changed, 12 insertions(+), 1 deletion(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 03408250b1..0bde325a8b 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1748,7 +1748,10 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) if (prefix && chdir(prefix)) die(_("Cannot come back to cwd")); - rev_index = !!(opts.flags & (WRITE_REV_VERIFY | WRITE_REV)); + if (git_env_bool(GIT_TEST_WRITE_REV_INDEX, 0)) + rev_index = 1; + else + rev_index = !!(opts.flags & (WRITE_REV_VERIFY | WRITE_REV)); for (i = 1; i < argc; i++) { const char *arg = argv[i]; diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index d784569200..24df0c98f7 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -3601,6 +3601,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) reset_pack_idx_option(&pack_idx_opts); git_config(git_pack_config, NULL); + if (git_env_bool(GIT_TEST_WRITE_REV_INDEX, 0)) + pack_idx_opts.flags |= WRITE_REV; progress = isatty(2); argc = parse_options(argc, argv, prefix, pack_objects_options, diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh index 6c27b886b8..d1cbf330a1 100755 --- a/ci/run-build-and-tests.sh +++ b/ci/run-build-and-tests.sh @@ -22,6 +22,7 @@ linux-gcc) export GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=1 export GIT_TEST_MULTI_PACK_INDEX=1 export GIT_TEST_ADD_I_USE_BUILTIN=1 + export GIT_TEST_WRITE_REV_INDEX=1 make test ;; linux-clang) diff --git a/pack-revindex.h b/pack-revindex.h index 01622cf21a..7237b2b6f8 100644 --- a/pack-revindex.h +++ b/pack-revindex.h @@ -16,6 +16,8 @@ * can be found */ +#define GIT_TEST_WRITE_REV_INDEX "GIT_TEST_WRITE_REV_INDEX" + struct packed_git; /* diff --git a/t/README b/t/README index c730a70770..0f97a51640 100644 --- a/t/README +++ b/t/README @@ -439,6 +439,9 @@ GIT_TEST_DEFAULT_HASH= specifies which hash algorithm to use in the test scripts. Recognized values for are "sha1" and "sha256". +GIT_TEST_WRITE_REV_INDEX=, when true enables the +'pack.writeReverseIndex' setting. + Naming Tests ------------ From patchwork Wed Jan 13 22:28:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12018267 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23D53C432C3 for ; Thu, 14 Jan 2021 02:04:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EE2A123442 for ; Thu, 14 Jan 2021 02:04:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727383AbhANCEh (ORCPT ); Wed, 13 Jan 2021 21:04:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56366 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729368AbhAMWbj (ORCPT ); Wed, 13 Jan 2021 17:31:39 -0500 Received: from mail-qv1-xf36.google.com (mail-qv1-xf36.google.com [IPv6:2607:f8b0:4864:20::f36]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 67534C0617AB for ; Wed, 13 Jan 2021 14:28:41 -0800 (PST) Received: by mail-qv1-xf36.google.com with SMTP id h13so1581247qvo.1 for ; Wed, 13 Jan 2021 14:28:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=F+IzFX+eJQuZjoTzcoY0Nicg1W0SyqzlGAWjYAPf4iQ=; b=QHKNbPBpfrN4IarkRf0SGVkcnOUVB8oBckJL6aH+YFGjHNGGwOtD/+Jj/IOFqLLHz3 LuBeuAA7l1cz0JHKfnh+J1WteDjrRk6Rbd6BaI7nOqVRTCrrxjp8ak1rskDeg5QIQJIY w1D/PiQPF98eGx/T76H0dbH2Fp80rMJWfSwZBuE6joMmUGyFArRQf43IEM7XEJejT0Us 9EkESTpF0Cf2fLQvs5p6VL2aALsMbCbdPo2266y71bEE9JmZ520mFzBZaRgJk3UQ8iQ6 E3XnZJnhpzuvidqWSYT9k0OtFIMe6I4tKaMxgAeklLJ7TIFG+TT3KiZm7xMJrSONWpIl PD4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=F+IzFX+eJQuZjoTzcoY0Nicg1W0SyqzlGAWjYAPf4iQ=; b=ZVkx9sJs/oZQ8l0K+Ct3juYxmRD1ks1UjfXgwJAirO+4SokWS+Zb20HdSngg8YcYRT einLMVaDRE22VkV+lWOkRuJTYCQ02FYk9lF6FCH6LR1S0IaENRdrNMwFywp3S7a0fZNd qsZu8HyF/WhZjRXMPbP3m/3hdk9r2EsYqUJ8FciOOVU8vN9UOS4+MseMb8Ny0RJ6OIjU 9NP7QQADX5CuYzQ0fLhGUQhntdFOhQXbkLObMLT1kGGGfFwzKSCdiSThQ+FLBnZUcicq yGWmiyIYH0SxaIB0nS7OGCEGyL42XOj8P17UFUk9YrBPQ1cnS2eYIpP2Vq4lUDkEUwJn eYRg== X-Gm-Message-State: AOAM533ASN3nukmpunjb2LssWQY1cexEKCucyH5NRTA8W/eXD2KZUmeO 8kyQ9LntE10WYei0N2iZ5GdTruCRbeo15Q== X-Google-Smtp-Source: ABdhPJzGNSkT/PrbAgwnFWJRJiX6Bi3ay+16WRDPNNHqRl96F+5AOAH5HhNrXKJ0WIonJ4SSab6NnA== X-Received: by 2002:a0c:b4a8:: with SMTP id c40mr4807079qve.60.1610576920452; Wed, 13 Jan 2021 14:28:40 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:b172:2e4c:efe4:db53]) by smtp.gmail.com with ESMTPSA id c7sm1791307qtw.70.2021.01.13.14.28.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Jan 2021 14:28:39 -0800 (PST) Date: Wed, 13 Jan 2021 17:28:38 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: dstolee@microsoft.com, gitster@pobox.com, jrnieder@gmail.com, peff@peff.net Subject: [PATCH v2 8/8] pack-revindex: ensure that on-disk reverse indexes are given precedence Message-ID: <48926ae1821f026c00c6237b771e0f9150b8b267.1610576805.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When an on-disk reverse index exists, there is no need to generate one in memory. In fact, doing so can be slow, and require large amounts of the heap. Let's make sure that we treat the on-disk reverse index with precedence (i.e., that when it exists, we don't bother trying to generate an equivalent one in memory) by teaching Git how to conditionally die() when generating a reverse index in memory. Then, add a test to ensure that when (a) an on-disk reverse index exists, and (b) when setting GIT_TEST_REV_INDEX_DIE_IN_MEMORY, that we do not die, implying that we read from the on-disk one. Signed-off-by: Taylor Blau --- pack-revindex.c | 4 ++++ pack-revindex.h | 1 + t/t5325-reverse-index.sh | 9 +++++++++ 3 files changed, 14 insertions(+) diff --git a/pack-revindex.c b/pack-revindex.c index 369812dd21..f264319f34 100644 --- a/pack-revindex.c +++ b/pack-revindex.c @@ -2,6 +2,7 @@ #include "pack-revindex.h" #include "object-store.h" #include "packfile.h" +#include "config.h" struct revindex_entry { off_t offset; @@ -166,6 +167,9 @@ static void create_pack_revindex(struct packed_git *p) static int load_pack_revindex_from_memory(struct packed_git *p) { + if (git_env_bool(GIT_TEST_REV_INDEX_DIE_IN_MEMORY, 0)) + die("dying as requested by '%s'", + GIT_TEST_REV_INDEX_DIE_IN_MEMORY); if (open_pack_index(p)) return -1; create_pack_revindex(p); diff --git a/pack-revindex.h b/pack-revindex.h index 7237b2b6f8..97f5893d3a 100644 --- a/pack-revindex.h +++ b/pack-revindex.h @@ -17,6 +17,7 @@ */ #define GIT_TEST_WRITE_REV_INDEX "GIT_TEST_WRITE_REV_INDEX" +#define GIT_TEST_REV_INDEX_DIE_IN_MEMORY "GIT_TEST_REV_INDEX_DIE_IN_MEMORY" struct packed_git; diff --git a/t/t5325-reverse-index.sh b/t/t5325-reverse-index.sh index be452bb343..a344b18d7e 100755 --- a/t/t5325-reverse-index.sh +++ b/t/t5325-reverse-index.sh @@ -85,4 +85,13 @@ test_expect_success 'pack-objects respects pack.writeReverseIndex' ' test_path_is_file pack-1-*.rev ' +test_expect_success 'reverse index is not generated when available on disk' ' + test_index_pack true && + test_path_is_file $rev && + + git rev-parse HEAD >tip && + GIT_TEST_REV_INDEX_DIE_IN_MEMORY=1 git cat-file \ + --batch-check="%(objectsize:disk)"