From patchwork Mon Nov 29 22:25:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12645909 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1E11C433EF for ; Mon, 29 Nov 2021 22:25:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231582AbhK2W3P (ORCPT ); Mon, 29 Nov 2021 17:29:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59600 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229574AbhK2W2Z (ORCPT ); Mon, 29 Nov 2021 17:28:25 -0500 Received: from mail-il1-x12b.google.com (mail-il1-x12b.google.com [IPv6:2607:f8b0:4864:20::12b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2F5E7C096768 for ; Mon, 29 Nov 2021 14:25:07 -0800 (PST) Received: by mail-il1-x12b.google.com with SMTP id m5so19034654ilh.11 for ; Mon, 29 Nov 2021 14:25:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=cC2BNVpbewKrGx1Gmtmu0NyYyBD/lH/03FqfYaU6lnM=; b=wzfPSHRp4tr0uVxKxZ/dGbCakd0Osphp/+pSd5dTSQUL4PFM2mD9QMt+wp6fsackCD 9nMoD3ixglHd15IFpOeKGs6pk2iVZwg6kcM+F0i2Zm7x+B2L2jIeRIDPsFP8Hk4oc95l mKSuDD1BGqKHtR6lq9Sa2kkQFqhx6IU0b3hN/Z9UcmZGX1L6U2bBwRH9nK0bm+uvj4rN akxbZo+wmFQihlmkaSmwO1rFIxxqd/BRf/NiCcqn9kfn//soE53OGZzpo9KgoM2DMiku opeYm0fNn4kA3LawukpmAULU52Yb6ao8WGzEdPDcq8k49d53QOJom0e27M95geIJgnff 9R2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=cC2BNVpbewKrGx1Gmtmu0NyYyBD/lH/03FqfYaU6lnM=; b=0vGsWWpNeewfk24KWF05QDVmw0ZWwE0oyQ3aoZN+EJJuZUYusHnje7y94fzGrjBH/5 2Ax5uCAibHam7qTEfQ5mu13BBv7ocJy1EYftq8q4WyeeCp2+kbdEmajH9LR4bH+YRPbO koWlBaYFFaddmqo5nmlHSLSzlMSk/oC7Dv9jHep0Gj/J2fp/Amlxpb1hkhGAAUBCt2xz M5jSpbsDoCwIi3RvE/XRP4776m+YFUYl3a4psnbgpvS6eNqMbfCparxazMRgMPCZZcAX PMTa0t8c/bJWMuqyx5I0WeXUNgIwqLi4qM86uva3HID9n8rq6rSdmS4yrwV/r/oFFPbi ej6A== X-Gm-Message-State: AOAM5332BhSxaBqt2hP8XfBdhXWGlXbFC9vN59bcFF9gxR0cvbQfab5M 5jxSExQ0XdGzWCWBpWCTo97sb/EsHjOgF4qk X-Google-Smtp-Source: ABdhPJx8z2Oa+XC7B/Ym40H06xIzoOBtZxvjhb0F3AVsvTIQCQb8kWFC4wYtgkWiREwziPLyYTSXNQ== X-Received: by 2002:a05:6e02:1a2d:: with SMTP id g13mr59233901ile.134.1638224706455; Mon, 29 Nov 2021 14:25:06 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id d4sm2900741ilq.53.2021.11.29.14.25.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 29 Nov 2021 14:25:06 -0800 (PST) Date: Mon, 29 Nov 2021 17:25:05 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: gitster@pobox.com, larsxschneider@gmail.com, peff@peff.net, tytso@mit.edu Subject: [PATCH 01/17] Documentation/technical: add cruft-packs.txt Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Create a technical document to explain cruft packs. It contains a brief overview of the problem, some background, details on the implementation, and a couple of alternative approaches not considered here. Signed-off-by: Taylor Blau --- Documentation/Makefile | 1 + Documentation/technical/cruft-packs.txt | 95 +++++++++++++++++++++++++ 2 files changed, 96 insertions(+) create mode 100644 Documentation/technical/cruft-packs.txt diff --git a/Documentation/Makefile b/Documentation/Makefile index ed656db2ae..0b01c9408e 100644 --- a/Documentation/Makefile +++ b/Documentation/Makefile @@ -91,6 +91,7 @@ TECH_DOCS += MyFirstContribution TECH_DOCS += MyFirstObjectWalk TECH_DOCS += SubmittingPatches TECH_DOCS += technical/bundle-format +TECH_DOCS += technical/cruft-packs TECH_DOCS += technical/hash-function-transition TECH_DOCS += technical/http-protocol TECH_DOCS += technical/index-format diff --git a/Documentation/technical/cruft-packs.txt b/Documentation/technical/cruft-packs.txt new file mode 100644 index 0000000000..bb54cce1b1 --- /dev/null +++ b/Documentation/technical/cruft-packs.txt @@ -0,0 +1,95 @@ += Cruft packs + +Cruft packs offer an alternative to Git's traditional mechanism of removing +unreachable objects. This document provides an overview of Git's pruning +mechanism, and how cruft packs can be used instead to accomplish the same. + +== Background + +To remove unreachable objects from your repository, Git offers `git repack -Ad` +(see linkgit:git-repack[1]). Quoting from the documentation: + +[quote] +[...] unreachable objects in a previous pack become loose, unpacked objects, +instead of being left in the old pack. [...] loose unreachable objects will be +pruned according to normal expiry rules with the next 'git gc' invocation. + +Unreachable objects aren't removed immediately, since doing so could race with +an incoming push which may reference an object which is about to be deleted. +Instead, those unreachable objects are stored as loose object and stay that way +until they are older than the expiration window, at which point they are removed +by linkgit:git-prune[1]. + +Git must store these unreachable objects loose in order to keep track of their +per-object mtimes. If these unreachable objects were written into one big pack, +then either freshening that pack (because an object contained within it was +re-written) or creating a new pack of unreachable objects would cause the pack's +mtime to get updated, and the objects within it would never leave the expiration +window. Instead, objects are stored loose in order to keep track of the +individual object mtimes and avoid a situation where all cruft objects are +freshened at once. + +This can lead to undesirable situations when a repository contains many +unreachable objects which have not yet left the grace period. Having large +directories in the shards of `.git/objects` can lead to decreased performance in +the repository. But given enough unreachable objects, this can lead to inode +starvation and degrade the performance of the whole system. Since we +can never pack those objects, these repositories often take up a large amount of +disk space, since we can only zlib compress them, but not store them in delta +chains. + +== Cruft packs + +Cruft packs are designed to eliminate the need for storing unreachable objects +in a loose state by including the per-object mtimes in a separate file alongside +a single pack containing all loose objects. + +A cruft pack is written by `git repack --cruft` when generating a new pack. +linkgit:git-pack-objects[1]'s `--cruft` option. Note that `git repack --cruft` +is a classic all-into-one repack, meaning that everything in the resulting pack is +reachable, and everything else is unreachable. Once written, the `--cruft` +option instructs `git repack` to generate another pack containing only objects +not packed in the previous step (which equates to packing all unreachable +objects together). This progresses as follows: + + 1. Enumerate every object, marking any object which is (a) not contained in a + kept-pack, and (b) whose mtime is within the grace period as a traversal + tip. + + 2. Perform a reachability traversal based on the tips gathered in the previous + step, adding every object along the way to the pack. + + 3. Write the pack out, along with a `.mtimes` file that records the per-object + timestamps. + +This mode is invoked internally by linkgit:git-repack[1] when instructed to +write a cruft pack. Crucially, the set of in-core kept packs is exactly the set +of packs which will not be deleted by the repack; in other words, they contain +all of the repository's reachable objects. + +When a repository already has a cruft pack, `git repack --cruft` typically only +adds objects to it. An exception to this is when `git repack` is given the +`--cruft-expiration` option, which allows the generated cruft pack to omit +expired objects instead of waiting for linkgit:git-gc[1] to expire those objects +later on. + +It is linkgit:git-gc[1] that is typically responsible for removing expired +unreachable objects. + +== Alternatives + +Notable alternatives to this design include: + + - The location of the per-object mtime data, and + - Whether cruft packs should be incremental or not. + +On the location of mtime data, a new auxiliary file tied to the pack was chosen +to avoid complicating the `.idx` format. If the `.idx` format were ever to gain +support for optional chunks of data, it may make sense to consolidate the +`.mtimes` format into the `.idx` itself. + +Incremental cruft packs (i.e., where each time a repository is repacked a new +cruft pack is generated containing only the unreachable objects introduced since +the last time a cruft pack was written) are significantly more complicated to +construct, and so aren't pursued here. The obvious drawback to the current +implementation is that the entire cruft pack must be re-written from scratch. From patchwork Mon Nov 29 22:25:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12645911 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C196C433FE for ; Mon, 29 Nov 2021 22:26:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231676AbhK2W3R (ORCPT ); Mon, 29 Nov 2021 17:29:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59740 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230232AbhK2W2e (ORCPT ); Mon, 29 Nov 2021 17:28:34 -0500 Received: from mail-il1-x133.google.com (mail-il1-x133.google.com [IPv6:2607:f8b0:4864:20::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D5BF9C0698E6 for ; Mon, 29 Nov 2021 14:25:09 -0800 (PST) Received: by mail-il1-x133.google.com with SMTP id r2so19038029ilb.10 for ; Mon, 29 Nov 2021 14:25:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=/GjnTv67WWRPsnfURvjk45ilOS3yFcPRKz2+khVTLhE=; b=IryNj0RSA+jy+BC5dDCnNJJYsPy5jcI8O0W4hXPI+J0z1hzmiFg5t3rkkhcAZaUUZi 83RJdyEN62W7QOcyS0JF14gwEccuHotyBQoPh3VbO6XTydcKCnAyLEGTPh13sMxkx03K iSxiUhFCDozHHR27czIAXV5ACh6RW+d6FI2EyzRD0uzIlvxvl2B4bPZnuIToLiMK6nbQ H5a4WJFUcPVvTfyd/RhlvVIn+FbuVd5E7dfkORCCAenqDX7H1JyTYL8Hk5PdB5QNAJTt G5Ap4ZRDTnCVACMR3ELsDDI009+OGb/P6xklmhJ1rOq1OrOOHxEY7sii5/SpOBxETD1s n0Ew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=/GjnTv67WWRPsnfURvjk45ilOS3yFcPRKz2+khVTLhE=; b=Dd4wFHA9rcFs45Ma0CYTOfvWYp+rxZBiptfUQnAGNWLo5v/EkGWfjg3HF9I85oC64C u0sP5CizjNSNwE4W0USC6+t6dG5SEO1Z5qP5NOsyN2dhqpHkx7/Tkd1FgCZIrxnIFJNo gpQg/fnY92oo8eOtxCQIPFAnv0uanwWo7zQbt3/AEnJ3tcn+CVDegjy0PZTuzN5XycwN iq2CsJUt/dcumrZo68nHSWmJM562pbxLI2i47GbT9nwUeKz9eXZhZ6yuBOWflci9Fztx BDrMMudtYgbPuoZDNeACHu64GH+3rxvRI0KNrneuMjUd6LcVFG1qIVNMQ2nu1bK2vVsQ JHhA== X-Gm-Message-State: AOAM532ri9VCi1dVZaRREh/rnFxgVsVxX1iJyrrL7mkkUGplQwpmS2hv jma062bhCnVIKDl/UBtCG4mDyH+KDXaRBWep X-Google-Smtp-Source: ABdhPJwF2BWxczcIOHEiPIzTspMsuJpSotpQMsId2+xcs+m/FkDRc/nL0SBAtBsaFb3YQD/xwHWbew== X-Received: by 2002:a05:6e02:b45:: with SMTP id f5mr61815412ilu.118.1638224708954; Mon, 29 Nov 2021 14:25:08 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id m16sm6914723ilj.45.2021.11.29.14.25.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 29 Nov 2021 14:25:08 -0800 (PST) Date: Mon, 29 Nov 2021 17:25:08 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: gitster@pobox.com, larsxschneider@gmail.com, peff@peff.net, tytso@mit.edu Subject: [PATCH 02/17] pack-mtimes: support reading .mtimes files Message-ID: <7d4ae7bd3e28e2ec904abb37b6f26505e37531c5.1638224692.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org To store the individual mtimes of objects in a cruft pack, introduce a new `.mtimes` format that can optionally accompany a single pack in the repository. The format is defined in Documentation/technical/pack-format.txt, and stores a 4-byte network order timestamp for each object in name (index) order. This patch prepares for cruft packs by defining the `.mtimes` format, and introducing a basic API that callers can use to read out individual mtimes. Signed-off-by: Taylor Blau --- Documentation/technical/pack-format.txt | 22 ++++ Makefile | 1 + builtin/repack.c | 1 + object-store.h | 5 +- pack-mtimes.c | 139 ++++++++++++++++++++++++ pack-mtimes.h | 16 +++ packfile.c | 18 ++- packfile.h | 1 + 8 files changed, 200 insertions(+), 3 deletions(-) create mode 100644 pack-mtimes.c create mode 100644 pack-mtimes.h diff --git a/Documentation/technical/pack-format.txt b/Documentation/technical/pack-format.txt index 8d2f42f29e..61d8d960e7 100644 --- a/Documentation/technical/pack-format.txt +++ b/Documentation/technical/pack-format.txt @@ -294,6 +294,28 @@ Pack file entry: <+ All 4-byte numbers are in network order. +== pack-*.mtimes files have the format: + + - A 4-byte magic number '0x4d544d45' ('MTME'). + + - A 4-byte version identifier (= 1). + + - A 4-byte hash function identifier (= 1 for SHA-1, 2 for SHA-256). + + - A table of mtimes (one per packed object, num_objects in total, each + a 4-byte unsigned integer in network order), in the same order as + objects appear in the index file (e.g., the first entry in the mtime + table corresponds to the object with the lowest lexically-sorted + oid). The mtimes count standard epoch seconds. + + - A trailer, containing a: + + checksum of the corresponding packfile, and + + a checksum of all of the above. + +All 4-byte numbers are in network order. + == multi-pack-index (MIDX) files have the following format: The multi-pack-index files refer to multiple pack-files and loose objects. diff --git a/Makefile b/Makefile index 12be39ac49..efd5e00717 100644 --- a/Makefile +++ b/Makefile @@ -949,6 +949,7 @@ LIB_OBJS += oidtree.o LIB_OBJS += pack-bitmap-write.o LIB_OBJS += pack-bitmap.o LIB_OBJS += pack-check.o +LIB_OBJS += pack-mtimes.o LIB_OBJS += pack-objects.o LIB_OBJS += pack-revindex.o LIB_OBJS += pack-write.o diff --git a/builtin/repack.c b/builtin/repack.c index 0b2d1e5d82..acbb7b8c3b 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -212,6 +212,7 @@ static struct { } exts[] = { {".pack"}, {".rev", 1}, + {".mtimes", 1}, {".bitmap", 1}, {".promisor", 1}, {".idx"}, diff --git a/object-store.h b/object-store.h index 952efb6a4b..d87481f101 100644 --- a/object-store.h +++ b/object-store.h @@ -89,12 +89,15 @@ struct packed_git { freshened:1, do_not_close:1, pack_promisor:1, - multi_pack_index:1; + multi_pack_index:1, + is_cruft:1; unsigned char hash[GIT_MAX_RAWSZ]; struct revindex_entry *revindex; const uint32_t *revindex_data; const uint32_t *revindex_map; size_t revindex_size; + const uint32_t *mtimes_map; + size_t mtimes_size; /* something like ".git/objects/pack/xxxxx.pack" */ char pack_name[FLEX_ARRAY]; /* more */ }; diff --git a/pack-mtimes.c b/pack-mtimes.c new file mode 100644 index 0000000000..4c7c00fa67 --- /dev/null +++ b/pack-mtimes.c @@ -0,0 +1,139 @@ +#include "pack-mtimes.h" +#include "object-store.h" +#include "packfile.h" + +static char *pack_mtimes_filename(struct packed_git *p) +{ + size_t len; + if (!strip_suffix(p->pack_name, ".pack", &len)) + BUG("pack_name does not end in .pack"); + /* NEEDSWORK: this could reuse code from pack-revindex.c. */ + return xstrfmt("%.*s.mtimes", (int)len, p->pack_name); +} + +int pack_has_mtimes(struct packed_git *p) +{ + struct stat st; + char *fname = pack_mtimes_filename(p); + + if (stat(fname, &st) < 0) { + if (errno == ENOENT) + return 0; + die_errno(_("could not stat %s"), fname); + } + + free(fname); + return 1; +} + +#define MTIMES_HEADER_SIZE (12) +#define MTIMES_MIN_SIZE (MTIMES_HEADER_SIZE + (2 * the_hash_algo->rawsz)) + +struct mtimes_header { + uint32_t signature; + uint32_t version; + uint32_t hash_id; +}; + +static int load_pack_mtimes_file(char *mtimes_file, + uint32_t num_objects, + const uint32_t **data_p, size_t *len_p) +{ + int fd, ret = 0; + struct stat st; + void *data = NULL; + size_t mtimes_size; + uint32_t *hdr; + + fd = git_open(mtimes_file); + + if (fd < 0) { + ret = -1; + goto cleanup; + } + if (fstat(fd, &st)) { + ret = error_errno(_("failed to read %s"), mtimes_file); + goto cleanup; + } + + mtimes_size = xsize_t(st.st_size); + + if (mtimes_size < MTIMES_MIN_SIZE) { + ret = error(_("mtimes file %s is too small"), mtimes_file); + goto cleanup; + } + + if (mtimes_size - MTIMES_MIN_SIZE != st_mult(sizeof(uint32_t), num_objects)) { + ret = error(_("mtimes file %s is corrupt"), mtimes_file); + goto cleanup; + } + + data = hdr = xmmap(NULL, mtimes_size, PROT_READ, MAP_PRIVATE, fd, 0); + + if (ntohl(*hdr) != MTIMES_SIGNATURE) { + ret = error(_("mtimes file %s has unknown signature"), mtimes_file); + goto cleanup; + } + + if (ntohl(*++hdr) != 1) { + ret = error(_("mtimes file %s has unsupported version %"PRIu32), + mtimes_file, ntohl(*hdr)); + goto cleanup; + } + hdr++; + if (!(ntohl(*hdr) == 1 || ntohl(*hdr) == 2)) { + ret = error(_("mtimes file %s has unsupported hash id %"PRIu32), + mtimes_file, ntohl(*hdr)); + goto cleanup; + } + +cleanup: + if (ret) { + if (data) + munmap(data, mtimes_size); + } else { + *len_p = mtimes_size; + *data_p = (const uint32_t *)data; + } + + close(fd); + return ret; +} + +int load_pack_mtimes(struct packed_git *p) +{ + char *mtimes_name = NULL; + int ret = 0; + + if (!p->is_cruft) + return ret; /* not a cruft pack */ + if (p->mtimes_map) + return ret; /* already loaded */ + + ret = open_pack_index(p); + if (ret < 0) + goto cleanup; + + mtimes_name = pack_mtimes_filename(p); + ret = load_pack_mtimes_file(mtimes_name, + p->num_objects, + &p->mtimes_map, + &p->mtimes_size); + if (ret) + goto cleanup; + +cleanup: + free(mtimes_name); + return ret; +} + +uint32_t nth_packed_mtime(struct packed_git *p, uint32_t pos) +{ + if (!p->mtimes_map) + BUG("pack .mtimes file not loaded for %s", p->pack_name); + if (p->num_objects <= pos) + BUG("pack .mtimes out-of-bounds (%"PRIu32" vs %"PRIu32")", + pos, p->num_objects); + + return get_be32(p->mtimes_map + pos + 3); +} diff --git a/pack-mtimes.h b/pack-mtimes.h new file mode 100644 index 0000000000..ac4247bb5e --- /dev/null +++ b/pack-mtimes.h @@ -0,0 +1,16 @@ +#ifndef PACK_MTIMES_H +#define PACK_MTIMES_H + +#include "git-compat-util.h" + +#define MTIMES_SIGNATURE 0x4d544d45 /* "MTME" */ +#define MTIMES_VERSION 1 + +struct packed_git; + +int pack_has_mtimes(struct packed_git *p); +int load_pack_mtimes(struct packed_git *p); + +uint32_t nth_packed_mtime(struct packed_git *p, uint32_t pos); + +#endif diff --git a/packfile.c b/packfile.c index 89402cfc69..ae79ac644e 100644 --- a/packfile.c +++ b/packfile.c @@ -333,12 +333,21 @@ void close_pack_revindex(struct packed_git *p) { p->revindex_data = NULL; } +void close_pack_mtimes(struct packed_git *p) { + if (!p->mtimes_map) + return; + + munmap((void *)p->mtimes_map, p->mtimes_size); + p->mtimes_map = NULL; +} + void close_pack(struct packed_git *p) { close_pack_windows(p); close_pack_fd(p); close_pack_index(p); close_pack_revindex(p); + close_pack_mtimes(p); oidset_clear(&p->bad_objects); } @@ -362,7 +371,7 @@ void close_object_store(struct raw_object_store *o) void unlink_pack_path(const char *pack_name, int force_delete) { - static const char *exts[] = {".pack", ".idx", ".rev", ".keep", ".bitmap", ".promisor"}; + static const char *exts[] = {".pack", ".idx", ".rev", ".keep", ".bitmap", ".promisor", ".mtimes"}; int i; struct strbuf buf = STRBUF_INIT; size_t plen; @@ -717,6 +726,10 @@ struct packed_git *add_packed_git(const char *path, size_t path_len, int local) if (!access(p->pack_name, F_OK)) p->pack_promisor = 1; + xsnprintf(p->pack_name + path_len, alloc - path_len, ".mtimes"); + if (!access(p->pack_name, F_OK)) + p->is_cruft = 1; + xsnprintf(p->pack_name + path_len, alloc - path_len, ".pack"); if (stat(p->pack_name, &st) || !S_ISREG(st.st_mode)) { free(p); @@ -868,7 +881,8 @@ static void prepare_pack(const char *full_name, size_t full_name_len, ends_with(file_name, ".pack") || ends_with(file_name, ".bitmap") || ends_with(file_name, ".keep") || - ends_with(file_name, ".promisor")) + ends_with(file_name, ".promisor") || + ends_with(file_name, ".mtimes")) string_list_append(data->garbage, full_name); else report_garbage(PACKDIR_FILE_GARBAGE, full_name); diff --git a/packfile.h b/packfile.h index 186146779d..32201d8af7 100644 --- a/packfile.h +++ b/packfile.h @@ -91,6 +91,7 @@ uint32_t get_pack_fanout(struct packed_git *p, uint32_t value); unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *); void close_pack_windows(struct packed_git *); void close_pack_revindex(struct packed_git *); +void close_pack_mtimes(struct packed_git *p); void close_pack(struct packed_git *); void close_object_store(struct raw_object_store *o); void unuse_pack(struct pack_window **); From patchwork Mon Nov 29 22:25:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12645915 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3BCCEC433EF for ; Mon, 29 Nov 2021 22:26:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231773AbhK2W3T (ORCPT ); Mon, 29 Nov 2021 17:29:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59752 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230461AbhK2W2e (ORCPT ); Mon, 29 Nov 2021 17:28:34 -0500 Received: from mail-il1-x129.google.com (mail-il1-x129.google.com [IPv6:2607:f8b0:4864:20::129]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1A13FC096766 for ; Mon, 29 Nov 2021 14:25:12 -0800 (PST) Received: by mail-il1-x129.google.com with SMTP id r2so19038136ilb.10 for ; Mon, 29 Nov 2021 14:25:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=g+qUl+8j4UU1/bSOamK3sP34bdd/CcSboI3AppJmIVk=; b=sXfCYjUB0WWr5HnfKoH8jRRL1jmpDYQ0miIVo5mHnnzERWjYlXbMwjoJ01Rjn5KnMQ SKtGhaEWJqVL64IDvPlm47gc47QS+7ZqnAYGCvqCynCF66yYxx+rMRo19fqVD9AGrwsf c7yLMQBUgCZW7U6HZd4Y9kgy9NrfP/eX2xAVEGp+J3GOrfaP0mjIGVpvJueMq1upn3RJ B6vJA633t32vk7zyB7D6HjkZcI/mGFkS7Ul68/kOQk1ejDqzAxyLLZMTV5q/m5rfqgVO 691Ec4aB40FSHtS5osHIvkkLJhna05rlA/6J4YlI9xDeJ/Nfaj3iKp0H4ZOh3jQlM6ev qQLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=g+qUl+8j4UU1/bSOamK3sP34bdd/CcSboI3AppJmIVk=; b=TJPjKTs5Nt9D3BekYOXdCNUOp/YkvG6yigv1vjPeIXll6k572rY0ta32HejhQ5ixnc 779/PbX1dBa5f7oC0zS1aMmO6dX0CgT9Jjf2CsGAZgvISN1CD10bS4HcVu+mn+xTQuCk OwyUbqgSpppv01BDkc3Vd/58cNKgf/pfBlyrr7hg+PUdb/zAgSsyXkVXdKbT2LjenEs7 eUh0z6Xc0ai56RKxH6XiAU0S1at/Wrjkp2m/xj31VMBsSCVegryXj2XEVNR32PDB7dVp amFD+l3KOA8VA8IihZA6SogtR2ObbIHpFsIHHdt8siknwqQFnQkCC3eG10d9N5Eh8bUY sbVw== X-Gm-Message-State: AOAM532fMashzzKvRWJpvrmivI7t1n5xSilxJcxprVN1aihMuRVOLZw6 a1Z5x6gMZ6oTQm2NmUip4Uc7ZYyoeWIdBaKr X-Google-Smtp-Source: ABdhPJzR9wBT3tPn6aNoiVccv1B+L9S8GlXDHplc4dX37UTF2nOeymOKGzSUsbOFAMEMS4e9sFSjLw== X-Received: by 2002:a05:6e02:17ca:: with SMTP id z10mr9045454ilu.98.1638224711421; Mon, 29 Nov 2021 14:25:11 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id l18sm8797843iob.17.2021.11.29.14.25.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 29 Nov 2021 14:25:11 -0800 (PST) Date: Mon, 29 Nov 2021 17:25:10 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: gitster@pobox.com, larsxschneider@gmail.com, peff@peff.net, tytso@mit.edu Subject: [PATCH 03/17] pack-write: pass 'struct packing_data' to 'stage_tmp_packfiles' Message-ID: <7f4612e859dd9923014c4e8a28bf5caea84d971a.1638224692.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This structure will be used to communicate the per-object mtimes when writing a cruft pack. Here, we need the full packing_data structure because the mtime information is stored in an array there, not on the individual object_entry's themselves (to avoid paying the overhead in structure width for operations which do not generate a cruft pack). We haven't passed this information down before because one of the two callers (in bulk-checkin.c) does not have a packing_data structure at all. In that case (where no cruft pack will be generated), NULL is passed instead. Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 3 ++- bulk-checkin.c | 2 +- pack-write.c | 1 + pack.h | 3 +++ 4 files changed, 7 insertions(+), 2 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 1a3dd445f8..bf45ffbc57 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1254,7 +1254,8 @@ static void write_pack_file(void) stage_tmp_packfiles(&tmpname, pack_tmp_name, written_list, nr_written, - &pack_idx_opts, hash, &idx_tmp_name); + &to_pack, &pack_idx_opts, hash, + &idx_tmp_name); if (write_bitmap_index) { size_t tmpname_len = tmpname.len; diff --git a/bulk-checkin.c b/bulk-checkin.c index 8785b2ac80..99f7596c4e 100644 --- a/bulk-checkin.c +++ b/bulk-checkin.c @@ -33,7 +33,7 @@ static void finish_tmp_packfile(struct strbuf *basename, char *idx_tmp_name = NULL; stage_tmp_packfiles(basename, pack_tmp_name, written_list, nr_written, - pack_idx_opts, hash, &idx_tmp_name); + NULL, pack_idx_opts, hash, &idx_tmp_name); rename_tmp_packfile_idx(basename, &idx_tmp_name); free(idx_tmp_name); diff --git a/pack-write.c b/pack-write.c index a5846f3a34..d594e3008e 100644 --- a/pack-write.c +++ b/pack-write.c @@ -483,6 +483,7 @@ void stage_tmp_packfiles(struct strbuf *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, + struct packing_data *to_pack, struct pack_idx_option *pack_idx_opts, unsigned char hash[], char **idx_tmp_name) diff --git a/pack.h b/pack.h index b22bfc4a18..fd27cfdfd7 100644 --- a/pack.h +++ b/pack.h @@ -109,11 +109,14 @@ int encode_in_pack_object_header(unsigned char *hdr, int hdr_len, #define PH_ERROR_PROTOCOL (-3) int read_pack_header(int fd, struct pack_header *); +struct packing_data; + struct hashfile *create_tmp_packfile(char **pack_tmp_name); void stage_tmp_packfiles(struct strbuf *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, + struct packing_data *to_pack, struct pack_idx_option *pack_idx_opts, unsigned char hash[], char **idx_tmp_name); From patchwork Mon Nov 29 22:25:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12645913 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38FA7C433F5 for ; Mon, 29 Nov 2021 22:26:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230508AbhK2W3S (ORCPT ); Mon, 29 Nov 2021 17:29:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59760 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230479AbhK2W2e (ORCPT ); Mon, 29 Nov 2021 17:28:34 -0500 Received: from mail-il1-x132.google.com (mail-il1-x132.google.com [IPv6:2607:f8b0:4864:20::132]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 96CF5C096769 for ; Mon, 29 Nov 2021 14:25:14 -0800 (PST) Received: by mail-il1-x132.google.com with SMTP id i6so19174742ila.0 for ; Mon, 29 Nov 2021 14:25:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=iPke5so28RSZVwUj2SAqnBiW5+x6vrjLApQSr7jsQo4=; b=LptWoeGl3FeLrLTBaVv3wt6+wAGch3a/kk5BoB1YZF2XhLUzRtU0j9nrx0lSOVDgwv pCvdO+WewpFwsgWlw9XY+eAmVqVFtV3IoJlVA9qgt4QNcZJ25MOndi9upL9oQNPWRhhh gxD7toFsNaPp776GvCNYk7hdm5nVY/CTVgOc4FLB9y8tK5Oh7tMvtzILlc/zyqmkt6eZ E0HVuXNo5XGfsy7Ec449GqxfzKT3APYmhnJX8TAY3EM4BuGOX8SwfZkAVPvTyvMC1Ldj kyCLjFz3etm+vbtVZcRguKSftoKdIFdtU6UG3ipLHKewOzS0TPhAuQqlIq4jOOO052NF yOAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=iPke5so28RSZVwUj2SAqnBiW5+x6vrjLApQSr7jsQo4=; b=MlxuwLcP6hs3QBSkjuySwMVot5EPBgNj9wmZLyR0KDVP77Pkors4HgPInoBJp7OfkX HuhRysNF9hPNVmD4oYXXeENvbTyJS6oYwxO2MO2RnTEY6Yn2qCZEdQLuyDkhPL1NWjUi a3VnZxdKcafLBO1fyS678rZb2lZJBopLcKNbfYV8i0y+R+v86dRz1zuPYzG+KZ+STB2o WD8yx69Y3Ee4d7jqSNRvpUG3AbLxmEILC4oUHGy6PWMnqcHe+BPPagUvEwvKmtwKuPYp GNUar5N3qjpVYOIqT9RjoNb4ct/peiwBuLZW5xtu2P+wJzoBdZeChAEWUnLC1wWHqpSL ytIg== X-Gm-Message-State: AOAM533vGqi1qrjfgJomdR8Y5YiKzHiC57KeJD74K5o1VYXcZvp8n2Ol 3Lm1AP8K0bryxLu5jbWlS3ozWKZbTrB23s7H X-Google-Smtp-Source: ABdhPJwGqz4ClCIhAN7QLVMUptCa6AxGC2w+akV3NQWzbKjYUXeLAm27Y95+JtDtmp2DlwUAnAEWQg== X-Received: by 2002:a92:c261:: with SMTP id h1mr37306694ild.162.1638224713938; Mon, 29 Nov 2021 14:25:13 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id r12sm8803873iln.72.2021.11.29.14.25.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 29 Nov 2021 14:25:13 -0800 (PST) Date: Mon, 29 Nov 2021 17:25:13 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: gitster@pobox.com, larsxschneider@gmail.com, peff@peff.net, tytso@mit.edu Subject: [PATCH 04/17] chunk-format.h: extract oid_version() Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org There are three definitions of an identical function which converts `the_hash_algo` into either 1 (for SHA-1) or 2 (for SHA-256). There is a copy of this function for writing both the commit-graph and multi-pack-index file, and another inline definition used to write the .rev header. Consolidate these into a single definition in chunk-format.h. It's not clear that this is the best header to define this function in, but it should do for now. (Worth noting, the .rev caller expects a 4-byte unsigned, but the other two callers work with a single unsigned byte. The consolidated version uses the latter type, and lets the compiler widen it when required). Another caller will be added in a subsequent patch. Signed-off-by: Taylor Blau --- chunk-format.c | 12 ++++++++++++ chunk-format.h | 3 +++ commit-graph.c | 18 +++--------------- midx.c | 18 +++--------------- pack-write.c | 15 ++------------- 5 files changed, 23 insertions(+), 43 deletions(-) diff --git a/chunk-format.c b/chunk-format.c index 1c3dca62e2..0275b74a89 100644 --- a/chunk-format.c +++ b/chunk-format.c @@ -181,3 +181,15 @@ int read_chunk(struct chunkfile *cf, return CHUNK_NOT_FOUND; } + +uint8_t oid_version(const struct git_hash_algo *algop) +{ + switch (hash_algo_by_ptr(algop)) { + case GIT_HASH_SHA1: + return 1; + case GIT_HASH_SHA256: + return 2; + default: + die(_("invalid hash version")); + } +} diff --git a/chunk-format.h b/chunk-format.h index 9ccbe00377..7885aa0848 100644 --- a/chunk-format.h +++ b/chunk-format.h @@ -2,6 +2,7 @@ #define CHUNK_FORMAT_H #include "git-compat-util.h" +#include "hash.h" struct hashfile; struct chunkfile; @@ -65,4 +66,6 @@ int read_chunk(struct chunkfile *cf, chunk_read_fn fn, void *data); +uint8_t oid_version(const struct git_hash_algo *algop); + #endif diff --git a/commit-graph.c b/commit-graph.c index 2706683acf..1f08152a35 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -193,18 +193,6 @@ char *get_commit_graph_chain_filename(struct object_directory *odb) return xstrfmt("%s/info/commit-graphs/commit-graph-chain", odb->path); } -static uint8_t oid_version(void) -{ - switch (hash_algo_by_ptr(the_hash_algo)) { - case GIT_HASH_SHA1: - return 1; - case GIT_HASH_SHA256: - return 2; - default: - die(_("invalid hash version")); - } -} - static struct commit_graph *alloc_commit_graph(void) { struct commit_graph *g = xcalloc(1, sizeof(*g)); @@ -365,9 +353,9 @@ struct commit_graph *parse_commit_graph(struct repository *r, } hash_version = *(unsigned char*)(data + 5); - if (hash_version != oid_version()) { + if (hash_version != oid_version(the_hash_algo)) { error(_("commit-graph hash version %X does not match version %X"), - hash_version, oid_version()); + hash_version, oid_version(the_hash_algo)); return NULL; } @@ -1908,7 +1896,7 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx) hashwrite_be32(f, GRAPH_SIGNATURE); hashwrite_u8(f, GRAPH_VERSION); - hashwrite_u8(f, oid_version()); + hashwrite_u8(f, oid_version(the_hash_algo)); hashwrite_u8(f, get_num_chunks(cf)); hashwrite_u8(f, ctx->num_commit_graphs_after - 1); diff --git a/midx.c b/midx.c index 8433086ac1..756ae6a206 100644 --- a/midx.c +++ b/midx.c @@ -40,18 +40,6 @@ #define PACK_EXPIRED UINT_MAX -static uint8_t oid_version(void) -{ - switch (hash_algo_by_ptr(the_hash_algo)) { - case GIT_HASH_SHA1: - return 1; - case GIT_HASH_SHA256: - return 2; - default: - die(_("invalid hash version")); - } -} - const unsigned char *get_midx_checksum(struct multi_pack_index *m) { return m->data + m->data_len - the_hash_algo->rawsz; @@ -131,9 +119,9 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local m->version); hash_version = m->data[MIDX_BYTE_HASH_VERSION]; - if (hash_version != oid_version()) { + if (hash_version != oid_version(the_hash_algo)) { error(_("multi-pack-index hash version %u does not match version %u"), - hash_version, oid_version()); + hash_version, oid_version(the_hash_algo)); goto cleanup_fail; } m->hash_len = the_hash_algo->rawsz; @@ -413,7 +401,7 @@ static size_t write_midx_header(struct hashfile *f, { hashwrite_be32(f, MIDX_SIGNATURE); hashwrite_u8(f, MIDX_VERSION); - hashwrite_u8(f, oid_version()); + hashwrite_u8(f, oid_version(the_hash_algo)); hashwrite_u8(f, num_chunks); hashwrite_u8(f, 0); /* unused */ hashwrite_be32(f, num_packs); diff --git a/pack-write.c b/pack-write.c index d594e3008e..ff305b404c 100644 --- a/pack-write.c +++ b/pack-write.c @@ -2,6 +2,7 @@ #include "pack.h" #include "csum-file.h" #include "remote.h" +#include "chunk-format.h" void reset_pack_idx_option(struct pack_idx_option *opts) { @@ -181,21 +182,9 @@ static int pack_order_cmp(const void *va, const void *vb, void *ctx) static void write_rev_header(struct hashfile *f) { - uint32_t oid_version; - switch (hash_algo_by_ptr(the_hash_algo)) { - case GIT_HASH_SHA1: - oid_version = 1; - break; - case GIT_HASH_SHA256: - oid_version = 2; - break; - default: - die("write_rev_header: unknown hash version"); - } - hashwrite_be32(f, RIDX_SIGNATURE); hashwrite_be32(f, RIDX_VERSION); - hashwrite_be32(f, oid_version); + hashwrite_be32(f, oid_version(the_hash_algo)); } static void write_rev_index_positions(struct hashfile *f, From patchwork Mon Nov 29 22:25:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12645917 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C46ECC433FE for ; Mon, 29 Nov 2021 22:26:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231934AbhK2W3V (ORCPT ); Mon, 29 Nov 2021 17:29:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59774 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229521AbhK2W24 (ORCPT ); Mon, 29 Nov 2021 17:28:56 -0500 Received: from mail-il1-x12f.google.com (mail-il1-x12f.google.com [IPv6:2607:f8b0:4864:20::12f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1484DC09676B for ; Mon, 29 Nov 2021 14:25:17 -0800 (PST) Received: by mail-il1-x12f.google.com with SMTP id z8so8981692ilu.7 for ; Mon, 29 Nov 2021 14:25:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=l9odbvWx1JTYqAjkhRRCS7ssyYnU3S9crqggWqJVCnU=; b=bLVTrX/c9517M8f/QuZxATrTnlRrY9S3y8Ch0SEAWVz75l0YXpEqbmeIzoyznewPNq 7B9oPxZ70f+inltUAowDIq5Pbgg+qewSYOHbFb57ZSsaFxpT607l7RNmmRF5A7Yrq4Np iwnyWwuQL/U+SyzKj5XpG21yIZ64fstdAzZNurFxxO12pPz5JVcMhwr+wRpinX8IX+J+ DGP8M38BxcdK982QpEd3/BOl9qdhaVQLvHTBuKPZvYB8Ns3+88cA7cDCMsDZn7Rak5pk RWVVAJkawcGyko09ZBr+NfkOaUeBxW8RahYsC13WmzOoTi/TWBcXPE1szcwi16R/FLJ6 06tA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=l9odbvWx1JTYqAjkhRRCS7ssyYnU3S9crqggWqJVCnU=; b=Kl9YjNd1WMSo6uLZND5GMb5a3B+0fmQwGpoMfSF91aKJ15ozLub7Q8X+itgx+yTDrT QVUrm3vZLciyS12+BPBkJZL3+7ho3SuDdvxSf+0dGR1iZvWPjrJG4QO+aFFfFrJNwmkK R1JE3u9+looUoAWdvqfkh6+EUCjVpNQsIQEI8+eU6XIjzEQhJ6oFNPAi7lF4gZlXgK0X wl5AG6dlfSxxrQqHLz+jqRi8KWwmxE+d2yzmUvmbYI9rsYYntpDdBryiYqppVPs6GHB5 4aJlVCXp//SlA2wjDRQoQUoK8WT1RWiF+HAkyk6DB7sukt8JLcTgiaCmwvoB/RVejiSu VFrQ== X-Gm-Message-State: AOAM533TmHzlX9DWVQIxxMP1v+fL/UUvmtzvSWV2wuIgLan8Rzzo5PG4 53Zb61lWIfDDggYSuNuTg+uNukcEW1ESUxYh X-Google-Smtp-Source: ABdhPJyBsGChtD/LHuIr/utn0qvzXZYyM4QJ3tBKIGZT8zC7cP9ytmZjHmLw40AJGUEp47FTBSp5YA== X-Received: by 2002:a05:6e02:1ca1:: with SMTP id x1mr29988218ill.72.1638224716320; Mon, 29 Nov 2021 14:25:16 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id x13sm9099394ilp.43.2021.11.29.14.25.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 29 Nov 2021 14:25:16 -0800 (PST) Date: Mon, 29 Nov 2021 17:25:15 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: gitster@pobox.com, larsxschneider@gmail.com, peff@peff.net, tytso@mit.edu Subject: [PATCH 05/17] pack-mtimes: support writing pack .mtimes files Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Now that the `.mtimes` format is defined, supplement the pack-write API to be able to conditionally write an `.mtimes` file along with a pack by setting an additional flag and passing an oidmap that contains the timestamps corresponding to each object in the pack. Signed-off-by: Taylor Blau --- pack-objects.c | 6 ++++ pack-objects.h | 20 ++++++++++++++ pack-write.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++++++ pack.h | 1 + 4 files changed, 101 insertions(+) diff --git a/pack-objects.c b/pack-objects.c index fe2a4eace9..272e8d4517 100644 --- a/pack-objects.c +++ b/pack-objects.c @@ -170,6 +170,9 @@ struct object_entry *packlist_alloc(struct packing_data *pdata, if (pdata->layer) REALLOC_ARRAY(pdata->layer, pdata->nr_alloc); + + if (pdata->cruft_mtime) + REALLOC_ARRAY(pdata->cruft_mtime, pdata->nr_alloc); } new_entry = pdata->objects + pdata->nr_objects++; @@ -198,6 +201,9 @@ struct object_entry *packlist_alloc(struct packing_data *pdata, if (pdata->layer) pdata->layer[pdata->nr_objects - 1] = 0; + if (pdata->cruft_mtime) + pdata->cruft_mtime[pdata->nr_objects - 1] = 0; + return new_entry; } diff --git a/pack-objects.h b/pack-objects.h index dca2351ef9..f17119de26 100644 --- a/pack-objects.h +++ b/pack-objects.h @@ -168,6 +168,9 @@ struct packing_data { /* delta islands */ unsigned int *tree_depth; unsigned char *layer; + + /* cruft packs */ + uint32_t *cruft_mtime; }; void prepare_packing_data(struct repository *r, struct packing_data *pdata); @@ -289,4 +292,21 @@ static inline void oe_set_layer(struct packing_data *pack, pack->layer[e - pack->objects] = layer; } +static inline uint32_t oe_cruft_mtime(struct packing_data *pack, + struct object_entry *e) +{ + if (!pack->cruft_mtime) + return 0; + return pack->cruft_mtime[e - pack->objects]; +} + +static inline void oe_set_cruft_mtime(struct packing_data *pack, + struct object_entry *e, + uint32_t mtime) +{ + if (!pack->cruft_mtime) + CALLOC_ARRAY(pack->cruft_mtime, pack->nr_alloc); + pack->cruft_mtime[e - pack->objects] = mtime; +} + #endif diff --git a/pack-write.c b/pack-write.c index ff305b404c..8c3efda2c3 100644 --- a/pack-write.c +++ b/pack-write.c @@ -3,6 +3,10 @@ #include "csum-file.h" #include "remote.h" #include "chunk-format.h" +#include "pack-mtimes.h" +#include "oidmap.h" +#include "chunk-format.h" +#include "pack-objects.h" void reset_pack_idx_option(struct pack_idx_option *opts) { @@ -276,6 +280,65 @@ const char *write_rev_file_order(const char *rev_name, return rev_name; } +static void write_mtimes_header(struct hashfile *f) +{ + hashwrite_be32(f, MTIMES_SIGNATURE); + hashwrite_be32(f, MTIMES_VERSION); + hashwrite_be32(f, oid_version(the_hash_algo)); +} + +static void write_mtimes_objects(struct hashfile *f, + struct packing_data *to_pack, + struct pack_idx_entry **objects, + uint32_t nr_objects) +{ + uint32_t i; + for (i = 0; i < nr_objects; i++) { + struct object_entry *e = (struct object_entry*)objects[i]; + hashwrite_be32(f, oe_cruft_mtime(to_pack, e)); + } +} + +static void write_mtimes_trailer(struct hashfile *f, const unsigned char *hash) +{ + hashwrite(f, hash, the_hash_algo->rawsz); +} + +static const char *write_mtimes_file(const char *mtimes_name, + struct packing_data *to_pack, + struct pack_idx_entry **objects, + uint32_t nr_objects, + const unsigned char *hash) +{ + struct hashfile *f; + int fd; + + if (!to_pack) + BUG("cannot call write_mtimes_file with NULL packing_data"); + + if (!mtimes_name) { + struct strbuf tmp_file = STRBUF_INIT; + fd = odb_mkstemp(&tmp_file, "pack/tmp_mtimes_XXXXXX"); + mtimes_name = strbuf_detach(&tmp_file, NULL); + } else { + unlink(mtimes_name); + fd = xopen(mtimes_name, O_CREAT|O_EXCL|O_WRONLY, 0600); + } + f = hashfd(fd, mtimes_name); + + write_mtimes_header(f); + write_mtimes_objects(f, to_pack, objects, nr_objects); + write_mtimes_trailer(f, hash); + + if (mtimes_name && adjust_shared_perm(mtimes_name) < 0) + die(_("failed to make %s readable"), mtimes_name); + + finalize_hashfile(f, NULL, + CSUM_HASH_IN_STREAM | CSUM_CLOSE | CSUM_FSYNC); + + return mtimes_name; +} + off_t write_pack_header(struct hashfile *f, uint32_t nr_entries) { struct pack_header hdr; @@ -478,6 +541,7 @@ void stage_tmp_packfiles(struct strbuf *name_buffer, char **idx_tmp_name) { const char *rev_tmp_name = NULL; + const char *mtimes_tmp_name = NULL; if (adjust_shared_perm(pack_tmp_name)) die_errno("unable to make temporary pack file readable"); @@ -490,9 +554,19 @@ void stage_tmp_packfiles(struct strbuf *name_buffer, rev_tmp_name = write_rev_file(NULL, written_list, nr_written, hash, pack_idx_opts->flags); + if (pack_idx_opts->flags & WRITE_MTIMES) { + mtimes_tmp_name = write_mtimes_file(NULL, to_pack, written_list, + nr_written, + hash); + if (adjust_shared_perm(mtimes_tmp_name)) + die_errno("unable to make temporary mtimes file readable"); + } + rename_tmp_packfile(name_buffer, pack_tmp_name, "pack"); if (rev_tmp_name) rename_tmp_packfile(name_buffer, rev_tmp_name, "rev"); + if (mtimes_tmp_name) + rename_tmp_packfile(name_buffer, mtimes_tmp_name, "mtimes"); } void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought) diff --git a/pack.h b/pack.h index fd27cfdfd7..01d385903a 100644 --- a/pack.h +++ b/pack.h @@ -44,6 +44,7 @@ struct pack_idx_option { #define WRITE_IDX_STRICT 02 #define WRITE_REV 04 #define WRITE_REV_VERIFY 010 +#define WRITE_MTIMES 020 uint32_t version; uint32_t off32_limit; From patchwork Mon Nov 29 22:25:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12645923 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8E97AC43217 for ; Mon, 29 Nov 2021 22:26:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231237AbhK2W3X (ORCPT ); Mon, 29 Nov 2021 17:29:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59788 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230520AbhK2W24 (ORCPT ); Mon, 29 Nov 2021 17:28:56 -0500 Received: from mail-io1-xd2c.google.com (mail-io1-xd2c.google.com [IPv6:2607:f8b0:4864:20::d2c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 71B17C09676C for ; Mon, 29 Nov 2021 14:25:19 -0800 (PST) Received: by mail-io1-xd2c.google.com with SMTP id x10so23586903ioj.9 for ; Mon, 29 Nov 2021 14:25:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=SFn3xr3SlTKELMLo6Xxs6aGRma5/b1r9A0LHLj8ntcs=; b=WtFqUWQ3/yPImamBnHYxXvvD2E7QRqSfU5dG8tMW33bKmgRDuQFEyEesl+e9MthSwJ RLSqrSllNtEjmtvHZLIGf7GvY9QEv8hRz+mqVg1CDN+UiiNPpSmSO7WDwPQU/JyT5lwx bAkZcylle3geKZ6836xMBuA0My+JZ6vuz4hACHLC6Wf3XfTv9xDa4nLWlDIS2n+LW/zV MjwmPumYinxCg8O0eLOeUkj2tVsEr+dm+Xy3ZEoAEqWQCewIMn6lqUv4Ddnlr78oN5Et GOmT6MWTTvRX0XuA5ylelT29KvZyS3K4Q70FJzrWlIlABL8aAlKp9JiKyjRtya6iEzbd JkvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=SFn3xr3SlTKELMLo6Xxs6aGRma5/b1r9A0LHLj8ntcs=; b=AmNBvi4McFSL1p4fHSa/rwKw7jICappQejhBySACe7/CLau3nazr+U9RSUlFyT6M/d qzVRmwTCMrDmHeBkXpvcsgJHiSOlcyZ1lxxEG7du1bbOcpVmbeRfjUTC/IsG+f8DxUZ5 iqY6NBwGFGRk9M72efOKdoiwD2VoyIT+TmWr0AkC5MqJ363a76AinVMKDYe1YKywXmE/ HURtboybh3K7/Cb/3P6NwRA0swEknU42l1YeyM7T3syJHN4EjUCc379DQcba7/rBUmzO ipWQi7tgbzUtMoLvSH2irZIhsKdTrqzgJsL8G5soMicWjuW+F3Ig7EQAqeFfhz6aCBlv S/Yw== X-Gm-Message-State: AOAM533JZDSXqIE1nzcKvrlgqzoJXLd18bJWQOGiJV20u7Wfoqd6qQhH ZFgyEYtUu7I1kKLvalXSU561x8CyHedkl9o9 X-Google-Smtp-Source: ABdhPJzsH5vZKVijMOd1kqJ62E4GPvvPITvDvahDOwHYMvRG4US0aoGBA3RrTXDivcTV0LpSJ07Wvg== X-Received: by 2002:a6b:2b12:: with SMTP id r18mr1723264ior.66.1638224718775; Mon, 29 Nov 2021 14:25:18 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id y21sm9055532ioj.41.2021.11.29.14.25.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 29 Nov 2021 14:25:18 -0800 (PST) Date: Mon, 29 Nov 2021 17:25:17 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: gitster@pobox.com, larsxschneider@gmail.com, peff@peff.net, tytso@mit.edu Subject: [PATCH 06/17] t/helper: add 'pack-mtimes' test-tool Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In the next patch, we will implement and test support for writing a cruft pack via a special mode of `git pack-objects`. To make sure that objects are written with the correct timestamps, and a new test-tool that can dump the object names and corresponding timestamps from a given `.mtimes` file. Signed-off-by: Taylor Blau --- Makefile | 1 + t/helper/test-pack-mtimes.c | 53 +++++++++++++++++++++++++++++++++++++ t/helper/test-tool.c | 1 + t/helper/test-tool.h | 1 + 4 files changed, 56 insertions(+) create mode 100644 t/helper/test-pack-mtimes.c diff --git a/Makefile b/Makefile index efd5e00717..a7382cbfc1 100644 --- a/Makefile +++ b/Makefile @@ -721,6 +721,7 @@ TEST_BUILTINS_OBJS += test-oid-array.o TEST_BUILTINS_OBJS += test-oidmap.o TEST_BUILTINS_OBJS += test-oidtree.o TEST_BUILTINS_OBJS += test-online-cpus.o +TEST_BUILTINS_OBJS += test-pack-mtimes.o TEST_BUILTINS_OBJS += test-parse-options.o TEST_BUILTINS_OBJS += test-parse-pathspec-file.o TEST_BUILTINS_OBJS += test-partial-clone.o diff --git a/t/helper/test-pack-mtimes.c b/t/helper/test-pack-mtimes.c new file mode 100644 index 0000000000..b143f62520 --- /dev/null +++ b/t/helper/test-pack-mtimes.c @@ -0,0 +1,53 @@ +#include "git-compat-util.h" +#include "test-tool.h" +#include "strbuf.h" +#include "object-store.h" +#include "packfile.h" +#include "pack-mtimes.h" + +static int dump_mtimes(struct packed_git *p) +{ + uint32_t i; + if (load_pack_mtimes(p) < 0) + die("could not load pack .mtimes"); + + for (i = 0; i < p->num_objects; i++) { + struct object_id oid; + if (nth_packed_object_id(&oid, p, i) < 0) + die("could not load object id at position %"PRIu32, i); + + printf("%s %"PRIu32"\n", + oid_to_hex(&oid), nth_packed_mtime(p, i)); + } + + return 0; +} + +static const char *pack_mtimes_usage = "\n" +" test-tool pack-mtimes "; + +int cmd__pack_mtimes(int argc, const char **argv) +{ + struct strbuf buf = STRBUF_INIT; + struct packed_git *p; + + setup_git_directory(); + + if (argc != 2) + usage(pack_mtimes_usage); + + for (p = get_all_packs(the_repository); p; p = p->next) { + strbuf_addstr(&buf, basename(p->pack_name)); + strbuf_strip_suffix(&buf, ".pack"); + strbuf_addstr(&buf, ".mtimes"); + + if (!strcmp(buf.buf, argv[1])) + break; + + strbuf_reset(&buf); + } + + strbuf_release(&buf); + + return p ? dump_mtimes(p) : 1; +} diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c index 3ce5585e53..1bb1c4b562 100644 --- a/t/helper/test-tool.c +++ b/t/helper/test-tool.c @@ -46,6 +46,7 @@ static struct test_cmd cmds[] = { { "oidmap", cmd__oidmap }, { "oidtree", cmd__oidtree }, { "online-cpus", cmd__online_cpus }, + { "pack-mtimes", cmd__pack_mtimes }, { "parse-options", cmd__parse_options }, { "parse-pathspec-file", cmd__parse_pathspec_file }, { "partial-clone", cmd__partial_clone }, diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h index 9f0f522850..07a2d3f94e 100644 --- a/t/helper/test-tool.h +++ b/t/helper/test-tool.h @@ -35,6 +35,7 @@ int cmd__mktemp(int argc, const char **argv); int cmd__oidmap(int argc, const char **argv); int cmd__oidtree(int argc, const char **argv); int cmd__online_cpus(int argc, const char **argv); +int cmd__pack_mtimes(int argc, const char **argv); int cmd__parse_options(int argc, const char **argv); int cmd__parse_pathspec_file(int argc, const char** argv); int cmd__partial_clone(int argc, const char **argv); From patchwork Mon Nov 29 22:25:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12645929 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8D656C433EF for ; Mon, 29 Nov 2021 22:26:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232203AbhK2W3j (ORCPT ); Mon, 29 Nov 2021 17:29:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59796 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231248AbhK2W24 (ORCPT ); Mon, 29 Nov 2021 17:28:56 -0500 Received: from mail-io1-xd33.google.com (mail-io1-xd33.google.com [IPv6:2607:f8b0:4864:20::d33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5A4BEC09676E for ; Mon, 29 Nov 2021 14:25:21 -0800 (PST) Received: by mail-io1-xd33.google.com with SMTP id w22so23622779ioa.1 for ; Mon, 29 Nov 2021 14:25:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=hkGzyKHnc9wQr7h7oh2hHBRxLGoqQj5I5JLLE+Haq2U=; b=mtaIwZqnKuJ+HldfRppHdlcjIFTY1N7ny1aVAb5Pc9MTtjuS/UsMIBZyVzrffGyaLE V8IRJrrxTqnO2Mi1lkwcVEUZPnFL71NntCYwu2GydIirAoWf1ap+GsT3pxwcvM45JEuW RSzkL/0RzU5c7gjlOr4d5CF/ny35Ri5TF0JAw7ThMLG06xZB6BsZS3yO5TNkO34Lhruq QZJC7CMDaaiIDfjPOCtUTVhlkAU3IPuLQ/GE3X6s8BODcNYPnC11shFm3g2CBK3ub6w8 Q6cVNDqbzAQ3R506xw7FYa8yx6PlyJwzBY4SXfrfrzr/mlgXXY62hWijLlxfj06rDbFT BAjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=hkGzyKHnc9wQr7h7oh2hHBRxLGoqQj5I5JLLE+Haq2U=; b=u1n5ejM6wABSTZ/ZzFoZb5vcbd9WJajnzdSXws4J6OYuWYnybysaJfRPW0d6zlWu9s iYpnMj6ToaFMFKrIG2yH4zn08agSqyntVUS2j07cPRCKxeeihdZZmDXC1a21g+EVpPaQ zLyeZ1Wc9e8PitHF9KBz2d/fBgA4Q6wWot/pauYsa/JmIiPEXKqf4NserBmCy+BjsBF1 nw+PVuQiBQcYHCbC2DQB4+etia4fVVayMO4zerXbHOs99+BH/Mpdp8eHfhlY/Aqgq80F SzPFh9+SSOgCz4kf4smg5XoY3/f3SJNMl1NVYV4gPtgzUWBn2D1VYI0X3TFPuHcqaQVn n08Q== X-Gm-Message-State: AOAM530LY/YOllAQoRoGunhQzIsaowtKyE7MM6Ke0fPjWN0sOwy748n/ RU+Udwm973TA0IF3xAIR8tiHDKkrAo3FCRW8 X-Google-Smtp-Source: ABdhPJxmpHAj/3JEaWFq/8QjTemgWH0Op1DZPF7azUHz8cMLoZyC9DhIZYAe0BFWHmUQQbJYjhMgcg== X-Received: by 2002:a05:6602:2d51:: with SMTP id d17mr61482095iow.47.1638224721332; Mon, 29 Nov 2021 14:25:21 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 1sm8772217ill.57.2021.11.29.14.25.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 29 Nov 2021 14:25:21 -0800 (PST) Date: Mon, 29 Nov 2021 17:25:20 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: gitster@pobox.com, larsxschneider@gmail.com, peff@peff.net, tytso@mit.edu Subject: [PATCH 07/17] builtin/pack-objects.c: return from create_object_entry() Message-ID: <5710933127b01125ebcfe232868abbe87fce0d87.1638224692.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A new caller in the next commit will want to immediately modify the object_entry structure created by create_object_entry(). Instead of forcing that caller to wastefully look-up the entry we just created, return it from create_object_entry() instead. Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index bf45ffbc57..3fb10529ba 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1508,13 +1508,13 @@ static int want_object_in_pack(const struct object_id *oid, return 1; } -static void create_object_entry(const struct object_id *oid, - enum object_type type, - uint32_t hash, - int exclude, - int no_try_delta, - struct packed_git *found_pack, - off_t found_offset) +static struct object_entry *create_object_entry(const struct object_id *oid, + enum object_type type, + uint32_t hash, + int exclude, + int no_try_delta, + struct packed_git *found_pack, + off_t found_offset) { struct object_entry *entry; @@ -1531,6 +1531,8 @@ static void create_object_entry(const struct object_id *oid, } entry->no_try_delta = no_try_delta; + + return entry; } static const char no_closure_warning[] = N_( From patchwork Mon Nov 29 22:25:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12645931 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3083BC433F5 for ; Mon, 29 Nov 2021 22:26:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232268AbhK2W3k (ORCPT ); Mon, 29 Nov 2021 17:29:40 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59816 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231230AbhK2W24 (ORCPT ); Mon, 29 Nov 2021 17:28:56 -0500 Received: from mail-il1-x12b.google.com (mail-il1-x12b.google.com [IPv6:2607:f8b0:4864:20::12b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BD421C09676F for ; Mon, 29 Nov 2021 14:25:24 -0800 (PST) Received: by mail-il1-x12b.google.com with SMTP id e8so19081155ilu.9 for ; Mon, 29 Nov 2021 14:25:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=lDsFe1jBg/drSQVHnwBdv3PUnKButsbHCmsHsdEdiYY=; b=uPPEYTFG85cmOw+Z+mVFxWhKcFoJVsk+HeIeK2QyBDWvewpyG475lIJThZBiqQQgq0 PQGv8auzKOucoz7CTecBzmM7PU8B3PCbVDB8VGuFFYySMQ+T3evTQqFTrYg74DCEEqbl +rPSK+rd6kAiUFguQReHpWQ0RUhIH0gVnY8vcxmppr4o4bKT877VogEla6ltTkM9KbtZ HRtSYMnz3gxxlAPItiNCAQcUiixJ9BEVmmQOqt1Fp7YbS8fLQmutrHKVyJ8MfCWnHi0s 8qbjcy0q+bYxq6ipwfV9VQGO76s/pBmBPlooTIRTa5xIGo2vVDzT0f5leBoOSkUwu/Cq orTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=lDsFe1jBg/drSQVHnwBdv3PUnKButsbHCmsHsdEdiYY=; b=P8eKN7vDqE36V6bBLBR1FMePC5JOvpmol/aTr9IbLiqhU4Clvwuh4OXfEZPQZZWBWi v+672+qhtUZZ4enlEJqXM3/JTQYHBI3uDSSvb71RI+CJTYLwrpr4Ux5f9TR7QgzdBdIG UnoHdEw1tZGt4QcC3lerjxct8Ds3cxVGR0jBShyWzeG5oLLPBd6iscs33wEcrBedPZpA vn6XkNkC7u+yFmRHaQuSx2VhN3MZEL81J9Ag3zqdUcDOqokb4lK/dbPrkEk53ZCSMnIf OP66DN6v7RJdXhg9j/66UpVzJPvmdXNCbY6Zf7IlCByDvWbbIMTr7F685uwD59up854w i1hA== X-Gm-Message-State: AOAM531l3ZU+fbcCLvOpao4Qv4XqifL9bNZDnTXH+GOL63UHoGZpVQiG H8c5UoaURjRlfYNVKAr9o/PveUI1kPVJOmI1 X-Google-Smtp-Source: ABdhPJxuRZ2TBP38R4O2jw0SCIYIg14B69EucRAeY69qggoLtmcx8tdP/gHDsqhlMUFyhPRA32HiPA== X-Received: by 2002:a92:2811:: with SMTP id l17mr14527321ilf.149.1638224723869; Mon, 29 Nov 2021 14:25:23 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id g7sm7469205iln.67.2021.11.29.14.25.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 29 Nov 2021 14:25:23 -0800 (PST) Date: Mon, 29 Nov 2021 17:25:22 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: gitster@pobox.com, larsxschneider@gmail.com, peff@peff.net, tytso@mit.edu Subject: [PATCH 08/17] builtin/pack-objects.c: --cruft without expiration Message-ID: <66165917a4660f63ce60b820d178d52a51304d20.1638224692.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Teach `pack-objects` how to generate a cruft pack when no objects are dropped (i.e., `--cruft-expiration=never`). Later patches will teach `pack-objects` how to generate a cruft pack that prunes objects. When generating a cruft pack which does not prune objects, we want to collect all unreachable objects into a single pack (noting and updating their mtimes as we accumulate them). Ordinary use will pass the result of a `git repack -A` as a kept pack, so when this patch says "kept pack", readers should think "reachable objects". Generating a non-expiring cruft packs works as follows: - Callers provide a list of every pack they know about, and indicate which packs are about to be removed. - All packs which are going to be removed (we'll call these the redundant ones) are marked as kept in-core, as well as any packs that `pack-objects` found but the caller did not specify. These packs are presumed to have entered the repository between the caller collecting packs and invoking `pack-objects`. Since we do not want to include objects in these packs (because we don't know which of their objects are or aren't reachable), these are also marked as kept in-core. - Then, we enumerate all objects in the repository, and add them to our packing list if they do not appear in an in-core kept pack. This results in a new cruft pack which contains all known objects that aren't included in the kept packs. When the kept pack is the result of `git repack -A`, the resulting pack contains all unreachable objects. Signed-off-by: Taylor Blau --- Documentation/git-pack-objects.txt | 23 +++ builtin/pack-objects.c | 203 ++++++++++++++++++++++++++- object-file.c | 2 +- object-store.h | 2 + t/t5327-pack-objects-cruft.sh | 218 +++++++++++++++++++++++++++++ 5 files changed, 442 insertions(+), 6 deletions(-) create mode 100755 t/t5327-pack-objects-cruft.sh diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt index dbfd1f9017..573c18afcd 100644 --- a/Documentation/git-pack-objects.txt +++ b/Documentation/git-pack-objects.txt @@ -13,6 +13,7 @@ SYNOPSIS [--no-reuse-delta] [--delta-base-offset] [--non-empty] [--local] [--incremental] [--window=] [--depth=] [--revs [--unpacked | --all]] [--keep-pack=] + [--cruft] [--cruft-expiration=