From patchwork Wed Mar 2 00:58:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12765340 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1F7D4C433EF for ; Wed, 2 Mar 2022 00:58:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238821AbiCBA6v (ORCPT ); Tue, 1 Mar 2022 19:58:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55300 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238818AbiCBA6p (ORCPT ); Tue, 1 Mar 2022 19:58:45 -0500 Received: from mail-io1-xd36.google.com (mail-io1-xd36.google.com [IPv6:2607:f8b0:4864:20::d36]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5B3403D483 for ; Tue, 1 Mar 2022 16:58:02 -0800 (PST) Received: by mail-io1-xd36.google.com with SMTP id c14so39726ioa.12 for ; Tue, 01 Mar 2022 16:58:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=U57WZotEOCKfboIeqgCsQcTnM0wsgdqmIpBbD+9WI7g=; b=UHmlSWD6alRVXQbCxXM2ytsTEiy2cglCF63oZ/+SLo8PzaweYOHNYxrFwgYVaj1gKs 7P4ftzbuuTAdSXSCfzjzjszDRoYObihhxIAcUV0k9JQuXg9n7jo4peYwYVhBdi4BxZOR TdecJhyT+sn1RA2wG4RSkR6TnwI4NyJEV50XqJzjoCxvVV6uprGAYIxllB1Q2SFLlu3F mIND1EWXJfjVlfqOSZm0vx94rAUTclpR7Ds/9y3mHSlSlDu+Mt1dWXbVmCCp7Ataahi5 Jz4IhIwbveDO0SpwliEIhCVQSlfOFbW9lUEGdm67Nrog68Bhg/2opGxtn8XGSFcyjiID PrVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=U57WZotEOCKfboIeqgCsQcTnM0wsgdqmIpBbD+9WI7g=; b=r4VQ0jsy4QJbbC8mXSDTiIGyEWYc6stBq2woDEmsxNvMJXeJm/SK7xlae2faDypb+9 LOzf0B+5OMUN+yNkljLdY4BcExjx41VyOEgfv0Yi9ldO6OXcURigKvLKnsdLUJ9Dnz3i 071tERv3/t4zfHwUzMZ5zXjNJjsFqBp9oMhZwzYeBAXxgBbhjQXIZMKrSVLxY3zvoysO QKtFmyaaZaFqomx3w7wUw3xQE5zAeRxbt5wXGMMWSeA+8EemxxtcfucU4B1Zo4Jg9S85 l8JxvTdjeyHIkHhHaime0Dy5rv76rxaYJYMpi4dikQQ6r3kTESr9fVXhFTa99+OZJtok 1R5Q== X-Gm-Message-State: AOAM53018l0h18/PFIyo7IKToVdcv1lpw7igbreSMZtLBIQcqok6CyGq kd2WCzIYW0S45P5cKmdiA7ftfOaKUU3uxms8 X-Google-Smtp-Source: ABdhPJzZ8qL3gf3fT3t1OHCfw38aea2R9/5HWnyoVq80c5DkZCHrilCNiG+1lfi8Z6jyJvKx5imC4w== X-Received: by 2002:a5d:8714:0:b0:636:13bb:bc89 with SMTP id u20-20020a5d8714000000b0063613bbbc89mr20834822iom.126.1646182681342; Tue, 01 Mar 2022 16:58:01 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id a6-20020a92d346000000b002c1a6040691sm8757506ilh.70.2022.03.01.16.58.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Mar 2022 16:58:01 -0800 (PST) Date: Tue, 1 Mar 2022 19:58:00 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: tytso@mit.edu, derrickstolee@github.com, gitster@pobox.com, larsxschneider@gmail.com Subject: [PATCH v2 01/17] Documentation/technical: add cruft-packs.txt Message-ID: <784ee7e0eec9ba520ebaaa27de2de810e2f6798a.1646182671.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Create a technical document to explain cruft packs. It contains a brief overview of the problem, some background, details on the implementation, and a couple of alternative approaches not considered here. Signed-off-by: Taylor Blau --- Documentation/Makefile | 1 + Documentation/technical/cruft-packs.txt | 97 +++++++++++++++++++++++++ 2 files changed, 98 insertions(+) create mode 100644 Documentation/technical/cruft-packs.txt diff --git a/Documentation/Makefile b/Documentation/Makefile index ed656db2ae..0b01c9408e 100644 --- a/Documentation/Makefile +++ b/Documentation/Makefile @@ -91,6 +91,7 @@ TECH_DOCS += MyFirstContribution TECH_DOCS += MyFirstObjectWalk TECH_DOCS += SubmittingPatches TECH_DOCS += technical/bundle-format +TECH_DOCS += technical/cruft-packs TECH_DOCS += technical/hash-function-transition TECH_DOCS += technical/http-protocol TECH_DOCS += technical/index-format diff --git a/Documentation/technical/cruft-packs.txt b/Documentation/technical/cruft-packs.txt new file mode 100644 index 0000000000..2c3c5d93f8 --- /dev/null +++ b/Documentation/technical/cruft-packs.txt @@ -0,0 +1,97 @@ += Cruft packs + +The cruft packs feature offer an alternative to Git's traditional mechanism of +removing unreachable objects. This document provides an overview of Git's +pruning mechanism, and how a cruft pack can be used instead to accomplish the +same. + +== Background + +To remove unreachable objects from your repository, Git offers `git repack -Ad` +(see linkgit:git-repack[1]). Quoting from the documentation: + +[quote] +[...] unreachable objects in a previous pack become loose, unpacked objects, +instead of being left in the old pack. [...] loose unreachable objects will be +pruned according to normal expiry rules with the next 'git gc' invocation. + +Unreachable objects aren't removed immediately, since doing so could race with +an incoming push which may reference an object which is about to be deleted. +Instead, those unreachable objects are stored as loose object and stay that way +until they are older than the expiration window, at which point they are removed +by linkgit:git-prune[1]. + +Git must store these unreachable objects loose in order to keep track of their +per-object mtimes. If these unreachable objects were written into one big pack, +then either freshening that pack (because an object contained within it was +re-written) or creating a new pack of unreachable objects would cause the pack's +mtime to get updated, and the objects within it would never leave the expiration +window. Instead, objects are stored loose in order to keep track of the +individual object mtimes and avoid a situation where all cruft objects are +freshened at once. + +This can lead to undesirable situations when a repository contains many +unreachable objects which have not yet left the grace period. Having large +directories in the shards of `.git/objects` can lead to decreased performance in +the repository. But given enough unreachable objects, this can lead to inode +starvation and degrade the performance of the whole system. Since we +can never pack those objects, these repositories often take up a large amount of +disk space, since we can only zlib compress them, but not store them in delta +chains. + +== Cruft packs + +A cruft pack eliminates the need for storing unreachable objects in a loose +state by including the per-object mtimes in a separate file alongside a single +pack containing all loose objects. + +A cruft pack is written by `git repack --cruft` when generating a new pack. +linkgit:git-pack-objects[1]'s `--cruft` option. Note that `git repack --cruft` +is a classic all-into-one repack, meaning that everything in the resulting pack is +reachable, and everything else is unreachable. Once written, the `--cruft` +option instructs `git repack` to generate another pack containing only objects +not packed in the previous step (which equates to packing all unreachable +objects together). This progresses as follows: + + 1. Enumerate every object, marking any object which is (a) not contained in a + kept-pack, and (b) whose mtime is within the grace period as a traversal + tip. + + 2. Perform a reachability traversal based on the tips gathered in the previous + step, adding every object along the way to the pack. + + 3. Write the pack out, along with a `.mtimes` file that records the per-object + timestamps. + +This mode is invoked internally by linkgit:git-repack[1] when instructed to +write a cruft pack. Crucially, the set of in-core kept packs is exactly the set +of packs which will not be deleted by the repack; in other words, they contain +all of the repository's reachable objects. + +When a repository already has a cruft pack, `git repack --cruft` typically only +adds objects to it. An exception to this is when `git repack` is given the +`--cruft-expiration` option, which allows the generated cruft pack to omit +expired objects instead of waiting for linkgit:git-gc[1] to expire those objects +later on. + +It is linkgit:git-gc[1] that is typically responsible for removing expired +unreachable objects. + +== Alternatives + +Notable alternatives to this design include: + + - The location of the per-object mtime data, and + - Storing unreachable objects in multiple cruft packs. + +On the location of mtime data, a new auxiliary file tied to the pack was chosen +to avoid complicating the `.idx` format. If the `.idx` format were ever to gain +support for optional chunks of data, it may make sense to consolidate the +`.mtimes` format into the `.idx` itself. + +Storing unreachable objects among multiple cruft packs (e.g., creating a new +cruft pack during each repacking operation including only unreachable objects +which aren't already stored in an earlier cruft pack) is significantly more +complicated to construct, and so aren't pursued here. The obvious drawback to +the current implementation is that the entire cruft pack must be re-written from +scratch. From patchwork Wed Mar 2 00:58:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12765342 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 79C0AC433FE for ; Wed, 2 Mar 2022 00:58:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238832AbiCBA6z (ORCPT ); Tue, 1 Mar 2022 19:58:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55424 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238819AbiCBA6s (ORCPT ); Tue, 1 Mar 2022 19:58:48 -0500 Received: from mail-io1-xd33.google.com (mail-io1-xd33.google.com [IPv6:2607:f8b0:4864:20::d33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7AF0C3E0D7 for ; Tue, 1 Mar 2022 16:58:04 -0800 (PST) Received: by mail-io1-xd33.google.com with SMTP id r8so134958ioj.9 for ; Tue, 01 Mar 2022 16:58:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=+dWq/e5QmSUL5ZUqHjSz3erT6dspP8onpvaWdPG9g10=; b=yneZelM9Ffam/6QdGbBnM1IrF52Z7QlZjkfK86V0zw3SfiUIz7Pbj2SeRLwWFAKPiE Y7ySXYmBHbztMhIJ2247qWwQNiCewX+H2otB80M+wVXynXJK68WRC6ly9MgGHqhcudt/ QaChxIVXwtQSRwW08N8/Xpk10LWokAUSeerG0dTBNfHKQGfbjZHRC0imUHu1FoL5qD8k ZVp3qIq+xlJ1Er7EAXQMwmKJP36mtR9jn05KRj3DwFY8UA0ForpipRMb9fQbhJdnjSJ3 utmw6fzWM8rHhgP+MENXyqSjJSgzKCtJXIgaT3DWXadpodKIqB5j2GbyG7BKc1U+gn43 vJ0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=+dWq/e5QmSUL5ZUqHjSz3erT6dspP8onpvaWdPG9g10=; b=3GuLNOZvcGAtps2O86lw/5QPty0t7+ZS7IE9oxu4/uRi5nL/Xsw0WyWk3nnZtBaUzi FrBrc/1dUl+UOF9+qKjWsm5lGlg34V5s763ASYyT0R9pqTE7T45JtechatVmkrNWWNtL q+RhQsZigbHd5QnyWh/YkpwjTDA4VBVhTUmgKj7V1GFjw1bxvhZFcj3U1wReOS23Ixhn z1kqDcxCl8nSEnxAdATYb2sXGbwO1gyi+3rORsMS9avqTfxayPsxHUcSmBieqG7hRI4a EdOEhBpjpU8Hb0bE89+55D//qBf5DKCb5Lhd9PNonWaYQB3m1X5sHiHLGUfQCnGVL8yD LJTQ== X-Gm-Message-State: AOAM533BPxAV6zWbNRHU1HASNqcgGLZVnbkOrSIo6CyJVqyFaWrbUTBC XXeKzt4XUGUU2gk/N5osQpGfCaBKwF7QN10z X-Google-Smtp-Source: ABdhPJwduuPFfNyi1B1dr28LF3rztYhIapEE0UGJYI1RXmynwMShcbbQyTuW0LhB+FLK+VxtDKpfOQ== X-Received: by 2002:a6b:7e4b:0:b0:640:b933:b62 with SMTP id k11-20020a6b7e4b000000b00640b9330b62mr20993429ioq.97.1646182683730; Tue, 01 Mar 2022 16:58:03 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id v9-20020a056e0213c900b002c2f20d33cdsm3694796ilj.78.2022.03.01.16.58.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Mar 2022 16:58:03 -0800 (PST) Date: Tue, 1 Mar 2022 19:58:02 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: tytso@mit.edu, derrickstolee@github.com, gitster@pobox.com, larsxschneider@gmail.com Subject: [PATCH v2 02/17] pack-mtimes: support reading .mtimes files Message-ID: <101b34660c0c5028ba591d052dc587bb8918ccb2.1646182671.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org To store the individual mtimes of objects in a cruft pack, introduce a new `.mtimes` format that can optionally accompany a single pack in the repository. The format is defined in Documentation/technical/pack-format.txt, and stores a 4-byte network order timestamp for each object in name (index) order. This patch prepares for cruft packs by defining the `.mtimes` format, and introducing a basic API that callers can use to read out individual mtimes. Signed-off-by: Taylor Blau --- Documentation/technical/pack-format.txt | 19 ++++ Makefile | 1 + builtin/repack.c | 1 + object-store.h | 5 +- pack-mtimes.c | 129 ++++++++++++++++++++++++ pack-mtimes.h | 15 +++ packfile.c | 19 +++- 7 files changed, 186 insertions(+), 3 deletions(-) create mode 100644 pack-mtimes.c create mode 100644 pack-mtimes.h diff --git a/Documentation/technical/pack-format.txt b/Documentation/technical/pack-format.txt index 6d3efb7d16..c443dbb526 100644 --- a/Documentation/technical/pack-format.txt +++ b/Documentation/technical/pack-format.txt @@ -294,6 +294,25 @@ Pack file entry: <+ All 4-byte numbers are in network order. +== pack-*.mtimes files have the format: + + - A 4-byte magic number '0x4d544d45' ('MTME'). + + - A 4-byte version identifier (= 1). + + - A 4-byte hash function identifier (= 1 for SHA-1, 2 for SHA-256). + + - A table of 4-byte unsigned integers in network order. The ith + value is the modification time (mtime) of the ith object in the + corresponding pack by lexicographic (index) order. The mtimes + count standard epoch seconds. + + - A trailer, containing a checksum of the corresponding packfile, + and a checksum of all of the above (each having length according + to the specified hash function). + +All 4-byte numbers are in network order. + == multi-pack-index (MIDX) files have the following format: The multi-pack-index files refer to multiple pack-files and loose objects. diff --git a/Makefile b/Makefile index 6f0b4b775f..1b186f4fd7 100644 --- a/Makefile +++ b/Makefile @@ -959,6 +959,7 @@ LIB_OBJS += oidtree.o LIB_OBJS += pack-bitmap-write.o LIB_OBJS += pack-bitmap.o LIB_OBJS += pack-check.o +LIB_OBJS += pack-mtimes.o LIB_OBJS += pack-objects.o LIB_OBJS += pack-revindex.o LIB_OBJS += pack-write.o diff --git a/builtin/repack.c b/builtin/repack.c index da1e364a75..f908f7d5dd 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -212,6 +212,7 @@ static struct { } exts[] = { {".pack"}, {".rev", 1}, + {".mtimes", 1}, {".bitmap", 1}, {".promisor", 1}, {".idx"}, diff --git a/object-store.h b/object-store.h index 6f89482df0..9b227661f2 100644 --- a/object-store.h +++ b/object-store.h @@ -115,12 +115,15 @@ struct packed_git { freshened:1, do_not_close:1, pack_promisor:1, - multi_pack_index:1; + multi_pack_index:1, + is_cruft:1; unsigned char hash[GIT_MAX_RAWSZ]; struct revindex_entry *revindex; const uint32_t *revindex_data; const uint32_t *revindex_map; size_t revindex_size; + const uint32_t *mtimes_map; + size_t mtimes_size; /* something like ".git/objects/pack/xxxxx.pack" */ char pack_name[FLEX_ARRAY]; /* more */ }; diff --git a/pack-mtimes.c b/pack-mtimes.c new file mode 100644 index 0000000000..50caa34381 --- /dev/null +++ b/pack-mtimes.c @@ -0,0 +1,129 @@ +#include "pack-mtimes.h" +#include "object-store.h" +#include "packfile.h" + +static char *pack_mtimes_filename(struct packed_git *p) +{ + size_t len; + if (!strip_suffix(p->pack_name, ".pack", &len)) + BUG("pack_name does not end in .pack"); + /* NEEDSWORK: this could reuse code from pack-revindex.c. */ + return xstrfmt("%.*s.mtimes", (int)len, p->pack_name); +} + +#define MTIMES_HEADER_SIZE (12) +#define MTIMES_MIN_SIZE (MTIMES_HEADER_SIZE + (2 * the_hash_algo->rawsz)) + +struct mtimes_header { + uint32_t signature; + uint32_t version; + uint32_t hash_id; +}; + +static int load_pack_mtimes_file(char *mtimes_file, + uint32_t num_objects, + const uint32_t **data_p, size_t *len_p) +{ + int fd, ret = 0; + struct stat st; + void *data = NULL; + size_t mtimes_size; + struct mtimes_header header; + uint32_t *hdr; + + fd = git_open(mtimes_file); + + if (fd < 0) { + ret = -1; + goto cleanup; + } + if (fstat(fd, &st)) { + ret = error_errno(_("failed to read %s"), mtimes_file); + goto cleanup; + } + + mtimes_size = xsize_t(st.st_size); + + if (mtimes_size < MTIMES_MIN_SIZE) { + ret = error(_("mtimes file %s is too small"), mtimes_file); + goto cleanup; + } + + if (mtimes_size - MTIMES_MIN_SIZE != st_mult(sizeof(uint32_t), num_objects)) { + ret = error(_("mtimes file %s is corrupt"), mtimes_file); + goto cleanup; + } + + data = hdr = xmmap(NULL, mtimes_size, PROT_READ, MAP_PRIVATE, fd, 0); + + header.signature = ntohl(hdr[0]); + header.version = ntohl(hdr[1]); + header.hash_id = ntohl(hdr[2]); + + if (header.signature != MTIMES_SIGNATURE) { + ret = error(_("mtimes file %s has unknown signature"), mtimes_file); + goto cleanup; + } + + if (header.version != 1) { + ret = error(_("mtimes file %s has unsupported version %"PRIu32), + mtimes_file, header.version); + goto cleanup; + } + + if (!(header.hash_id == 1 || header.hash_id == 2)) { + ret = error(_("mtimes file %s has unsupported hash id %"PRIu32), + mtimes_file, header.hash_id); + goto cleanup; + } + +cleanup: + if (ret) { + if (data) + munmap(data, mtimes_size); + } else { + *len_p = mtimes_size; + *data_p = (const uint32_t *)data; + } + + close(fd); + return ret; +} + +int load_pack_mtimes(struct packed_git *p) +{ + char *mtimes_name = NULL; + int ret = 0; + + if (!p->is_cruft) + return ret; /* not a cruft pack */ + if (p->mtimes_map) + return ret; /* already loaded */ + + ret = open_pack_index(p); + if (ret < 0) + goto cleanup; + + mtimes_name = pack_mtimes_filename(p); + ret = load_pack_mtimes_file(mtimes_name, + p->num_objects, + &p->mtimes_map, + &p->mtimes_size); + if (ret) + goto cleanup; + +cleanup: + free(mtimes_name); + return ret; +} + +uint32_t nth_packed_mtime(struct packed_git *p, uint32_t pos) +{ + if (!p->mtimes_map) + BUG("pack .mtimes file not loaded for %s", p->pack_name); + if (p->num_objects <= pos) + BUG("pack .mtimes out-of-bounds (%"PRIu32" vs %"PRIu32")", + pos, p->num_objects); + + return get_be32(p->mtimes_map + pos + 3); +} diff --git a/pack-mtimes.h b/pack-mtimes.h new file mode 100644 index 0000000000..38ddb9f893 --- /dev/null +++ b/pack-mtimes.h @@ -0,0 +1,15 @@ +#ifndef PACK_MTIMES_H +#define PACK_MTIMES_H + +#include "git-compat-util.h" + +#define MTIMES_SIGNATURE 0x4d544d45 /* "MTME" */ +#define MTIMES_VERSION 1 + +struct packed_git; + +int load_pack_mtimes(struct packed_git *p); + +uint32_t nth_packed_mtime(struct packed_git *p, uint32_t pos); + +#endif diff --git a/packfile.c b/packfile.c index 835b2d2716..fc0245fbab 100644 --- a/packfile.c +++ b/packfile.c @@ -334,12 +334,22 @@ static void close_pack_revindex(struct packed_git *p) p->revindex_data = NULL; } +static void close_pack_mtimes(struct packed_git *p) +{ + if (!p->mtimes_map) + return; + + munmap((void *)p->mtimes_map, p->mtimes_size); + p->mtimes_map = NULL; +} + void close_pack(struct packed_git *p) { close_pack_windows(p); close_pack_fd(p); close_pack_index(p); close_pack_revindex(p); + close_pack_mtimes(p); oidset_clear(&p->bad_objects); } @@ -363,7 +373,7 @@ void close_object_store(struct raw_object_store *o) void unlink_pack_path(const char *pack_name, int force_delete) { - static const char *exts[] = {".pack", ".idx", ".rev", ".keep", ".bitmap", ".promisor"}; + static const char *exts[] = {".pack", ".idx", ".rev", ".keep", ".bitmap", ".promisor", ".mtimes"}; int i; struct strbuf buf = STRBUF_INIT; size_t plen; @@ -718,6 +728,10 @@ struct packed_git *add_packed_git(const char *path, size_t path_len, int local) if (!access(p->pack_name, F_OK)) p->pack_promisor = 1; + xsnprintf(p->pack_name + path_len, alloc - path_len, ".mtimes"); + if (!access(p->pack_name, F_OK)) + p->is_cruft = 1; + xsnprintf(p->pack_name + path_len, alloc - path_len, ".pack"); if (stat(p->pack_name, &st) || !S_ISREG(st.st_mode)) { free(p); @@ -869,7 +883,8 @@ static void prepare_pack(const char *full_name, size_t full_name_len, ends_with(file_name, ".pack") || ends_with(file_name, ".bitmap") || ends_with(file_name, ".keep") || - ends_with(file_name, ".promisor")) + ends_with(file_name, ".promisor") || + ends_with(file_name, ".mtimes")) string_list_append(data->garbage, full_name); else report_garbage(PACKDIR_FILE_GARBAGE, full_name); From patchwork Wed Mar 2 00:58:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12765341 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E7A29C433F5 for ; Wed, 2 Mar 2022 00:58:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238825AbiCBA6w (ORCPT ); Tue, 1 Mar 2022 19:58:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55590 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238823AbiCBA6u (ORCPT ); Tue, 1 Mar 2022 19:58:50 -0500 Received: from mail-io1-xd31.google.com (mail-io1-xd31.google.com [IPv6:2607:f8b0:4864:20::d31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F2ED13ED35 for ; Tue, 1 Mar 2022 16:58:06 -0800 (PST) Received: by mail-io1-xd31.google.com with SMTP id d19so139786ioc.8 for ; Tue, 01 Mar 2022 16:58:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=h1/TaS4hJ7uvqeQbRpqvGUZQGPog/DBrFRJeVMB1op0=; b=UN/NHbhY39ubb8pdoJyrtcQ2cjYWwxJoQQBtPEChAEFyr3mxrsoS4SCA+LssjpwP9x xCaLTZ9Yp/xTW/Jm8GZqMdtt6kuFQ/UKk6HWTQodTOwcKN0MGIJN4+46k2V3jUI9925s 1DyMypzcieWrwvF7sbtGiqOSPrHXEZSza7EfRNz+kNTAcck/7DL/ArOMUu8/KXq5Q3e3 bkmkA9KYRRqTOTnmTEb2OcPUZl3f5s2qr0Ks6qhx+0dm8n4Do80bECKrwc1ICktkP+7l t13KQxsBsWeH4EaRvK9nTyxArxwLUb8Rep9QXRGOKkaNW3L3epjjIr0ekUXCN0I/rHaq B3xQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=h1/TaS4hJ7uvqeQbRpqvGUZQGPog/DBrFRJeVMB1op0=; b=3QG4E3Q8IGyrv6pH9NLeQNAAsEXNzfgaZENRk2Q6TwV21LdxM14GlncoL6Np3vQdsf +DefvcQGrZK9N8oM+OtIE134F3/q/meFjvDfEwDlDZHAMdX/5PYvbELuK7pFvZ+VcytY 2vxLbvJarsDN6KJbncIOVb9OtfhgGXqzTrnXr4Xz0ijMU5MovR9hkKyk5SVaszZ+oatu CmKJL07eAvr0WDmBmlvP8fnbM5k5gJvDEkJOFGvVANExOhljjBFWTLrQ09E8EjxsGSWW GzRBDXZQBsE389S6SM/rB6O0SaL8kjrs54uqks2AkIYR83nns0/JYcGWSXkPAsiVs3WE JAAg== X-Gm-Message-State: AOAM533Eass/bq7HOckZkXu6YWSLxTTXy43qF2tQwRXhGn+ITdgsOoyW zYDyTRZdvx6Sbg/c33MD2tmWFGo2bLJJPj3E X-Google-Smtp-Source: ABdhPJxV+cQJhG4SANdEUiax5E2TtVeYY8EJSd25NpquC38rio2j3KvHIoy3ZkoYhkbcrJUpr4SWVg== X-Received: by 2002:a02:271d:0:b0:307:ea12:ff8b with SMTP id g29-20020a02271d000000b00307ea12ff8bmr24261600jaa.274.1646182686231; Tue, 01 Mar 2022 16:58:06 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id e23-20020a6b5017000000b00635b8032d45sm7918049iob.22.2022.03.01.16.58.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Mar 2022 16:58:05 -0800 (PST) Date: Tue, 1 Mar 2022 19:58:05 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: tytso@mit.edu, derrickstolee@github.com, gitster@pobox.com, larsxschneider@gmail.com Subject: [PATCH v2 03/17] pack-write: pass 'struct packing_data' to 'stage_tmp_packfiles' Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This structure will be used to communicate the per-object mtimes when writing a cruft pack. Here, we need the full packing_data structure because the mtime information is stored in an array there, not on the individual object_entry's themselves (to avoid paying the overhead in structure width for operations which do not generate a cruft pack). We haven't passed this information down before because one of the two callers (in bulk-checkin.c) does not have a packing_data structure at all. In that case (where no cruft pack will be generated), NULL is passed instead. Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 3 ++- bulk-checkin.c | 2 +- pack-write.c | 1 + pack.h | 3 +++ 4 files changed, 7 insertions(+), 2 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 178e611f09..385970cb7b 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1254,7 +1254,8 @@ static void write_pack_file(void) stage_tmp_packfiles(&tmpname, pack_tmp_name, written_list, nr_written, - &pack_idx_opts, hash, &idx_tmp_name); + &to_pack, &pack_idx_opts, hash, + &idx_tmp_name); if (write_bitmap_index) { size_t tmpname_len = tmpname.len; diff --git a/bulk-checkin.c b/bulk-checkin.c index 8785b2ac80..99f7596c4e 100644 --- a/bulk-checkin.c +++ b/bulk-checkin.c @@ -33,7 +33,7 @@ static void finish_tmp_packfile(struct strbuf *basename, char *idx_tmp_name = NULL; stage_tmp_packfiles(basename, pack_tmp_name, written_list, nr_written, - pack_idx_opts, hash, &idx_tmp_name); + NULL, pack_idx_opts, hash, &idx_tmp_name); rename_tmp_packfile_idx(basename, &idx_tmp_name); free(idx_tmp_name); diff --git a/pack-write.c b/pack-write.c index a5846f3a34..d594e3008e 100644 --- a/pack-write.c +++ b/pack-write.c @@ -483,6 +483,7 @@ void stage_tmp_packfiles(struct strbuf *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, + struct packing_data *to_pack, struct pack_idx_option *pack_idx_opts, unsigned char hash[], char **idx_tmp_name) diff --git a/pack.h b/pack.h index b22bfc4a18..fd27cfdfd7 100644 --- a/pack.h +++ b/pack.h @@ -109,11 +109,14 @@ int encode_in_pack_object_header(unsigned char *hdr, int hdr_len, #define PH_ERROR_PROTOCOL (-3) int read_pack_header(int fd, struct pack_header *); +struct packing_data; + struct hashfile *create_tmp_packfile(char **pack_tmp_name); void stage_tmp_packfiles(struct strbuf *name_buffer, const char *pack_tmp_name, struct pack_idx_entry **written_list, uint32_t nr_written, + struct packing_data *to_pack, struct pack_idx_option *pack_idx_opts, unsigned char hash[], char **idx_tmp_name); From patchwork Wed Mar 2 00:58:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12765343 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24F7AC433F5 for ; Wed, 2 Mar 2022 00:58:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238836AbiCBA64 (ORCPT ); Tue, 1 Mar 2022 19:58:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55710 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238818AbiCBA6w (ORCPT ); Tue, 1 Mar 2022 19:58:52 -0500 Received: from mail-il1-x12f.google.com (mail-il1-x12f.google.com [IPv6:2607:f8b0:4864:20::12f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B55A53E0D7 for ; Tue, 1 Mar 2022 16:58:09 -0800 (PST) Received: by mail-il1-x12f.google.com with SMTP id w4so244493ilj.5 for ; Tue, 01 Mar 2022 16:58:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=LtC0Ec4kYzTlFC8i2zTuZ0qokyxmwms48N+ucsIWL1Y=; b=bTF7WV3DJN4P+O/rYqUUM0YM4ZsE8aofSJLMYcCTH9IMS/e1/JyY3m6UN/lRVwS1rG DA0N5TM2D0n27c1Qcfu+qygBADYE8tGq4CxANXkBXPktpZ2fd6XCR/7IdYNBDlx4T/YD CiU+eREgAcAjPMV3lm8oB+nmCB7gTiAoKboP28hFItOjRUMehdxUF0f8AacmOnrKl0ho AGQp03ZwnffL1CjbTx36ctkMwCkZAQfWYO5splYqwy5NWeDGVInu5ffuup189Ib1PULt /C1bSkJ15M0OSH3EPZRTT8fw1QJtxN0BKYWdt8vkoBGuY5UJv1tssbwzIx3Tl4EDx9sW noDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=LtC0Ec4kYzTlFC8i2zTuZ0qokyxmwms48N+ucsIWL1Y=; b=1DAWiB7FiaBbtuOuvlYjHGSPV7lY4qFaKcq2taKXihy5jzGf9HcmNRfk4fvIJVhiM2 EcZIQ4SjIS8ycSRaXSnKd9Ry5uWDJjr1ykpBXg8YMBLKmKKM4dT6xmmMpEeFG30yNPpe ra3tqtjIQPEQBkeG1gNq9ekzDJ10vVhAdId1WExP2sfaZDyxgm9TV3kyg4LDBBe+u0uR FhJ6UV0TUyApiZu1NmlZvkSp2I/HCp0m4yGW2VkpmMsWRRDl/Zg6St9CVtLobo+wQBmJ UFCRIs0CeDDFh0rH73ptQQaaRHGCnD+p6fYlBdyxfEC9ASlXQUUhNqHB1uGPmBBRiUZA GKug== X-Gm-Message-State: AOAM533AQVKJNzurE8etks6YhpLqX6XTmM8LJ0cefdINdkBofX2lnWgo zxMEgmVKrwAcydAQeRlNUWO84ltUk3QLJM/V X-Google-Smtp-Source: ABdhPJzB2WICyGznhSJ5j11175jJaUcTgSPBicajsfyoBj/TJZtHyA1UJU7SHx3Du3xdIZS/WAXR1A== X-Received: by 2002:a05:6e02:170f:b0:2c2:c247:b586 with SMTP id u15-20020a056e02170f00b002c2c247b586mr17387250ill.155.1646182688719; Tue, 01 Mar 2022 16:58:08 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id q5-20020a056e0220e500b002c5ba03f8eesm97252ilv.9.2022.03.01.16.58.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Mar 2022 16:58:08 -0800 (PST) Date: Tue, 1 Mar 2022 19:58:07 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: tytso@mit.edu, derrickstolee@github.com, gitster@pobox.com, larsxschneider@gmail.com Subject: [PATCH v2 04/17] chunk-format.h: extract oid_version() Message-ID: <1e0ed363ae93099444b6626ff0a2043e8d88771d.1646182671.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org There are three definitions of an identical function which converts `the_hash_algo` into either 1 (for SHA-1) or 2 (for SHA-256). There is a copy of this function for writing both the commit-graph and multi-pack-index file, and another inline definition used to write the .rev header. Consolidate these into a single definition in chunk-format.h. It's not clear that this is the best header to define this function in, but it should do for now. (Worth noting, the .rev caller expects a 4-byte unsigned, but the other two callers work with a single unsigned byte. The consolidated version uses the latter type, and lets the compiler widen it when required). Another caller will be added in a subsequent patch. Signed-off-by: Taylor Blau --- chunk-format.c | 12 ++++++++++++ chunk-format.h | 3 +++ commit-graph.c | 18 +++--------------- midx.c | 18 +++--------------- pack-write.c | 15 ++------------- 5 files changed, 23 insertions(+), 43 deletions(-) diff --git a/chunk-format.c b/chunk-format.c index 1c3dca62e2..0275b74a89 100644 --- a/chunk-format.c +++ b/chunk-format.c @@ -181,3 +181,15 @@ int read_chunk(struct chunkfile *cf, return CHUNK_NOT_FOUND; } + +uint8_t oid_version(const struct git_hash_algo *algop) +{ + switch (hash_algo_by_ptr(algop)) { + case GIT_HASH_SHA1: + return 1; + case GIT_HASH_SHA256: + return 2; + default: + die(_("invalid hash version")); + } +} diff --git a/chunk-format.h b/chunk-format.h index 9ccbe00377..7885aa0848 100644 --- a/chunk-format.h +++ b/chunk-format.h @@ -2,6 +2,7 @@ #define CHUNK_FORMAT_H #include "git-compat-util.h" +#include "hash.h" struct hashfile; struct chunkfile; @@ -65,4 +66,6 @@ int read_chunk(struct chunkfile *cf, chunk_read_fn fn, void *data); +uint8_t oid_version(const struct git_hash_algo *algop); + #endif diff --git a/commit-graph.c b/commit-graph.c index 265c010122..f678d2c4a1 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -193,18 +193,6 @@ char *get_commit_graph_chain_filename(struct object_directory *odb) return xstrfmt("%s/info/commit-graphs/commit-graph-chain", odb->path); } -static uint8_t oid_version(void) -{ - switch (hash_algo_by_ptr(the_hash_algo)) { - case GIT_HASH_SHA1: - return 1; - case GIT_HASH_SHA256: - return 2; - default: - die(_("invalid hash version")); - } -} - static struct commit_graph *alloc_commit_graph(void) { struct commit_graph *g = xcalloc(1, sizeof(*g)); @@ -365,9 +353,9 @@ struct commit_graph *parse_commit_graph(struct repository *r, } hash_version = *(unsigned char*)(data + 5); - if (hash_version != oid_version()) { + if (hash_version != oid_version(the_hash_algo)) { error(_("commit-graph hash version %X does not match version %X"), - hash_version, oid_version()); + hash_version, oid_version(the_hash_algo)); return NULL; } @@ -1911,7 +1899,7 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx) hashwrite_be32(f, GRAPH_SIGNATURE); hashwrite_u8(f, GRAPH_VERSION); - hashwrite_u8(f, oid_version()); + hashwrite_u8(f, oid_version(the_hash_algo)); hashwrite_u8(f, get_num_chunks(cf)); hashwrite_u8(f, ctx->num_commit_graphs_after - 1); diff --git a/midx.c b/midx.c index 865170bad0..65e670c5e2 100644 --- a/midx.c +++ b/midx.c @@ -41,18 +41,6 @@ #define PACK_EXPIRED UINT_MAX -static uint8_t oid_version(void) -{ - switch (hash_algo_by_ptr(the_hash_algo)) { - case GIT_HASH_SHA1: - return 1; - case GIT_HASH_SHA256: - return 2; - default: - die(_("invalid hash version")); - } -} - const unsigned char *get_midx_checksum(struct multi_pack_index *m) { return m->data + m->data_len - the_hash_algo->rawsz; @@ -134,9 +122,9 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local m->version); hash_version = m->data[MIDX_BYTE_HASH_VERSION]; - if (hash_version != oid_version()) { + if (hash_version != oid_version(the_hash_algo)) { error(_("multi-pack-index hash version %u does not match version %u"), - hash_version, oid_version()); + hash_version, oid_version(the_hash_algo)); goto cleanup_fail; } m->hash_len = the_hash_algo->rawsz; @@ -420,7 +408,7 @@ static size_t write_midx_header(struct hashfile *f, { hashwrite_be32(f, MIDX_SIGNATURE); hashwrite_u8(f, MIDX_VERSION); - hashwrite_u8(f, oid_version()); + hashwrite_u8(f, oid_version(the_hash_algo)); hashwrite_u8(f, num_chunks); hashwrite_u8(f, 0); /* unused */ hashwrite_be32(f, num_packs); diff --git a/pack-write.c b/pack-write.c index d594e3008e..ff305b404c 100644 --- a/pack-write.c +++ b/pack-write.c @@ -2,6 +2,7 @@ #include "pack.h" #include "csum-file.h" #include "remote.h" +#include "chunk-format.h" void reset_pack_idx_option(struct pack_idx_option *opts) { @@ -181,21 +182,9 @@ static int pack_order_cmp(const void *va, const void *vb, void *ctx) static void write_rev_header(struct hashfile *f) { - uint32_t oid_version; - switch (hash_algo_by_ptr(the_hash_algo)) { - case GIT_HASH_SHA1: - oid_version = 1; - break; - case GIT_HASH_SHA256: - oid_version = 2; - break; - default: - die("write_rev_header: unknown hash version"); - } - hashwrite_be32(f, RIDX_SIGNATURE); hashwrite_be32(f, RIDX_VERSION); - hashwrite_be32(f, oid_version); + hashwrite_be32(f, oid_version(the_hash_algo)); } static void write_rev_index_positions(struct hashfile *f, From patchwork Wed Mar 2 00:58:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12765345 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 694F2C433FE for ; Wed, 2 Mar 2022 00:58:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238840AbiCBA67 (ORCPT ); Tue, 1 Mar 2022 19:58:59 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55960 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238830AbiCBA6y (ORCPT ); Tue, 1 Mar 2022 19:58:54 -0500 Received: from mail-io1-xd35.google.com (mail-io1-xd35.google.com [IPv6:2607:f8b0:4864:20::d35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E485E41329 for ; Tue, 1 Mar 2022 16:58:11 -0800 (PST) Received: by mail-io1-xd35.google.com with SMTP id t11so143080ioi.7 for ; Tue, 01 Mar 2022 16:58:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=P5GgWkN9cl8Xn1q1KDYUKB5PiHhKBc7Ki0Cn957yFXs=; b=4sJl2vzTwjblEivw+fzaLk+ffbmIxEdtzISxDfrqY4tlKykCtItp/u59y8QDPiYqon KksRvl2e0Qcv3oyupRVlcGBMZye0o5J2xlPTIHM/JZ3mHGlILeHrkFx4XXahFZku9H5w haaQZkpMVXB9204V663SWZOKSe3vD1LMyUpO95aw3ubmBoqBxnQ+GVJ3IguC0yE4sIVQ 9+mRe58VB7dBV3ZRJ0T9ZTC3EYDBzMcB7GtqU7LIx9XtWPtIEi7CN3Nj8typ9FgqcrK3 Y+n6KFNcyd7IJN6dEj68NC+HejY3DF9WfJgxm9XJ2IBd8esgfRFhg2ta3Dg1FzdORMbN ZTiA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=P5GgWkN9cl8Xn1q1KDYUKB5PiHhKBc7Ki0Cn957yFXs=; b=1UXgeLXzXEEljlla3Zduv7Hm8wpP5Az7tQymjQzzkEL3oUlkNmy3Xl8aHTWlc+p1/a CltftyFuzIZ9sH8JD7XlW1QSm2ORVFpNc1hGltzEZK81kcd85x6HiDo1zMQHH5691I/h 8wLXWv1QR6tgGG4v0Py+cEqqHRo0EkB2jP6Uq7/LVbvmHicu4YMij1Rl5NN/3upZF4+m cbxkLHAIDcczGJrgdhl5vMPZPzmIN7NVgYZoEFA8QXgzmFJ6hUthP8XyZEQhLlWZt979 z8J+MtjhoUh0wtgMj8pSARNSOavHqEDL6ileYjFGn0+oU8BxD95ddQjWFa5YJqv0g2WQ 4LkQ== X-Gm-Message-State: AOAM530FP3TlExorm4fxD6Yw9pLdZ0K7bSnDS/QoeJwzya91PPka4aqP JTXWOHfQURJlapSJJrfOLvlBDtT4y9Qe1S4Z X-Google-Smtp-Source: ABdhPJxLimPd2rlJ3aNvn7fkY5728+qC5ntWz8kaaTzdIFx1fKggu40obtSFFvo3zSQySCLOF1gvuA== X-Received: by 2002:a05:6638:3f0a:b0:315:1190:ae9b with SMTP id ck10-20020a0566383f0a00b003151190ae9bmr24331238jab.210.1646182691186; Tue, 01 Mar 2022 16:58:11 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id p12-20020a92d28c000000b002c29f97824dsm8606957ilp.48.2022.03.01.16.58.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Mar 2022 16:58:10 -0800 (PST) Date: Tue, 1 Mar 2022 19:58:10 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: tytso@mit.edu, derrickstolee@github.com, gitster@pobox.com, larsxschneider@gmail.com Subject: [PATCH v2 05/17] pack-mtimes: support writing pack .mtimes files Message-ID: <5236490688213ff350b38f618ccb27f055300464.1646182671.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Now that the `.mtimes` format is defined, supplement the pack-write API to be able to conditionally write an `.mtimes` file along with a pack by setting an additional flag and passing an oidmap that contains the timestamps corresponding to each object in the pack. Signed-off-by: Taylor Blau --- pack-objects.c | 6 ++++ pack-objects.h | 25 ++++++++++++++++ pack-write.c | 77 ++++++++++++++++++++++++++++++++++++++++++++++++++ pack.h | 1 + 4 files changed, 109 insertions(+) diff --git a/pack-objects.c b/pack-objects.c index fe2a4eace9..272e8d4517 100644 --- a/pack-objects.c +++ b/pack-objects.c @@ -170,6 +170,9 @@ struct object_entry *packlist_alloc(struct packing_data *pdata, if (pdata->layer) REALLOC_ARRAY(pdata->layer, pdata->nr_alloc); + + if (pdata->cruft_mtime) + REALLOC_ARRAY(pdata->cruft_mtime, pdata->nr_alloc); } new_entry = pdata->objects + pdata->nr_objects++; @@ -198,6 +201,9 @@ struct object_entry *packlist_alloc(struct packing_data *pdata, if (pdata->layer) pdata->layer[pdata->nr_objects - 1] = 0; + if (pdata->cruft_mtime) + pdata->cruft_mtime[pdata->nr_objects - 1] = 0; + return new_entry; } diff --git a/pack-objects.h b/pack-objects.h index dca2351ef9..393b9db546 100644 --- a/pack-objects.h +++ b/pack-objects.h @@ -168,6 +168,14 @@ struct packing_data { /* delta islands */ unsigned int *tree_depth; unsigned char *layer; + + /* + * Used when writing cruft packs. + * + * Object mtimes are stored in pack order when writing, but + * written out in lexicographic (index) order. + */ + uint32_t *cruft_mtime; }; void prepare_packing_data(struct repository *r, struct packing_data *pdata); @@ -289,4 +297,21 @@ static inline void oe_set_layer(struct packing_data *pack, pack->layer[e - pack->objects] = layer; } +static inline uint32_t oe_cruft_mtime(struct packing_data *pack, + struct object_entry *e) +{ + if (!pack->cruft_mtime) + return 0; + return pack->cruft_mtime[e - pack->objects]; +} + +static inline void oe_set_cruft_mtime(struct packing_data *pack, + struct object_entry *e, + uint32_t mtime) +{ + if (!pack->cruft_mtime) + CALLOC_ARRAY(pack->cruft_mtime, pack->nr_alloc); + pack->cruft_mtime[e - pack->objects] = mtime; +} + #endif diff --git a/pack-write.c b/pack-write.c index ff305b404c..270280c4df 100644 --- a/pack-write.c +++ b/pack-write.c @@ -3,6 +3,10 @@ #include "csum-file.h" #include "remote.h" #include "chunk-format.h" +#include "pack-mtimes.h" +#include "oidmap.h" +#include "chunk-format.h" +#include "pack-objects.h" void reset_pack_idx_option(struct pack_idx_option *opts) { @@ -276,6 +280,70 @@ const char *write_rev_file_order(const char *rev_name, return rev_name; } +static void write_mtimes_header(struct hashfile *f) +{ + hashwrite_be32(f, MTIMES_SIGNATURE); + hashwrite_be32(f, MTIMES_VERSION); + hashwrite_be32(f, oid_version(the_hash_algo)); +} + +/* + * Writes the object mtimes of "objects" for use in a .mtimes file. + * Note that objects must be in lexicographic (index) order, which is + * the expected ordering of these values in the .mtimes file. + */ +static void write_mtimes_objects(struct hashfile *f, + struct packing_data *to_pack, + struct pack_idx_entry **objects, + uint32_t nr_objects) +{ + uint32_t i; + for (i = 0; i < nr_objects; i++) { + struct object_entry *e = (struct object_entry*)objects[i]; + hashwrite_be32(f, oe_cruft_mtime(to_pack, e)); + } +} + +static void write_mtimes_trailer(struct hashfile *f, const unsigned char *hash) +{ + hashwrite(f, hash, the_hash_algo->rawsz); +} + +static const char *write_mtimes_file(const char *mtimes_name, + struct packing_data *to_pack, + struct pack_idx_entry **objects, + uint32_t nr_objects, + const unsigned char *hash) +{ + struct hashfile *f; + int fd; + + if (!to_pack) + BUG("cannot call write_mtimes_file with NULL packing_data"); + + if (!mtimes_name) { + struct strbuf tmp_file = STRBUF_INIT; + fd = odb_mkstemp(&tmp_file, "pack/tmp_mtimes_XXXXXX"); + mtimes_name = strbuf_detach(&tmp_file, NULL); + } else { + unlink(mtimes_name); + fd = xopen(mtimes_name, O_CREAT|O_EXCL|O_WRONLY, 0600); + } + f = hashfd(fd, mtimes_name); + + write_mtimes_header(f); + write_mtimes_objects(f, to_pack, objects, nr_objects); + write_mtimes_trailer(f, hash); + + if (adjust_shared_perm(mtimes_name) < 0) + die(_("failed to make %s readable"), mtimes_name); + + finalize_hashfile(f, NULL, + CSUM_HASH_IN_STREAM | CSUM_CLOSE | CSUM_FSYNC); + + return mtimes_name; +} + off_t write_pack_header(struct hashfile *f, uint32_t nr_entries) { struct pack_header hdr; @@ -478,6 +546,7 @@ void stage_tmp_packfiles(struct strbuf *name_buffer, char **idx_tmp_name) { const char *rev_tmp_name = NULL; + const char *mtimes_tmp_name = NULL; if (adjust_shared_perm(pack_tmp_name)) die_errno("unable to make temporary pack file readable"); @@ -490,9 +559,17 @@ void stage_tmp_packfiles(struct strbuf *name_buffer, rev_tmp_name = write_rev_file(NULL, written_list, nr_written, hash, pack_idx_opts->flags); + if (pack_idx_opts->flags & WRITE_MTIMES) { + mtimes_tmp_name = write_mtimes_file(NULL, to_pack, written_list, + nr_written, + hash); + } + rename_tmp_packfile(name_buffer, pack_tmp_name, "pack"); if (rev_tmp_name) rename_tmp_packfile(name_buffer, rev_tmp_name, "rev"); + if (mtimes_tmp_name) + rename_tmp_packfile(name_buffer, mtimes_tmp_name, "mtimes"); } void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought) diff --git a/pack.h b/pack.h index fd27cfdfd7..01d385903a 100644 --- a/pack.h +++ b/pack.h @@ -44,6 +44,7 @@ struct pack_idx_option { #define WRITE_IDX_STRICT 02 #define WRITE_REV 04 #define WRITE_REV_VERIFY 010 +#define WRITE_MTIMES 020 uint32_t version; uint32_t off32_limit; From patchwork Wed Mar 2 00:58:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12765344 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE9FEC433EF for ; Wed, 2 Mar 2022 00:58:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238829AbiCBA66 (ORCPT ); Tue, 1 Mar 2022 19:58:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56136 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238838AbiCBA64 (ORCPT ); Tue, 1 Mar 2022 19:58:56 -0500 Received: from mail-io1-xd2b.google.com (mail-io1-xd2b.google.com [IPv6:2607:f8b0:4864:20::d2b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5F4EA5DA78 for ; Tue, 1 Mar 2022 16:58:14 -0800 (PST) Received: by mail-io1-xd2b.google.com with SMTP id q8so165664iod.2 for ; Tue, 01 Mar 2022 16:58:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=rzAK6KV5383vhhOfHxdpZCKzzhHayX+JyBviHEbbJ8c=; b=4jfaKPjF5SRaEu3ahAQ8qEqA9V13VqdZ6q187NCWLBAK9DE8JRnk+10aC3wAxNq5WR Bk8bHIehTwjy8Xjwb0w8IBNBMhfutoWoDCzaJNoFU6nrwjacyiO/2A7ZmQfhtjCJoWUP dYj3Nx28tpg5vkY0eT1Kp4n/j8eQZ6j5blZBZu/UxrXSNY8qt8UeGDuewku5f8iH3hor AMw4ZAeVWRige6uV0kTVYMn6fE6QSaWoqqCPC21Uf+cvetJnBbnBPR8cq/z5HZA6nxAu POX1E2PX9bYaSqQPVeKBjyXGERFmR4sm26j1whbTXg6qMv1RIVXkkpv6k8obFV1TljKy Va1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=rzAK6KV5383vhhOfHxdpZCKzzhHayX+JyBviHEbbJ8c=; b=dlgfy/Rn4OsK2it0fWTvtjfKV9JhlOBB9GnMngI8so4/nBgQTtyOUtujDwov4VusJK BcIvTlC3R7Odg2x/PPHWCqGHiA97K+w7aNG0CZ+M+f9SMjpu+MBydfW8eVCDCDH8eCl3 pRlHvsnzv+2P2A0bu4zpoU1duefLvbST/fj4V18GgqFiDbIHLeGaTNzXGakKPG0pLRLu bE/iQL0Pp7O2vjtFsivd8aUA6Uy7/ss9OdnV5PN3VNGtvM95I5LOD5/j3sIfmSHJonIB ZOJ+XXZgGxpcmLxSU80x4IiHfGdFLz+zzPsQ7jA5R9Tsf6BsZv6ZOepcFNhC6m3mlZkx UzvQ== X-Gm-Message-State: AOAM5304Z5uevB2/W+oCPaC64mDcjVlD+R316ENJ94nyw2S9FSF1luHT Hd5oOUbWojJKWsB0BhKkALag+FiuP7xMhE6h X-Google-Smtp-Source: ABdhPJxHlb+pg2mldkkshSKNKUgVpUODGCMWMjHhoJATDI+qvwSIIeVzxgBmlsJ967a1hWc1xOQmZg== X-Received: by 2002:a05:6638:69d:b0:314:ac09:428d with SMTP id i29-20020a056638069d00b00314ac09428dmr12589000jab.0.1646182693685; Tue, 01 Mar 2022 16:58:13 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id m1-20020a056e021c2100b002c2ec1c8012sm4355762ilh.53.2022.03.01.16.58.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Mar 2022 16:58:13 -0800 (PST) Date: Tue, 1 Mar 2022 19:58:12 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: tytso@mit.edu, derrickstolee@github.com, gitster@pobox.com, larsxschneider@gmail.com Subject: [PATCH v2 06/17] t/helper: add 'pack-mtimes' test-tool Message-ID: <78313bc4412fd480c91cc36d6032914ee79368c3.1646182671.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In the next patch, we will implement and test support for writing a cruft pack via a special mode of `git pack-objects`. To make sure that objects are written with the correct timestamps, and a new test-tool that can dump the object names and corresponding timestamps from a given `.mtimes` file. Signed-off-by: Taylor Blau --- Makefile | 1 + t/helper/test-pack-mtimes.c | 56 +++++++++++++++++++++++++++++++++++++ t/helper/test-tool.c | 1 + t/helper/test-tool.h | 1 + 4 files changed, 59 insertions(+) create mode 100644 t/helper/test-pack-mtimes.c diff --git a/Makefile b/Makefile index 1b186f4fd7..5c0ed1ade7 100644 --- a/Makefile +++ b/Makefile @@ -727,6 +727,7 @@ TEST_BUILTINS_OBJS += test-oid-array.o TEST_BUILTINS_OBJS += test-oidmap.o TEST_BUILTINS_OBJS += test-oidtree.o TEST_BUILTINS_OBJS += test-online-cpus.o +TEST_BUILTINS_OBJS += test-pack-mtimes.o TEST_BUILTINS_OBJS += test-parse-options.o TEST_BUILTINS_OBJS += test-parse-pathspec-file.o TEST_BUILTINS_OBJS += test-partial-clone.o diff --git a/t/helper/test-pack-mtimes.c b/t/helper/test-pack-mtimes.c new file mode 100644 index 0000000000..f7b79daf4c --- /dev/null +++ b/t/helper/test-pack-mtimes.c @@ -0,0 +1,56 @@ +#include "git-compat-util.h" +#include "test-tool.h" +#include "strbuf.h" +#include "object-store.h" +#include "packfile.h" +#include "pack-mtimes.h" + +static void dump_mtimes(struct packed_git *p) +{ + uint32_t i; + if (load_pack_mtimes(p) < 0) + die("could not load pack .mtimes"); + + for (i = 0; i < p->num_objects; i++) { + struct object_id oid; + if (nth_packed_object_id(&oid, p, i) < 0) + die("could not load object id at position %"PRIu32, i); + + printf("%s %"PRIu32"\n", + oid_to_hex(&oid), nth_packed_mtime(p, i)); + } +} + +static const char *pack_mtimes_usage = "\n" +" test-tool pack-mtimes "; + +int cmd__pack_mtimes(int argc, const char **argv) +{ + struct strbuf buf = STRBUF_INIT; + struct packed_git *p; + + setup_git_directory(); + + if (argc != 2) + usage(pack_mtimes_usage); + + for (p = get_all_packs(the_repository); p; p = p->next) { + strbuf_addstr(&buf, basename(p->pack_name)); + strbuf_strip_suffix(&buf, ".pack"); + strbuf_addstr(&buf, ".mtimes"); + + if (!strcmp(buf.buf, argv[1])) + break; + + strbuf_reset(&buf); + } + + strbuf_release(&buf); + + if (!p) + die("could not find pack '%s'", argv[1]); + + dump_mtimes(p); + + return 0; +} diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c index e6ec69cf32..7d472b31fd 100644 --- a/t/helper/test-tool.c +++ b/t/helper/test-tool.c @@ -47,6 +47,7 @@ static struct test_cmd cmds[] = { { "oidmap", cmd__oidmap }, { "oidtree", cmd__oidtree }, { "online-cpus", cmd__online_cpus }, + { "pack-mtimes", cmd__pack_mtimes }, { "parse-options", cmd__parse_options }, { "parse-pathspec-file", cmd__parse_pathspec_file }, { "partial-clone", cmd__partial_clone }, diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h index 20756eefdd..0ac4f32955 100644 --- a/t/helper/test-tool.h +++ b/t/helper/test-tool.h @@ -37,6 +37,7 @@ int cmd__mktemp(int argc, const char **argv); int cmd__oidmap(int argc, const char **argv); int cmd__oidtree(int argc, const char **argv); int cmd__online_cpus(int argc, const char **argv); +int cmd__pack_mtimes(int argc, const char **argv); int cmd__parse_options(int argc, const char **argv); int cmd__parse_pathspec_file(int argc, const char** argv); int cmd__partial_clone(int argc, const char **argv); From patchwork Wed Mar 2 00:58:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12765346 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A891CC433F5 for ; Wed, 2 Mar 2022 00:58:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238837AbiCBA7I (ORCPT ); Tue, 1 Mar 2022 19:59:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56620 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238843AbiCBA7C (ORCPT ); Tue, 1 Mar 2022 19:59:02 -0500 Received: from mail-il1-x129.google.com (mail-il1-x129.google.com [IPv6:2607:f8b0:4864:20::129]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C41BA90260 for ; Tue, 1 Mar 2022 16:58:16 -0800 (PST) Received: by mail-il1-x129.google.com with SMTP id k7so233690ilo.8 for ; Tue, 01 Mar 2022 16:58:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=CIKN/CB83u6sQ1/KE0vEIeyTEwD5VnDa2rxArL8gDJ0=; b=FpGiuIdQ1YjcSSRXKNUicfiySgW9MgPXdXtdzBIIOudGQUUQUeLigf1Qq5Gb+PBZ5F x00dCwVlWj5MyAyeCa09bao1PUD/wg9lAj/LI8APchITNqEzaJzwkhA7GJ8DlSkCbvWJ 8YX8IkA/JL5bhvAPt0DERLDbHBua424BcK3bH9KAFHmwyT8vuI93y6jRH8SU7fBKX2hO vlED7lQYWLTSdoBbwNAJ2QTgHQqbokAhJhD7Zm7cBwvvIurKytE5x9jZdew1XrSAyOXh 8VLgwujIpFjnZWegqpBtUKBpyNRXTVQ092usNj6o8z7X0Nq5W9pDNv9vr1eIYHcclYoO QBgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=CIKN/CB83u6sQ1/KE0vEIeyTEwD5VnDa2rxArL8gDJ0=; b=u73RHJG0LbdG2Zw5YQmWlxeBV82VIZ85tJ5DzireJ8XL0/aDuGK/7r4+YipAAiZstv 9mG754SnMaeyoGPVypnpVbyyRr0sJr0rFegNYbt9GBRzKi7SbsICC4RiGR+3JlFPpIkw Dx508h4RsS1YTBhdrM3Oz4JP81Ssv3OLMAYRskd6sX6vVSgrWFuF0b4H8gcW0mcHz7vX OmFNC0aXFiyv4W7OLbeza2/0IMMZ1IG9w4raNytsV4MVQg67e3rInLsOy9bnajnqDheC fhbbzE7BvMqMIvNK3XzMGa73wRHP+CCb4WRCFfLedpYOTH8fA+p9trIE3GLFIDsKQFpf EkPw== X-Gm-Message-State: AOAM533QGczG/DE9pcb9RR7wPwQeClHNX7301/Rv71/XMsdc8ija8GCf wi4KI6FpN+bTNfES5RmCjIgq2lAFM/fI/YKf X-Google-Smtp-Source: ABdhPJyvy/m6ARLnccTsf+Q8VPCSIRPNLNFWhnOVJokscUI7mcDGvHM98Zkdw28UOe7kV0oaNpoUcw== X-Received: by 2002:a92:c888:0:b0:2c2:fb23:1cf with SMTP id w8-20020a92c888000000b002c2fb2301cfmr8065416ilo.301.1646182696066; Tue, 01 Mar 2022 16:58:16 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id a4-20020a5d9544000000b00640a6eb6e1esm8445326ios.53.2022.03.01.16.58.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Mar 2022 16:58:15 -0800 (PST) Date: Tue, 1 Mar 2022 19:58:15 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: tytso@mit.edu, derrickstolee@github.com, gitster@pobox.com, larsxschneider@gmail.com Subject: [PATCH v2 07/17] builtin/pack-objects.c: return from create_object_entry() Message-ID: <142098668d1ff6feea69be328ccc55119d14bf13.1646182671.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A new caller in the next commit will want to immediately modify the object_entry structure created by create_object_entry(). Instead of forcing that caller to wastefully look-up the entry we just created, return it from create_object_entry() instead. Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 385970cb7b..3f08a3c63a 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1508,13 +1508,13 @@ static int want_object_in_pack(const struct object_id *oid, return 1; } -static void create_object_entry(const struct object_id *oid, - enum object_type type, - uint32_t hash, - int exclude, - int no_try_delta, - struct packed_git *found_pack, - off_t found_offset) +static struct object_entry *create_object_entry(const struct object_id *oid, + enum object_type type, + uint32_t hash, + int exclude, + int no_try_delta, + struct packed_git *found_pack, + off_t found_offset) { struct object_entry *entry; @@ -1531,6 +1531,8 @@ static void create_object_entry(const struct object_id *oid, } entry->no_try_delta = no_try_delta; + + return entry; } static const char no_closure_warning[] = N_( From patchwork Wed Mar 2 00:58:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12765347 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12D44C433F5 for ; Wed, 2 Mar 2022 00:58:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238854AbiCBA7Q (ORCPT ); Tue, 1 Mar 2022 19:59:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56910 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238848AbiCBA7G (ORCPT ); Tue, 1 Mar 2022 19:59:06 -0500 Received: from mail-io1-xd2d.google.com (mail-io1-xd2d.google.com [IPv6:2607:f8b0:4864:20::d2d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 96F5C9859B for ; Tue, 1 Mar 2022 16:58:19 -0800 (PST) Received: by mail-io1-xd2d.google.com with SMTP id c18so146637ioc.6 for ; Tue, 01 Mar 2022 16:58:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=gakm3IQnaKxIxbYR90n9EPvtJzAQTt057eQCWRTllSE=; b=1WLmyfrIFEVPBadX8gFqABNie7HaxRiD+nxQtTyso5YnHVt/PSdmGaF381YYHdnHrA 3CMN/9do3ejCDcVFB6fFXRWb6z39boP5H2KeJwrkeSpzl/Ap1QVVPrmh50uZ0SxqJIdN LefO1PZRrhHyFw4K600Ub9vCqubrOhbKt6s0PlTozZg8YdwPMS0+E6bCqY87ZdldyxLM SawYHhwm9HliIL6wL3qfV+t3Gtje2i2G6QPvq+5zaFa5PNQn+0N1hfrM6M0ht/7sUBQ8 /olcAZCuCdIASfHiDXjhes66aXHZi7bYfKdUZJbPmAltRc4vqTRA9bo/njJ4xyVK7TGc ZwVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=gakm3IQnaKxIxbYR90n9EPvtJzAQTt057eQCWRTllSE=; b=fgOk7nM1XEz5jSP/dwrTu9KHgybJdQuiqK7jHvDNd3xwhNhz56M2cnE+3moMp9JGeB /O/5A23A7bn2si/mluGakDAinhmvQIUvLN8cfmLNZVsE7Z4e7CUudqXOMSvGaKhL4CwM cHFbxDRaq1FbaNFylIc3rqxSeqHBIE7T6qa1DgaccIf4ScvpYlSnd5oYL3uizL4pdwip 0b4kSiQAJeNMAdIIXVjQGt3X6U2Mlx5HP1GPu6PYcIFqPV/5ZUNRFciSl3UtCSvEsV14 hBhbgSMII3YbEg5xUJN4VWhwdHZdeN3jrtAYk7iS12/x4qtM3jDQrabHpqOTIsZPrP79 bCeA== X-Gm-Message-State: AOAM533yHPXPDWGSCVO6C39p8SzBcpx1EAJzLyRuoY7zZvCf707x1XV1 k+sMo0iAJXHSe8ew0Jrj9P18Mhxf3nUooff9 X-Google-Smtp-Source: ABdhPJySJrlChtONCS0JhJqMNoCxJ2URle36SuWsKAune7gaLMAjhQkAxTjtRBbvq6tlmrbbnLgUSw== X-Received: by 2002:a6b:e403:0:b0:640:6b4b:1b41 with SMTP id u3-20020a6be403000000b006406b4b1b41mr21035865iog.9.1646182698507; Tue, 01 Mar 2022 16:58:18 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id o11-20020a92a80b000000b002c1ec0ca545sm8608360ilh.18.2022.03.01.16.58.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Mar 2022 16:58:18 -0800 (PST) Date: Tue, 1 Mar 2022 19:58:17 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: tytso@mit.edu, derrickstolee@github.com, gitster@pobox.com, larsxschneider@gmail.com Subject: [PATCH v2 08/17] builtin/pack-objects.c: --cruft without expiration Message-ID: <2517a6be3d48a721dee6b5aa54f73b64e6abd1d6.1646182671.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Teach `pack-objects` how to generate a cruft pack when no objects are dropped (i.e., `--cruft-expiration=never`). Later patches will teach `pack-objects` how to generate a cruft pack that prunes objects. When generating a cruft pack which does not prune objects, we want to collect all unreachable objects into a single pack (noting and updating their mtimes as we accumulate them). Ordinary use will pass the result of a `git repack -A` as a kept pack, so when this patch says "kept pack", readers should think "reachable objects". Generating a non-expiring cruft packs works as follows: - Callers provide a list of every pack they know about, and indicate which packs are about to be removed. - All packs which are going to be removed (we'll call these the redundant ones) are marked as kept in-core. Any packs the caller did not mention (but are known to the `pack-objects` process) are also marked as kept in-core. Packs not mentioned by the caller are assumed to be unknown to them, i.e., they entered the repository after the caller decided which packs should be kept and which should be discarded. Since we do not want to include objects in these "unknown" packs (because we don't know which of their objects are or aren't reachable), these are also marked as kept in-core. - Then, we enumerate all objects in the repository, and add them to our packing list if they do not appear in an in-core kept pack. This results in a new cruft pack which contains all known objects that aren't included in the kept packs. When the kept pack is the result of `git repack -A`, the resulting pack contains all unreachable objects. Signed-off-by: Taylor Blau --- Documentation/git-pack-objects.txt | 30 ++++ builtin/pack-objects.c | 201 +++++++++++++++++++++++++- object-file.c | 2 +- object-store.h | 2 + t/t5328-pack-objects-cruft.sh | 218 +++++++++++++++++++++++++++++ 5 files changed, 448 insertions(+), 5 deletions(-) create mode 100755 t/t5328-pack-objects-cruft.sh diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt index f8344e1e5b..a9995a932c 100644 --- a/Documentation/git-pack-objects.txt +++ b/Documentation/git-pack-objects.txt @@ -13,6 +13,7 @@ SYNOPSIS [--no-reuse-delta] [--delta-base-offset] [--non-empty] [--local] [--incremental] [--window=] [--depth=] [--revs [--unpacked | --all]] [--keep-pack=] + [--cruft] [--cruft-expiration=