From patchwork Thu Jan 24 21:51:54 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Schindelin via GitGitGadget X-Patchwork-Id: 10780183 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C2A7A139A for ; Thu, 24 Jan 2019 21:51:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B16692FE9A for ; Thu, 24 Jan 2019 21:51:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AF9DB2FED9; Thu, 24 Jan 2019 21:51:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 336B42FE9A for ; Thu, 24 Jan 2019 21:51:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728170AbfAXVv5 (ORCPT ); Thu, 24 Jan 2019 16:51:57 -0500 Received: from mail-ed1-f68.google.com ([209.85.208.68]:45955 "EHLO mail-ed1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728152AbfAXVv5 (ORCPT ); Thu, 24 Jan 2019 16:51:57 -0500 Received: by mail-ed1-f68.google.com with SMTP id d39so5799799edb.12 for ; Thu, 24 Jan 2019 13:51:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:message-id:in-reply-to:references:from:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=cr7MdApjmJUktJ5Q5K3N1as5Z14llMEvJbtdmyAZi3g=; b=tPhReTX4yiYlvaQJlsOBzrVvtCh4Es6Hy6SkZMcc64jt6Jq5WNilwlqnMDpVhkIDRo 2rQnXA4cYI/rQF2i/bXJV9P8fNXJFpCd5xaAaiPM7gIsq/5pDrAVI77EHrWQIUEHAsl8 I5SitWiHGHm7KZvMQ3Nocj8Q74tVMHT3OhOVkP0yAJVPOClRej/XKBrDH8xQ3xsEI9va x+U0zYMACk8f9AnCq22q9K6qzXy647TBALuGW+lqBtwxojgBm5vXtd5XQlXtRHBlg1E+ hYfFYzqE+pjL8fpYnxEltGi1vi8v/xniveabL4PwiVwus07JYSt+5aknES6cdjaUkYsJ LjCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:in-reply-to:references:from :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=cr7MdApjmJUktJ5Q5K3N1as5Z14llMEvJbtdmyAZi3g=; b=DwutKVrkawy+/YfGLHWKyNr88+PJItn/vjM47v6zNj4ivVElUtAPm6Nm9p5dg27XK6 IVUHirRIiuF6kNNNrmEa/Zo2RERLpM5Pj1jI4qIcUtxCFF5c4pdRguQGip8noyS41i1p wS5HBCRTkb6DmlL5aJarcGGd2JUQbTjfvRYCIIPsYyCnZmNwwcUBVqrXrz08o2aFOORG I+KwLKDLtNRN6fZeiOpCEt9JQ/9QFQK9s1T7ZGCRACvIz00+1AcsBfElqldf8oI+yOU9 fnoYA+3X9exKrBHEpJXVmjaXTULmJZhy9IPGMUevbebTOb9HV5Eny/xftiMICz5RbPrv IPVQ== X-Gm-Message-State: AJcUukc7JJudEzkXDYf6hZ8nIEwV2XmxXen5TUfcdLZ+ZiYUVN0OJa1Y LJ9M8A6BUkbehCHm0hSK3s9ODo/E X-Google-Smtp-Source: ALg8bN5lio8rSQNQ4zb96OXxnxwlt2nrS4vQz3uMFQxG2SkxMxb3MtabNbNLCfkSPzLRZOsXwHRwIQ== X-Received: by 2002:a50:ce0f:: with SMTP id y15mr7906950edi.207.1548366715323; Thu, 24 Jan 2019 13:51:55 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id v11sm11074861edy.49.2019.01.24.13.51.54 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 24 Jan 2019 13:51:54 -0800 (PST) Date: Thu, 24 Jan 2019 13:51:54 -0800 (PST) X-Google-Original-Date: Thu, 24 Jan 2019 21:51:44 GMT Message-Id: <62b393b816d65a578c7a0ff9b40f01e0637e4d67.1548366713.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Subject: [PATCH v4 01/10] repack: refactor pack deletion for future use Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: sbeller@google.com, peff@peff.net, jrnieder@gmail.com, avarab@gmail.com, jonathantanmy@google.com, Junio C Hamano , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Derrick Stolee The repack builtin deletes redundant pack-files and their associated .idx, .promisor, .bitmap, and .keep files. We will want to re-use this logic in the future for other types of repack, so pull the logic into 'unlink_pack_path()' in packfile.c. The 'ignore_keep' parameter is enabled for the use in repack, but will be important for a future caller. Signed-off-by: Derrick Stolee --- builtin/repack.c | 14 ++------------ packfile.c | 28 ++++++++++++++++++++++++++++ packfile.h | 7 +++++++ 3 files changed, 37 insertions(+), 12 deletions(-) diff --git a/builtin/repack.c b/builtin/repack.c index 45583683ee..3d445b34b4 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -129,19 +129,9 @@ static void get_non_kept_pack_filenames(struct string_list *fname_list, static void remove_redundant_pack(const char *dir_name, const char *base_name) { - const char *exts[] = {".pack", ".idx", ".keep", ".bitmap", ".promisor"}; - int i; struct strbuf buf = STRBUF_INIT; - size_t plen; - - strbuf_addf(&buf, "%s/%s", dir_name, base_name); - plen = buf.len; - - for (i = 0; i < ARRAY_SIZE(exts); i++) { - strbuf_setlen(&buf, plen); - strbuf_addstr(&buf, exts[i]); - unlink(buf.buf); - } + strbuf_addf(&buf, "%s/%s.pack", dir_name, base_name); + unlink_pack_path(buf.buf, 1); strbuf_release(&buf); } diff --git a/packfile.c b/packfile.c index d1e6683ffe..bacecb4d0d 100644 --- a/packfile.c +++ b/packfile.c @@ -352,6 +352,34 @@ void close_all_packs(struct raw_object_store *o) } } +void unlink_pack_path(const char *pack_name, int force_delete) +{ + static const char *exts[] = {".pack", ".idx", ".keep", ".bitmap", ".promisor"}; + int i; + struct strbuf buf = STRBUF_INIT; + size_t plen; + + strbuf_addstr(&buf, pack_name); + strip_suffix_mem(buf.buf, &buf.len, ".pack"); + plen = buf.len; + + if (!force_delete) { + strbuf_addstr(&buf, ".keep"); + if (!access(buf.buf, F_OK)) { + strbuf_release(&buf); + return; + } + } + + for (i = 0; i < ARRAY_SIZE(exts); i++) { + strbuf_setlen(&buf, plen); + strbuf_addstr(&buf, exts[i]); + unlink(buf.buf); + } + + strbuf_release(&buf); +} + /* * The LRU pack is the one with the oldest MRU window, preferring packs * with no used windows, or the oldest mtime if it has no windows allocated. diff --git a/packfile.h b/packfile.h index 6c4037605d..5b7bcdb1dd 100644 --- a/packfile.h +++ b/packfile.h @@ -86,6 +86,13 @@ extern void unuse_pack(struct pack_window **); extern void clear_delta_base_cache(void); extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local); +/* + * Unlink the .pack and associated extension files. + * Does not unlink if 'force_delete' is false and the pack-file is + * marked as ".keep". + */ +extern void unlink_pack_path(const char *pack_name, int force_delete); + /* * Make sure that a pointer access into an mmap'd index file is within bounds, * and can provide at least 8 bytes of data. From patchwork Thu Jan 24 21:51:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Johannes Schindelin via GitGitGadget X-Patchwork-Id: 10780185 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1EF6813B4 for ; Thu, 24 Jan 2019 21:52:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0E41C2FED4 for ; Thu, 24 Jan 2019 21:52:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 029B82FEBF; Thu, 24 Jan 2019 21:51:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 910FA2FED6 for ; Thu, 24 Jan 2019 21:51:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728176AbfAXVv6 (ORCPT ); Thu, 24 Jan 2019 16:51:58 -0500 Received: from mail-ed1-f68.google.com ([209.85.208.68]:34960 "EHLO mail-ed1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728162AbfAXVv5 (ORCPT ); Thu, 24 Jan 2019 16:51:57 -0500 Received: by mail-ed1-f68.google.com with SMTP id x30so5850146edx.2 for ; Thu, 24 Jan 2019 13:51:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:message-id:in-reply-to:references:from:subject:mime-version :content-transfer-encoding:fcc:content-transfer-encoding:to:cc; bh=KFvWs3MtFPh9f4rs4LNngKoc4vcN++p1+neR0WjImg8=; b=Ss+fVjLDIgnGbjXVxNJczC5roD/KoKIaoIP4BqWf4b2/K9C1tUnKYX+qjWL3Ndl5+A 7sRU6VEg8vnxD+/ECXOHnvqzsXXM3PpKjEeUcseqwIebFY5nud04a3OYnZ9nmoS8bhrX unKTNFZjyqa3vS+kewcmbR1aKHsgRCnMvC7Ll1HQ6m+f8kp1dCvlSRrWRrCGfqpccbct zT+rXnh/bG8H6PMX9BT0SE/VJ5yc/6tf3pKEZbsTIfpeBj0OKe99VLJDSjQQyCwadORG 1DPqrmhNTj35ZIOLj5D0MMYPlquPfjHWs5QiluhzNH7wu0bI7S6lFAOniYVcnUUXda/m Ea6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:in-reply-to:references:from :subject:mime-version:content-transfer-encoding:fcc :content-transfer-encoding:to:cc; bh=KFvWs3MtFPh9f4rs4LNngKoc4vcN++p1+neR0WjImg8=; b=eIJCptqz8CGGnFUoRympm55zXEjVpgnEODoT7nxf0lFle/lt5PcgtZS/NB9fRMRt/X arGvkqu6kR5nLHMVrCy8yw+teqB8oDms3IDjV8GQUth/pX/oNu7arxAVAMXRjADU+YPP 7hjmOTmyEuHKv1+4zA/e1g/jRN5VbDswYlSrjp6OwdZp3m7JSFR5m8fg5nNbM59Qk7kM PRGXdvFmSFGuwnSZhKgGe1esAFUIUlG7tT6xWk2SCT4WdqrbROLBNEFVp5w3/lo8xvpq +sao9uk60hr3yiTfN9oDrlQNYSmcEncFuPNSNIAFNUcHhg8sOU+XuWD4dRpquROKTU7S zquw== X-Gm-Message-State: AJcUukf7k+/rt+1fdI4RoST2txt0F8Zhml0Ra8tdm/r8NJF/FO0DWEvX 8+p8lLO11R+Q3PaGw/iPFUtQnVDx X-Google-Smtp-Source: ALg8bN4ZvYQ6Sz9q7fwACeniPnUdk7Te/IvZgi2S/Qx+kU7FRZBe3KSYOBBi1RjRuU4S4D8Mtj9x/g== X-Received: by 2002:aa7:d602:: with SMTP id c2mr8077079edr.203.1548366716055; Thu, 24 Jan 2019 13:51:56 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id p30sm11183101eda.68.2019.01.24.13.51.55 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 24 Jan 2019 13:51:55 -0800 (PST) Date: Thu, 24 Jan 2019 13:51:55 -0800 (PST) X-Google-Original-Date: Thu, 24 Jan 2019 21:51:45 GMT Message-Id: <78867859042eea6318da65e236f903d6102ed93b.1548366713.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Subject: [PATCH v4 02/10] Docs: rearrange subcommands for multi-pack-index MIME-Version: 1.0 Fcc: Sent To: git@vger.kernel.org Cc: sbeller@google.com, peff@peff.net, jrnieder@gmail.com, avarab@gmail.com, jonathantanmy@google.com, Junio C Hamano , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Derrick Stolee We will add new subcommands to the multi-pack-index, and that will make the documentation a bit messier. Clean up the 'verb' descriptions by renaming the concept to 'subcommand' and removing the reference to the object directory. Helped-by: Stefan Beller Helped-by: Szeder Gábor Signed-off-by: Derrick Stolee --- Documentation/git-multi-pack-index.txt | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/Documentation/git-multi-pack-index.txt b/Documentation/git-multi-pack-index.txt index f7778a2c85..1af406aca2 100644 --- a/Documentation/git-multi-pack-index.txt +++ b/Documentation/git-multi-pack-index.txt @@ -9,7 +9,7 @@ git-multi-pack-index - Write and verify multi-pack-indexes SYNOPSIS -------- [verse] -'git multi-pack-index' [--object-dir=] +'git multi-pack-index' [--object-dir=] DESCRIPTION ----------- @@ -23,13 +23,13 @@ OPTIONS `/packs/multi-pack-index` for the current MIDX file, and `/packs` for the pack-files to index. +The following subcommands are available: + write:: - When given as the verb, write a new MIDX file to - `/packs/multi-pack-index`. + Write a new MIDX file. verify:: - When given as the verb, verify the contents of the MIDX file - at `/packs/multi-pack-index`. + Verify the contents of the MIDX file. EXAMPLES From patchwork Thu Jan 24 21:51:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Schindelin via GitGitGadget X-Patchwork-Id: 10780195 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 68382139A for ; Thu, 24 Jan 2019 21:52:06 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 584E62FEDF for ; Thu, 24 Jan 2019 21:52:06 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4CE662FED6; Thu, 24 Jan 2019 21:52:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B532F2FECC for ; Thu, 24 Jan 2019 21:52:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728249AbfAXVwD (ORCPT ); Thu, 24 Jan 2019 16:52:03 -0500 Received: from mail-ed1-f67.google.com ([209.85.208.67]:42628 "EHLO mail-ed1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728152AbfAXVv6 (ORCPT ); Thu, 24 Jan 2019 16:51:58 -0500 Received: by mail-ed1-f67.google.com with SMTP id y20so5814307edw.9 for ; Thu, 24 Jan 2019 13:51:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:message-id:in-reply-to:references:from:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=p/NfXfCR9Biih22TQ/5cUFRPA5OpU1lVTtKAGfgg+IE=; b=Abqvzrc7BqV+elLUVnQuUcaEvsE7b15npA0whwbAJt9v2n/h8aTwRwjZ5qmks4CCXk 14YWNQmqumFMtGLwWkTC6xWjExvv+u25Ep5iAJzqSyF7cTf184S1PNzqJtC9cxvmMyvI fGTZoeIk4IoncIrXwq+C2ZemdOk7DadEKj9xfCpjAKICvWnTNgguS1Wajf2e52Yu8bxM gX1cRAeVldmqbUXarKxp0KveauxZD6cbg4ZQ4vVLSEbRdgcPaJvidGoepM1XHREMuRlC QsXyht7nshm1ikr3rfxCyUY2GF2kAPhHCCIsKiwjarM7Osi3ZbVgw07KUFfwoqMn5GKc 4lHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:in-reply-to:references:from :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=p/NfXfCR9Biih22TQ/5cUFRPA5OpU1lVTtKAGfgg+IE=; b=EldobB9FVCkxkLghPpSzTkKiu/axyDiZEZEmk7o843j/jskik8rBqn/pVrZ33V9otS 8CLanfpfv8pjzL1tp98MJZkR+Lt5aYdq2QoUKI/vv1HB79H/6kKQQqRHr80OHHh+Jxye Hum6ry6jr/axWiMRO7COPuaeaqtBwjZ542Crm8bXfQl0GJm9hDnRZdK71MAWVdOMf2wR ZwgYNfionnOa1R3Mfs0zpy6EbCqi/2QNx4q56N18PYsyZDaELQf0iFxMhMfwosOurX2S WwDfD4QadzAo1QDhMYMwDOseVn2MGoNRCNdx3jwTRxiG/cHbk89FF4um0hf9KD/xTmog jwdQ== X-Gm-Message-State: AJcUukdSm4g74IIQKX6p0v616VM9fMZoFTDmOVVGmNk8xT0tNUpFWYM/ DKUrnq15uVF2Li6ovnIZOJaUfjxM X-Google-Smtp-Source: ALg8bN4SuRJD1EQPJG8WVBd5wtBqJmH6T8XHigyCVebIDHEfpW+r/6CqF8YYwT50PQmsDO5wSH0bkg== X-Received: by 2002:a50:9b1d:: with SMTP id o29mr7806823edi.246.1548366716924; Thu, 24 Jan 2019 13:51:56 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id l41sm11791962eda.83.2019.01.24.13.51.56 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 24 Jan 2019 13:51:56 -0800 (PST) Date: Thu, 24 Jan 2019 13:51:56 -0800 (PST) X-Google-Original-Date: Thu, 24 Jan 2019 21:51:46 GMT Message-Id: <628ca4603690b1239d722df8560dcd0b3790738d.1548366713.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Subject: [PATCH v4 03/10] multi-pack-index: prepare for 'expire' subcommand Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: sbeller@google.com, peff@peff.net, jrnieder@gmail.com, avarab@gmail.com, jonathantanmy@google.com, Junio C Hamano , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Derrick Stolee The multi-pack-index tracks objects in a collection of pack-files. Only one copy of each object is indexed, using the modified time of the pack-files to determine tie-breakers. It is possible to have a pack-file with no referenced objects because all objects have a duplicate in a newer pack-file. Introduce a new 'expire' subcommand to the multi-pack-index builtin. This subcommand will delete these unused pack-files and rewrite the multi-pack-index to no longer refer to those files. More details about the specifics will follow as the method is implemented. Add a test that verifies the 'expire' subcommand is correctly wired, but will still be valid when the verb is implemented. Specifically, create a set of packs that should all have referenced objects and should not be removed during an 'expire' operation. The packs are created carefully to ensure they have a specific order when sorted by size. This will be important in a later test. Signed-off-by: Derrick Stolee --- Documentation/git-multi-pack-index.txt | 5 +++ builtin/multi-pack-index.c | 4 ++- midx.c | 5 +++ midx.h | 1 + t/t5319-multi-pack-index.sh | 49 ++++++++++++++++++++++++++ 5 files changed, 63 insertions(+), 1 deletion(-) diff --git a/Documentation/git-multi-pack-index.txt b/Documentation/git-multi-pack-index.txt index 1af406aca2..6186c4c936 100644 --- a/Documentation/git-multi-pack-index.txt +++ b/Documentation/git-multi-pack-index.txt @@ -31,6 +31,11 @@ write:: verify:: Verify the contents of the MIDX file. +expire:: + Delete the pack-files that are tracked by the MIDX file, but + have no objects referenced by the MIDX. Rewrite the MIDX file + afterward to remove all references to these pack-files. + EXAMPLES -------- diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index fca70f8e4f..145de3a46c 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -5,7 +5,7 @@ #include "midx.h" static char const * const builtin_multi_pack_index_usage[] = { - N_("git multi-pack-index [--object-dir=] (write|verify)"), + N_("git multi-pack-index [--object-dir=] (write|verify|expire)"), NULL }; @@ -44,6 +44,8 @@ int cmd_multi_pack_index(int argc, const char **argv, return write_midx_file(opts.object_dir); if (!strcmp(argv[0], "verify")) return verify_midx_file(opts.object_dir); + if (!strcmp(argv[0], "expire")) + return expire_midx_packs(opts.object_dir); die(_("unrecognized verb: %s"), argv[0]); } diff --git a/midx.c b/midx.c index 730ff84dff..bb825ef816 100644 --- a/midx.c +++ b/midx.c @@ -1025,3 +1025,8 @@ int verify_midx_file(const char *object_dir) return verify_midx_error; } + +int expire_midx_packs(const char *object_dir) +{ + return 0; +} diff --git a/midx.h b/midx.h index 774f652530..e3a2b740b5 100644 --- a/midx.h +++ b/midx.h @@ -49,6 +49,7 @@ int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, i int write_midx_file(const char *object_dir); void clear_midx_file(struct repository *r); int verify_midx_file(const char *object_dir); +int expire_midx_packs(const char *object_dir); void close_midx(struct multi_pack_index *m); diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 70926b5bc0..a8528f7da0 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -348,4 +348,53 @@ test_expect_success 'verify incorrect 64-bit offset' ' "incorrect object offset" ' +test_expect_success 'setup expire tests' ' + mkdir dup && + ( + cd dup && + git init && + test-tool genrandom "data" 4096 >large_file.txt && + git update-index --add large_file.txt && + for i in $(test_seq 1 20) + do + test_commit $i + done && + git branch A HEAD && + git branch B HEAD~8 && + git branch C HEAD~13 && + git branch D HEAD~16 && + git branch E HEAD~18 && + git pack-objects --revs .git/objects/pack/pack-A <<-EOF && + refs/heads/A + ^refs/heads/B + EOF + git pack-objects --revs .git/objects/pack/pack-B <<-EOF && + refs/heads/B + ^refs/heads/C + EOF + git pack-objects --revs .git/objects/pack/pack-C <<-EOF && + refs/heads/C + ^refs/heads/D + EOF + git pack-objects --revs .git/objects/pack/pack-D <<-EOF && + refs/heads/D + ^refs/heads/E + EOF + git pack-objects --revs .git/objects/pack/pack-E <<-EOF && + refs/heads/E + EOF + git multi-pack-index write + ) +' + +test_expect_success 'expire does not remove any packs' ' + ( + cd dup && + ls .git/objects/pack >expect && + git multi-pack-index expire && + ls .git/objects/pack >actual && + test_cmp expect actual + ) +' + test_done From patchwork Thu Jan 24 21:51:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Schindelin via GitGitGadget X-Patchwork-Id: 10780187 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4703E14E5 for ; Thu, 24 Jan 2019 21:52:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 34E2B2FE68 for ; Thu, 24 Jan 2019 21:52:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 28EF22FED0; Thu, 24 Jan 2019 21:52:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A23FA2FEB0 for ; Thu, 24 Jan 2019 21:52:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728199AbfAXVwA (ORCPT ); Thu, 24 Jan 2019 16:52:00 -0500 Received: from mail-ed1-f68.google.com ([209.85.208.68]:45959 "EHLO mail-ed1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728180AbfAXVv7 (ORCPT ); Thu, 24 Jan 2019 16:51:59 -0500 Received: by mail-ed1-f68.google.com with SMTP id d39so5799865edb.12 for ; Thu, 24 Jan 2019 13:51:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:message-id:in-reply-to:references:from:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=DLIPd3bITa4K2fjCV1YYrBf98h0dmsqGlSEuJaeGE5M=; b=ToPh5gRGNTtbQGpwvsC855StIIoWvSI/HOxT5atnEKjr2HlTS0z8Tl7Edp6gXJaGs0 9qbGC/USzJMhcVB115USKsnU7RXAVUpbqVZwKwkGf5TopnpLRQzpe5rs0empHX5v50L+ eOqglApmLKqv6GDxusjiiOryg3bzeGn3WeQQ/3uil6yxgIWlCOubQu0fBrY3db8Zk4A9 YGRo0OpOdJQwTqkvEqKf6/KYY9kv3DgE9ssNGxcoqfcNTA4kC18Hk4RPHsWPaZ2pnLLY AnEnZ6cdQkrVEkMJ4on3DqQmwTgElxHaVyZdWlrtPp66/BHd9NigOHN94jreiu7uzbc9 4kvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:in-reply-to:references:from :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=DLIPd3bITa4K2fjCV1YYrBf98h0dmsqGlSEuJaeGE5M=; b=OBLLg550RxZGF+zHbydA7S5V0mlZSxK0gix+JKYfZaPbAh/e2R92ISGxWeWr8Rtgu3 sjFAlnQNe4gdNfZuh9S1efqxGj0TLjqrT/b3vDQpInwXKDOrCSeUHW/7RDRMbXQp0dna uCKpVduytSCOehFMvQSw5wJT50Ag06s44tCWRYqMAHaPYjK7oW7+rrecMYSvdIoQpsia KhfGHG4cQiIzxEI9HyGhA1SkWJP32m3LsrRHzYIZGB0We8pSzuBVoUv06B4UQDzV8G3m B3qiQmd3/ZcCY6VYfoPe2wIL6WYY7tPpK6QFJurqnqQhjjTqvJNFgNtPV0Jlvs+I3WgW tjXg== X-Gm-Message-State: AJcUukcPN/BX9pOHRsY1sLSdYzYqWZ7zVYsF30iktB8AGrLnfGeed02o kLEHBKB84O5dZuON5QrCHalB68Cc X-Google-Smtp-Source: ALg8bN5UK7Ijk/Ho/hvu3Vpac0MOZBWW1LpVsyU+94IdhskcidNgjvey0X07EgeiBx3p0HHcqNq9iA== X-Received: by 2002:a50:a938:: with SMTP id l53mr7976673edc.194.1548366717692; Thu, 24 Jan 2019 13:51:57 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id q4sm11198770eda.50.2019.01.24.13.51.57 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 24 Jan 2019 13:51:57 -0800 (PST) Date: Thu, 24 Jan 2019 13:51:57 -0800 (PST) X-Google-Original-Date: Thu, 24 Jan 2019 21:51:47 GMT Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Subject: [PATCH v4 04/10] midx: simplify computation of pack name lengths Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: sbeller@google.com, peff@peff.net, jrnieder@gmail.com, avarab@gmail.com, jonathantanmy@google.com, Junio C Hamano , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Derrick Stolee Before writing the multi-pack-index, we compute the length of the pack-index names concatenated together. This forms the data in the pack name chunk, and we precompute it to compute chunk offsets. The value is also modified to fit alignment needs. Previously, this computation was coupled with adding packs from the existing multi-pack-index and the remaining packs in the object dir not already covered by the multi-pack-index. In anticipation of this becoming more complicated with the 'expire' subcommand, simplify the computation by centralizing it to a single loop before writing the file. Signed-off-by: Derrick Stolee --- midx.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/midx.c b/midx.c index bb825ef816..f087bbbe82 100644 --- a/midx.c +++ b/midx.c @@ -383,7 +383,6 @@ struct pack_list { uint32_t nr; uint32_t alloc_list; uint32_t alloc_names; - size_t pack_name_concat_len; struct multi_pack_index *m; }; @@ -418,7 +417,6 @@ static void add_pack_to_midx(const char *full_path, size_t full_path_len, } packs->names[packs->nr] = xstrdup(file_name); - packs->pack_name_concat_len += strlen(file_name) + 1; packs->nr++; } } @@ -762,6 +760,7 @@ int write_midx_file(const char *object_dir) uint32_t nr_entries, num_large_offsets = 0; struct pack_midx_entry *entries = NULL; int large_offsets_needed = 0; + int pack_name_concat_len = 0; midx_name = get_midx_filename(object_dir); if (safe_create_leading_directories(midx_name)) { @@ -777,7 +776,6 @@ int write_midx_file(const char *object_dir) packs.alloc_names = packs.alloc_list; packs.list = NULL; packs.names = NULL; - packs.pack_name_concat_len = 0; ALLOC_ARRAY(packs.list, packs.alloc_list); ALLOC_ARRAY(packs.names, packs.alloc_names); @@ -788,7 +786,6 @@ int write_midx_file(const char *object_dir) packs.list[packs.nr] = NULL; packs.names[packs.nr] = xstrdup(packs.m->pack_names[i]); - packs.pack_name_concat_len += strlen(packs.names[packs.nr]) + 1; packs.nr++; } } @@ -798,10 +795,6 @@ int write_midx_file(const char *object_dir) if (packs.m && packs.nr == packs.m->num_packs) goto cleanup; - if (packs.pack_name_concat_len % MIDX_CHUNK_ALIGNMENT) - packs.pack_name_concat_len += MIDX_CHUNK_ALIGNMENT - - (packs.pack_name_concat_len % MIDX_CHUNK_ALIGNMENT); - ALLOC_ARRAY(pack_perm, packs.nr); sort_packs_by_name(packs.names, packs.nr, pack_perm); @@ -814,6 +807,13 @@ int write_midx_file(const char *object_dir) large_offsets_needed = 1; } + for (i = 0; i < packs.nr; i++) + pack_name_concat_len += strlen(packs.names[i]) + 1; + + if (pack_name_concat_len % MIDX_CHUNK_ALIGNMENT) + pack_name_concat_len += MIDX_CHUNK_ALIGNMENT - + (pack_name_concat_len % MIDX_CHUNK_ALIGNMENT); + hold_lock_file_for_update(&lk, midx_name, LOCK_DIE_ON_ERROR); f = hashfd(lk.tempfile->fd, lk.tempfile->filename.buf); FREE_AND_NULL(midx_name); @@ -831,7 +831,7 @@ int write_midx_file(const char *object_dir) cur_chunk++; chunk_ids[cur_chunk] = MIDX_CHUNKID_OIDFANOUT; - chunk_offsets[cur_chunk] = chunk_offsets[cur_chunk - 1] + packs.pack_name_concat_len; + chunk_offsets[cur_chunk] = chunk_offsets[cur_chunk - 1] + pack_name_concat_len; cur_chunk++; chunk_ids[cur_chunk] = MIDX_CHUNKID_OIDLOOKUP; From patchwork Thu Jan 24 21:51:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Schindelin via GitGitGadget X-Patchwork-Id: 10780191 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D1BB913B4 for ; Thu, 24 Jan 2019 21:52:04 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BB3D42FEE0 for ; Thu, 24 Jan 2019 21:52:04 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AE9A82FECC; Thu, 24 Jan 2019 21:52:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CBF1A2FEE0 for ; Thu, 24 Jan 2019 21:52:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728236AbfAXVwD (ORCPT ); Thu, 24 Jan 2019 16:52:03 -0500 Received: from mail-ed1-f68.google.com ([209.85.208.68]:33238 "EHLO mail-ed1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728024AbfAXVwB (ORCPT ); Thu, 24 Jan 2019 16:52:01 -0500 Received: by mail-ed1-f68.google.com with SMTP id p6so5863719eds.0 for ; Thu, 24 Jan 2019 13:51:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:message-id:in-reply-to:references:from:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=G6eNgzXBTt2gCUouRNClyIQleuPAG5rVGh3Lj5nT6Zs=; b=nKpdDS0Ed8+sKtMNFImgZZ0jBxm+eFmzXDMCN8PvD92wWGa+K98AOHyEmGm9VSD6y1 juaHP+zFV9c2aXbNgzQ2hJQT+G+U0ZPWM7ulWlJOsTgz032cBdK3LUDT1W+GBls9C1lD CdUcgzS1J7EG5ZtGQhKvLaiF/0nY8n0UqHHJ7On3LyC2OWRn82vaPoukXskxSmLpxd1o 3sQlPNRoLtWjMQRE8nfu5ozpM++h1Role9+kH5J/4NDdwzuARRupD8a/Vd+4PhkJOoQK DxjUUmI+9uyG7pfM5CurW9gEHeeS04UaMhOxXv98iF1YfUSUWuyWXLfTq8u9PzlvurWw Ty7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:in-reply-to:references:from :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=G6eNgzXBTt2gCUouRNClyIQleuPAG5rVGh3Lj5nT6Zs=; b=WSPT2ZfZ3nGqEzAHnIpjB2yWOUWdhQ4QttOst/HyyJf5UfDwhyCoH6tLgmlwGt5N20 41PWAOLdaBMgrwwdIwLpTkBN8SPrq5auPwMbA/AEW0JRdRSQ6XO5BtLfX/0lO48xvahl rEHxmqm0DrWjQT8gyWh71clHW35LpxJdVmernzbGa/hh7Qzv+60OrmmKc6ePrytf/eGt HGknQa+u+0o4O1iR3G0vz0uBz59lNaYxLNUbamZzDHWTraxc/q7uNhUiUt9ds6Yc6Jqm y1Hqm09YXgGQlMXHXWwdZxXgToO8hjAZpWIRAUTGRtS/B2P+7a3kVEFvxeLB6EOPDHSv Pirw== X-Gm-Message-State: AJcUuke1ohdCJfUdk5Vv9bxFgNPU9GQN44Q0XSHMK6wjlQGUxO4wQpMP uXv1TUoenH/I3GO8DaDCshnAyKK4 X-Google-Smtp-Source: ALg8bN4CyBOZqyXGyeqo2Yj+cgLcuBmfEL3vui47FtW2i8r53C9/gf+7tk1UrUVpCkr9adEajST9Cw== X-Received: by 2002:a50:90b7:: with SMTP id c52mr8239435eda.31.1548366718569; Thu, 24 Jan 2019 13:51:58 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id p30sm11183111eda.68.2019.01.24.13.51.57 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 24 Jan 2019 13:51:57 -0800 (PST) Date: Thu, 24 Jan 2019 13:51:57 -0800 (PST) X-Google-Original-Date: Thu, 24 Jan 2019 21:51:48 GMT Message-Id: <3950743b963049f502f1e5caefcfe477ebbcdd8b.1548366713.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Subject: [PATCH v4 05/10] midx: refactor permutation logic and pack sorting Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: sbeller@google.com, peff@peff.net, jrnieder@gmail.com, avarab@gmail.com, jonathantanmy@google.com, Junio C Hamano , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Derrick Stolee In anticipation of the expire subcommand, refactor the way we sort the packfiles by name. This will greatly simplify our approach to dropping expired packs from the list. First, create 'struct pack_info' to replace 'struct pack_pair'. This struct contains the necessary information about a pack, including its name, a pointer to its packfile struct (if not already in the multi-pack-index), and the original pack-int-id. Second, track the pack information using an array of pack_info structs in the pack_list struct. This simplifies the logic around the multiple arrays we were tracking in that struct. Finally, update get_sorted_entries() to not permute the pack-int-id and instead supply the permutation to write_midx_object_offsets(). This requires sorting the packs after get_sorted_entries(). Signed-off-by: Derrick Stolee --- midx.c | 156 +++++++++++++++++++++++++-------------------------------- 1 file changed, 69 insertions(+), 87 deletions(-) diff --git a/midx.c b/midx.c index f087bbbe82..95c39106b2 100644 --- a/midx.c +++ b/midx.c @@ -377,12 +377,23 @@ static size_t write_midx_header(struct hashfile *f, return MIDX_HEADER_SIZE; } +struct pack_info { + uint32_t orig_pack_int_id; + char *pack_name; + struct packed_git *p; +}; + +static int pack_info_compare(const void *_a, const void *_b) +{ + struct pack_info *a = (struct pack_info *)_a; + struct pack_info *b = (struct pack_info *)_b; + return strcmp(a->pack_name, b->pack_name); +} + struct pack_list { - struct packed_git **list; - char **names; + struct pack_info *info; uint32_t nr; - uint32_t alloc_list; - uint32_t alloc_names; + uint32_t alloc; struct multi_pack_index *m; }; @@ -395,66 +406,32 @@ static void add_pack_to_midx(const char *full_path, size_t full_path_len, if (packs->m && midx_contains_pack(packs->m, file_name)) return; - ALLOC_GROW(packs->list, packs->nr + 1, packs->alloc_list); - ALLOC_GROW(packs->names, packs->nr + 1, packs->alloc_names); + ALLOC_GROW(packs->info, packs->nr + 1, packs->alloc); - packs->list[packs->nr] = add_packed_git(full_path, - full_path_len, - 0); + packs->info[packs->nr].p = add_packed_git(full_path, + full_path_len, + 0); - if (!packs->list[packs->nr]) { + if (!packs->info[packs->nr].p) { warning(_("failed to add packfile '%s'"), full_path); return; } - if (open_pack_index(packs->list[packs->nr])) { + if (open_pack_index(packs->info[packs->nr].p)) { warning(_("failed to open pack-index '%s'"), full_path); - close_pack(packs->list[packs->nr]); - FREE_AND_NULL(packs->list[packs->nr]); + close_pack(packs->info[packs->nr].p); + FREE_AND_NULL(packs->info[packs->nr].p); return; } - packs->names[packs->nr] = xstrdup(file_name); + packs->info[packs->nr].pack_name = xstrdup(file_name); + packs->info[packs->nr].orig_pack_int_id = packs->nr; packs->nr++; } } -struct pack_pair { - uint32_t pack_int_id; - char *pack_name; -}; - -static int pack_pair_compare(const void *_a, const void *_b) -{ - struct pack_pair *a = (struct pack_pair *)_a; - struct pack_pair *b = (struct pack_pair *)_b; - return strcmp(a->pack_name, b->pack_name); -} - -static void sort_packs_by_name(char **pack_names, uint32_t nr_packs, uint32_t *perm) -{ - uint32_t i; - struct pack_pair *pairs; - - ALLOC_ARRAY(pairs, nr_packs); - - for (i = 0; i < nr_packs; i++) { - pairs[i].pack_int_id = i; - pairs[i].pack_name = pack_names[i]; - } - - QSORT(pairs, nr_packs, pack_pair_compare); - - for (i = 0; i < nr_packs; i++) { - pack_names[i] = pairs[i].pack_name; - perm[pairs[i].pack_int_id] = i; - } - - free(pairs); -} - struct pack_midx_entry { struct object_id oid; uint32_t pack_int_id; @@ -480,7 +457,6 @@ static int midx_oid_compare(const void *_a, const void *_b) } static int nth_midxed_pack_midx_entry(struct multi_pack_index *m, - uint32_t *pack_perm, struct pack_midx_entry *e, uint32_t pos) { @@ -488,7 +464,7 @@ static int nth_midxed_pack_midx_entry(struct multi_pack_index *m, return 1; nth_midxed_object_oid(&e->oid, m, pos); - e->pack_int_id = pack_perm[nth_midxed_pack_int_id(m, pos)]; + e->pack_int_id = nth_midxed_pack_int_id(m, pos); e->offset = nth_midxed_offset(m, pos); /* consider objects in midx to be from "old" packs */ @@ -522,8 +498,7 @@ static void fill_pack_entry(uint32_t pack_int_id, * of a packfile containing the object). */ static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m, - struct packed_git **p, - uint32_t *perm, + struct pack_info *info, uint32_t nr_packs, uint32_t *nr_objects) { @@ -534,7 +509,7 @@ static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m, uint32_t start_pack = m ? m->num_packs : 0; for (cur_pack = start_pack; cur_pack < nr_packs; cur_pack++) - total_objects += p[cur_pack]->num_objects; + total_objects += info[cur_pack].p->num_objects; /* * As we de-duplicate by fanout value, we expect the fanout @@ -559,7 +534,7 @@ static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m, for (cur_object = start; cur_object < end; cur_object++) { ALLOC_GROW(entries_by_fanout, nr_fanout + 1, alloc_fanout); - nth_midxed_pack_midx_entry(m, perm, + nth_midxed_pack_midx_entry(m, &entries_by_fanout[nr_fanout], cur_object); nr_fanout++; @@ -570,12 +545,12 @@ static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m, uint32_t start = 0, end; if (cur_fanout) - start = get_pack_fanout(p[cur_pack], cur_fanout - 1); - end = get_pack_fanout(p[cur_pack], cur_fanout); + start = get_pack_fanout(info[cur_pack].p, cur_fanout - 1); + end = get_pack_fanout(info[cur_pack].p, cur_fanout); for (cur_object = start; cur_object < end; cur_object++) { ALLOC_GROW(entries_by_fanout, nr_fanout + 1, alloc_fanout); - fill_pack_entry(perm[cur_pack], p[cur_pack], cur_object, &entries_by_fanout[nr_fanout]); + fill_pack_entry(cur_pack, info[cur_pack].p, cur_object, &entries_by_fanout[nr_fanout]); nr_fanout++; } } @@ -604,7 +579,7 @@ static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m, } static size_t write_midx_pack_names(struct hashfile *f, - char **pack_names, + struct pack_info *info, uint32_t num_packs) { uint32_t i; @@ -612,14 +587,14 @@ static size_t write_midx_pack_names(struct hashfile *f, size_t written = 0; for (i = 0; i < num_packs; i++) { - size_t writelen = strlen(pack_names[i]) + 1; + size_t writelen = strlen(info[i].pack_name) + 1; - if (i && strcmp(pack_names[i], pack_names[i - 1]) <= 0) + if (i && strcmp(info[i].pack_name, info[i - 1].pack_name) <= 0) BUG("incorrect pack-file order: %s before %s", - pack_names[i - 1], - pack_names[i]); + info[i - 1].pack_name, + info[i].pack_name); - hashwrite(f, pack_names[i], writelen); + hashwrite(f, info[i].pack_name, writelen); written += writelen; } @@ -690,6 +665,7 @@ static size_t write_midx_oid_lookup(struct hashfile *f, unsigned char hash_len, } static size_t write_midx_object_offsets(struct hashfile *f, int large_offset_needed, + uint32_t *perm, struct pack_midx_entry *objects, uint32_t nr_objects) { struct pack_midx_entry *list = objects; @@ -699,7 +675,7 @@ static size_t write_midx_object_offsets(struct hashfile *f, int large_offset_nee for (i = 0; i < nr_objects; i++) { struct pack_midx_entry *obj = list++; - hashwrite_be32(f, obj->pack_int_id); + hashwrite_be32(f, perm[obj->pack_int_id]); if (large_offset_needed && obj->offset >> 31) hashwrite_be32(f, MIDX_LARGE_OFFSET_NEEDED | nr_large_offset++); @@ -772,20 +748,17 @@ int write_midx_file(const char *object_dir) packs.m = load_multi_pack_index(object_dir, 1); packs.nr = 0; - packs.alloc_list = packs.m ? packs.m->num_packs : 16; - packs.alloc_names = packs.alloc_list; - packs.list = NULL; - packs.names = NULL; - ALLOC_ARRAY(packs.list, packs.alloc_list); - ALLOC_ARRAY(packs.names, packs.alloc_names); + packs.alloc = packs.m ? packs.m->num_packs : 16; + packs.info = NULL; + ALLOC_ARRAY(packs.info, packs.alloc); if (packs.m) { for (i = 0; i < packs.m->num_packs; i++) { - ALLOC_GROW(packs.list, packs.nr + 1, packs.alloc_list); - ALLOC_GROW(packs.names, packs.nr + 1, packs.alloc_names); + ALLOC_GROW(packs.info, packs.nr + 1, packs.alloc); - packs.list[packs.nr] = NULL; - packs.names[packs.nr] = xstrdup(packs.m->pack_names[i]); + packs.info[packs.nr].orig_pack_int_id = i; + packs.info[packs.nr].pack_name = xstrdup(packs.m->pack_names[i]); + packs.info[packs.nr].p = NULL; packs.nr++; } } @@ -795,10 +768,7 @@ int write_midx_file(const char *object_dir) if (packs.m && packs.nr == packs.m->num_packs) goto cleanup; - ALLOC_ARRAY(pack_perm, packs.nr); - sort_packs_by_name(packs.names, packs.nr, pack_perm); - - entries = get_sorted_entries(packs.m, packs.list, pack_perm, packs.nr, &nr_entries); + entries = get_sorted_entries(packs.m, packs.info, packs.nr, &nr_entries); for (i = 0; i < nr_entries; i++) { if (entries[i].offset > 0x7fffffff) @@ -807,8 +777,21 @@ int write_midx_file(const char *object_dir) large_offsets_needed = 1; } + QSORT(packs.info, packs.nr, pack_info_compare); + + /* + * pack_perm stores a permutation between pack-int-ids from the + * previous multi-pack-index to the new one we are writing: + * + * pack_perm[old_id] = new_id + */ + ALLOC_ARRAY(pack_perm, packs.nr); + for (i = 0; i < packs.nr; i++) { + pack_perm[packs.info[i].orig_pack_int_id] = i; + } + for (i = 0; i < packs.nr; i++) - pack_name_concat_len += strlen(packs.names[i]) + 1; + pack_name_concat_len += strlen(packs.info[i].pack_name) + 1; if (pack_name_concat_len % MIDX_CHUNK_ALIGNMENT) pack_name_concat_len += MIDX_CHUNK_ALIGNMENT - @@ -879,7 +862,7 @@ int write_midx_file(const char *object_dir) switch (chunk_ids[i]) { case MIDX_CHUNKID_PACKNAMES: - written += write_midx_pack_names(f, packs.names, packs.nr); + written += write_midx_pack_names(f, packs.info, packs.nr); break; case MIDX_CHUNKID_OIDFANOUT: @@ -891,7 +874,7 @@ int write_midx_file(const char *object_dir) break; case MIDX_CHUNKID_OBJECTOFFSETS: - written += write_midx_object_offsets(f, large_offsets_needed, entries, nr_entries); + written += write_midx_object_offsets(f, large_offsets_needed, pack_perm, entries, nr_entries); break; case MIDX_CHUNKID_LARGEOFFSETS: @@ -914,15 +897,14 @@ int write_midx_file(const char *object_dir) cleanup: for (i = 0; i < packs.nr; i++) { - if (packs.list[i]) { - close_pack(packs.list[i]); - free(packs.list[i]); + if (packs.info[i].p) { + close_pack(packs.info[i].p); + free(packs.info[i].p); } - free(packs.names[i]); + free(packs.info[i].pack_name); } - free(packs.list); - free(packs.names); + free(packs.info); free(entries); free(pack_perm); free(midx_name); From patchwork Thu Jan 24 21:51:59 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Schindelin via GitGitGadget X-Patchwork-Id: 10780193 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A191413B4 for ; Thu, 24 Jan 2019 21:52:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 909872FECC for ; Thu, 24 Jan 2019 21:52:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8EAD12FEDA; Thu, 24 Jan 2019 21:52:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C77C02FEEC for ; Thu, 24 Jan 2019 21:52:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728245AbfAXVwD (ORCPT ); Thu, 24 Jan 2019 16:52:03 -0500 Received: from mail-ed1-f67.google.com ([209.85.208.67]:36362 "EHLO mail-ed1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728180AbfAXVwB (ORCPT ); Thu, 24 Jan 2019 16:52:01 -0500 Received: by mail-ed1-f67.google.com with SMTP id f23so5831271edb.3 for ; Thu, 24 Jan 2019 13:52:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:message-id:in-reply-to:references:from:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=7vOePW10TehWuekX3AHCQpH6KR8rxnd2tOqRyX9geZI=; b=nEFn75os4qhz0uTwh35kdVerj3uLgj+IbznUhVFFDayHBP2VA+t/AhuzsKXscFxnEG /5gqNWr31ZmmIXOHbzrVTVhZEEPdOA1hzM+MPfXmnEsBM0nclcGs15imIJEtfFRu+9jE /6MzJ37QKnXEB1dWc1RYWa1vWUA63uxTmxjlzNPzVmPd+8uxvymk29pQhrtzoIpnFCXz t1sM+uwiL2yknlx5bjHsMj8OhmetPuOMyJ2fxIUiUMilYnv7e/M/wqGMXkX+P3tHs7LY vJdXOJrEQZsJ0rGkBoPtPMELpIzcle5IIc0OHr91+z67IbxtkLyyfdVBIuuFO9pXBpnw O73A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:in-reply-to:references:from :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=7vOePW10TehWuekX3AHCQpH6KR8rxnd2tOqRyX9geZI=; b=A8y/dDceRy2UA5Didsq9h2WLjFVOym7mzL1VWBRkdlm7v/4L/V3eNJ6cJfbiNoS7oU /JQdpgC3vqwCv5P8N4k6vK9DHD27xPrOrlXRNQOLNvWOvopXZ8d7OVKfGjke26E39MZR JXkN1nn5n/EL4eMPBMn0vYVGvNcOrTGIx7RV3qypAf4aFVxlXG3Gh0UVDx1xzHat17J0 Qou0/3CD0fml8ZpNaLk2bA4PeQHoyUS9sWA/nBx5yYeHApYXaLjFLPVoGF/Mdo8B39gH T5Zg76XM04HvQuyHN3/s6JG1dwE5n6+1T41uy2CvmQAqGwEpDGt7mExUtZEaj17dvqDp R1dQ== X-Gm-Message-State: AJcUukdv/IMSa3DtdS7WfxKbTleUbDpwE/RyhtTcIFdWAnmwdNVruVia hQzV0Mzn+r2+kQV7gww/pj6kV9IG X-Google-Smtp-Source: ALg8bN4qtNJNJb0Q4cbVuVwzMs0Vi4ujMTpffyTPU2B0u4uKb7u7MHNBo5amMkbapuhWE8DJ2lXfcw== X-Received: by 2002:a50:ba5c:: with SMTP id 28mr8083606eds.91.1548366719462; Thu, 24 Jan 2019 13:51:59 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id t11sm779468ejl.69.2019.01.24.13.51.58 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 24 Jan 2019 13:51:59 -0800 (PST) Date: Thu, 24 Jan 2019 13:51:59 -0800 (PST) X-Google-Original-Date: Thu, 24 Jan 2019 21:51:49 GMT Message-Id: <6691d9790229d308bc4071bfe1d116da13e3a868.1548366713.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Subject: [PATCH v4 06/10] multi-pack-index: implement 'expire' subcommand Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: sbeller@google.com, peff@peff.net, jrnieder@gmail.com, avarab@gmail.com, jonathantanmy@google.com, Junio C Hamano , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Derrick Stolee The 'git multi-pack-index expire' subcommand looks at the existing mult-pack-index, counts the number of objects referenced in each pack-file, deletes the pack-fils with no referenced objects, and rewrites the multi-pack-index to no longer reference those packs. Refactor the write_midx_file() method to call write_midx_internal() which now takes an existing 'struct multi_pack_index' and a list of pack-files to drop (as specified by the names of their pack- indexes). As we write the new multi-pack-index, we drop those file names from the list of known pack-files. The expire_midx_packs() method removes the unreferenced pack-files after carefully closing the packs to avoid open handles. Test that a new pack-file that covers the contents of two other pack-files leads to those pack-files being deleted during the expire subcommand. Be sure to read the multi-pack-index to ensure it no longer references those packs. Signed-off-by: Derrick Stolee --- midx.c | 120 +++++++++++++++++++++++++++++++++--- t/t5319-multi-pack-index.sh | 20 ++++++ 2 files changed, 130 insertions(+), 10 deletions(-) diff --git a/midx.c b/midx.c index 95c39106b2..299e9b2e8f 100644 --- a/midx.c +++ b/midx.c @@ -33,6 +33,8 @@ #define MIDX_CHUNK_LARGE_OFFSET_WIDTH (sizeof(uint64_t)) #define MIDX_LARGE_OFFSET_NEEDED 0x80000000 +#define PACK_EXPIRED UINT_MAX + static char *get_midx_filename(const char *object_dir) { return xstrfmt("%s/pack/multi-pack-index", object_dir); @@ -381,6 +383,7 @@ struct pack_info { uint32_t orig_pack_int_id; char *pack_name; struct packed_git *p; + unsigned expired : 1; }; static int pack_info_compare(const void *_a, const void *_b) @@ -428,6 +431,7 @@ static void add_pack_to_midx(const char *full_path, size_t full_path_len, packs->info[packs->nr].pack_name = xstrdup(file_name); packs->info[packs->nr].orig_pack_int_id = packs->nr; + packs->info[packs->nr].expired = 0; packs->nr++; } } @@ -587,13 +591,17 @@ static size_t write_midx_pack_names(struct hashfile *f, size_t written = 0; for (i = 0; i < num_packs; i++) { - size_t writelen = strlen(info[i].pack_name) + 1; + size_t writelen; + + if (info[i].expired) + continue; if (i && strcmp(info[i].pack_name, info[i - 1].pack_name) <= 0) BUG("incorrect pack-file order: %s before %s", info[i - 1].pack_name, info[i].pack_name); + writelen = strlen(info[i].pack_name) + 1; hashwrite(f, info[i].pack_name, writelen); written += writelen; } @@ -675,6 +683,11 @@ static size_t write_midx_object_offsets(struct hashfile *f, int large_offset_nee for (i = 0; i < nr_objects; i++) { struct pack_midx_entry *obj = list++; + if (perm[obj->pack_int_id] == PACK_EXPIRED) + BUG("object %s is in an expired pack with int-id %d", + oid_to_hex(&obj->oid), + obj->pack_int_id); + hashwrite_be32(f, perm[obj->pack_int_id]); if (large_offset_needed && obj->offset >> 31) @@ -721,7 +734,8 @@ static size_t write_midx_large_offsets(struct hashfile *f, uint32_t nr_large_off return written; } -int write_midx_file(const char *object_dir) +static int write_midx_internal(const char *object_dir, struct multi_pack_index *m, + struct string_list *packs_to_drop) { unsigned char cur_chunk, num_chunks = 0; char *midx_name; @@ -737,6 +751,8 @@ int write_midx_file(const char *object_dir) struct pack_midx_entry *entries = NULL; int large_offsets_needed = 0; int pack_name_concat_len = 0; + int dropped_packs = 0; + int result = 0; midx_name = get_midx_filename(object_dir); if (safe_create_leading_directories(midx_name)) { @@ -745,7 +761,10 @@ int write_midx_file(const char *object_dir) midx_name); } - packs.m = load_multi_pack_index(object_dir, 1); + if (m) + packs.m = m; + else + packs.m = load_multi_pack_index(object_dir, 1); packs.nr = 0; packs.alloc = packs.m ? packs.m->num_packs : 16; @@ -759,13 +778,14 @@ int write_midx_file(const char *object_dir) packs.info[packs.nr].orig_pack_int_id = i; packs.info[packs.nr].pack_name = xstrdup(packs.m->pack_names[i]); packs.info[packs.nr].p = NULL; + packs.info[packs.nr].expired = 0; packs.nr++; } } for_each_file_in_pack_dir(object_dir, add_pack_to_midx, &packs); - if (packs.m && packs.nr == packs.m->num_packs) + if (packs.m && packs.nr == packs.m->num_packs && !packs_to_drop) goto cleanup; entries = get_sorted_entries(packs.m, packs.info, packs.nr, &nr_entries); @@ -779,6 +799,34 @@ int write_midx_file(const char *object_dir) QSORT(packs.info, packs.nr, pack_info_compare); + if (packs_to_drop && packs_to_drop->nr) { + int drop_index = 0; + int missing_drops = 0; + + for (i = 0; i < packs.nr && drop_index < packs_to_drop->nr; i++) { + int cmp = strcmp(packs.info[i].pack_name, + packs_to_drop->items[drop_index].string); + + if (!cmp) { + drop_index++; + packs.info[i].expired = 1; + } else if (cmp > 0) { + error(_("did not see pack-file %s to drop"), + packs_to_drop->items[drop_index].string); + drop_index++; + missing_drops++; + i--; + } else { + packs.info[i].expired = 0; + } + } + + if (missing_drops) { + result = 1; + goto cleanup; + } + } + /* * pack_perm stores a permutation between pack-int-ids from the * previous multi-pack-index to the new one we are writing: @@ -787,11 +835,18 @@ int write_midx_file(const char *object_dir) */ ALLOC_ARRAY(pack_perm, packs.nr); for (i = 0; i < packs.nr; i++) { - pack_perm[packs.info[i].orig_pack_int_id] = i; + if (packs.info[i].expired) { + dropped_packs++; + pack_perm[packs.info[i].orig_pack_int_id] = PACK_EXPIRED; + } else { + pack_perm[packs.info[i].orig_pack_int_id] = i - dropped_packs; + } } - for (i = 0; i < packs.nr; i++) - pack_name_concat_len += strlen(packs.info[i].pack_name) + 1; + for (i = 0; i < packs.nr; i++) { + if (!packs.info[i].expired) + pack_name_concat_len += strlen(packs.info[i].pack_name) + 1; + } if (pack_name_concat_len % MIDX_CHUNK_ALIGNMENT) pack_name_concat_len += MIDX_CHUNK_ALIGNMENT - @@ -807,7 +862,7 @@ int write_midx_file(const char *object_dir) cur_chunk = 0; num_chunks = large_offsets_needed ? 5 : 4; - written = write_midx_header(f, num_chunks, packs.nr); + written = write_midx_header(f, num_chunks, packs.nr - dropped_packs); chunk_ids[cur_chunk] = MIDX_CHUNKID_PACKNAMES; chunk_offsets[cur_chunk] = written + (num_chunks + 1) * MIDX_CHUNKLOOKUP_WIDTH; @@ -908,7 +963,12 @@ int write_midx_file(const char *object_dir) free(entries); free(pack_perm); free(midx_name); - return 0; + return result; +} + +int write_midx_file(const char *object_dir) +{ + return write_midx_internal(object_dir, NULL, NULL); } void clear_midx_file(struct repository *r) @@ -1010,5 +1070,45 @@ int verify_midx_file(const char *object_dir) int expire_midx_packs(const char *object_dir) { - return 0; + uint32_t i, *count, result = 0; + struct string_list packs_to_drop = STRING_LIST_INIT_DUP; + struct multi_pack_index *m = load_multi_pack_index(object_dir, 1); + + if (!m) + return 0; + + count = xcalloc(m->num_packs, sizeof(uint32_t)); + for (i = 0; i < m->num_objects; i++) { + int pack_int_id = nth_midxed_pack_int_id(m, i); + count[pack_int_id]++; + } + + for (i = 0; i < m->num_packs; i++) { + char *pack_name; + + if (count[i]) + continue; + + if (prepare_midx_pack(m, i)) + continue; + + if (m->packs[i]->pack_keep) + continue; + + pack_name = xstrdup(m->packs[i]->pack_name); + close_pack(m->packs[i]); + FREE_AND_NULL(m->packs[i]); + + string_list_insert(&packs_to_drop, m->pack_names[i]); + unlink_pack_path(pack_name, 0); + free(pack_name); + } + + free(count); + + if (packs_to_drop.nr) + result = write_midx_internal(object_dir, m, &packs_to_drop); + + string_list_clear(&packs_to_drop, 0); + return result; } diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index a8528f7da0..65e85debec 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -397,4 +397,24 @@ test_expect_success 'expire does not remove any packs' ' ) ' +test_expect_success 'expire removes unreferenced packs' ' + ( + cd dup && + git pack-objects --revs .git/objects/pack/pack-combined <<-EOF && + refs/heads/A + ^refs/heads/C + EOF + git multi-pack-index write && + ls .git/objects/pack | grep -v -e pack-[AB] >expect && + git multi-pack-index expire && + ls .git/objects/pack >actual && + test_cmp expect actual && + ls .git/objects/pack/ | grep idx >expect-idx && + test-tool read-midx .git/objects | grep idx >actual-midx && + test_cmp expect-idx actual-midx && + git multi-pack-index verify && + git fsck + ) +' + test_done From patchwork Thu Jan 24 21:51:59 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Schindelin via GitGitGadget X-Patchwork-Id: 10780203 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B66D5139A for ; Thu, 24 Jan 2019 21:52:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A37EC2FEE8 for ; Thu, 24 Jan 2019 21:52:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 97B752FEF6; Thu, 24 Jan 2019 21:52:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B6C212FEEF for ; Thu, 24 Jan 2019 21:52:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728275AbfAXVwL (ORCPT ); Thu, 24 Jan 2019 16:52:11 -0500 Received: from mail-ed1-f65.google.com ([209.85.208.65]:34962 "EHLO mail-ed1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728201AbfAXVwC (ORCPT ); Thu, 24 Jan 2019 16:52:02 -0500 Received: by mail-ed1-f65.google.com with SMTP id x30so5850256edx.2 for ; Thu, 24 Jan 2019 13:52:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:message-id:in-reply-to:references:from:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=qa4nh8yOf4o5JOTfZ39IVv/zKm36nPFtf8y74fqLm/o=; b=bDSBPnFCVF6U9vpuitKXgMZYA9FLh/lS7OZN64DmSEGJDJabPI0FUqVxoAA8GuqiIl 32zIBl8JPyLh2aY9pVQlRllDcWIddvVbAx4Zh97Hpr2/ZTECN4A9KNbw/6j8w/BA4frF Ty6VM2Owjd4NKeeXP+t6LHQgwUByDi18SESzI1VaFtflKjZBt6QvS2eQJgyUksm0sorh 9UrZ4fX8sFY9Mib2QU23AokWvQu2o0RK7qFUm/8Tstj7nqQyIkui13diqWwY+JD/L5T+ NwHEVy/uLQPgzVSBYAEoisHR7ny8rGVzfV2PirYhFcrV4aK2Ye+ZskYFWmwQqfbXuPmM QrSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:in-reply-to:references:from :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=qa4nh8yOf4o5JOTfZ39IVv/zKm36nPFtf8y74fqLm/o=; b=TfEGW5SLzYx0bYd6zSUOisqRMpTs7yJuRdMyOpN8T/GlDq1y9qirlJJs+d5n4MDvbc rCd+fmBIcauHneyBvHg6yF6SH6chiXygrHq91NiqlBELW1nDBsWHNp8TVDuvQwoQ0QbV vZx4JirRFUSX9DzeeWrxjucyh0+EySUmOj8dtSYsQm2mJ65W4P+qAec0wqKLKBcD34mN Rg4KT4Kewx2xEnxbRPFiZrW/k3xs0+Ijh4tr9dgxauh/aAojHIkd2cwoDCx49bHeytfV nxsjz3tdkjBq7aUlm+qSUpudV5fD4+m5N03VVpcD2dkr3uSnLi1TKIak9wHL9mUg4dEf 6Z6Q== X-Gm-Message-State: AJcUukerqn64ODSCv+B+08C1GfUq8th1gRpOxQjBJ+e9+W8O56ZCcVFU +nqyTJsdEbxVMSH/ndhgJ0IET3yg X-Google-Smtp-Source: ALg8bN6vmvbNMtceXqn1nMSZ62OOu+ybtUt5UekCHj6T22BUbJwGomjkbFv9s1ZV6Piqpn//morxng== X-Received: by 2002:a50:8b41:: with SMTP id l59mr8200246edl.44.1548366720262; Thu, 24 Jan 2019 13:52:00 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a14sm971218eju.24.2019.01.24.13.51.59 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 24 Jan 2019 13:51:59 -0800 (PST) Date: Thu, 24 Jan 2019 13:51:59 -0800 (PST) X-Google-Original-Date: Thu, 24 Jan 2019 21:51:50 GMT Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Subject: [PATCH v4 07/10] multi-pack-index: prepare 'repack' subcommand Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: sbeller@google.com, peff@peff.net, jrnieder@gmail.com, avarab@gmail.com, jonathantanmy@google.com, Junio C Hamano , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Derrick Stolee In an environment where the multi-pack-index is useful, it is due to many pack-files and an inability to repack the object store into a single pack-file. However, it is likely that many of these pack-files are rather small, and could be repacked into a slightly larger pack-file without too much effort. It may also be important to ensure the object store is highly available and the repack operation does not interrupt concurrent git commands. Introduce a 'repack' subcommand to 'git multi-pack-index' that takes a '--batch-size' option. The subcommand will inspect the multi-pack-index for referenced pack-files whose size is smaller than the batch size, until collecting a list of pack-files whose sizes sum to larger than the batch size. Then, a new pack-file will be created containing the objects from those pack-files that are referenced by the multi-pack-index. The resulting pack is likely to actually be smaller than the batch size due to compression and the fact that there may be objects in the pack- files that have duplicate copies in other pack-files. The current change introduces the command-line arguments, and we add a test that ensures we parse these options properly. Since we specify a small batch size, we will guarantee that future implementations do not change the list of pack-files. In addition, we hard-code the modified times of the packs in the pack directory to ensure the list of packs sorted by modified time matches the order if sorted by size (ascending). This will be important in a future test. Signed-off-by: Derrick Stolee --- Documentation/git-multi-pack-index.txt | 11 +++++++++++ builtin/multi-pack-index.c | 12 ++++++++++-- midx.c | 5 +++++ midx.h | 1 + t/t5319-multi-pack-index.sh | 17 +++++++++++++++++ 5 files changed, 44 insertions(+), 2 deletions(-) diff --git a/Documentation/git-multi-pack-index.txt b/Documentation/git-multi-pack-index.txt index 6186c4c936..de345c2400 100644 --- a/Documentation/git-multi-pack-index.txt +++ b/Documentation/git-multi-pack-index.txt @@ -36,6 +36,17 @@ expire:: have no objects referenced by the MIDX. Rewrite the MIDX file afterward to remove all references to these pack-files. +repack:: + Create a new pack-file containing objects in small pack-files + referenced by the multi-pack-index. Select the pack-files by + examining packs from oldest-to-newest, adding a pack if its + size is below the batch size. Stop adding packs when the sum + of sizes of the added packs is above the batch size. If the + total size does not reach the batch size, then do nothing. + Rewrite the multi-pack-index to reference the new pack-file. + A later run of 'git multi-pack-index expire' will delete the + pack-files that were part of this batch. + EXAMPLES -------- diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index 145de3a46c..c66239de33 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -5,12 +5,13 @@ #include "midx.h" static char const * const builtin_multi_pack_index_usage[] = { - N_("git multi-pack-index [--object-dir=] (write|verify|expire)"), + N_("git multi-pack-index [--object-dir=] (write|verify|expire|repack --batch-size=)"), NULL }; static struct opts_multi_pack_index { const char *object_dir; + unsigned long batch_size; } opts; int cmd_multi_pack_index(int argc, const char **argv, @@ -19,6 +20,8 @@ int cmd_multi_pack_index(int argc, const char **argv, static struct option builtin_multi_pack_index_options[] = { OPT_FILENAME(0, "object-dir", &opts.object_dir, N_("object directory containing set of packfile and pack-index pairs")), + OPT_MAGNITUDE(0, "batch-size", &opts.batch_size, + N_("during repack, collect pack-files of smaller size into a batch that is larger than this size")), OPT_END(), }; @@ -40,6 +43,11 @@ int cmd_multi_pack_index(int argc, const char **argv, return 1; } + if (!strcmp(argv[0], "repack")) + return midx_repack(opts.object_dir, (size_t)opts.batch_size); + if (opts.batch_size) + die(_("--batch-size option is only for 'repack' subcommand")); + if (!strcmp(argv[0], "write")) return write_midx_file(opts.object_dir); if (!strcmp(argv[0], "verify")) @@ -47,5 +55,5 @@ int cmd_multi_pack_index(int argc, const char **argv, if (!strcmp(argv[0], "expire")) return expire_midx_packs(opts.object_dir); - die(_("unrecognized verb: %s"), argv[0]); + die(_("unrecognized subcommand: %s"), argv[0]); } diff --git a/midx.c b/midx.c index 299e9b2e8f..768a7dff73 100644 --- a/midx.c +++ b/midx.c @@ -1112,3 +1112,8 @@ int expire_midx_packs(const char *object_dir) string_list_clear(&packs_to_drop, 0); return result; } + +int midx_repack(const char *object_dir, size_t batch_size) +{ + return 0; +} diff --git a/midx.h b/midx.h index e3a2b740b5..394a21ee96 100644 --- a/midx.h +++ b/midx.h @@ -50,6 +50,7 @@ int write_midx_file(const char *object_dir); void clear_midx_file(struct repository *r); int verify_midx_file(const char *object_dir); int expire_midx_packs(const char *object_dir); +int midx_repack(const char *object_dir, size_t batch_size); void close_midx(struct multi_pack_index *m); diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 65e85debec..acc5e65ecc 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -417,4 +417,21 @@ test_expect_success 'expire removes unreferenced packs' ' ) ' +test_expect_success 'repack with minimum size does not alter existing packs' ' + ( + cd dup && + rm -rf .git/objects/pack && + mv .git/objects/pack-backup .git/objects/pack && + touch -m -t 201901010000 .git/objects/pack/pack-D* && + touch -m -t 201901010001 .git/objects/pack/pack-C* && + touch -m -t 201901010002 .git/objects/pack/pack-B* && + touch -m -t 201901010003 .git/objects/pack/pack-A* && + ls .git/objects/pack >expect && + MINSIZE=$(ls -l .git/objects/pack/*pack | awk "{print \$5;}" | sort -n | head -n 1) && + git multi-pack-index repack --batch-size=$MINSIZE && + ls .git/objects/pack >actual && + test_cmp expect actual + ) +' + test_done From patchwork Thu Jan 24 21:52:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Schindelin via GitGitGadget X-Patchwork-Id: 10780199 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9A9BF14E5 for ; Thu, 24 Jan 2019 21:52:10 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8A6FE2FEBF for ; Thu, 24 Jan 2019 21:52:10 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 88BFD2FE9E; Thu, 24 Jan 2019 21:52:10 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E765F2FEEF for ; Thu, 24 Jan 2019 21:52:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728255AbfAXVwH (ORCPT ); Thu, 24 Jan 2019 16:52:07 -0500 Received: from mail-ed1-f66.google.com ([209.85.208.66]:39013 "EHLO mail-ed1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728210AbfAXVwE (ORCPT ); Thu, 24 Jan 2019 16:52:04 -0500 Received: by mail-ed1-f66.google.com with SMTP id b14so5822356edt.6 for ; Thu, 24 Jan 2019 13:52:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:message-id:in-reply-to:references:from:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=3+NkqqpfkOyK4hrSrT6mLRHBKyg4HisiCb03FbN9nQ0=; b=dBquUccG66n0MstDopT6YR4M7yjh/l2tTdHKDefiWeI/hFmDP1e5esA4cQTj57tK3w 0/k23kmFuvtLewu9TFWkNVYV7UwjVeUz1BicS/MnM379NRx+KlQDUvzXweoGtSExCu5o XZRek/vEmK5m/Oq34Pyg0FQ/mOWscSkwLerizCrJoBcsGelzwRqFZb1oNARcRZS0uMqw ffLvs/m7XUsW0B4qJINBh2X8Vs+zD0R58O1W7AeecIgeji82XYQup4PGCjP0jycQnePU RuYAH6wMFm9Gj5Gf9jYr904xpNnh2PNOG2GyQusUkfUujP33vaJXozQ0dzJPdJTisz+o dNnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:in-reply-to:references:from :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=3+NkqqpfkOyK4hrSrT6mLRHBKyg4HisiCb03FbN9nQ0=; b=P1kudMdr1NpIR6mVObjJSdy/Q/1NAlT96Y0ve0+nf3/f14/NizStVjMB2K0bTbdazh tnTr7ApUTspCd3Nl+FqasrheuIdOW6wiswgiwOqozqtP5BMnkmPFE/fdHH4bapRiMWJx HGBoAF7243dwF+EoBKW+QxPn7/xujKd+6IXTlWqjw6ZG8aBOMrz8ZIRAR8+CBi1PHX4M 5tSclh82+XWa/NkT3lR3qKQ4bvU+w2V7DmXAai4niCwRP0OoYVVI038gf31WNXeaQm8c vhd328aPvVsSwT1oLHcjkycHrqxSIJfCimwsuOWSi8Vb9bIiJq67D3B8UNbyMuG5QHt/ XWrw== X-Gm-Message-State: AJcUukd5USAZ4CTaqs5+8P+m8YIgdo3Y6dTKqiEDierqz3DXo5Np1xlG ObhZBOzp4XUXLM5NTFF4XbDyPXmU X-Google-Smtp-Source: ALg8bN7AxsrKKw8tKnnJb6Q47lq88+4ksWoFhsPfMKt1L9RoBioFcuz9KiWOh/W8mueEKMRRlkuUFA== X-Received: by 2002:a05:6402:694:: with SMTP id f20mr1294935edy.99.1548366721155; Thu, 24 Jan 2019 13:52:01 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id c30sm11370912edc.70.2019.01.24.13.52.00 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 24 Jan 2019 13:52:00 -0800 (PST) Date: Thu, 24 Jan 2019 13:52:00 -0800 (PST) X-Google-Original-Date: Thu, 24 Jan 2019 21:51:51 GMT Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Subject: [PATCH v4 08/10] midx: implement midx_repack() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: sbeller@google.com, peff@peff.net, jrnieder@gmail.com, avarab@gmail.com, jonathantanmy@google.com, Junio C Hamano , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Derrick Stolee To repack using a multi-pack-index, first sort all pack-files by their modified time. Second, walk those pack-files from oldest to newest, adding the packs to a list if they are smaller than the given pack-size. Finally, collect the objects from the multi-pack- index that are in those packs and send them to 'git pack-objects'. While first designing a 'git multi-pack-index repack' operation, I started by collecting the batches based on the size of the objects instead of the size of the pack-files. This allows repacking a large pack-file that has very few referencd objects. However, this came at a significant cost of parsing pack-files instead of simply reading the multi-pack-index and getting the file information for the pack-files. This object-size idea could be a direction for future expansion in this area. Signed-off-by: Derrick Stolee --- midx.c | 109 +++++++++++++++++++++++++++++++++++- t/t5319-multi-pack-index.sh | 31 +++++++++- 2 files changed, 138 insertions(+), 2 deletions(-) diff --git a/midx.c b/midx.c index 768a7dff73..7d81bf943e 100644 --- a/midx.c +++ b/midx.c @@ -8,6 +8,7 @@ #include "sha1-lookup.h" #include "midx.h" #include "progress.h" +#include "run-command.h" #define MIDX_SIGNATURE 0x4d494458 /* "MIDX" */ #define MIDX_VERSION 1 @@ -1113,7 +1114,113 @@ int expire_midx_packs(const char *object_dir) return result; } -int midx_repack(const char *object_dir, size_t batch_size) +struct time_and_id { + timestamp_t mtime; + uint32_t pack_int_id; +}; + +static int compare_by_mtime(const void *a_, const void *b_) { + const struct time_and_id *a, *b; + + a = (const struct time_and_id *)a_; + b = (const struct time_and_id *)b_; + + if (a->mtime < b->mtime) + return -1; + if (a->mtime > b->mtime) + return 1; return 0; } + +int midx_repack(const char *object_dir, size_t batch_size) +{ + int result = 0; + uint32_t i, packs_to_repack; + size_t total_size; + struct time_and_id *pack_ti; + unsigned char *include_pack; + struct child_process cmd = CHILD_PROCESS_INIT; + struct strbuf base_name = STRBUF_INIT; + struct multi_pack_index *m = load_multi_pack_index(object_dir, 1); + + if (!m) + return 0; + + include_pack = xcalloc(m->num_packs, sizeof(unsigned char)); + pack_ti = xcalloc(m->num_packs, sizeof(struct time_and_id)); + + for (i = 0; i < m->num_packs; i++) { + pack_ti[i].pack_int_id = i; + + if (prepare_midx_pack(m, i)) + continue; + + pack_ti[i].mtime = m->packs[i]->mtime; + } + QSORT(pack_ti, m->num_packs, compare_by_mtime); + + total_size = 0; + packs_to_repack = 0; + for (i = 0; total_size < batch_size && i < m->num_packs; i++) { + int pack_int_id = pack_ti[i].pack_int_id; + struct packed_git *p = m->packs[pack_int_id]; + + if (!p) + continue; + if (p->pack_size >= batch_size) + continue; + + packs_to_repack++; + total_size += p->pack_size; + include_pack[pack_int_id] = 1; + } + + if (total_size < batch_size || packs_to_repack < 2) + goto cleanup; + + argv_array_push(&cmd.args, "pack-objects"); + + strbuf_addstr(&base_name, object_dir); + strbuf_addstr(&base_name, "/pack/pack"); + argv_array_push(&cmd.args, base_name.buf); + strbuf_release(&base_name); + + cmd.git_cmd = 1; + cmd.in = cmd.out = -1; + + if (start_command(&cmd)) { + error(_("could not start pack-objects")); + result = 1; + goto cleanup; + } + + for (i = 0; i < m->num_objects; i++) { + struct object_id oid; + uint32_t pack_int_id = nth_midxed_pack_int_id(m, i); + + if (!include_pack[pack_int_id]) + continue; + + nth_midxed_object_oid(&oid, m, i); + xwrite(cmd.in, oid_to_hex(&oid), the_hash_algo->hexsz); + xwrite(cmd.in, "\n", 1); + } + close(cmd.in); + + if (finish_command(&cmd)) { + error(_("could not finish pack-objects")); + result = 1; + goto cleanup; + } + + result = write_midx_internal(object_dir, m, NULL); + m = NULL; + +cleanup: + if (m) + close_midx(m); + free(include_pack); + free(pack_ti); + return result; +} diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index acc5e65ecc..d6c1353514 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -383,7 +383,8 @@ test_expect_success 'setup expire tests' ' git pack-objects --revs .git/objects/pack/pack-E <<-EOF && refs/heads/E EOF - git multi-pack-index write + git multi-pack-index write && + cp -r .git/objects/pack .git/objects/pack-backup ) ' @@ -434,4 +435,32 @@ test_expect_success 'repack with minimum size does not alter existing packs' ' ) ' +test_expect_success 'repack creates a new pack' ' + ( + cd dup && + ls .git/objects/pack/*idx >idx-list && + test_line_count = 5 idx-list && + THIRD_SMALLEST_SIZE=$(ls -l .git/objects/pack/*pack | awk "{print \$5;}" | sort -n | head -n 3 | tail -n 1) && + BATCH_SIZE=$(($THIRD_SMALLEST_SIZE + 1)) && + git multi-pack-index repack --batch-size=$BATCH_SIZE && + ls .git/objects/pack/*idx >idx-list && + test_line_count = 6 idx-list && + test-tool read-midx .git/objects | grep idx >midx-list && + test_line_count = 6 midx-list + ) +' + +test_expect_success 'expire removes repacked packs' ' + ( + cd dup && + ls -al .git/objects/pack/*pack && + ls -S .git/objects/pack/*pack | head -n 4 >expect && + git multi-pack-index expire && + ls -S .git/objects/pack/*pack >actual && + test_cmp expect actual && + test-tool read-midx .git/objects | grep idx >midx-list && + test_line_count = 4 midx-list + ) +' + test_done From patchwork Thu Jan 24 21:52:01 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Schindelin via GitGitGadget X-Patchwork-Id: 10780197 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id ABD26139A for ; Thu, 24 Jan 2019 21:52:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 989492FE75 for ; Thu, 24 Jan 2019 21:52:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 96F3A2FEE5; Thu, 24 Jan 2019 21:52:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2DAD42FEA8 for ; Thu, 24 Jan 2019 21:52:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728264AbfAXVwI (ORCPT ); Thu, 24 Jan 2019 16:52:08 -0500 Received: from mail-ed1-f68.google.com ([209.85.208.68]:39017 "EHLO mail-ed1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728223AbfAXVwD (ORCPT ); Thu, 24 Jan 2019 16:52:03 -0500 Received: by mail-ed1-f68.google.com with SMTP id b14so5822376edt.6 for ; Thu, 24 Jan 2019 13:52:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:message-id:in-reply-to:references:from:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=nFDmTJ7EIYAEQqIc3KjFWf1HILA8Ab7U2MDpeTRtibg=; b=kND3cLbBGgUg4zm8RVTth0X+4X0fYryk8nSoHp6Z7r/DoFKRsNMdYkHYuopDGfYenD lbED+huOCyoaTpbYqMA/cLLAbt8x4EX3FfOLKhO5zOZiXu2jfTTHU5/ByW8d5ERH6Og+ 2zlKh8o7fODgqOLYRE3Qb1TZkXBa8FPc/BEeaPlQcKJ5QUAwLy7t2YYjOhq6jRMVCqcK 2opRsluayzRScqtT8zyl4XaJyvHJz7e7lYSHI3zzinhwG9evxE7Nr8vkQvwh3QTwzgIR OfDC0S1FtE3A6Htmvhw83/I+hkGA4ogxtDODV/RsEgrpCEFbZLKGEugomk6//Eevzg5n v4eg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:in-reply-to:references:from :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=nFDmTJ7EIYAEQqIc3KjFWf1HILA8Ab7U2MDpeTRtibg=; b=WtDoVVedzsuVPVy8KnEgtTg1KO5Z9eGTfrNr40xdztKJmBaFCU2aH/hUefuZwlVpMc X5nnkEIg2VamZojC4IRVJS0YupdUdkhmNDy38UEHDSgruCkYbTwoqLsWmPQ4sAxGlzML V1z0yzteam8Zwav0UE+za7aXsF8PHm9YWHXpoIHHGwQdtGGIwdT6+T8uKJyjIVIs0VmH HZL56juirdOE9f0BVG6vNzHUb4O7rlIrE6lSdd6p+9IATXac9D+tudbwmkg94e5bkWaj GllaJHSALNAY1/5r6KPwvnTJF/CiFq+urTIU4zJohRrNJ4O3HODjlT49DkAIykLPU2SJ uHpA== X-Gm-Message-State: AJcUukfQGNI6IRxkClwdqWJ9TAicv2vY8z/XZWqpSntu63v1ffQMMVLv OGRq6xPd8GBsV5JwS4xPD5UsLQB9 X-Google-Smtp-Source: ALg8bN6jsyzDQc49ed1wvEqaMYIeEjCtZn16ZHSf537YULsLHVPHY3xNiWZppIQ/RwFOXpyV1UJL7Q== X-Received: by 2002:a17:906:1d01:: with SMTP id n1-v6mr7430877ejh.61.1548366721976; Thu, 24 Jan 2019 13:52:01 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id m5-v6sm6169638ejq.21.2019.01.24.13.52.01 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 24 Jan 2019 13:52:01 -0800 (PST) Date: Thu, 24 Jan 2019 13:52:01 -0800 (PST) X-Google-Original-Date: Thu, 24 Jan 2019 21:51:52 GMT Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Subject: [PATCH v4 09/10] multi-pack-index: test expire while adding packs Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: sbeller@google.com, peff@peff.net, jrnieder@gmail.com, avarab@gmail.com, jonathantanmy@google.com, Junio C Hamano , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Derrick Stolee During development of the multi-pack-index expire subcommand, a version went out that improperly computed the pack order if a new pack was introduced while other packs were being removed. Part of the subtlety of the bug involved the new pack being placed before other packs that already existed in the multi-pack-index. Add a test to t5319-multi-pack-index.sh that catches this issue. The test adds new packs that cause another pack to be expired, and creates new packs that are lexicographically sorted before and after the existing packs. Signed-off-by: Derrick Stolee --- t/t5319-multi-pack-index.sh | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index d6c1353514..19b769eea0 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -463,4 +463,36 @@ test_expect_success 'expire removes repacked packs' ' ) ' +test_expect_success 'expire works when adding new packs' ' + ( + cd dup && + git pack-objects --revs .git/objects/pack/pack-combined <<-EOF && + refs/heads/A + ^refs/heads/B + EOF + git pack-objects --revs .git/objects/pack/pack-combined <<-EOF && + refs/heads/B + ^refs/heads/C + EOF + git pack-objects --revs .git/objects/pack/pack-combined <<-EOF && + refs/heads/C + ^refs/heads/D + EOF + git multi-pack-index write && + git pack-objects --revs .git/objects/pack/a-pack <<-EOF && + refs/heads/D + ^refs/heads/E + EOF + git multi-pack-index write && + git pack-objects --revs .git/objects/pack/z-pack <<-EOF && + refs/heads/E + EOF + git multi-pack-index expire && + ls .git/objects/pack/ | grep idx >expect && + test-tool read-midx .git/objects | grep idx >actual && + test_cmp expect actual && + git multi-pack-index verify + ) +' + test_done From patchwork Thu Jan 24 21:52:02 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Schindelin via GitGitGadget X-Patchwork-Id: 10780201 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5D44D13B4 for ; Thu, 24 Jan 2019 21:52:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4E5A22FECF for ; Thu, 24 Jan 2019 21:52:11 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 429712FEE8; Thu, 24 Jan 2019 21:52:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DC92F2FEC7 for ; Thu, 24 Jan 2019 21:52:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728254AbfAXVwH (ORCPT ); Thu, 24 Jan 2019 16:52:07 -0500 Received: from mail-ed1-f66.google.com ([209.85.208.66]:45964 "EHLO mail-ed1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728180AbfAXVwE (ORCPT ); Thu, 24 Jan 2019 16:52:04 -0500 Received: by mail-ed1-f66.google.com with SMTP id d39so5800033edb.12 for ; Thu, 24 Jan 2019 13:52:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:message-id:in-reply-to:references:from:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=phXU1dpxUp62jnnpSRXnqeaYcnvYczZREKAHf4PRcNs=; b=lrPbSCJ7JrvKA9V6fSqYz+P3IkGOxjlpbW/H+ITRrdCQGrumPgr4J6UQ2NYXt0ql27 lL4OxXQoEympNVvPVyHWx0XMwdgo36ECdX8wKK2bOLTK8qnusjZoUnbhEc3CSVlBm/2Q +M4QOgy4BpgEjrd+ahKEG79z/eIhiDaCAAbtqYio9i230xwGAHa4mjx8nOzMKHckid0I x+iKyLRihBmVh63caQSmKBZalY/n02AVInTY0gWJusP1k62EoVLoPaKVFxPR/B6mIh5P qvjhjxfpCqSaigNbksGY1kwLZ4U6XD7DFgYao6OXhOhB7BlGavMt3zzLeqMBOYOV2+4O cxbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:in-reply-to:references:from :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=phXU1dpxUp62jnnpSRXnqeaYcnvYczZREKAHf4PRcNs=; b=Gc2SUf4CGbjyiPGY8MiodE1hBQs8UZav7zIN8/yVvoTq0sQojlO0jAMoA1K/V7UtYd 8X6loNTWASI647opeHM2Lc5j/5ZtWZyyeAzCa/zZbYuLZ1qAAdcCY8q21Qpr2o2sFrz4 G6HjheUMfrN3t97YH7vw3EjL7ZgWsvrJBXdnBdO1kNhKPGr9Omz1GzIaSX/A9rrmNMxV VGlon+vwWYDLligmI51Mx2h5hgxrYUJoK4nahtSo7GD+s4mi2jtPoXmfrDsGwhr7eoRV XrQCz0TvLft9TZbCGNZxz52aTbQ9LW8jSD6McUIBaaU4+YMYXDLNcLNlWYgYSPpUc96r XGjA== X-Gm-Message-State: AJcUukdc96GMBzp0u97O4GKSVJli5zHBRckbclZQZ36BLhyRmbs1fHf6 OeHU3qbMRO6vXcJsGs1nO60lEVlp X-Google-Smtp-Source: ALg8bN7kC280x0eKhUk3NqPQYbDHcvzmWMSVm5rL3Up4EogSHwAyE0YdWw79sbC9m+EAor9R0famuA== X-Received: by 2002:aa7:c594:: with SMTP id g20mr7780751edq.79.1548366722790; Thu, 24 Jan 2019 13:52:02 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id k26-v6sm6333185ejv.59.2019.01.24.13.52.02 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 24 Jan 2019 13:52:02 -0800 (PST) Date: Thu, 24 Jan 2019 13:52:02 -0800 (PST) X-Google-Original-Date: Thu, 24 Jan 2019 21:51:53 GMT Message-Id: <481b08890fa633a61995300b7dd47979a0db53dd.1548366713.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Subject: [PATCH v4 10/10] midx: add test that 'expire' respects .keep files Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: sbeller@google.com, peff@peff.net, jrnieder@gmail.com, avarab@gmail.com, jonathantanmy@google.com, Junio C Hamano , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Derrick Stolee The 'git multi-pack-index expire' subcommand may delete packs that are not needed from the perspective of the multi-pack-index. If a pack has a .keep file, then we should not delete that pack. Add a test that ensures we preserve a pack that would otherwise be expired. First, create a new pack that contains every object in the repo, then add it to the multi-pack-index. Then create a .keep file for a pack starting with "a-pack" that was added in the previous test. Finally, expire and verify that the pack remains and the other packs were expired. Signed-off-by: Derrick Stolee --- t/t5319-multi-pack-index.sh | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 19b769eea0..bcfa520401 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -495,4 +495,22 @@ test_expect_success 'expire works when adding new packs' ' ) ' +test_expect_success 'expire respects .keep files' ' + ( + cd dup && + git pack-objects --revs .git/objects/pack/pack-all <<-EOF && + refs/heads/A + EOF + git multi-pack-index write && + PACKA=$(ls .git/objects/pack/a-pack*\.pack | sed s/\.pack\$//) && + touch $PACKA.keep && + git multi-pack-index expire && + ls -S .git/objects/pack/a-pack* | grep $PACKA >a-pack-files && + test_line_count = 3 a-pack-files && + test-tool read-midx .git/objects | grep idx >midx-list && + test_line_count = 2 midx-list + ) +' + + test_done