From patchwork Wed Feb 24 19:09:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12102293 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2AF43C433E0 for ; Wed, 24 Feb 2021 19:11:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D3C0A64E24 for ; Wed, 24 Feb 2021 19:11:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234927AbhBXTLM (ORCPT ); Wed, 24 Feb 2021 14:11:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59370 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236010AbhBXTKJ (ORCPT ); Wed, 24 Feb 2021 14:10:09 -0500 Received: from mail-qk1-x732.google.com (mail-qk1-x732.google.com [IPv6:2607:f8b0:4864:20::732]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5FC89C06174A for ; Wed, 24 Feb 2021 11:09:29 -0800 (PST) Received: by mail-qk1-x732.google.com with SMTP id h8so3278326qkk.6 for ; Wed, 24 Feb 2021 11:09:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=EWKQAPQYWkmrYzu/165SRPVqylKwo41lj7IqdytbI/E=; b=1twOEUAEhSYuZsx4NN2ScPgjWw4aShsFeWPg/hTTvLcpE/4SKReJ2wVwnzwH+LF3xc yTBiYeP2fYzjyAG3vlh74xIbO+n3wMe5iHo69AcEOXQQ42QW++mkYOiOJzc6/xh0Speg CBDT5zonMJCc8QSNquR9v4GkBOxLSQD8vJEscf+Ekywxho7w1NGfMh4KKe5TJrUADnHz 6YuMcGAgdr8ASiqkNAY2sKJmJ0CycdyZkVhyAho2M9zmyvMYFPVdJiqmxFFwn7OKx/cC y7q5EWVL+jibbM3ib8rphw6Ol3IJQSysYMwHI0moU00h6lpNtG/1iEeBLRK+aY2Rbydt u01A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=EWKQAPQYWkmrYzu/165SRPVqylKwo41lj7IqdytbI/E=; b=BUjA6KuQgoGpfyQcIEjiqs4viCL2+4mypar72ZZD2SbaqhKw0dKsVozSMjtkZS5vT7 KqDnsN9pJL4Fz3RCIRK9Sn4i3wJ01uUZIlex1i8OXlce3iYEKzs7e4LL0Aa1zGb2W7nH dr03GiSamTz9lFR6vD3i8ofmgMIN490BE11DY1DQ/iMGw/798VdALuPct+tSUupkO246 RsgACij9wI1XUYwpxIykicCRmqHJtsSk+OBuRzmyEO1EGaxooxNRl5RPthHgXlcKsoO8 aRkXU+gwEznBDHPJGwMT2xddhi8jxqD2f5ijIy1/aplFBfpmzllXLQvlxYnv0o4ni0yc sjZA== X-Gm-Message-State: AOAM530uhYFLqYq+caCr2dcumOuPnJjCxaM9xHdyl7duVXWe/6Qe2utU kkfOYimcOUKI5DFrRr738oAmxhOcXwvmmZ9d X-Google-Smtp-Source: ABdhPJyH9NTcIThwybz9vMkhCufEVcsB2PIchGSMlJAxnuhrC3J6LwzcLYYHf0J9YMVOuaz5pofs8g== X-Received: by 2002:a37:9f86:: with SMTP id i128mr7403130qke.20.1614193768385; Wed, 24 Feb 2021 11:09:28 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:268b:c46e:d22e:db6b]) by smtp.gmail.com with ESMTPSA id q73sm2146438qke.65.2021.02.24.11.09.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Feb 2021 11:09:28 -0800 (PST) Date: Wed, 24 Feb 2021 14:09:25 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, avarab@gmail.com, gitster@pobox.com Subject: [PATCH v2 01/15] builtin/multi-pack-index.c: inline 'flags' with options Message-ID: <0527fa89a9a8a44df4d046f4efd04fd9fc23bb91.1614193703.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Subcommands of the 'git multi-pack-index' command (e.g., 'write', 'verify', etc.) will want to optionally change a set of shared flags that are eventually passed to the MIDX libraries. Right now, options and flags are handled separately. Inline them into the same structure so that sub-commands can more easily share the 'flags' data. Signed-off-by: Taylor Blau --- builtin/multi-pack-index.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index 5bf88cd2a8..4a0ddb06c4 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -14,13 +14,12 @@ static struct opts_multi_pack_index { const char *object_dir; unsigned long batch_size; int progress; + unsigned flags; } opts; int cmd_multi_pack_index(int argc, const char **argv, const char *prefix) { - unsigned flags = 0; - static struct option builtin_multi_pack_index_options[] = { OPT_FILENAME(0, "object-dir", &opts.object_dir, N_("object directory containing set of packfile and pack-index pairs")), @@ -40,7 +39,7 @@ int cmd_multi_pack_index(int argc, const char **argv, if (!opts.object_dir) opts.object_dir = get_object_directory(); if (opts.progress) - flags |= MIDX_PROGRESS; + opts.flags |= MIDX_PROGRESS; if (argc == 0) usage_with_options(builtin_multi_pack_index_usage, @@ -55,16 +54,16 @@ int cmd_multi_pack_index(int argc, const char **argv, if (!strcmp(argv[0], "repack")) return midx_repack(the_repository, opts.object_dir, - (size_t)opts.batch_size, flags); + (size_t)opts.batch_size, opts.flags); if (opts.batch_size) die(_("--batch-size option is only for 'repack' subcommand")); if (!strcmp(argv[0], "write")) - return write_midx_file(opts.object_dir, flags); + return write_midx_file(opts.object_dir, opts.flags); if (!strcmp(argv[0], "verify")) - return verify_midx_file(the_repository, opts.object_dir, flags); + return verify_midx_file(the_repository, opts.object_dir, opts.flags); if (!strcmp(argv[0], "expire")) - return expire_midx_packs(the_repository, opts.object_dir, flags); + return expire_midx_packs(the_repository, opts.object_dir, opts.flags); die(_("unrecognized subcommand: %s"), argv[0]); } From patchwork Wed Feb 24 19:09:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12102305 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 48130C433E0 for ; Wed, 24 Feb 2021 19:12:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F01F564E90 for ; Wed, 24 Feb 2021 19:12:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234917AbhBXTLr (ORCPT ); Wed, 24 Feb 2021 14:11:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59386 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235284AbhBXTKN (ORCPT ); Wed, 24 Feb 2021 14:10:13 -0500 Received: from mail-qk1-x72f.google.com (mail-qk1-x72f.google.com [IPv6:2607:f8b0:4864:20::72f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 67CF5C061786 for ; Wed, 24 Feb 2021 11:09:33 -0800 (PST) Received: by mail-qk1-x72f.google.com with SMTP id w19so3219814qki.13 for ; Wed, 24 Feb 2021 11:09:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=V4a9UY5s40LlGymKW7WgX8xoLwzSyTJIIO9ypByJRAw=; b=UgG6U4+vozWy4Dt23RPO7ruKy/fFCLCI2qlcwYXdFT/0f30Q1PBm8+4uN9+bGZMeNJ 92KUWt4Q6P/GJR/AtOmEPV/Olm4J+B/aO5uT9+lgyjaysL9IzxuliC+u6efOJARCHXGW S/DmgzACoVxziD7NAwTnvWodgqH4kIl05uGU7H7GRC176CbI4iv1vy/6eWDK+9HtURxx Er+jfZVzwQH8ONeOCT6yL1lQZhTVDCYCfKVN0llazbTBNRBvZ+0SSmo0fhutarVPqvUu fvnFDpFripNDLDtGnNWnEnj9tlPt0T87OAJLSsG/dqDnth6GLA96sVydoK8ek/wuOfYK rfvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=V4a9UY5s40LlGymKW7WgX8xoLwzSyTJIIO9ypByJRAw=; b=CIuwOcUiqDjjLT4lbgHGZP6WTe9lVHFbcNHjFwymRv03/f+F78OijgiXRGRYufAXbI hS+Le60uNJTq+MDOd+oWgomiDj5LQ2/q2BxBHmrO9jhQynmwNaH11ThzZAQoIMOwjxdF b6g5GfF4bL18U7VJUQ7A3cgrIdsZPYcxOMzXNaHMYbFdp7hjgaujEIONOZP3QbcbkK/M NFMaNAMNLkioa5kVhUy1TwivNyn4Un3aLQzfGyIXQpYwq14/3N3rgEetnC1SICIX9a/m wxTOmeE7MJcbA5aaFC6X0/GpDhrAMue24pAQUmK/u5BeHcHi9IJcU1ptnjkDjMhkSbad tsUw== X-Gm-Message-State: AOAM531js8KV/0VH71DZ1fRsVp6iQklPmeSsARFWY/SmhpsGB7xW6SOb Y0UU5/83YrBXhSGA9wR1y24lYCyzTRG7g2eg X-Google-Smtp-Source: ABdhPJyIFVM8WoANEl8KDoE1VlqjwTdz33Gn+rLMa5jqWt1Pbubw5Zbt5RrpB2av5qVU6tQd60MhSw== X-Received: by 2002:a37:a5d0:: with SMTP id o199mr32416978qke.388.1614193772426; Wed, 24 Feb 2021 11:09:32 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:268b:c46e:d22e:db6b]) by smtp.gmail.com with ESMTPSA id p10sm2187909qke.92.2021.02.24.11.09.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Feb 2021 11:09:31 -0800 (PST) Date: Wed, 24 Feb 2021 14:09:30 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, avarab@gmail.com, gitster@pobox.com Subject: [PATCH v2 02/15] builtin/multi-pack-index.c: don't handle 'progress' separately Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Now that there is a shared 'flags' member in the options structure, there is no need to keep track of whether to force progress or not, since ultimately the decision of whether or not to show a progress meter is controlled by a bit in the flags member. Manipulate that bit directly, and drop the now-unnecessary 'progress' field while we're at it. Signed-off-by: Taylor Blau --- builtin/multi-pack-index.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index 4a0ddb06c4..c70f020d8f 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -13,7 +13,6 @@ static char const * const builtin_multi_pack_index_usage[] = { static struct opts_multi_pack_index { const char *object_dir; unsigned long batch_size; - int progress; unsigned flags; } opts; @@ -23,7 +22,7 @@ int cmd_multi_pack_index(int argc, const char **argv, static struct option builtin_multi_pack_index_options[] = { OPT_FILENAME(0, "object-dir", &opts.object_dir, N_("object directory containing set of packfile and pack-index pairs")), - OPT_BOOL(0, "progress", &opts.progress, N_("force progress reporting")), + OPT_BIT(0, "progress", &opts.flags, N_("force progress reporting"), MIDX_PROGRESS), OPT_MAGNITUDE(0, "batch-size", &opts.batch_size, N_("during repack, collect pack-files of smaller size into a batch that is larger than this size")), OPT_END(), @@ -31,15 +30,14 @@ int cmd_multi_pack_index(int argc, const char **argv, git_config(git_default_config, NULL); - opts.progress = isatty(2); + if (isatty(2)) + opts.flags |= MIDX_PROGRESS; argc = parse_options(argc, argv, prefix, builtin_multi_pack_index_options, builtin_multi_pack_index_usage, 0); if (!opts.object_dir) opts.object_dir = get_object_directory(); - if (opts.progress) - opts.flags |= MIDX_PROGRESS; if (argc == 0) usage_with_options(builtin_multi_pack_index_usage, From patchwork Wed Feb 24 19:09:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12102303 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 743E3C433DB for ; Wed, 24 Feb 2021 19:12:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 258B764F0C for ; Wed, 24 Feb 2021 19:12:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236024AbhBXTL4 (ORCPT ); Wed, 24 Feb 2021 14:11:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59402 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235860AbhBXTKR (ORCPT ); Wed, 24 Feb 2021 14:10:17 -0500 Received: from mail-qk1-x72b.google.com (mail-qk1-x72b.google.com [IPv6:2607:f8b0:4864:20::72b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 38096C061788 for ; Wed, 24 Feb 2021 11:09:37 -0800 (PST) Received: by mail-qk1-x72b.google.com with SMTP id 204so3231033qke.11 for ; Wed, 24 Feb 2021 11:09:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=By4ZDw+6iwX13Ip2dovM5NjuD4lu7Lq1pVXXR7wZA+M=; b=cIpA37NO3puKgAdnziqS7Hp8RKBHNrRXshWBF53nmbvhtgwEZoxnvSbOhpGXhuYW1+ JdjGQ5TPSIi3KYr2F/cP789nzqWbLFo+u5nE/qSNbIQMBMbwwIETSxN/oA4ujKwAabCI erPxNEzZtJmUtvaNDFJdanmZsJ8l5lX/uIm/Eu12TFECZxnaWL8PTdOmKT2vmGIkuHnk KdRhfce6DsvwVAneGXUVKQZ6oJ8hXev+A9SlhMlIwqaGOhvu+QyWbhE2k1lMsrLG2L4a kiO8JxoaY9NRWDOjOFCrybRNYk5G5ZDIbrreBXCBMBbod6paFO/PnX37aS7OINdOHhCR REqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=By4ZDw+6iwX13Ip2dovM5NjuD4lu7Lq1pVXXR7wZA+M=; b=b3NbJUmC5kW/J3zi0J8eXmHOhFPILMS8xHSLdVOItZR1KQvn51aTA8r4q0MbVx7qbe IvO8E3E2fapOqJNejzdmWkMUr5GUn1QVqH/PZhDTOZaG2Zk5GL8QvE1pvpvWkX8jIyZF a2tzsIgnyGksl21R6kQXyfAXPoCiJja4NSsZRksgCbqYVw2Dr8/3Em2vu+DUrLzGusw5 pf28uaB07/PMJKcZO5huTC4CamRmt5IhByoVGTo3LR8/Kvs+dXTSMoFClhaldhODeo5k 4P5hlKHAk3boMEY2kOG7wDQvajYLN7Eow9HoilKZ2OXlr+465gR72vZrdKiwPJgaf29K SZ8w== X-Gm-Message-State: AOAM532qcD6FkcDPihiQxm19iucm00SxzrU07bW5TWEFOYpxo+zNC86O ve3SKmAJWlVL618FwvVldZNVBzUR48/+oHUM X-Google-Smtp-Source: ABdhPJxW0M8QwB8Reax/bpuqEOJGdrX04vWkfUgdzN7X0xcE0DsLX+KEzx9XF+TQ42WXFHW08lG9Ew== X-Received: by 2002:a37:7044:: with SMTP id l65mr32561474qkc.417.1614193776252; Wed, 24 Feb 2021 11:09:36 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:268b:c46e:d22e:db6b]) by smtp.gmail.com with ESMTPSA id 6sm2198271qkv.24.2021.02.24.11.09.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Feb 2021 11:09:35 -0800 (PST) Date: Wed, 24 Feb 2021 14:09:34 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, avarab@gmail.com, gitster@pobox.com Subject: [PATCH v2 03/15] builtin/multi-pack-index.c: define common usage with a macro Message-ID: <8679dfd2121cbc75818e97eb40a27dd3af021e38.1614193703.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Factor out the usage message into pieces corresponding to each mode. This avoids options specific to one sub-command from being shared with another in the usage. A subsequent commit will use these #define macros to have usage variables for each sub-command without duplicating their contents. Signed-off-by: Taylor Blau --- builtin/multi-pack-index.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index c70f020d8f..eea498e026 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -5,8 +5,23 @@ #include "midx.h" #include "trace2.h" +#define BUILTIN_MIDX_WRITE_USAGE \ + N_("git multi-pack-index [] write") + +#define BUILTIN_MIDX_VERIFY_USAGE \ + N_("git multi-pack-index [] verify") + +#define BUILTIN_MIDX_EXPIRE_USAGE \ + N_("git multi-pack-index [] expire") + +#define BUILTIN_MIDX_REPACK_USAGE \ + N_("git multi-pack-index [] repack [--batch-size=]") + static char const * const builtin_multi_pack_index_usage[] = { - N_("git multi-pack-index [] (write|verify|expire|repack --batch-size=)"), + BUILTIN_MIDX_WRITE_USAGE, + BUILTIN_MIDX_VERIFY_USAGE, + BUILTIN_MIDX_EXPIRE_USAGE, + BUILTIN_MIDX_REPACK_USAGE, NULL }; From patchwork Wed Feb 24 19:09:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12102307 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E9F0AC43381 for ; Wed, 24 Feb 2021 19:12:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9C55764EC4 for ; Wed, 24 Feb 2021 19:12:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236055AbhBXTMG (ORCPT ); Wed, 24 Feb 2021 14:12:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59418 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235980AbhBXTKV (ORCPT ); Wed, 24 Feb 2021 14:10:21 -0500 Received: from mail-qv1-xf32.google.com (mail-qv1-xf32.google.com [IPv6:2607:f8b0:4864:20::f32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8919AC06178A for ; Wed, 24 Feb 2021 11:09:41 -0800 (PST) Received: by mail-qv1-xf32.google.com with SMTP id 2so1551512qvd.0 for ; Wed, 24 Feb 2021 11:09:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=/b52xhIUWI7qN7nCecignICrg1moxxqCSViEAWPuqHk=; b=KIGFCaqdpObHB0Vy6uMB90tyOGXJhmyhL4GvVc/drt/QUpW1D8+2aLhhSJvFn2vRnk goHz1dz5+krfrbRmfifkIVGlT/fNHkZgmPX3GzLYEuWL62BJA1zeRdCPWrRtIGpnUmEq vXUg4lDGdoOPjKCu1pL1415fD9iYzeG+bVP6DcJHr8uSlPnhdvChwymqRVQb+aQnsh8O 8496+Ji3lKVybo6rcCGdpdqz2WNTzkxIfEno7oHDnje1oCrVTiiI2ZgmNkuzhRGaKJ3Z rbM0DkdS+qhgHIPuB1qReM1Q6JcwJF/LtYqdGZDZxYe+c+n1axcSN25J2rH1Fv2DHOgT 5iaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=/b52xhIUWI7qN7nCecignICrg1moxxqCSViEAWPuqHk=; b=a9O2+l5bQ5YatH8AE5KnXYk9jfHXvdzh181ZoHQ29NGD83hlOyj8BDeRDN3SadNYlu puul0egZet8qQIx45AjjBKogYLQ9UfQiRfFBf0nV09SMqUSyAp7X6DvpTJziu9ML0LFS Liv/nbZsrMPo7iK35QjSx5KbuXkAcDX0aRTYT9prMJ79e+qGZUEfmIn5bi6dyuPm0lSz /oFsmte9RPrzwjVOECLo2rFaMREVxg6VCIx+jl/M0+bIAhCxHYHt8t2LcXDQYfA5xjLn rjYz54eKT75k0N8ritPGVg9OXZPQzOPTcn8TXu3z+hZ1pT9E2QXempcupvW/3tGJACom SEfg== X-Gm-Message-State: AOAM530CBgnPMNTL1fJbNqX5RrGHQI05ri/bGBWvj0UgAFY05R9VhMTI ZhBJ3ehyDqt+Fu1epTIs6NOd2alsc1j/tz8X X-Google-Smtp-Source: ABdhPJxku127M6mP5nEoqZlNk6wK+yVBYumOaywHWlm6tb/SYw8dIAQxTC/EyaPwQ7sXRPGvRPo8SA== X-Received: by 2002:a0c:b66c:: with SMTP id q44mr8050411qvf.3.1614193780501; Wed, 24 Feb 2021 11:09:40 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:268b:c46e:d22e:db6b]) by smtp.gmail.com with ESMTPSA id v12sm1904898qtw.73.2021.02.24.11.09.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Feb 2021 11:09:40 -0800 (PST) Date: Wed, 24 Feb 2021 14:09:38 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, avarab@gmail.com, gitster@pobox.com Subject: [PATCH v2 04/15] builtin/multi-pack-index.c: split sub-commands Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Handle sub-commands of the 'git multi-pack-index' builtin (e.g., "write", "repack", etc.) separately from one another. This allows sub-commands with unique options, without forcing cmd_multi_pack_index() to reject invalid combinations itself. This comes at the cost of some duplication and boilerplate. Luckily, the duplication is reduced to a minimum, since common options are shared among sub-commands due to a suggestion by Ævar. (Sub-commands do have to retain the common options, too, since this builtin accepts common options on either side of the sub-command). Roughly speaking, cmd_multi_pack_index() parses options (including common ones), and stops at the first non-option, which is the sub-command. It then dispatches to the appropriate sub-command, which parses the remaining options (also including common options). Unknown options are kept by the sub-commands in order to detect their presence (and complain that too many arguments were given). Signed-off-by: Taylor Blau --- builtin/multi-pack-index.c | 131 ++++++++++++++++++++++++++++++------- 1 file changed, 106 insertions(+), 25 deletions(-) diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index eea498e026..caf0248a98 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -5,17 +5,33 @@ #include "midx.h" #include "trace2.h" +static char const * const builtin_multi_pack_index_write_usage[] = { #define BUILTIN_MIDX_WRITE_USAGE \ N_("git multi-pack-index [] write") + BUILTIN_MIDX_WRITE_USAGE, + NULL +}; +static char const * const builtin_multi_pack_index_verify_usage[] = { #define BUILTIN_MIDX_VERIFY_USAGE \ N_("git multi-pack-index [] verify") + BUILTIN_MIDX_VERIFY_USAGE, + NULL +}; +static char const * const builtin_multi_pack_index_expire_usage[] = { #define BUILTIN_MIDX_EXPIRE_USAGE \ N_("git multi-pack-index [] expire") + BUILTIN_MIDX_EXPIRE_USAGE, + NULL +}; +static char const * const builtin_multi_pack_index_repack_usage[] = { #define BUILTIN_MIDX_REPACK_USAGE \ N_("git multi-pack-index [] repack [--batch-size=]") + BUILTIN_MIDX_REPACK_USAGE, + NULL +}; static char const * const builtin_multi_pack_index_usage[] = { BUILTIN_MIDX_WRITE_USAGE, @@ -31,25 +47,99 @@ static struct opts_multi_pack_index { unsigned flags; } opts; -int cmd_multi_pack_index(int argc, const char **argv, - const char *prefix) +static struct option common_opts[] = { + OPT_FILENAME(0, "object-dir", &opts.object_dir, + N_("object directory containing set of packfile and pack-index pairs")), + OPT_BIT(0, "progress", &opts.flags, N_("force progress reporting"), MIDX_PROGRESS), + OPT_END(), +}; + +static struct option *add_common_options(struct option *prev) { - static struct option builtin_multi_pack_index_options[] = { - OPT_FILENAME(0, "object-dir", &opts.object_dir, - N_("object directory containing set of packfile and pack-index pairs")), - OPT_BIT(0, "progress", &opts.flags, N_("force progress reporting"), MIDX_PROGRESS), + struct option *with_common = parse_options_concat(common_opts, prev); + free(prev); + return with_common; +} + +static int cmd_multi_pack_index_write(int argc, const char **argv) +{ + struct option *options = common_opts; + + argc = parse_options(argc, argv, NULL, + options, builtin_multi_pack_index_write_usage, + PARSE_OPT_KEEP_UNKNOWN); + if (argc) + usage_with_options(builtin_multi_pack_index_write_usage, + options); + + return write_midx_file(opts.object_dir, opts.flags); +} + +static int cmd_multi_pack_index_verify(int argc, const char **argv) +{ + struct option *options = common_opts; + + argc = parse_options(argc, argv, NULL, + options, builtin_multi_pack_index_verify_usage, + PARSE_OPT_KEEP_UNKNOWN); + if (argc) + usage_with_options(builtin_multi_pack_index_verify_usage, + options); + + return verify_midx_file(the_repository, opts.object_dir, opts.flags); +} + +static int cmd_multi_pack_index_expire(int argc, const char **argv) +{ + struct option *options = common_opts; + + argc = parse_options(argc, argv, NULL, + options, builtin_multi_pack_index_expire_usage, + PARSE_OPT_KEEP_UNKNOWN); + if (argc) + usage_with_options(builtin_multi_pack_index_expire_usage, + options); + + return expire_midx_packs(the_repository, opts.object_dir, opts.flags); +} + +static int cmd_multi_pack_index_repack(int argc, const char **argv) +{ + struct option *options; + static struct option builtin_multi_pack_index_repack_options[] = { OPT_MAGNITUDE(0, "batch-size", &opts.batch_size, N_("during repack, collect pack-files of smaller size into a batch that is larger than this size")), OPT_END(), }; + options = parse_options_dup(builtin_multi_pack_index_repack_options); + options = add_common_options(options); + + argc = parse_options(argc, argv, NULL, + options, + builtin_multi_pack_index_repack_usage, + PARSE_OPT_KEEP_UNKNOWN); + if (argc) + usage_with_options(builtin_multi_pack_index_repack_usage, + options); + + return midx_repack(the_repository, opts.object_dir, + (size_t)opts.batch_size, opts.flags); +} + +int cmd_multi_pack_index(int argc, const char **argv, + const char *prefix) +{ + struct option *builtin_multi_pack_index_options = common_opts; + git_config(git_default_config, NULL); if (isatty(2)) opts.flags |= MIDX_PROGRESS; argc = parse_options(argc, argv, prefix, builtin_multi_pack_index_options, - builtin_multi_pack_index_usage, 0); + builtin_multi_pack_index_usage, + PARSE_OPT_STOP_AT_NON_OPTION); if (!opts.object_dir) opts.object_dir = get_object_directory(); @@ -58,25 +148,16 @@ int cmd_multi_pack_index(int argc, const char **argv, usage_with_options(builtin_multi_pack_index_usage, builtin_multi_pack_index_options); - if (argc > 1) { - die(_("too many arguments")); - return 1; - } - trace2_cmd_mode(argv[0]); if (!strcmp(argv[0], "repack")) - return midx_repack(the_repository, opts.object_dir, - (size_t)opts.batch_size, opts.flags); - if (opts.batch_size) - die(_("--batch-size option is only for 'repack' subcommand")); - - if (!strcmp(argv[0], "write")) - return write_midx_file(opts.object_dir, opts.flags); - if (!strcmp(argv[0], "verify")) - return verify_midx_file(the_repository, opts.object_dir, opts.flags); - if (!strcmp(argv[0], "expire")) - return expire_midx_packs(the_repository, opts.object_dir, opts.flags); - - die(_("unrecognized subcommand: %s"), argv[0]); + return cmd_multi_pack_index_repack(argc, argv); + else if (!strcmp(argv[0], "write")) + return cmd_multi_pack_index_write(argc, argv); + else if (!strcmp(argv[0], "verify")) + return cmd_multi_pack_index_verify(argc, argv); + else if (!strcmp(argv[0], "expire")) + return cmd_multi_pack_index_expire(argc, argv); + else + die(_("unrecognized subcommand: %s"), argv[0]); } From patchwork Wed Feb 24 19:09:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12102311 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50037C433E0 for ; Wed, 24 Feb 2021 19:12:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 03B4F64EC4 for ; Wed, 24 Feb 2021 19:12:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234468AbhBXTM2 (ORCPT ); Wed, 24 Feb 2021 14:12:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59600 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236052AbhBXTLM (ORCPT ); Wed, 24 Feb 2021 14:11:12 -0500 Received: from mail-qt1-x835.google.com (mail-qt1-x835.google.com [IPv6:2607:f8b0:4864:20::835]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7E0F3C06178B for ; Wed, 24 Feb 2021 11:09:45 -0800 (PST) Received: by mail-qt1-x835.google.com with SMTP id f17so2271218qth.7 for ; Wed, 24 Feb 2021 11:09:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=eKdnafm5TaJHG3wh9UuBGQ6K2jl+EAntVsZyVYWA1wU=; b=cyWP9bb+I5HLCAksb3G1qnanogTq6u11TsmOejWuuqU6+oeXZJYD0U/ltXHPP3wACG yY2CbdgXj6ySLiUCNNcP9pmX05rBOp4GwZfvhzz5Ixy5zaOMHsZM7WXbRH2oqD6VwKyi dBhScbUYJsM4baJ6rSURsLmYlWh65FHMc4dbbzMieevh6rh6Ovx1rWMVmoze5hiHEr02 tsXS73eiRwhQMegkPSnD9s5MmBKhufOfktjBS4eglrNpnmXjLeRJN+vGCvjxbyD4UzRM 98U8XEriij9sCdWv2n2mzznpFfMhak8v3a6XjIFVMEr2ciU58HAFn7rywI6g0QVamb06 NmnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=eKdnafm5TaJHG3wh9UuBGQ6K2jl+EAntVsZyVYWA1wU=; b=nRNt603eqccmvzgKUna6N6UWldDZIc6oK2glyWRTuTADiydwhEiQryV3WrYQEUeGGs i9dry/VNWefuW1yvx52VWVBMuYZgPJKBCjbD46hZAouRrNtecfeBUTr1XYjKGhoQinD4 1q94Gts4a5YhGHoLfDZAVSTSleb8Qpg7JpMPbZDqgRi8DIhdpwa3SYJl+2OKQANgjFHU lgV5VD2cZ/Yc40USkPkLHSp8lFc85ZTqlj7cMPaer5yo/F0jZlBCLlrtTxiTOLl7z+PU nbwQ1Qio8ws4IJtmufYvJAe3W1gZah/VSh0g8i2oQuUhV0MkPffi05t63nAMTBUKvKsK wTOA== X-Gm-Message-State: AOAM530fRiZ4OiphqIMZxlvDaFxOfsswIVUbo7OjIMfjbo3pOIWuDG3M RUU5yTIR5Z2wBmTBi+ihorsiJYYYleQzpta+ X-Google-Smtp-Source: ABdhPJxqsUBdOBaXzPcfCl6gC+nRvwjd9qVfKIT4popa6jNxSGI/GCFDySvgASuzS6UpCc/UHvskqA== X-Received: by 2002:ac8:3734:: with SMTP id o49mr14334594qtb.376.1614193784478; Wed, 24 Feb 2021 11:09:44 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:268b:c46e:d22e:db6b]) by smtp.gmail.com with ESMTPSA id k188sm1425001qkd.132.2021.02.24.11.09.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Feb 2021 11:09:44 -0800 (PST) Date: Wed, 24 Feb 2021 14:09:42 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, avarab@gmail.com, gitster@pobox.com Subject: [PATCH v2 05/15] builtin/multi-pack-index.c: don't enter bogus cmd_mode Message-ID: <5daa2946d37b5662274f5565ec35ef4d169b55bf.1614193703.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Even before the recent refactoring, 'git multi-pack-index' calls 'trace2_cmd_mode()' before verifying that the sub-command is recognized. Push this call down into the individual sub-commands so that we don't enter a bogus command mode. Signed-off-by: Ævar Arnfjörð Bjarmason Signed-off-by: Taylor Blau --- builtin/multi-pack-index.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index caf0248a98..9fdfe168c2 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -65,6 +65,8 @@ static int cmd_multi_pack_index_write(int argc, const char **argv) { struct option *options = common_opts; + trace2_cmd_mode(argv[0]); + argc = parse_options(argc, argv, NULL, options, builtin_multi_pack_index_write_usage, PARSE_OPT_KEEP_UNKNOWN); @@ -79,6 +81,8 @@ static int cmd_multi_pack_index_verify(int argc, const char **argv) { struct option *options = common_opts; + trace2_cmd_mode(argv[0]); + argc = parse_options(argc, argv, NULL, options, builtin_multi_pack_index_verify_usage, PARSE_OPT_KEEP_UNKNOWN); @@ -93,6 +97,8 @@ static int cmd_multi_pack_index_expire(int argc, const char **argv) { struct option *options = common_opts; + trace2_cmd_mode(argv[0]); + argc = parse_options(argc, argv, NULL, options, builtin_multi_pack_index_expire_usage, PARSE_OPT_KEEP_UNKNOWN); @@ -115,6 +121,8 @@ static int cmd_multi_pack_index_repack(int argc, const char **argv) options = parse_options_dup(builtin_multi_pack_index_repack_options); options = add_common_options(options); + trace2_cmd_mode(argv[0]); + argc = parse_options(argc, argv, NULL, options, builtin_multi_pack_index_repack_usage, @@ -148,8 +156,6 @@ int cmd_multi_pack_index(int argc, const char **argv, usage_with_options(builtin_multi_pack_index_usage, builtin_multi_pack_index_options); - trace2_cmd_mode(argv[0]); - if (!strcmp(argv[0], "repack")) return cmd_multi_pack_index_repack(argc, argv); else if (!strcmp(argv[0], "write")) From patchwork Wed Feb 24 19:09:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12102309 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 97A9EC433E0 for ; Wed, 24 Feb 2021 19:12:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3C0AB64E90 for ; Wed, 24 Feb 2021 19:12:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235928AbhBXTMK (ORCPT ); Wed, 24 Feb 2021 14:12:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59602 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236053AbhBXTLM (ORCPT ); Wed, 24 Feb 2021 14:11:12 -0500 Received: from mail-qt1-x833.google.com (mail-qt1-x833.google.com [IPv6:2607:f8b0:4864:20::833]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A3974C06178C for ; Wed, 24 Feb 2021 11:09:49 -0800 (PST) Received: by mail-qt1-x833.google.com with SMTP id b24so2232159qtp.13 for ; Wed, 24 Feb 2021 11:09:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=T/TrMmLxVFvCMM3Gy+nsVSLynHsbxrEsNMhxDSmKx94=; b=epwRI4st3OTMUlrYR/6TvT0KXDbBCMEvOH/W7c0Qo6YvZughoEBZ5AJeAe3Fgd5fGO 1Rmbc8ak4l/lqlWkqC9BBeR30dVK8/83dpnJj5LRVSjYb4Nx14rgsdu1Mslo5QZfrcXW 0PsQzyWSymUp/qVpgXhP0smmO6abKMurgXi/lLrGfNtotBeAgBJkDdgaPZPuicSWZxKH wmDWNBVUuZScueDabt/wF9wjeq21Lqiqw4tMuZhfrdE+P1/KRxxrdgQZRpsA0G34r9ry fmb72bmIoq4epl33dLutynpf8tRJ3WxTsmlrNiu9w+l+/l/AqvrFmQKF2Kc4NLPqJDRO pGpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=T/TrMmLxVFvCMM3Gy+nsVSLynHsbxrEsNMhxDSmKx94=; b=KqHjIr9i+dVod5ObdBA+ynZHi1MSLEFyuiWIN+dNgjL6xMBr6yj8QYGyae4pAn3J6r VqrBZzhppSkkOT97ibBiexwZR/tlXwThjuvfRiaZLHJN9Rq+IDhNoxyqiYNNRtEOERGJ X0n8UWljI6OB1vprNusQS2uidMBFjL/LqmzPetYT1BwnRJ0hz9vN2yFl3QzvXnFFmyna 4s1DH+qt6zxBmrjAz1E8nmRv3Iqj2irh3pLwfAYm6+r9sfLa5y7C0UgUWOMw5KgQyK68 HqaAuedZaI6lXnxUgNtKSLey17QsyYDRLX4HMUgio3G6NVE9+0zGD8OKn7kgosaFv3eE a1fg== X-Gm-Message-State: AOAM531vXg59XmKmPXBpzH3eOgHUgbMEByXKqeltqdg7H8EWs1WSDjOy OSefBtBxQfzksWDrTbXRyDmKqS7GPaVKgFUv X-Google-Smtp-Source: ABdhPJxmc6gUR+fI9y50lMiJ9yl9dE2K9PCdismmOvoH5XypIZ3k3uRl6msvi0Z9gfWNPSASrk1xCA== X-Received: by 2002:ac8:545:: with SMTP id c5mr30367863qth.296.1614193788674; Wed, 24 Feb 2021 11:09:48 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:268b:c46e:d22e:db6b]) by smtp.gmail.com with ESMTPSA id k8sm2202406qkk.81.2021.02.24.11.09.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Feb 2021 11:09:48 -0800 (PST) Date: Wed, 24 Feb 2021 14:09:46 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, avarab@gmail.com, gitster@pobox.com Subject: [PATCH v2 06/15] builtin/multi-pack-index.c: display usage on unrecognized command Message-ID: <98d9ea0770ea38caaa71c6b9bf234c8bdde2639b.1614193703.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When given a sub-command that it doesn't understand, 'git multi-pack-index' dies with the following message: $ git multi-pack-index bogus fatal: unrecognized subcommand: bogus Instead of 'die()'-ing, we can display the usage text, which is much more helpful: $ git.compile multi-pack-index bogus usage: git multi-pack-index [] write or: git multi-pack-index [] verify or: git multi-pack-index [] expire or: git multi-pack-index [] repack [--batch-size=] --object-dir object directory containing set of packfile and pack-index pairs --progress force progress reporting While we're at it, clean up some duplication between the "no sub-command" and "unrecognized sub-command" conditionals. Signed-off-by: Ævar Arnfjörð Bjarmason Signed-off-by: Taylor Blau --- builtin/multi-pack-index.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index 9fdfe168c2..5b05e5ce39 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -153,8 +153,7 @@ int cmd_multi_pack_index(int argc, const char **argv, opts.object_dir = get_object_directory(); if (argc == 0) - usage_with_options(builtin_multi_pack_index_usage, - builtin_multi_pack_index_options); + goto usage; if (!strcmp(argv[0], "repack")) return cmd_multi_pack_index_repack(argc, argv); @@ -165,5 +164,7 @@ int cmd_multi_pack_index(int argc, const char **argv, else if (!strcmp(argv[0], "expire")) return cmd_multi_pack_index_expire(argc, argv); else - die(_("unrecognized subcommand: %s"), argv[0]); +usage: + usage_with_options(builtin_multi_pack_index_usage, + builtin_multi_pack_index_options); } From patchwork Wed Feb 24 19:09:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12102313 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2CA42C433DB for ; Wed, 24 Feb 2021 19:13:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CD87464E85 for ; Wed, 24 Feb 2021 19:13:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234888AbhBXTM6 (ORCPT ); Wed, 24 Feb 2021 14:12:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59734 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236019AbhBXTLs (ORCPT ); Wed, 24 Feb 2021 14:11:48 -0500 Received: from mail-qk1-x732.google.com (mail-qk1-x732.google.com [IPv6:2607:f8b0:4864:20::732]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CA06DC061793 for ; Wed, 24 Feb 2021 11:09:53 -0800 (PST) Received: by mail-qk1-x732.google.com with SMTP id q85so3258450qke.8 for ; Wed, 24 Feb 2021 11:09:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=fK8WEtc2GFHptLJs2Vl6B9J5BbpyrxE1ckawIMUyd9Y=; b=bBUfxyD7Y8ZA9yGZ9XMWaZLUX8ZYC5s+sIQUdeHPVkyVWWkipDGbcVIKvOzqlWkkqc 49vhPrCd275nO/H0dqdq31FcI+JWaRocFp9WIqAx68fLumFmStz7HvqtAQhTiflKFST3 1C1jy3KTSkeSzPrr3H8yXQH0NfCljMW9UERSbPu/v0Vn+ULlXvkgGqN+5uZqQrglKNGK SMNoQ0DFwCW53I0VYOFtNjdaFimAr+Alw4h0yRhqHkser6CncUMHUtMqSxShO+bt5w5I KX57XN3uknvxQwenx6kujcPn8W8obhQ6mNvgyPQP3si9KjELFOTfyt04SzLshG3CMWd7 w0hg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=fK8WEtc2GFHptLJs2Vl6B9J5BbpyrxE1ckawIMUyd9Y=; b=CkCZN3VpqHXKq3mZEAsSYyiggR9fPJDGJgzKMokHtFid7WcAmVl3epgo50AHWxSRM2 Bvt74q3VO4Hn8UKgknDG9G2CwMjywmyOk4c400+lnL7QBFNCpxkE9CItNbqtRTAvQtfu GTrYLTHw2VbIvMJSx1+d32h3mJ+AmOPNTg8UtcnUkpd6oQ1muMiiMrXlcctR/op7Xfwi xFUH2G2UhZUybSDF+SBBMlgTEuJGTdIfeuHwzTb6AVULx2fFZk4ZPxVuUhjriC4/1uoc g4GlPfYMJZnkZkvx/z/QA/ESKRYBacAYjgVFiV8sO2YxUN0eBpIh3YMfgJWnI3D7O7It zC0g== X-Gm-Message-State: AOAM5323lze36tRnTkQjx1BAKoWNm8Mym2nS7QycqUt6XqKKGIJ+OIJO iLdfd2EBn8C5xdwW1m2MIQHUGdMgtyELFwlX X-Google-Smtp-Source: ABdhPJwF8rSkp84nw3CZOLxSJagTk3/zXdWKAUZ+7MOrKmEk8DS05JNPYO32q3t2zdO4xbtxHjm+zw== X-Received: by 2002:a37:a654:: with SMTP id p81mr32033854qke.354.1614193792833; Wed, 24 Feb 2021 11:09:52 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:268b:c46e:d22e:db6b]) by smtp.gmail.com with ESMTPSA id z5sm2179439qkc.61.2021.02.24.11.09.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Feb 2021 11:09:52 -0800 (PST) Date: Wed, 24 Feb 2021 14:09:50 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, avarab@gmail.com, gitster@pobox.com Subject: [PATCH v2 07/15] t/helper/test-read-midx.c: add '--show-objects' Message-ID: <2fd9f4debff480e18c902a919750cb7b7aba66bf.1614193703.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The 'read-midx' helper is used in places like t5319 to display basic information about a multi-pack-index. In the next patch, the MIDX writing machinery will learn a new way to choose from which pack an object is selected when multiple copies of that object exist. To disambiguate which pack introduces an object so that this feature can be tested, add a '--show-objects' option which displays additional information about each object in the MIDX. Signed-off-by: Taylor Blau --- t/helper/test-read-midx.c | 24 ++++++++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-) diff --git a/t/helper/test-read-midx.c b/t/helper/test-read-midx.c index 2430880f78..7c2eb11a8e 100644 --- a/t/helper/test-read-midx.c +++ b/t/helper/test-read-midx.c @@ -4,7 +4,7 @@ #include "repository.h" #include "object-store.h" -static int read_midx_file(const char *object_dir) +static int read_midx_file(const char *object_dir, int show_objects) { uint32_t i; struct multi_pack_index *m; @@ -43,13 +43,29 @@ static int read_midx_file(const char *object_dir) printf("object-dir: %s\n", m->object_dir); + if (show_objects) { + struct object_id oid; + struct pack_entry e; + + for (i = 0; i < m->num_objects; i++) { + nth_midxed_object_oid(&oid, m, i); + fill_midx_entry(the_repository, &oid, &e, m); + + printf("%s %"PRIu64"\t%s\n", + oid_to_hex(&oid), e.offset, e.p->pack_name); + } + return 0; + } + return 0; } int cmd__read_midx(int argc, const char **argv) { - if (argc != 2) - usage("read-midx "); + if (!(argc == 2 || argc == 3)) + usage("read-midx [--show-objects] "); - return read_midx_file(argv[1]); + if (!strcmp(argv[1], "--show-objects")) + return read_midx_file(argv[2], 1); + return read_midx_file(argv[1], 0); } From patchwork Wed Feb 24 19:09:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12102315 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 61AC4C433DB for ; Wed, 24 Feb 2021 19:13:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 091AA64E90 for ; Wed, 24 Feb 2021 19:13:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236079AbhBXTNG (ORCPT ); Wed, 24 Feb 2021 14:13:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59736 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236071AbhBXTLs (ORCPT ); Wed, 24 Feb 2021 14:11:48 -0500 Received: from mail-qv1-xf2f.google.com (mail-qv1-xf2f.google.com [IPv6:2607:f8b0:4864:20::f2f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0F4F4C061794 for ; Wed, 24 Feb 2021 11:09:58 -0800 (PST) Received: by mail-qv1-xf2f.google.com with SMTP id cw15so789658qvb.11 for ; Wed, 24 Feb 2021 11:09:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=m8qi0J1ZNvQ8fs3ZZthOPdeaBR0hCDJJtllpSbpTp2o=; b=BvPnw96Mmw/ypqYZxT6z659y/AAzHmVAmmH2utOPhuougaVapcSppvRTPJjKwoOgL7 ksmMA55rCykDQVNEab8qHBGJbHunslNYtglJ2E8/DF7aUpWxY7ddVBMLq/3u6oRGuUvv aBJq90y7fyBnT25rqaWcEGRUTyI17EQSIwBJ8A0FVvwMT7+dvmUPRATWEm68xwH0DDvS VA5lN3S+rdVf3rrFpIuppM1zu6F9PiOcegjDv2JtwTkAZEwtGnX4jzYPAbNvf4/+aiEh gs5GCtWHzwy2O4nK5yHr/mrpSqF+/mmsX7+R2ADCV5/wnKDdZW77gM9Q+LaCIvmjFREK lMmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=m8qi0J1ZNvQ8fs3ZZthOPdeaBR0hCDJJtllpSbpTp2o=; b=dN+Abu4fd9id31HKwkIzoFUmWnRw6FrCCS7TsNfoGMpJzSu4g+TdEnvxmymi5RkZZR jvJEFPjkHFrpkQvvbXUaMhl7JbqlGv3agGkp5ahsjxvvMobCohXQ/+VzhGlcHzINLKOP Un45EVNzs/eB1IpHYzAzvJA8BbYm7CnXCFKy5gkgTfT8euZhhGIUVHyceGMpn6LOJZ6R KBKT3NLNL6oX689hoSnPOzXXTiTk8iLj5+qhYFP4vu72kUSXXCGVf9Z22tYta0sHspZX 2rPltCg8fHji/R01Vc4a7KMUTIxrMtZgKgfSMrxwCjv4QWphA9OFZLi2Icn8otR4kUhB 4bzg== X-Gm-Message-State: AOAM531bgGxJnRVQhIkrUWZHpoe7k2SRK4qirHxszzspAoYhYKx8Q/EO USkZAh+GZnq35LFReD5Ux8NdGLI5A4mMkwCB X-Google-Smtp-Source: ABdhPJwoW4wV4tePoIIoNHU5fz7bAQxeYGZt/tFAsgv0VPGPNY376E77sDkVxPaT8+D2ejI3XQMOHA== X-Received: by 2002:a0c:a404:: with SMTP id w4mr21928627qvw.22.1614193796813; Wed, 24 Feb 2021 11:09:56 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:268b:c46e:d22e:db6b]) by smtp.gmail.com with ESMTPSA id z1sm1869652qtu.83.2021.02.24.11.09.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Feb 2021 11:09:56 -0800 (PST) Date: Wed, 24 Feb 2021 14:09:54 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, avarab@gmail.com, gitster@pobox.com Subject: [PATCH v2 08/15] midx: allow marking a pack as preferred Message-ID: <223b89909416ec7c5505f9cedaa80bf86ecc7b2e.1614193703.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When multiple packs in the multi-pack index contain the same object, the MIDX machinery must make a choice about which pack it associates with that object. Prior to this patch, the lowest-ordered[1] pack was always selected. Pack selection for duplicate objects is relatively unimportant today, but it will become important for multi-pack bitmaps. This is because we can only invoke the pack-reuse mechanism when all of the bits for reused objects come from the reuse pack (in order to ensure that all reused deltas can find their base objects in the same pack). To encourage the pack selection process to prefer one pack over another (the pack to be preferred is the one a caller would like to later use as a reuse pack), introduce the concept of a "preferred pack". When provided, the MIDX code will always prefer an object found in a preferred pack over any other. No format changes are required to store the preferred pack, since it will be able to be inferred with a corresponding MIDX bitmap, by looking up the pack associated with the object in the first bit position (this ordering is described in detail in a subsequent commit). [1]: the ordering is specified by MIDX internals; for our purposes we can consider the "lowest ordered" pack to be "the one with the most-recent mtime. Signed-off-by: Taylor Blau --- Documentation/git-multi-pack-index.txt | 14 ++- Documentation/technical/multi-pack-index.txt | 5 +- builtin/multi-pack-index.c | 18 +++- builtin/repack.c | 2 +- midx.c | 99 ++++++++++++++++++-- midx.h | 2 +- t/t5319-multi-pack-index.sh | 39 ++++++++ 7 files changed, 161 insertions(+), 18 deletions(-) diff --git a/Documentation/git-multi-pack-index.txt b/Documentation/git-multi-pack-index.txt index eb0caa0439..ffd601bc17 100644 --- a/Documentation/git-multi-pack-index.txt +++ b/Documentation/git-multi-pack-index.txt @@ -9,7 +9,8 @@ git-multi-pack-index - Write and verify multi-pack-indexes SYNOPSIS -------- [verse] -'git multi-pack-index' [--object-dir=] [--[no-]progress] +'git multi-pack-index' [--object-dir=] [--[no-]progress] + [--preferred-pack=] DESCRIPTION ----------- @@ -30,7 +31,16 @@ OPTIONS The following subcommands are available: write:: - Write a new MIDX file. + Write a new MIDX file. The following options are available for + the `write` sub-command: ++ +-- + --preferred-pack=:: + Optionally specify the tie-breaking pack used when + multiple packs contain the same object. If not given, + ties are broken in favor of the pack with the lowest + mtime. +-- verify:: Verify the contents of the MIDX file. diff --git a/Documentation/technical/multi-pack-index.txt b/Documentation/technical/multi-pack-index.txt index e8e377a59f..fb688976c4 100644 --- a/Documentation/technical/multi-pack-index.txt +++ b/Documentation/technical/multi-pack-index.txt @@ -43,8 +43,9 @@ Design Details a change in format. - The MIDX keeps only one record per object ID. If an object appears - in multiple packfiles, then the MIDX selects the copy in the most- - recently modified packfile. + in multiple packfiles, then the MIDX selects the copy in the + preferred packfile, otherwise selecting from the most-recently + modified packfile. - If there exist packfiles in the pack directory not registered in the MIDX, then those packfiles are loaded into the `packed_git` diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index 5b05e5ce39..2329dc5ec0 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -4,10 +4,11 @@ #include "parse-options.h" #include "midx.h" #include "trace2.h" +#include "object-store.h" static char const * const builtin_multi_pack_index_write_usage[] = { #define BUILTIN_MIDX_WRITE_USAGE \ - N_("git multi-pack-index [] write") + N_("git multi-pack-index [] write [--preferred-pack=]") BUILTIN_MIDX_WRITE_USAGE, NULL }; @@ -43,6 +44,7 @@ static char const * const builtin_multi_pack_index_usage[] = { static struct opts_multi_pack_index { const char *object_dir; + const char *preferred_pack; unsigned long batch_size; unsigned flags; } opts; @@ -63,7 +65,16 @@ static struct option *add_common_options(struct option *prev) static int cmd_multi_pack_index_write(int argc, const char **argv) { - struct option *options = common_opts; + struct option *options; + static struct option builtin_multi_pack_index_write_options[] = { + OPT_STRING(0, "preferred-pack", &opts.preferred_pack, + N_("preferred-pack"), + N_("pack for reuse when computing a multi-pack bitmap")), + OPT_END(), + }; + + options = parse_options_dup(builtin_multi_pack_index_write_options); + options = add_common_options(options); trace2_cmd_mode(argv[0]); @@ -74,7 +85,8 @@ static int cmd_multi_pack_index_write(int argc, const char **argv) usage_with_options(builtin_multi_pack_index_write_usage, options); - return write_midx_file(opts.object_dir, opts.flags); + return write_midx_file(opts.object_dir, opts.preferred_pack, + opts.flags); } static int cmd_multi_pack_index_verify(int argc, const char **argv) diff --git a/builtin/repack.c b/builtin/repack.c index 01440de2d5..9f00806805 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -523,7 +523,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) remove_temporary_files(); if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0)) - write_midx_file(get_object_directory(), 0); + write_midx_file(get_object_directory(), NULL, 0); string_list_clear(&names, 0); string_list_clear(&rollback, 0); diff --git a/midx.c b/midx.c index 971faa8cfc..d2c56c4bc6 100644 --- a/midx.c +++ b/midx.c @@ -431,6 +431,24 @@ static int pack_info_compare(const void *_a, const void *_b) return strcmp(a->pack_name, b->pack_name); } +static int lookup_idx_or_pack_name(struct pack_info *info, + uint32_t nr, + const char *pack_name) +{ + uint32_t lo = 0, hi = nr; + while (lo < hi) { + uint32_t mi = lo + (hi - lo) / 2; + int cmp = cmp_idx_or_pack_name(pack_name, info[mi].pack_name); + if (cmp < 0) + hi = mi; + else if (cmp > 0) + lo = mi + 1; + else + return mi; + } + return -1; +} + struct write_midx_context { struct pack_info *info; uint32_t nr; @@ -445,6 +463,8 @@ struct write_midx_context { uint32_t *pack_perm; unsigned large_offsets_needed:1; uint32_t num_large_offsets; + + int preferred_pack_idx; }; static void add_pack_to_midx(const char *full_path, size_t full_path_len, @@ -489,6 +509,7 @@ struct pack_midx_entry { uint32_t pack_int_id; time_t pack_mtime; uint64_t offset; + unsigned preferred : 1; }; static int midx_oid_compare(const void *_a, const void *_b) @@ -500,6 +521,12 @@ static int midx_oid_compare(const void *_a, const void *_b) if (cmp) return cmp; + /* Sort objects in a preferred pack first when multiple copies exist. */ + if (a->preferred > b->preferred) + return -1; + if (a->preferred < b->preferred) + return 1; + if (a->pack_mtime > b->pack_mtime) return -1; else if (a->pack_mtime < b->pack_mtime) @@ -527,7 +554,8 @@ static int nth_midxed_pack_midx_entry(struct multi_pack_index *m, static void fill_pack_entry(uint32_t pack_int_id, struct packed_git *p, uint32_t cur_object, - struct pack_midx_entry *entry) + struct pack_midx_entry *entry, + int preferred) { if (nth_packed_object_id(&entry->oid, p, cur_object) < 0) die(_("failed to locate object %d in packfile"), cur_object); @@ -536,6 +564,7 @@ static void fill_pack_entry(uint32_t pack_int_id, entry->pack_mtime = p->mtime; entry->offset = nth_packed_object_offset(p, cur_object); + entry->preferred = !!preferred; } /* @@ -552,7 +581,8 @@ static void fill_pack_entry(uint32_t pack_int_id, static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m, struct pack_info *info, uint32_t nr_packs, - uint32_t *nr_objects) + uint32_t *nr_objects, + uint32_t preferred_pack) { uint32_t cur_fanout, cur_pack, cur_object; uint32_t alloc_fanout, alloc_objects, total_objects = 0; @@ -589,12 +619,17 @@ static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m, nth_midxed_pack_midx_entry(m, &entries_by_fanout[nr_fanout], cur_object); + if (nth_midxed_pack_int_id(m, cur_object) == preferred_pack) + entries_by_fanout[nr_fanout].preferred = 1; + else + entries_by_fanout[nr_fanout].preferred = 0; nr_fanout++; } } for (cur_pack = start_pack; cur_pack < nr_packs; cur_pack++) { uint32_t start = 0, end; + int preferred = cur_pack == preferred_pack; if (cur_fanout) start = get_pack_fanout(info[cur_pack].p, cur_fanout - 1); @@ -602,7 +637,11 @@ static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m, for (cur_object = start; cur_object < end; cur_object++) { ALLOC_GROW(entries_by_fanout, nr_fanout + 1, alloc_fanout); - fill_pack_entry(cur_pack, info[cur_pack].p, cur_object, &entries_by_fanout[nr_fanout]); + fill_pack_entry(cur_pack, + info[cur_pack].p, + cur_object, + &entries_by_fanout[nr_fanout], + preferred); nr_fanout++; } } @@ -777,7 +816,9 @@ static int write_midx_large_offsets(struct hashfile *f, } static int write_midx_internal(const char *object_dir, struct multi_pack_index *m, - struct string_list *packs_to_drop, unsigned flags) + struct string_list *packs_to_drop, + const char *preferred_pack_name, + unsigned flags) { char *midx_name; uint32_t i; @@ -828,7 +869,19 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * if (ctx.m && ctx.nr == ctx.m->num_packs && !packs_to_drop) goto cleanup; - ctx.entries = get_sorted_entries(ctx.m, ctx.info, ctx.nr, &ctx.entries_nr); + if (preferred_pack_name) { + for (i = 0; i < ctx.nr; i++) { + if (!cmp_idx_or_pack_name(preferred_pack_name, + ctx.info[i].pack_name)) { + ctx.preferred_pack_idx = i; + break; + } + } + } else + ctx.preferred_pack_idx = -1; + + ctx.entries = get_sorted_entries(ctx.m, ctx.info, ctx.nr, &ctx.entries_nr, + ctx.preferred_pack_idx); ctx.large_offsets_needed = 0; for (i = 0; i < ctx.entries_nr; i++) { @@ -889,6 +942,31 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * pack_name_concat_len += strlen(ctx.info[i].pack_name) + 1; } + /* + * Recompute the preferred_pack_idx (if applicable) according to the + * permuted pack order. + */ + ctx.preferred_pack_idx = -1; + if (preferred_pack_name) { + ctx.preferred_pack_idx = lookup_idx_or_pack_name(ctx.info, + ctx.nr, + preferred_pack_name); + if (ctx.preferred_pack_idx < 0) + warning(_("unknown preferred pack: '%s'"), + preferred_pack_name); + else { + uint32_t orig = ctx.info[ctx.preferred_pack_idx].orig_pack_int_id; + uint32_t perm = ctx.pack_perm[orig]; + + if (perm == PACK_EXPIRED) { + warning(_("preferred pack '%s' is expired"), + preferred_pack_name); + ctx.preferred_pack_idx = -1; + } else + ctx.preferred_pack_idx = perm; + } + } + if (pack_name_concat_len % MIDX_CHUNK_ALIGNMENT) pack_name_concat_len += MIDX_CHUNK_ALIGNMENT - (pack_name_concat_len % MIDX_CHUNK_ALIGNMENT); @@ -947,9 +1025,12 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * return result; } -int write_midx_file(const char *object_dir, unsigned flags) +int write_midx_file(const char *object_dir, + const char *preferred_pack_name, + unsigned flags) { - return write_midx_internal(object_dir, NULL, NULL, flags); + return write_midx_internal(object_dir, NULL, NULL, preferred_pack_name, + flags); } void clear_midx_file(struct repository *r) @@ -1184,7 +1265,7 @@ int expire_midx_packs(struct repository *r, const char *object_dir, unsigned fla free(count); if (packs_to_drop.nr) - result = write_midx_internal(object_dir, m, &packs_to_drop, flags); + result = write_midx_internal(object_dir, m, &packs_to_drop, NULL, flags); string_list_clear(&packs_to_drop, 0); return result; @@ -1373,7 +1454,7 @@ int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, goto cleanup; } - result = write_midx_internal(object_dir, m, NULL, flags); + result = write_midx_internal(object_dir, m, NULL, NULL, flags); m = NULL; cleanup: diff --git a/midx.h b/midx.h index b18cf53bc4..e7fea61109 100644 --- a/midx.h +++ b/midx.h @@ -47,7 +47,7 @@ int fill_midx_entry(struct repository *r, const struct object_id *oid, struct pa int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pack_name); int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local); -int write_midx_file(const char *object_dir, unsigned flags); +int write_midx_file(const char *object_dir, const char *preferred_pack_name, unsigned flags); void clear_midx_file(struct repository *r); int verify_midx_file(struct repository *r, const char *object_dir, unsigned flags); int expire_midx_packs(struct repository *r, const char *object_dir, unsigned flags); diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index b4afab1dfc..fd94ba9053 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -31,6 +31,14 @@ midx_read_expect () { test_cmp expect actual } +midx_expect_object_offset () { + OID="$1" + OFFSET="$2" + OBJECT_DIR="$3" + test-tool read-midx --show-objects $OBJECT_DIR >actual && + grep "^$OID $OFFSET" actual +} + test_expect_success 'setup' ' test_oid_cache <<-EOF idxoff sha1:2999 @@ -234,6 +242,37 @@ test_expect_success 'warn on improper hash version' ' ) ' +test_expect_success 'midx picks objects from preferred pack' ' + test_when_finished rm -rf preferred.git && + git init --bare preferred.git && + ( + cd preferred.git && + + a=$(echo "a" | git hash-object -w --stdin) && + b=$(echo "b" | git hash-object -w --stdin) && + c=$(echo "c" | git hash-object -w --stdin) && + + # Set up two packs, duplicating the object "B" at different + # offsets. + git pack-objects objects/pack/test-AB <<-EOF && + $a + $b + EOF + bc=$(git pack-objects objects/pack/test-BC <<-EOF + $b + $c + EOF + ) && + + git multi-pack-index --object-dir=objects \ + write --preferred-pack=test-BC-$bc.idx 2>err && + test_must_be_empty err && + + ofs=$(git show-index X-Patchwork-Id: 12102319 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 30562C433E6 for ; Wed, 24 Feb 2021 19:13:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E64C464E90 for ; Wed, 24 Feb 2021 19:13:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236125AbhBXTNd (ORCPT ); Wed, 24 Feb 2021 14:13:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59854 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236080AbhBXTMV (ORCPT ); Wed, 24 Feb 2021 14:12:21 -0500 Received: from mail-qt1-x836.google.com (mail-qt1-x836.google.com [IPv6:2607:f8b0:4864:20::836]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A2CEAC061797 for ; Wed, 24 Feb 2021 11:10:01 -0800 (PST) Received: by mail-qt1-x836.google.com with SMTP id b24so2232621qtp.13 for ; Wed, 24 Feb 2021 11:10:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=FTPV/Vio+C3gf+tmi6dzfePMcY9p8Ik/AXdW3TRwAUo=; b=mW+xvpINfdcMu4dgr/xSj8RCS50hhQa+sRxctynNw1BzM3Qcizi1QtARu7iqIM9TC8 mSB827OeBAKxV4oQvI+GpCeI6+KGj1l2Ld23McaOeyChS7SPpdPtd9vSR0DVdMU6vwpX +ylWjzP5eMeJ/JWS6uQ0vsjH02FxilmC7SHrlFjjpW56aI4qQi0eoDQvzYigQz9eUBsd 1PVGrgScGxjPv83EZDP/FjKbud4XIMF8dM/08Ri/CtRDifCv2Xo256p0eq6pLbkfYnZ/ mMQ7fwVzZTLXw46ee5jBcJTXt2NWi8cLFW4R+UJArnSCZTt5m1T36wm58fxus/q5sYjB 05mw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=FTPV/Vio+C3gf+tmi6dzfePMcY9p8Ik/AXdW3TRwAUo=; b=Zy4JQwoXp2KHlIm2Y7nK3eE5C9CQo/GGcRL1bxVuLworlCqr0Mj8RtN/EBHTCyAxGn 2ulF36GljmvMEdSDllcEZVVkNBk4EhMqiM9BMp4mpWXEwvB5gsDpNOsoKkKIFg53Xj3Y oN1+xsYFFgjMRm3Lifmaio1Ajo/bRytDeIzM24uAfzGzKm5vHOueB5MC+7zk+M4knKtO WYFR53XMKPZhCr7sFyWGb4Ks9B9BexhZ9QwAIPma3nMHcP5mLrR3EIVuPUo4KOp23Ybd j042mYPx9k55kQGbYc15dGLcep/qMBVlclswyB8fngRZLiLHKwUrl0qh9ZRWcLyEiaKC 1C7A== X-Gm-Message-State: AOAM533Fc36Ba5mQ4emav4d14KVx8uKwvOfdn1IWNmV0RxHW5bbP8KUF sQLz0YqpXREpiHiNAvsE/JxC9IdMeLu9Gyeb X-Google-Smtp-Source: ABdhPJyIvnUhfZTVi9BfKjJA2uk7rg5lOTf0zKwqmBfMAFTHiTtcDjz1gkYl6MvECRzdHsGHuvbPAQ== X-Received: by 2002:ac8:5786:: with SMTP id v6mr31275316qta.200.1614193800689; Wed, 24 Feb 2021 11:10:00 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:268b:c46e:d22e:db6b]) by smtp.gmail.com with ESMTPSA id o20sm1893323qtp.92.2021.02.24.11.10.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Feb 2021 11:10:00 -0800 (PST) Date: Wed, 24 Feb 2021 14:09:58 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, avarab@gmail.com, gitster@pobox.com Subject: [PATCH v2 09/15] midx: don't free midx_name early Message-ID: <976848bc4b40f1d1e7cdef5e5cf031ecd9f1ac0f.1614193703.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A subsequent patch will need to refer back to 'midx_name' later on in the function. In fact, this variable is already free()'d later on, so this makes the later free() no longer redundant. Signed-off-by: Taylor Blau --- midx.c | 1 - 1 file changed, 1 deletion(-) diff --git a/midx.c b/midx.c index d2c56c4bc6..db043d3e65 100644 --- a/midx.c +++ b/midx.c @@ -973,7 +973,6 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * hold_lock_file_for_update(&lk, midx_name, LOCK_DIE_ON_ERROR); f = hashfd(get_lock_file_fd(&lk), get_lock_file_path(&lk)); - FREE_AND_NULL(midx_name); if (ctx.m) close_midx(ctx.m); From patchwork Wed Feb 24 19:10:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12102323 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58CDCC433E0 for ; Wed, 24 Feb 2021 19:13:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1B32364EC4 for ; Wed, 24 Feb 2021 19:13:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236023AbhBXTNi (ORCPT ); Wed, 24 Feb 2021 14:13:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59856 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236083AbhBXTMV (ORCPT ); Wed, 24 Feb 2021 14:12:21 -0500 Received: from mail-qk1-x72f.google.com (mail-qk1-x72f.google.com [IPv6:2607:f8b0:4864:20::72f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C58A6C0617A7 for ; Wed, 24 Feb 2021 11:10:05 -0800 (PST) Received: by mail-qk1-x72f.google.com with SMTP id d20so2280282qkc.2 for ; Wed, 24 Feb 2021 11:10:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=iz6OksybEjkqiZCof/6e7vf6VJgo67PXi7MqX8uuQ34=; b=ubc/Hm2JpcoLoT7jbKJVj3TMyx2tSd+fe5JvUVAQ6n2C3APQx6mfg/pj18f3/+i8Ld RBFaQDBF4Tq+n4KfH7hmA4xzDBd/gLG3Qf73qUNr26vtI6VEPFrcvgwcTksDW6KDsFDe 6ZLLe3ijZ6mbKbL3evjGYLPjNgDMpnJoj59hkTmjtRNtCPkrUG+JwpnLefTbnxXFyKIt FtEzCn/6s4ltTrjFzsb6+pHxcQ4y7qSCOA0+39iNkUKLLiJdtNT67vVvNRXv2wEhn0gn bYtQ7QlJHErwaL6hpUWlQMIHFesgNpn6JC7yQ11BQ3I6CSO9p/DhN75lhDZcfgmVz+Dr sRbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=iz6OksybEjkqiZCof/6e7vf6VJgo67PXi7MqX8uuQ34=; b=kiR4siIc/Elimc/R8ZCsa/1KAWSga0R0DKB/wKbO5mncsBTiDHu3WTaNLbHyZYv0BA n/p1JDL9wNxzeyFHz21KK1ZF13dCeG2gcdegUC0bnXhhnCXWth9LaJp+cRbM0WreyknR pRYZJtMNSx+VhNvq/gkhfqmFYdjasVraGtAdWAvKnw7c6QY3+Kp7rNfqzdxS9NdsD4zd +xXJ6y52+ZrqQwkU+dUrkUyfruvnk10T3bGagSEUYvAHj1asDYia6+VnZzV0hxhYGEE7 JzX25huXqfEyDgUeaMfPvypZwKdpu51b/hbMwgw0RMaqmsD79ZApgr//6uw7Us/hmYqk pdZw== X-Gm-Message-State: AOAM531cCEcnpkosMicb/sbnHAFWWbvIG1rwg+yABbU7ibB7t0aT8Smn X3wH2K6Jm4vNmlfbxKr6v+G1qtxeATdsOShJ X-Google-Smtp-Source: ABdhPJwzpEUmgWHRMTETuUwnE9SHF4K2vVXa1Y7aQVhpQBEz1aDH5O2JLOHqslPdLzdBFO0uKRpnQg== X-Received: by 2002:a37:bd84:: with SMTP id n126mr32481988qkf.54.1614193804723; Wed, 24 Feb 2021 11:10:04 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:268b:c46e:d22e:db6b]) by smtp.gmail.com with ESMTPSA id w5sm1869514qta.45.2021.02.24.11.10.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Feb 2021 11:10:04 -0800 (PST) Date: Wed, 24 Feb 2021 14:10:02 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, avarab@gmail.com, gitster@pobox.com Subject: [PATCH v2 10/15] midx: keep track of the checksum Message-ID: <5ed47f7e3a8c42bae051243061debd7b97f630da.1614193703.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org write_midx_internal() uses a hashfile to write the multi-pack index, but discards its checksum. This makes sense, since nothing that takes place after writing the MIDX cares about its checksum. That is about to change in a subsequent patch, when the optional reverse index corresponding to the MIDX will want to include the MIDX's checksum. Store the checksum of the MIDX in preparation for that. Signed-off-by: Taylor Blau --- midx.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/midx.c b/midx.c index db043d3e65..3ea795f416 100644 --- a/midx.c +++ b/midx.c @@ -821,6 +821,7 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * unsigned flags) { char *midx_name; + unsigned char midx_hash[GIT_MAX_RAWSZ]; uint32_t i; struct hashfile *f = NULL; struct lock_file lk; @@ -1004,7 +1005,7 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * write_midx_header(f, get_num_chunks(cf), ctx.nr - dropped_packs); write_chunkfile(cf, &ctx); - finalize_hashfile(f, NULL, CSUM_FSYNC | CSUM_HASH_IN_STREAM); + finalize_hashfile(f, midx_hash, CSUM_FSYNC | CSUM_HASH_IN_STREAM); free_chunkfile(cf); commit_lock_file(&lk); From patchwork Wed Feb 24 19:10:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12102321 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 81BB9C433E9 for ; Wed, 24 Feb 2021 19:13:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3A7F564E24 for ; Wed, 24 Feb 2021 19:13:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236122AbhBXTNn (ORCPT ); Wed, 24 Feb 2021 14:13:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59862 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236084AbhBXTMW (ORCPT ); Wed, 24 Feb 2021 14:12:22 -0500 Received: from mail-qk1-x72d.google.com (mail-qk1-x72d.google.com [IPv6:2607:f8b0:4864:20::72d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9A481C0617A9 for ; Wed, 24 Feb 2021 11:10:09 -0800 (PST) Received: by mail-qk1-x72d.google.com with SMTP id x124so3305086qkc.1 for ; Wed, 24 Feb 2021 11:10:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=4ZXleYMf9s8ZNcYrdTh9UAftkdmYXWKQAjFFPrJovzM=; b=hJk2Mn45Osb711TJsjzDfB5Y2BnMrxuxc/ekcXuuC0cAfx3THNqRxMtKjQT9Vnwent bUacu8vydtBtfZ664r1PjRuEKEyfZ/mFQ9upjDDRf6Cib5fKv54EzDsYzYOD+rtT8BSC +jlp+337YHoVf3F0duzlUVfFk26NFHo8S24ed2sJQacWuy9oex7M2H6UBSA7/b+sS6tF ZgV7PH7N1p6Nei5e/2GP2b8IlDFi5Y8s7Pe1CYvobHWYxiH2rsBn00aKQSRDkFnhjmE2 IE67iSNiok6xDvc3Q82/lRii+d3Bxk8a/6TxFzL4+ulOz6xK1hUYwHfgOs4hHSZyzUGc VJeQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=4ZXleYMf9s8ZNcYrdTh9UAftkdmYXWKQAjFFPrJovzM=; b=BZUC191OG4y2lFd9/9MQPzkZ8a9YaWFiioSYfCXzPB7hdSrOquplO3LpiC48EfWSqu vboRjEtqGhfUuJy41vGLDUf5+bacp0EoR7DMKWf/aGONuimYpDP6pN45juy8xUYfPuzd JyJPHhARRvtm9YZQPhqVDKnbzBJ3VQPCgK8wO4AA2p0oJp9VkEMIAEfEXZtNi3htT+EX EAM6XBpdDoIEBNXifeAH8n1zuQ3ls+6bAYDZLIGT+mEtsKsyyhyuE/A0UlB6NHpe8S0U 3gvrg3ghL2eSpfYwN6k9ezbu9kBr6ZDKICkd4t3EeIC+MoqrI0EehZ8EzmqSmyXmXTxj Ttdw== X-Gm-Message-State: AOAM532HMMCOR7X+5bAZyui50OpkVq+o8oOmKmQwzghH8N0cGL0YQVHn YHUAHNaxNxbVhn3QM1vGVg9sJY+1Gz8+ojQP X-Google-Smtp-Source: ABdhPJxI5fL9T2yEs7+ZD5U/0Fc1AIWamagjuPQK3InHgXYX+OPZF7phMY8xg5nU87wZ5HXQ7Qikkg== X-Received: by 2002:a05:620a:b95:: with SMTP id k21mr31971645qkh.125.1614193808544; Wed, 24 Feb 2021 11:10:08 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:268b:c46e:d22e:db6b]) by smtp.gmail.com with ESMTPSA id b190sm2218059qkg.103.2021.02.24.11.10.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Feb 2021 11:10:08 -0800 (PST) Date: Wed, 24 Feb 2021 14:10:06 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, avarab@gmail.com, gitster@pobox.com Subject: [PATCH v2 11/15] midx: make some functions non-static Message-ID: <0292508e12582462b799d3ba6b190eb29915661d.1614193703.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In a subsequent commit, pack-revindex.c will become responsible for sorting a list of objects in the "MIDX pack order" (which will be defined in the following patch). To do so, it will need to be know the pack identifier and offset within that pack for each object in the MIDX. The MIDX code already has functions for doing just that (nth_midxed_offset() and nth_midxed_pack_int_id()), but they are statically declared. Since there is no reason that they couldn't be exposed publicly, and because they are already doing exactly what the caller in pack-revindex.c will want, expose them publicly so that they can be reused there. Signed-off-by: Taylor Blau --- midx.c | 4 ++-- midx.h | 2 ++ 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/midx.c b/midx.c index 3ea795f416..27a8b76dfe 100644 --- a/midx.c +++ b/midx.c @@ -239,7 +239,7 @@ struct object_id *nth_midxed_object_oid(struct object_id *oid, return oid; } -static off_t nth_midxed_offset(struct multi_pack_index *m, uint32_t pos) +off_t nth_midxed_offset(struct multi_pack_index *m, uint32_t pos) { const unsigned char *offset_data; uint32_t offset32; @@ -258,7 +258,7 @@ static off_t nth_midxed_offset(struct multi_pack_index *m, uint32_t pos) return offset32; } -static uint32_t nth_midxed_pack_int_id(struct multi_pack_index *m, uint32_t pos) +uint32_t nth_midxed_pack_int_id(struct multi_pack_index *m, uint32_t pos) { return get_be32(m->chunk_object_offsets + (off_t)pos * MIDX_CHUNK_OFFSET_WIDTH); diff --git a/midx.h b/midx.h index e7fea61109..93bd68189e 100644 --- a/midx.h +++ b/midx.h @@ -40,6 +40,8 @@ struct multi_pack_index { struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local); int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t pack_int_id); int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, uint32_t *result); +off_t nth_midxed_offset(struct multi_pack_index *m, uint32_t pos); +uint32_t nth_midxed_pack_int_id(struct multi_pack_index *m, uint32_t pos); struct object_id *nth_midxed_object_oid(struct object_id *oid, struct multi_pack_index *m, uint32_t n); From patchwork Wed Feb 24 19:10:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12102325 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C040C433DB for ; Wed, 24 Feb 2021 19:14:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B53ED64E90 for ; Wed, 24 Feb 2021 19:14:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236136AbhBXTNr (ORCPT ); Wed, 24 Feb 2021 14:13:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59864 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236085AbhBXTMW (ORCPT ); Wed, 24 Feb 2021 14:12:22 -0500 Received: from mail-qv1-xf36.google.com (mail-qv1-xf36.google.com [IPv6:2607:f8b0:4864:20::f36]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8C851C0617AA for ; Wed, 24 Feb 2021 11:10:13 -0800 (PST) Received: by mail-qv1-xf36.google.com with SMTP id k5so1551328qvu.2 for ; Wed, 24 Feb 2021 11:10:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=OlvHIpN/0fZchJUQMEcQiUlDWg8Z7UeC1EeMZ372rD8=; b=krHC6OrtIAcmtJFNIPowaN1m7iNed3cvywjIJQA1MEnP60i/XReCeI4K7zm7guveCk l8du0uAUvRNZpuZFtARQ0chZRTkE/AwDVmw88ofgaQXBMivsEyyDzVADr35Xm/Wyg2Lk 7sYOyoruAcM+emEoq561waQBlWdUOrYwps2Ygmdky0c42BLaX+X1XlkFyoi97BOvu9aC vPv90wKmOtyVHktloFvTvtmqETndoGiYRZpz805NNJS+nPVfkFJnCbkjCIXQUXi5ePSg CK66iiRQJHfY+sud2XoKxQSTObdx16MzFy0g5hEiGNbPqcnC42LsiyDb7g2k+pvbj2K7 g+yA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=OlvHIpN/0fZchJUQMEcQiUlDWg8Z7UeC1EeMZ372rD8=; b=eNiI24k0TR0VYeQ8pzBAZ4QH1NNf0dBrJvI5aFc1UfAroshabbNKfDU+o4EfLSeJ1o cHRdIsxlPM+71RZzwVEMs5MYZtvEp8PvUQyvX18KE9iHtbGetLgbqB6Nay4kwi69LeZp cgIjDvzayB0dTDrG/pgkBQ0NksV3ylOzF6iCYqAIhfkn/I0hiidIFE0XKCKSUGhlTj+N jW8KEU1BfVeJoqWcwSItmk8zfFxIDEYwawge8IJmxnaxRJ3zipjELNLrifSEgH4ybwDX Y/O08znzQPtBkwp1L51POBpYsqTdrkO+ZlF5Ij7kiUGb3asm0XpXW2GZuX2spNvfcBmt mMyA== X-Gm-Message-State: AOAM533gx+341zEaR2d3boiJRc0rYxMXUiUXjJnT2lwQ4V8yl6NO8goz TGTFmVfxHbSDDLsAsurvzv4+Upwk2ay2RTD2 X-Google-Smtp-Source: ABdhPJwjFxU6JGXMhO3+GffB4sUXEJC8d4NvN/UC10Z23aJ2oLeHgMlEtB+DBEhs3C8r+0uUs0J+qw== X-Received: by 2002:a0c:e9c2:: with SMTP id q2mr23139282qvo.48.1614193812469; Wed, 24 Feb 2021 11:10:12 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:268b:c46e:d22e:db6b]) by smtp.gmail.com with ESMTPSA id m30sm849605qtd.30.2021.02.24.11.10.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Feb 2021 11:10:12 -0800 (PST) Date: Wed, 24 Feb 2021 14:10:10 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, avarab@gmail.com, gitster@pobox.com Subject: [PATCH v2 12/15] Documentation/technical: describe multi-pack reverse indexes Message-ID: <404d730498938da034d860d894ddbb7d6dffc27d.1614193703.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org As a prerequisite to implementing multi-pack bitmaps, motivate and describe the format and ordering of the multi-pack reverse index. The subsequent patch will implement reading this format, and the patch after that will implement writing it while producing a multi-pack index. Co-authored-by: Jeff King Signed-off-by: Jeff King Signed-off-by: Taylor Blau --- Documentation/technical/pack-format.txt | 80 +++++++++++++++++++++++++ 1 file changed, 80 insertions(+) diff --git a/Documentation/technical/pack-format.txt b/Documentation/technical/pack-format.txt index 1faa949bf6..77eb591057 100644 --- a/Documentation/technical/pack-format.txt +++ b/Documentation/technical/pack-format.txt @@ -379,3 +379,83 @@ CHUNK DATA: TRAILER: Index checksum of the above contents. + +== multi-pack-index reverse indexes + +Similar to the pack-based reverse index, the multi-pack index can also +be used to generate a reverse index. + +Instead of mapping between offset, pack-, and index position, this +reverse index maps between an object's position within the MIDX, and +that object's position within a pseudo-pack that the MIDX describes. + +To clarify these three orderings, consider a multi-pack reachability +bitmap (which does not yet exist, but is what we are building towards +here). Each bit needs to correspond to an object in the MIDX, and so we +need an efficient mapping from bit position to MIDX position. + +One solution is to let bits occupy the same position in the oid-sorted +index stored by the MIDX. But because oids are effectively random, there +resulting reachability bitmaps would have no locality, and thus compress +poorly. (This is the reason that single-pack bitmaps use the pack +ordering, and not the .idx ordering, for the same purpose.) + +So we'd like to define an ordering for the whole MIDX based around +pack ordering, which has far better locality (and thus compresses more +efficiently). We can think of a pseudo-pack created by the concatenation +of all of the packs in the MIDX. E.g., if we had a MIDX with three packs +(a, b, c), with 10, 15, and 20 objects respectively, we can imagine an +ordering of the objects like: + + |a,0|a,1|...|a,9|b,0|b,1|...|b,14|c,0|c,1|...|c,19| + +where the ordering of the packs is defined by the MIDX's pack list, +and then the ordering of objects within each pack is the same as the +order in the actual packfile. + +Given the list of packs and their counts of objects, you can +naïvely reconstruct that pseudo-pack ordering (e.g., the object at +position 27 must be (c,1) because packs "a" and "b" consumed 25 of the +slots). But there's a catch. Objects may be duplicated between packs, in +which case the MIDX only stores one pointer to the object (and thus we'd +want only one slot in the bitmap). + +Callers could handle duplicates themselves by reading objects in order +of their bit-position, but that's linear in the number of objects, and +much too expensive for ordinary bitmap lookups. Building a reverse index +solves this, since it is the logical inverse of the index, and that +index has already removed duplicates. But, building a reverse index on +the fly can be expensive. Since we already have an on-disk format for +pack-based reverse indexes, let's reuse it for the MIDX's pseudo-pack, +too. + +Objects from the MIDX are ordered as follows to string together the +pseudo-pack. Let _pack(o)_ return the pack from which _o_ was selected +by the MIDX, and define an ordering of packs based on their numeric ID +(as stored by the MIDX). Let _offset(o)_ return the object offset of _o_ +within _pack(o)_. Then, compare _o~1~_ and _o~2~_ as follows: + + - If one of _pack(o~1~)_ and _pack(o~2~)_ is preferred and the other + is not, then the preferred one sorts first. ++ +(This is a detail that allows the MIDX bitmap to determine which +pack should be used by the pack-reuse mechanism, since it can ask +the MIDX for the pack containing the object at bit position 0). + + - If _pack(o~1~) ≠ pack(o~2~)_, then sort the two objects in + descending order based on the pack ID. + + - Otherwise, _pack(o~1~) = pack(o~2~)_, and the objects are + sorted in pack-order (i.e., _o~1~_ sorts ahead of _o~2~_ exactly + when _offset(o~1~) < offset(o~2~)_). + +In short, a MIDX's pseudo-pack is the de-duplicated concatenation of +objects in packs stored by the MIDX, laid out in pack order, and the +packs arranged in MIDX order (with the preferred pack coming first). + +Finally, note that the MIDX's reverse index is not stored as a chunk in +the multi-pack-index itself. This is done because the reverse index +includes the checksum of the pack or MIDX to which it belongs, which +makes it impossible to write in the MIDX. To avoid races when rewriting +the MIDX, a MIDX reverse index includes the MIDX's checksum in its +filename (e.g., `multi-pack-index-xyz.rev`). From patchwork Wed Feb 24 19:10:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12102327 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2240C433DB for ; Wed, 24 Feb 2021 19:14:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 72E6464E24 for ; Wed, 24 Feb 2021 19:14:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235726AbhBXTOF (ORCPT ); Wed, 24 Feb 2021 14:14:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59872 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236086AbhBXTMY (ORCPT ); Wed, 24 Feb 2021 14:12:24 -0500 Received: from mail-qv1-xf2b.google.com (mail-qv1-xf2b.google.com [IPv6:2607:f8b0:4864:20::f2b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B4C98C0617AB for ; Wed, 24 Feb 2021 11:10:17 -0800 (PST) Received: by mail-qv1-xf2b.google.com with SMTP id k8so1549658qvm.6 for ; Wed, 24 Feb 2021 11:10:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=XvyqpOC23zxhU46+QFprrDTu/DghKhe7IhTeXG0hwwI=; b=b0IiC0lbtIafYRorlegfaTr7Vp7jTybydzap4QnQLa2QXjAzndU2ZY13mL525W7YNY TFRXMlwPAAljdax+Z9N781ae2YTw/ddRpUox+PUK6gZckOaTU+rlFcywi1RNZ6ZKImcE 0w5e6pKhI1llB4m60uQvjcFZWBgDOnDYShehhF2IAFn2qeGhbZTLkqr0WCl884V2M/y1 acgMaNdQB+3pZDQ8j2DWLNX8PlpjarGYj+fIJcpFKD7DpU8izBEi9cWHPf7Bh/cEFPbQ iTvv3YgUOgkGEcVfmuiDuAFUkuYVCACEhOnSsXdYMoRvXG2GT5mn1F9uWzfCPP2GpNfo hNHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=XvyqpOC23zxhU46+QFprrDTu/DghKhe7IhTeXG0hwwI=; b=SUHU9bb7n9aVgaNlQ88GqCwZHW6kcw+WDJc+OV+nvLe8XdDqdZwH2rhf6JRoCb5pt6 7yD2btKOJhyaaB+X0IYxm11XZM0PfMD31C5xlA8z6yl+v4RXztIs6ya4j3QqS+4yb1R/ STzwPO1da9DwAZZORi+18/Vzg5nR5BwKoqElEGAY7BaVrmvmiwVcZkqGAr9dYeZMdkiv RKGyJZWbXCDB3bi4gduhVF/xV8i1T3UdYdkrh1NXZ01YD1PDGVtjMMdWofXdKsiQvwFY G1f5DGCKJsEch0tZyTk21xRS2odXC+h7RYCTsEGkSRE6gCl1Ohi4MZ7gT95PuSLugGeb R96Q== X-Gm-Message-State: AOAM53313ED57OTVfqc5akVMvxB5daHZwOapClfn0D/oDls6oV/Ov3cp quZFNverdFw7pEmFbdJE/eUhrlS8I8RIQUca X-Google-Smtp-Source: ABdhPJw/u/xda9RJwFFs50yb5hus8U/UiDhxIREG8GWL6SqKVVrUA5twMiA876+/VV5IxmvT4RXl/g== X-Received: by 2002:ad4:5c87:: with SMTP id o7mr18313370qvh.31.1614193816430; Wed, 24 Feb 2021 11:10:16 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:268b:c46e:d22e:db6b]) by smtp.gmail.com with ESMTPSA id k26sm2108712qkj.131.2021.02.24.11.10.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Feb 2021 11:10:16 -0800 (PST) Date: Wed, 24 Feb 2021 14:10:14 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, avarab@gmail.com, gitster@pobox.com Subject: [PATCH v2 13/15] pack-revindex: read multi-pack reverse indexes Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Implement reading for multi-pack reverse indexes, as described in the previous patch. Note that these functions don't yet have any callers, and won't until multi-pack reachability bitmaps are introduced in a later patch series. In the meantime, this patch implements some of the infrastructure necessary to support multi-pack bitmaps. There are three new functions exposed by the revindex API: - load_midx_revindex(): loads the reverse index corresponding to the given multi-pack index. - midx_to_pack_pos() and pack_pos_to_midx(): these convert between the multi-pack index and pseudo-pack order. load_midx_revindex() and pack_pos_to_midx() are both relatively straightforward. load_midx_revindex() needs a few functions to be exposed from the midx API. One to get the checksum of a midx, and another to get the .rev's filename. Similar to recent changes in the packed_git struct, three new fields are added to the multi_pack_index struct: one to keep track of the size, one to keep track of the mmap'd pointer, and another to point past the header and at the reverse index's data. pack_pos_to_midx() simply reads the corresponding entry out of the table. midx_to_pack_pos() is the trickiest, since it needs to find an object's position in the psuedo-pack order, but that order can only be recovered in the .rev file itself. This mapping can be implemented with a binary search, but note that the thing we're binary searching over isn't an array, but rather a _permutation_. So, when comparing two items, it's helpful to keep in mind the difference. Instead of a traditional binary search, where you are comparing two things directly, here we're comparing a (pack, offset) tuple with an index into the multi-pack index. That index describes another (pack, offset) tuple, and it is _those_ two tuples that are compared. Signed-off-by: Taylor Blau --- midx.c | 11 +++++ midx.h | 6 +++ pack-revindex.c | 127 ++++++++++++++++++++++++++++++++++++++++++++++++ pack-revindex.h | 53 ++++++++++++++++++++ packfile.c | 3 ++ 5 files changed, 200 insertions(+) diff --git a/midx.c b/midx.c index 27a8b76dfe..8d7a8927b8 100644 --- a/midx.c +++ b/midx.c @@ -47,11 +47,22 @@ static uint8_t oid_version(void) } } +static const unsigned char *get_midx_checksum(struct multi_pack_index *m) +{ + return m->data + m->data_len - the_hash_algo->rawsz; +} + static char *get_midx_filename(const char *object_dir) { return xstrfmt("%s/pack/multi-pack-index", object_dir); } +char *get_midx_rev_filename(struct multi_pack_index *m) +{ + return xstrfmt("%s/pack/multi-pack-index-%s.rev", + m->object_dir, hash_to_hex(get_midx_checksum(m))); +} + static int midx_read_oid_fanout(const unsigned char *chunk_start, size_t chunk_size, void *data) { diff --git a/midx.h b/midx.h index 93bd68189e..0a8294d2ee 100644 --- a/midx.h +++ b/midx.h @@ -15,6 +15,10 @@ struct multi_pack_index { const unsigned char *data; size_t data_len; + const uint32_t *revindex_data; + const uint32_t *revindex_map; + size_t revindex_len; + uint32_t signature; unsigned char version; unsigned char hash_len; @@ -37,6 +41,8 @@ struct multi_pack_index { #define MIDX_PROGRESS (1 << 0) +char *get_midx_rev_filename(struct multi_pack_index *m); + struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local); int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t pack_int_id); int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, uint32_t *result); diff --git a/pack-revindex.c b/pack-revindex.c index 83fe4de773..2e15ba3a8f 100644 --- a/pack-revindex.c +++ b/pack-revindex.c @@ -3,6 +3,7 @@ #include "object-store.h" #include "packfile.h" #include "config.h" +#include "midx.h" struct revindex_entry { off_t offset; @@ -292,6 +293,44 @@ int load_pack_revindex(struct packed_git *p) return -1; } +int load_midx_revindex(struct multi_pack_index *m) +{ + char *revindex_name; + int ret; + if (m->revindex_data) + return 0; + + revindex_name = get_midx_rev_filename(m); + + ret = load_revindex_from_disk(revindex_name, + m->num_objects, + &m->revindex_map, + &m->revindex_len); + if (ret) + goto cleanup; + + m->revindex_data = (const uint32_t *)((const char *)m->revindex_map + RIDX_HEADER_SIZE); + +cleanup: + free(revindex_name); + return ret; +} + +int close_midx_revindex(struct multi_pack_index *m) +{ + if (!m) + return 0; + + if (munmap((void*)m->revindex_map, m->revindex_len)) + return -1; + + m->revindex_map = NULL; + m->revindex_data = NULL; + m->revindex_len = 0; + + return 0; +} + int offset_to_pack_pos(struct packed_git *p, off_t ofs, uint32_t *pos) { unsigned lo, hi; @@ -346,3 +385,91 @@ off_t pack_pos_to_offset(struct packed_git *p, uint32_t pos) else return nth_packed_object_offset(p, pack_pos_to_index(p, pos)); } + +uint32_t pack_pos_to_midx(struct multi_pack_index *m, uint32_t pos) +{ + if (!m->revindex_data) + BUG("pack_pos_to_midx: reverse index not yet loaded"); + if (m->num_objects <= pos) + BUG("pack_pos_to_midx: out-of-bounds object at %"PRIu32, pos); + return get_be32((const char *)m->revindex_data + (pos * sizeof(uint32_t))); +} + +struct midx_pack_key { + uint32_t pack; + off_t offset; + + uint32_t preferred_pack; + struct multi_pack_index *midx; +}; + +static int midx_pack_order_cmp(const void *va, const void *vb) +{ + const struct midx_pack_key *key = va; + struct multi_pack_index *midx = key->midx; + + uint32_t versus = pack_pos_to_midx(midx, (uint32_t*)vb - (const uint32_t *)midx->revindex_data); + uint32_t versus_pack = nth_midxed_pack_int_id(midx, versus); + off_t versus_offset; + + uint32_t key_preferred = key->pack == key->preferred_pack; + uint32_t versus_preferred = versus_pack == key->preferred_pack; + + /* + * First, compare the preferred-ness, noting that the preferred pack + * comes first. + */ + if (key_preferred && !versus_preferred) + return -1; + else if (!key_preferred && versus_preferred) + return 1; + + /* Then, break ties first by comparing the pack IDs. */ + if (key->pack < versus_pack) + return -1; + else if (key->pack > versus_pack) + return 1; + + /* Finally, break ties by comparing offsets within a pack. */ + versus_offset = nth_midxed_offset(midx, versus); + if (key->offset < versus_offset) + return -1; + else if (key->offset > versus_offset) + return 1; + + return 0; +} + +int midx_to_pack_pos(struct multi_pack_index *m, uint32_t at, uint32_t *pos) +{ + struct midx_pack_key key; + uint32_t *found; + + if (!m->revindex_data) + BUG("midx_to_pack_pos: reverse index not yet loaded"); + if (m->num_objects <= at) + BUG("midx_to_pack_pos: out-of-bounds object at %"PRIu32, at); + + key.pack = nth_midxed_pack_int_id(m, at); + key.offset = nth_midxed_offset(m, at); + key.midx = m; + /* + * The preferred pack sorts first, so determine its identifier by + * looking at the first object in pseudo-pack order. + * + * Note that if no --preferred-pack is explicitly given when writing a + * multi-pack index, then whichever pack has the lowest identifier + * implicitly is preferred (and includes all its objects, since ties are + * broken first by pack identifier). + */ + key.preferred_pack = nth_midxed_pack_int_id(m, pack_pos_to_midx(m, 0)); + + found = bsearch(&key, m->revindex_data, m->num_objects, + sizeof(uint32_t), midx_pack_order_cmp); + + if (!found) + return error("bad offset for revindex"); + + *pos = found - m->revindex_data; + return 0; +} diff --git a/pack-revindex.h b/pack-revindex.h index ba7c82c125..479b8f2f9c 100644 --- a/pack-revindex.h +++ b/pack-revindex.h @@ -14,6 +14,20 @@ * * - offset: the byte offset within the .pack file at which the object contents * can be found + * + * The revindex can also be used with a multi-pack index (MIDX). In this + * setting: + * + * - index position refers to an object's numeric position within the MIDX + * + * - pack position refers to an object's position within a non-existent pack + * described by the MIDX. The pack structure is described in + * Documentation/technical/pack-format.txt. + * + * It is effectively a concatanation of all packs in the MIDX (ordered by + * their numeric ID within the MIDX) in their original order within each + * pack), removing duplicates, and placing the preferred pack (if any) + * first. */ @@ -24,6 +38,7 @@ #define GIT_TEST_REV_INDEX_DIE_IN_MEMORY "GIT_TEST_REV_INDEX_DIE_IN_MEMORY" struct packed_git; +struct multi_pack_index; /* * load_pack_revindex populates the revindex's internal data-structures for the @@ -34,6 +49,22 @@ struct packed_git; */ int load_pack_revindex(struct packed_git *p); +/* + * load_midx_revindex loads the '.rev' file corresponding to the given + * multi-pack index by mmap-ing it and assigning pointers in the + * multi_pack_index to point at it. + * + * A negative number is returned on error. + */ +int load_midx_revindex(struct multi_pack_index *m); + +/* + * Frees resources associated with a multi-pack reverse index. + * + * A negative number is returned on error. + */ +int close_midx_revindex(struct multi_pack_index *m); + /* * offset_to_pack_pos converts an object offset to a pack position. This * function returns zero on success, and a negative number otherwise. The @@ -71,4 +102,26 @@ uint32_t pack_pos_to_index(struct packed_git *p, uint32_t pos); */ off_t pack_pos_to_offset(struct packed_git *p, uint32_t pos); +/* + * pack_pos_to_midx converts the object at position "pos" within the MIDX + * pseudo-pack into a MIDX position. + * + * If the reverse index has not yet been loaded, or the position is out of + * bounds, this function aborts. + * + * This function runs in time O(log N) with the number of objects in the MIDX. + */ +uint32_t pack_pos_to_midx(struct multi_pack_index *m, uint32_t pos); + +/* + * midx_to_pack_pos converts from the MIDX-relative position at "at" to the + * corresponding pack position. + * + * If the reverse index has not yet been loaded, or the position is out of + * bounds, this function aborts. + * + * This function runs in constant time. + */ +int midx_to_pack_pos(struct multi_pack_index *midx, uint32_t at, uint32_t *pos); + #endif diff --git a/packfile.c b/packfile.c index 1fec12ac5f..82623e0cb4 100644 --- a/packfile.c +++ b/packfile.c @@ -862,6 +862,9 @@ static void prepare_pack(const char *full_name, size_t full_name_len, if (!strcmp(file_name, "multi-pack-index")) return; + if (starts_with(file_name, "multi-pack-index") && + ends_with(file_name, ".rev")) + return; if (ends_with(file_name, ".idx") || ends_with(file_name, ".rev") || ends_with(file_name, ".pack") || From patchwork Wed Feb 24 19:10:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12102329 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D613C433E0 for ; Wed, 24 Feb 2021 19:14:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 33F3664E90 for ; Wed, 24 Feb 2021 19:14:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236092AbhBXTOQ (ORCPT ); Wed, 24 Feb 2021 14:14:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59874 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236088AbhBXTMY (ORCPT ); Wed, 24 Feb 2021 14:12:24 -0500 Received: from mail-qt1-x833.google.com (mail-qt1-x833.google.com [IPv6:2607:f8b0:4864:20::833]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 44687C061356 for ; Wed, 24 Feb 2021 11:10:21 -0800 (PST) Received: by mail-qt1-x833.google.com with SMTP id s15so2299200qtq.0 for ; Wed, 24 Feb 2021 11:10:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=OdrrVVdY9eTH+bR/GFmli0pijBOQNZeXKRXpgiM9zIs=; b=0o5AXxiBbrLIPRwnAxKRngtenMAW+e3kG+pj+Rmhv4ZMYndPwf3dNtHO7Ce5AC6hg4 rtU5MVu4676G/+LN9oaYuh/CcP0kr5Cc2Eq3uJk1Gbg5x8VUp0rEQVBa70ZbDIhxNZnH xiZcGgaLJdxQ6lHyh/K/ziF4jfdrBuRuMaNec3fPWjg4g8hcsWNBzTbaQYUnZl5iFweB H9ua96w2tNRs9Zz/oULZDEHQ2pQ8Hh5SL2YYlECQXXPZ48Fmnlt/q0+S7jb3fY6UR/aS qIQZJKChNw1gSuot0qbqVjsJZMqv5xP++ipICBLnnCwbuzbcQkWopPdcS/hErR3HSfLg G47g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=OdrrVVdY9eTH+bR/GFmli0pijBOQNZeXKRXpgiM9zIs=; b=Rukox5IfpLR7PPTAVCEHxSnKrC39cf8jmF7PvkDSzH9B+nrlKPT5lxLokKBBGYmwJP leU2Y6p6bXi6nRvPxkHONdtnH+5nrNTZaEOxroXMMybi8A11OqaJOp/G84sqAbFYQYU/ mntynBttZCxfZ5rfNeHdfDtd/8Bu0vCCI2T6MdT6apG1icTRqTOhj1fX78e+ocsHMtAv GH1/CIJM8VNTI7aZ4xrq5RmkRAmanMxeCrgg5YokaTW3XfXMTFW/UrH5e6zO/LYFj9Px 8glQpvdjlsftRxvy42CuiXQ5tvvd8gXpJ4H5n0DMFbV+WBoMeJquvYVtInk0fZaLtAIk NKOg== X-Gm-Message-State: AOAM533JiEquTWBoTb9sAQGkUnSLP9zTF2WY10RnJqw8b3soAl8v1mPF ortHUegOs3t16ZS2a0WRJPCn23novWfr9rbU X-Google-Smtp-Source: ABdhPJymttxKqewFsADMdDLVh7/S/Y8qII4bX2gs+mhJi7/i7D4rTarL2xtUippsI5PZ2ybRIX/Qiw== X-Received: by 2002:a05:622a:3ca:: with SMTP id k10mr31439845qtx.270.1614193820245; Wed, 24 Feb 2021 11:10:20 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:268b:c46e:d22e:db6b]) by smtp.gmail.com with ESMTPSA id l60sm1893383qte.13.2021.02.24.11.10.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Feb 2021 11:10:19 -0800 (PST) Date: Wed, 24 Feb 2021 14:10:18 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, avarab@gmail.com, gitster@pobox.com Subject: [PATCH v2 14/15] pack-write.c: extract 'write_rev_file_order' Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Existing callers provide the reverse index code with an array of 'struct pack_idx_entry *'s, which is then sorted by pack order (comparing the offsets of each object within the pack). Prepare for the multi-pack index to write a .rev file by providing a way to write the reverse index without an array of pack_idx_entry (which the MIDX code does not have). Instead, callers can invoke 'write_rev_index_positions()', which takes an array of uint32_t's. The ith entry in this array specifies the ith object's (in index order) position within the pack (in pack order). Expose this new function for use in a later patch, and rewrite the existing write_rev_file() in terms of this new function. Signed-off-by: Taylor Blau --- pack-write.c | 39 ++++++++++++++++++++++++++++----------- pack.h | 1 + 2 files changed, 29 insertions(+), 11 deletions(-) diff --git a/pack-write.c b/pack-write.c index 680c36755d..75fcf70db1 100644 --- a/pack-write.c +++ b/pack-write.c @@ -201,21 +201,12 @@ static void write_rev_header(struct hashfile *f) } static void write_rev_index_positions(struct hashfile *f, - struct pack_idx_entry **objects, + uint32_t *pack_order, uint32_t nr_objects) { - uint32_t *pack_order; uint32_t i; - - ALLOC_ARRAY(pack_order, nr_objects); - for (i = 0; i < nr_objects; i++) - pack_order[i] = i; - QSORT_S(pack_order, nr_objects, pack_order_cmp, objects); - for (i = 0; i < nr_objects; i++) hashwrite_be32(f, pack_order[i]); - - free(pack_order); } static void write_rev_trailer(struct hashfile *f, const unsigned char *hash) @@ -228,6 +219,32 @@ const char *write_rev_file(const char *rev_name, uint32_t nr_objects, const unsigned char *hash, unsigned flags) +{ + uint32_t *pack_order; + uint32_t i; + const char *ret; + + ALLOC_ARRAY(pack_order, nr_objects); + for (i = 0; i < nr_objects; i++) + pack_order[i] = i; + QSORT_S(pack_order, nr_objects, pack_order_cmp, objects); + + if (!(flags & (WRITE_REV | WRITE_REV_VERIFY))) + return NULL; + + ret = write_rev_file_order(rev_name, pack_order, nr_objects, hash, + flags); + + free(pack_order); + + return ret; +} + +const char *write_rev_file_order(const char *rev_name, + uint32_t *pack_order, + uint32_t nr_objects, + const unsigned char *hash, + unsigned flags) { struct hashfile *f; int fd; @@ -262,7 +279,7 @@ const char *write_rev_file(const char *rev_name, write_rev_header(f); - write_rev_index_positions(f, objects, nr_objects); + write_rev_index_positions(f, pack_order, nr_objects); write_rev_trailer(f, hash); if (rev_name && adjust_shared_perm(rev_name) < 0) diff --git a/pack.h b/pack.h index afdcf8f5c7..09c2a7dd3a 100644 --- a/pack.h +++ b/pack.h @@ -94,6 +94,7 @@ struct ref; void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought); const char *write_rev_file(const char *rev_name, struct pack_idx_entry **objects, uint32_t nr_objects, const unsigned char *hash, unsigned flags); +const char *write_rev_file_order(const char *rev_name, uint32_t *pack_order, uint32_t nr_objects, const unsigned char *hash, unsigned flags); /* * The "hdr" output buffer should be at least this big, which will handle sizes From patchwork Wed Feb 24 19:10:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12102317 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91778C433E0 for ; Wed, 24 Feb 2021 19:13:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 57A1A64E85 for ; Wed, 24 Feb 2021 19:13:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235200AbhBXTNL (ORCPT ); Wed, 24 Feb 2021 14:13:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59602 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235957AbhBXTMO (ORCPT ); Wed, 24 Feb 2021 14:12:14 -0500 Received: from mail-qk1-x733.google.com (mail-qk1-x733.google.com [IPv6:2607:f8b0:4864:20::733]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 25459C06121D for ; Wed, 24 Feb 2021 11:10:25 -0800 (PST) Received: by mail-qk1-x733.google.com with SMTP id d20so2281299qkc.2 for ; Wed, 24 Feb 2021 11:10:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=QLkmbjCgrPzs5QwZZHoC9S2ZRHXiY/McDvgVmb26zK0=; b=XqE62G6TQoGA4zIqnwMFaE6VHVfuEvym0erY8sCvYk7RH4/wDH9r8l4gbfEs3aY2BM wVvMruoXpa7vqusGTU4u05Gnp4gyYxw4Qz1BKBjD8+EBRvLCx1mrtFKVm6oiNnmJX27V 1FwVzKaMIz87cb+tO+829t4CC8uYzxTekoNC/dQUFpm/NZRyX4l9ly97j94abdfb4dov Xws4LoeWryXYVQiaLsW6IjYLhJy8nGJ8olbJn/jCx92M6H9S3Uc/6AM5QwD0fWq2wmVK xz1u91+RkZujFhxfuS7MIvtNte0bO/px47geEqE+0QZTU0J80ZaZcUMLiqw3BfJl8Nw6 vCng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=QLkmbjCgrPzs5QwZZHoC9S2ZRHXiY/McDvgVmb26zK0=; b=qNrE/FtQSuCiahHnUUucAMBjH8avONIz2xYRZ/baxKb2/DDG6a6XJbH7SzDOSaTkto 6lTYGNNuyMylj6HEYx+M5GmZ0QbKgkcmr7IkAbGYDfnwsrVLtqBslFJYYHO9ZIz2l6rX kmK/UeuD7Z1AsIzg93oudu4E4bMVEi6gi7uP0RLmF/Hb1B8fW3bT0Z49dc6TpXFMSizX n5c6/yiVXoxjemm13z82aFuaY4YvfmFb5bxsZdusdAgEQVMM0zjUgdq1BfDHUk7TXL9L 6AHSUhZMD1BaE20DeVr5vP5y04DYSX9h8j1nufby0/PSk/YDrsX4pY6nWuDo7MbkoZTX IyCA== X-Gm-Message-State: AOAM531YUTranFLLdQCcRW1IzvP2xDvyTmEcp7nKwXkLVjUt8l8HSZen If49xX47HFAi9RJLfHMmj1Hau/gnD4VS9zTC X-Google-Smtp-Source: ABdhPJwWnObT/5/qp2yVpgPejFxGcmFDExgIdjc9zP5zA2dtyMgX6tae/7Q3lNJXYl+MnpHr/S+r/g== X-Received: by 2002:a37:b4c:: with SMTP id 73mr7936025qkl.112.1614193824089; Wed, 24 Feb 2021 11:10:24 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:268b:c46e:d22e:db6b]) by smtp.gmail.com with ESMTPSA id f22sm2150252qkm.54.2021.02.24.11.10.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Feb 2021 11:10:23 -0800 (PST) Date: Wed, 24 Feb 2021 14:10:21 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, avarab@gmail.com, gitster@pobox.com Subject: [PATCH v2 15/15] pack-revindex: write multi-pack reverse indexes Message-ID: <01bd6a35c6c441a30a22a4c2d17e9cf53de6b148.1614193703.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Implement the writing half of multi-pack reverse indexes. This is nothing more than the format describe a few patches ago, with a new set of helper functions that will be used to clear out stale .rev files corresponding to old MIDXs. Unfortunately, a very similar comparison function as the one implemented recently in pack-revindex.c is reimplemented here, this time accepting a MIDX-internal type. An effort to DRY these up would create more indirection and overhead than is necessary, so it isn't pursued here. Currently, there are no callers which pass the MIDX_WRITE_REV_INDEX flag, meaning that this is all dead code. But, that won't be the case for long, since subsequent patches will introduce the multi-pack bitmap, which will begin passing this field. Signed-off-by: Taylor Blau --- midx.c | 111 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ midx.h | 1 + 2 files changed, 112 insertions(+) diff --git a/midx.c b/midx.c index 8d7a8927b8..820276cc45 100644 --- a/midx.c +++ b/midx.c @@ -12,6 +12,7 @@ #include "run-command.h" #include "repository.h" #include "chunk-format.h" +#include "pack.h" #define MIDX_SIGNATURE 0x4d494458 /* "MIDX" */ #define MIDX_VERSION 1 @@ -472,6 +473,7 @@ struct write_midx_context { uint32_t entries_nr; uint32_t *pack_perm; + uint32_t *pack_order; unsigned large_offsets_needed:1; uint32_t num_large_offsets; @@ -826,6 +828,66 @@ static int write_midx_large_offsets(struct hashfile *f, return 0; } +static int midx_pack_order_cmp(const void *va, const void *vb, void *_ctx) +{ + struct write_midx_context *ctx = _ctx; + + struct pack_midx_entry *a = &ctx->entries[*(const uint32_t *)va]; + struct pack_midx_entry *b = &ctx->entries[*(const uint32_t *)vb]; + + uint32_t perm_a = ctx->pack_perm[a->pack_int_id]; + uint32_t perm_b = ctx->pack_perm[b->pack_int_id]; + + /* Sort objects in the preferred pack ahead of any others. */ + if (a->preferred > b->preferred) + return -1; + if (a->preferred < b->preferred) + return 1; + + /* Then, order objects by which packs they appear in. */ + if (perm_a < perm_b) + return -1; + if (perm_a > perm_b) + return 1; + + /* Then, disambiguate by their offset within each pack. */ + if (a->offset < b->offset) + return -1; + if (a->offset > b->offset) + return 1; + + return 0; +} + +static uint32_t *midx_pack_order(struct write_midx_context *ctx) +{ + uint32_t *pack_order; + uint32_t i; + + ALLOC_ARRAY(pack_order, ctx->entries_nr); + for (i = 0; i < ctx->entries_nr; i++) + pack_order[i] = i; + QSORT_S(pack_order, ctx->entries_nr, midx_pack_order_cmp, ctx); + + return pack_order; +} + +static void write_midx_reverse_index(char *midx_name, unsigned char *midx_hash, + struct write_midx_context *ctx) +{ + struct strbuf buf = STRBUF_INIT; + + strbuf_addf(&buf, "%s-%s.rev", midx_name, hash_to_hex(midx_hash)); + + write_rev_file_order(buf.buf, ctx->pack_order, ctx->entries_nr, + midx_hash, WRITE_REV); + + strbuf_release(&buf); +} + +static void clear_midx_files_ext(struct repository *r, const char *ext, + unsigned char *keep_hash); + static int write_midx_internal(const char *object_dir, struct multi_pack_index *m, struct string_list *packs_to_drop, const char *preferred_pack_name, @@ -1018,6 +1080,14 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * finalize_hashfile(f, midx_hash, CSUM_FSYNC | CSUM_HASH_IN_STREAM); free_chunkfile(cf); + + if (flags & MIDX_WRITE_REV_INDEX) + ctx.pack_order = midx_pack_order(&ctx); + + if (flags & MIDX_WRITE_REV_INDEX) + write_midx_reverse_index(midx_name, midx_hash, &ctx); + clear_midx_files_ext(the_repository, ".rev", midx_hash); + commit_lock_file(&lk); cleanup: @@ -1032,6 +1102,7 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * free(ctx.info); free(ctx.entries); free(ctx.pack_perm); + free(ctx.pack_order); free(midx_name); return result; } @@ -1044,6 +1115,44 @@ int write_midx_file(const char *object_dir, flags); } +struct clear_midx_data { + char *keep; + const char *ext; +}; + +static void clear_midx_file_ext(const char *full_path, size_t full_path_len, + const char *file_name, void *_data) +{ + struct clear_midx_data *data = _data; + + if (!(starts_with(file_name, "multi-pack-index-") && + ends_with(file_name, data->ext))) + return; + if (data->keep && !strcmp(data->keep, file_name)) + return; + + if (unlink(full_path)) + die_errno(_("failed to remove %s"), full_path); +} + +static void clear_midx_files_ext(struct repository *r, const char *ext, + unsigned char *keep_hash) +{ + struct clear_midx_data data; + memset(&data, 0, sizeof(struct clear_midx_data)); + + if (keep_hash) + data.keep = xstrfmt("multi-pack-index-%s%s", + hash_to_hex(keep_hash), ext); + data.ext = ext; + + for_each_file_in_pack_dir(r->objects->odb->path, + clear_midx_file_ext, + &data); + + free(data.keep); +} + void clear_midx_file(struct repository *r) { char *midx = get_midx_filename(r->objects->odb->path); @@ -1056,6 +1165,8 @@ void clear_midx_file(struct repository *r) if (remove_path(midx)) die(_("failed to clear multi-pack-index at %s"), midx); + clear_midx_files_ext(r, ".rev", NULL); + free(midx); } diff --git a/midx.h b/midx.h index 0a8294d2ee..8684cf0fef 100644 --- a/midx.h +++ b/midx.h @@ -40,6 +40,7 @@ struct multi_pack_index { }; #define MIDX_PROGRESS (1 << 0) +#define MIDX_WRITE_REV_INDEX (1 << 1) char *get_midx_rev_filename(struct multi_pack_index *m);