From patchwork Tue Mar 30 15:03:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12172865 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9EFB2C433E2 for ; Tue, 30 Mar 2021 15:04:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7052A619D0 for ; Tue, 30 Mar 2021 15:04:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232032AbhC3PEX (ORCPT ); Tue, 30 Mar 2021 11:04:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52952 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231229AbhC3PDu (ORCPT ); Tue, 30 Mar 2021 11:03:50 -0400 Received: from mail-il1-x12c.google.com (mail-il1-x12c.google.com [IPv6:2607:f8b0:4864:20::12c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2E717C061574 for ; Tue, 30 Mar 2021 08:03:50 -0700 (PDT) Received: by mail-il1-x12c.google.com with SMTP id c17so14423974ilj.7 for ; Tue, 30 Mar 2021 08:03:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=1QN5HZPQsaalWIjksNlL+PaFtD+CqOWNvSTPU6BixXE=; b=GFl5fm1MQdX+S4CUQRTW4xAN+DnNGDAZIBHc2cHbLrigmjZgpieKZCH90H0Fwws1pk hU5ZapY69ihC27hOY15iKDNTADT2t9I0UibTSnhg3p8iVBTzLgmfRgSiijj0lm1wkHY0 czXVBtGwUPd4sw9yZE+0w+TNkfUSXsjo/kiPp6ibUSecA/AgHyOyGUlv0HpCkyd+9Q26 KFBSaBpNXA5R5ak4GH0fL+Tjoy82V0KSyj8aVzr32bjWHTjdEWY3cQFcMwyWQtRC+n3p Z399Bks2cHhLH4ehT4pyOKOsHw5HCDXCJYgC3h/yFg5aDEEONrYL0+w2xq0zcOb3Pzs7 9HfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=1QN5HZPQsaalWIjksNlL+PaFtD+CqOWNvSTPU6BixXE=; b=emSoPrDDG8mcvKalb/6qehhu0eXClr+EMRlRqLpCeWmyvhVmuevO4B3gg6uesEaYXs y0VvEW37nKL5eWhp/AZ7ZrwyqN5bfKiZbEVsZb/1wsJIv3S7O3UQ+2RxPgvIS2Ux8OF0 OxGECDu0FnjeM9xlVXNP8WwFXw6FxuZSc16vicwbc5COX9ufZGNhFZ6uTKOCIu+1Knhc yCmn8bW99m8knFeGu0JHF9m0Vv7tUjH6y/KThOuDf+FOWQOh2h/bGQGX454zfD19HZYU /WxLv4uQ7oXkmxuwwTi9V5PJbMIplAT2oam793s41FFBhqZayN5/hcvoetyrqXI30MsM 7onQ== X-Gm-Message-State: AOAM5327FpY1kw5j/JFZqEOCd82dTvH4X0Ud/qFHNeU1uXWGG5iRo5mT wLnHP8uYXThN7ZZlJIqkSyCIJHArbyhBhw== X-Google-Smtp-Source: ABdhPJx5NSQDEyRg04nwAYFmyS4ICt1wLW7ZCtCAupQ3Lx98fZt/T+nPqqGob76S++ZNwAJl/uCCZA== X-Received: by 2002:a05:6e02:1068:: with SMTP id q8mr11651363ilj.181.1617116629351; Tue, 30 Mar 2021 08:03:49 -0700 (PDT) Received: from localhost ([2605:9480:22e:ff10:7b00:4f79:8763:6261]) by smtp.gmail.com with ESMTPSA id v17sm11169030ios.46.2021.03.30.08.03.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 08:03:48 -0700 (PDT) Date: Tue, 30 Mar 2021 11:03:47 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: gitster@pobox.com, peff@peff.net, dstolee@microsoft.com, jonathantanmy@google.com Subject: [PATCH v4 01/16] builtin/multi-pack-index.c: inline 'flags' with options Message-ID: <90e021725f3aa26355bb60c48da754af2e00f049.1617116623.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Subcommands of the 'git multi-pack-index' command (e.g., 'write', 'verify', etc.) will want to optionally change a set of shared flags that are eventually passed to the MIDX libraries. Right now, options and flags are handled separately. That's fine, since the options structure is never passed around. But a future patch will make it so that common options shared by all sub-commands are defined in a common location. That means that "flags" would have to become a global variable. Group it with the options structure so that we reduce the number of global variables we have overall. Signed-off-by: Taylor Blau --- builtin/multi-pack-index.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index 5bf88cd2a8..4a0ddb06c4 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -14,13 +14,12 @@ static struct opts_multi_pack_index { const char *object_dir; unsigned long batch_size; int progress; + unsigned flags; } opts; int cmd_multi_pack_index(int argc, const char **argv, const char *prefix) { - unsigned flags = 0; - static struct option builtin_multi_pack_index_options[] = { OPT_FILENAME(0, "object-dir", &opts.object_dir, N_("object directory containing set of packfile and pack-index pairs")), @@ -40,7 +39,7 @@ int cmd_multi_pack_index(int argc, const char **argv, if (!opts.object_dir) opts.object_dir = get_object_directory(); if (opts.progress) - flags |= MIDX_PROGRESS; + opts.flags |= MIDX_PROGRESS; if (argc == 0) usage_with_options(builtin_multi_pack_index_usage, @@ -55,16 +54,16 @@ int cmd_multi_pack_index(int argc, const char **argv, if (!strcmp(argv[0], "repack")) return midx_repack(the_repository, opts.object_dir, - (size_t)opts.batch_size, flags); + (size_t)opts.batch_size, opts.flags); if (opts.batch_size) die(_("--batch-size option is only for 'repack' subcommand")); if (!strcmp(argv[0], "write")) - return write_midx_file(opts.object_dir, flags); + return write_midx_file(opts.object_dir, opts.flags); if (!strcmp(argv[0], "verify")) - return verify_midx_file(the_repository, opts.object_dir, flags); + return verify_midx_file(the_repository, opts.object_dir, opts.flags); if (!strcmp(argv[0], "expire")) - return expire_midx_packs(the_repository, opts.object_dir, flags); + return expire_midx_packs(the_repository, opts.object_dir, opts.flags); die(_("unrecognized subcommand: %s"), argv[0]); } From patchwork Tue Mar 30 15:03:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12172867 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC87AC433E5 for ; Tue, 30 Mar 2021 15:04:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C9E22619CB for ; Tue, 30 Mar 2021 15:04:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232167AbhC3PEY (ORCPT ); Tue, 30 Mar 2021 11:04:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52960 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231344AbhC3PDx (ORCPT ); Tue, 30 Mar 2021 11:03:53 -0400 Received: from mail-il1-x133.google.com (mail-il1-x133.google.com [IPv6:2607:f8b0:4864:20::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 654AEC061574 for ; Tue, 30 Mar 2021 08:03:53 -0700 (PDT) Received: by mail-il1-x133.google.com with SMTP id z9so14428113ilb.4 for ; Tue, 30 Mar 2021 08:03:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=V4a9UY5s40LlGymKW7WgX8xoLwzSyTJIIO9ypByJRAw=; b=QLuFurfbqCYnSA7Tqu+jSqnrDra4ycwDnP3aZan09hQflraA/LkgodnRMDmbtRtHd/ YIgjg+mFSxyItkmJ8LxqHaVty7zTYNXV/yqq2vp6CqzHmtGEJgN3P0NqaMLTe5MrY3Dh YsUDwa1A6Hhq00TLnQQ91MTlyN3XHNfHq9IogOU1aCgoTJ4BX3DOHIDkd8acycTuepbL Mp3IGXG0O1K8Xx2onmgbbZCNpdOhCs8wFCb++S9ncpn/D+MJHquOWS9ug0kVObm76ajp SG4DIkKicH8JI5mYK3Xyu4Zufss4sK6GZre7PF6KLUV0mX6GRCBWo0+nDYeOcTsXNWet HEOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=V4a9UY5s40LlGymKW7WgX8xoLwzSyTJIIO9ypByJRAw=; b=SEN28eZKvIfgQo+RGe34IqHyjidCmPTMhupb9evkiIRfmBnFfq94Gcj4L1uFPmwKCf iMQeSW/CE3qqObraIwqVvw6BWXjXS9ypBzizbzcMwJRF6H2i/009f8bYE3q4yF5Rxva9 c9XYFGAeTXvJ3ipQqu3KmGEHqQnfyNc91iZ2IZE5A6GnN7HlG1/SP7u3/vrOOiVhaA11 Vd+hj67yo5+YK7vv8p0ZIb7gGzoar5y+9V3QEguqKoJIgTWCX7lPrdi09zeGs7weHRtt VKsv6YVu8J2FNL4R/qXHFZzeJmSC2Ttovg2WcpwFkXqAgDvNNJqo4b/XbeN7wy5huin/ csAA== X-Gm-Message-State: AOAM533DVgRQcFLtLojVuId0+eQEjn7T3EZw/sWU6W21lyOyKsVvq430 czjpdAjTAWaqnZRFAoAkvCVmR0+C8I0D6Q== X-Google-Smtp-Source: ABdhPJwN7FKEUhUi+8Lj3z837zjbX1ipWs+ehOriCnEmzP6MBj7a0gaNKetofg0jngGHIczq47HmQg== X-Received: by 2002:a92:b744:: with SMTP id c4mr21370650ilm.265.1617116632550; Tue, 30 Mar 2021 08:03:52 -0700 (PDT) Received: from localhost ([2605:9480:22e:ff10:7b00:4f79:8763:6261]) by smtp.gmail.com with ESMTPSA id c17sm9118071ilh.32.2021.03.30.08.03.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 08:03:52 -0700 (PDT) Date: Tue, 30 Mar 2021 11:03:51 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: gitster@pobox.com, peff@peff.net, dstolee@microsoft.com, jonathantanmy@google.com Subject: [PATCH v4 02/16] builtin/multi-pack-index.c: don't handle 'progress' separately Message-ID: <130c191b805917bdd146c4122b8b8ceaae700580.1617116623.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Now that there is a shared 'flags' member in the options structure, there is no need to keep track of whether to force progress or not, since ultimately the decision of whether or not to show a progress meter is controlled by a bit in the flags member. Manipulate that bit directly, and drop the now-unnecessary 'progress' field while we're at it. Signed-off-by: Taylor Blau --- builtin/multi-pack-index.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index 4a0ddb06c4..c70f020d8f 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -13,7 +13,6 @@ static char const * const builtin_multi_pack_index_usage[] = { static struct opts_multi_pack_index { const char *object_dir; unsigned long batch_size; - int progress; unsigned flags; } opts; @@ -23,7 +22,7 @@ int cmd_multi_pack_index(int argc, const char **argv, static struct option builtin_multi_pack_index_options[] = { OPT_FILENAME(0, "object-dir", &opts.object_dir, N_("object directory containing set of packfile and pack-index pairs")), - OPT_BOOL(0, "progress", &opts.progress, N_("force progress reporting")), + OPT_BIT(0, "progress", &opts.flags, N_("force progress reporting"), MIDX_PROGRESS), OPT_MAGNITUDE(0, "batch-size", &opts.batch_size, N_("during repack, collect pack-files of smaller size into a batch that is larger than this size")), OPT_END(), @@ -31,15 +30,14 @@ int cmd_multi_pack_index(int argc, const char **argv, git_config(git_default_config, NULL); - opts.progress = isatty(2); + if (isatty(2)) + opts.flags |= MIDX_PROGRESS; argc = parse_options(argc, argv, prefix, builtin_multi_pack_index_options, builtin_multi_pack_index_usage, 0); if (!opts.object_dir) opts.object_dir = get_object_directory(); - if (opts.progress) - opts.flags |= MIDX_PROGRESS; if (argc == 0) usage_with_options(builtin_multi_pack_index_usage, From patchwork Tue Mar 30 15:03:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12172871 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E45E3C433E4 for ; Tue, 30 Mar 2021 15:04:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B625A61953 for ; Tue, 30 Mar 2021 15:04:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232190AbhC3PEZ (ORCPT ); Tue, 30 Mar 2021 11:04:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52976 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231650AbhC3PD5 (ORCPT ); Tue, 30 Mar 2021 11:03:57 -0400 Received: from mail-io1-xd2d.google.com (mail-io1-xd2d.google.com [IPv6:2607:f8b0:4864:20::d2d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D52EFC061574 for ; Tue, 30 Mar 2021 08:03:56 -0700 (PDT) Received: by mail-io1-xd2d.google.com with SMTP id r193so16748586ior.9 for ; Tue, 30 Mar 2021 08:03:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=By4ZDw+6iwX13Ip2dovM5NjuD4lu7Lq1pVXXR7wZA+M=; b=oYfwYywa4Azc3MrZ/+HJco6eZzG3NG+YOmIIw26elEnhU1sn+fpalfDyqauL/HpCDo NWQMNGEjZ5S4A9JIkHfcXAK2cS/cKdfOjqVaWEtf9xgALY6ZheHH8cSz9Vl6eZNVkt6o dYiIhjvcZ8OhNhYORRl6eHo6nJpj3/qqShX71TFO+riQjcPKF/SIkJWSLoFV0Gvi2joQ M+7B92tXPivtZ2ImGuTBLMcYBfl3YTTOf5+g9rNB4tQZwelZsgOesUc1zGs5uAvbPh9+ yuwMzvz3v8UxNJ/nB6NKCh+W+wA6YkgJdNRaBK0dGhsdSegwxGm4gTSvWY5micq0cLdB KGUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=By4ZDw+6iwX13Ip2dovM5NjuD4lu7Lq1pVXXR7wZA+M=; b=aOCk7VSVR9ItHH1/5CkVZnk3nrbCXT5vITMvlt0O4fSvY+S151MxVW/Q10QcF4vEGO BFuS80q7iF/2x5DPC5+zFVb4mwxccT2KOTcQVBYYI7bbfh/FvSnwN098CSTqXeefz5ln kWtS5jlOph8GBtZhVubtKQAknWIg8rg1ot8U5f4pZa9f70rw+MS7hfyXfADsm08zvsEy VZhPVvNDtXe3nXnjNyxhzSYH0uF16Cgd7MrvOtP79Iw2EZHDJMI9RyBXoPWyCJ6AQiZB tuywkZt0K2yF6AOG10NddFo4eu+0crm3YiyZAlLvGwoe6wCyG2CFsgJaSRN/eBGiKQSz H5Pg== X-Gm-Message-State: AOAM530Ix7j5wDxMnrpno3Y7qHQFmPwNSpXjFy4GYHLrB011ZF/auane +rrFmJER8G2QR9nniD2wwGwwqud9UOkIaA== X-Google-Smtp-Source: ABdhPJxd9DzgS52wn80QHVWGCazH+Q1O4inDvYyiot4szgjfFlKwzjZLHtoMMBiXC19ahyUdExxOfw== X-Received: by 2002:a5d:81da:: with SMTP id t26mr24535534iol.129.1617116636013; Tue, 30 Mar 2021 08:03:56 -0700 (PDT) Received: from localhost ([2605:9480:22e:ff10:7b00:4f79:8763:6261]) by smtp.gmail.com with ESMTPSA id h8sm5770334ila.52.2021.03.30.08.03.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 08:03:55 -0700 (PDT) Date: Tue, 30 Mar 2021 11:03:54 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: gitster@pobox.com, peff@peff.net, dstolee@microsoft.com, jonathantanmy@google.com Subject: [PATCH v4 03/16] builtin/multi-pack-index.c: define common usage with a macro Message-ID: <5a274b909674401fa682b08ce08ad5b6acb47784.1617116623.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Factor out the usage message into pieces corresponding to each mode. This avoids options specific to one sub-command from being shared with another in the usage. A subsequent commit will use these #define macros to have usage variables for each sub-command without duplicating their contents. Signed-off-by: Taylor Blau --- builtin/multi-pack-index.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index c70f020d8f..eea498e026 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -5,8 +5,23 @@ #include "midx.h" #include "trace2.h" +#define BUILTIN_MIDX_WRITE_USAGE \ + N_("git multi-pack-index [] write") + +#define BUILTIN_MIDX_VERIFY_USAGE \ + N_("git multi-pack-index [] verify") + +#define BUILTIN_MIDX_EXPIRE_USAGE \ + N_("git multi-pack-index [] expire") + +#define BUILTIN_MIDX_REPACK_USAGE \ + N_("git multi-pack-index [] repack [--batch-size=]") + static char const * const builtin_multi_pack_index_usage[] = { - N_("git multi-pack-index [] (write|verify|expire|repack --batch-size=)"), + BUILTIN_MIDX_WRITE_USAGE, + BUILTIN_MIDX_VERIFY_USAGE, + BUILTIN_MIDX_EXPIRE_USAGE, + BUILTIN_MIDX_REPACK_USAGE, NULL }; From patchwork Tue Mar 30 15:03:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12172869 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C5AD0C433E3 for ; Tue, 30 Mar 2021 15:04:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 875E0619E1 for ; Tue, 30 Mar 2021 15:04:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232201AbhC3PEZ (ORCPT ); Tue, 30 Mar 2021 11:04:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52992 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231794AbhC3PEA (ORCPT ); Tue, 30 Mar 2021 11:04:00 -0400 Received: from mail-qk1-x735.google.com (mail-qk1-x735.google.com [IPv6:2607:f8b0:4864:20::735]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A05BDC061574 for ; Tue, 30 Mar 2021 08:04:00 -0700 (PDT) Received: by mail-qk1-x735.google.com with SMTP id q26so16146186qkm.6 for ; Tue, 30 Mar 2021 08:04:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=W4SDY+1YxIrmLwpqmBLyRI1mkSOd6U18NqzhGj2GmDk=; b=r2kwAdUd3nz8cP7sbRO77yAtLZGGxAwrDRzxPkn1moEHu+srhuUSaWla2NvNqF6oRs tpSaj2FlwlSFXCpdwrGbOn6b3X3Pd347PmPAsocuXxkn//Px75Hv1S+KHcsmM4PJlK0l 28YjEhmpYLaY6EVGdK3JT0AAcHYkoTH+n/cOk1Ewy3zATaT6eBI2NJ+zW8WWrh/F1HZl YsVbqCM2XD0mOt+si9/6avIgWEBRki4vHnDT4eZrc/I5q2bmIGH7qZ1yqM5Xm4Deukhs /xHsZWs1RtUzIkjN7TOel0Q6WOzrj2eUKE4s4KXBA46dYn+ArVbTmPonEQ21fyyKMiOS +y/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=W4SDY+1YxIrmLwpqmBLyRI1mkSOd6U18NqzhGj2GmDk=; b=rl2Xt7Fm0ZLsOdxTtVA+1qgpgQzVMw4OrCAd/MCZPeQOexeHXvG9qrZ2T3xVNtGtIy llevtMR/tjrvCVf6eZz3oQlD8d+eQHyJJ3otxHh/dyoqt1ifPVPJ1ZRAJhvVtbl9SiFv 10RqsRID16VPhjtlZ20ZylodjMb96fFKNo4M/dvSM/l7KSrppIUqyrm4Ee7/ckEZ3ZK1 UUG4N26dGERsAQhdrQ5QdwbxrhPD8AmI4DoKJZMacBpohh/VDMzR7scjxwuW0ojF2UGI PnW79Cg696UF5Y4Hkg+9E+E5cMTzl1uxhQONMwOqifYfHITPHe+1T7ecxoHwiPiabNoA ZMvA== X-Gm-Message-State: AOAM532aVv+x5FaKnSBLXYae4Tr/KcbiKri1HIm3bd/OJFjCDHRxuiFa bhavFpy//dLUI5tHfQFVLl/DT/U29JYd8g== X-Google-Smtp-Source: ABdhPJz0xRzLXaLe+ZJh21CGi8n7TgjQDpwCTeSXyL8WGkPkgyyK5ms7z3sfwVPVoWQ3mtVSYUXZLg== X-Received: by 2002:a37:66cd:: with SMTP id a196mr31650657qkc.374.1617116639575; Tue, 30 Mar 2021 08:03:59 -0700 (PDT) Received: from localhost ([2605:9480:22e:ff10:7b00:4f79:8763:6261]) by smtp.gmail.com with ESMTPSA id 79sm16287312qki.37.2021.03.30.08.03.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 08:03:59 -0700 (PDT) Date: Tue, 30 Mar 2021 11:03:57 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: gitster@pobox.com, peff@peff.net, dstolee@microsoft.com, jonathantanmy@google.com Subject: [PATCH v4 04/16] builtin/multi-pack-index.c: split sub-commands Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Handle sub-commands of the 'git multi-pack-index' builtin (e.g., "write", "repack", etc.) separately from one another. This allows sub-commands with unique options, without forcing cmd_multi_pack_index() to reject invalid combinations itself. This comes at the cost of some duplication and boilerplate. Luckily, the duplication is reduced to a minimum, since common options are shared among sub-commands due to a suggestion by Ævar. (Sub-commands do have to retain the common options, too, since this builtin accepts common options on either side of the sub-command). Roughly speaking, cmd_multi_pack_index() parses options (including common ones), and stops at the first non-option, which is the sub-command. It then dispatches to the appropriate sub-command, which parses the remaining options (also including common options). Unknown options are kept by the sub-commands in order to detect their presence (and complain that too many arguments were given). Signed-off-by: Taylor Blau --- builtin/multi-pack-index.c | 130 ++++++++++++++++++++++++++++++------- 1 file changed, 105 insertions(+), 25 deletions(-) diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index eea498e026..a78640c061 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -17,6 +17,22 @@ #define BUILTIN_MIDX_REPACK_USAGE \ N_("git multi-pack-index [] repack [--batch-size=]") +static char const * const builtin_multi_pack_index_write_usage[] = { + BUILTIN_MIDX_WRITE_USAGE, + NULL +}; +static char const * const builtin_multi_pack_index_verify_usage[] = { + BUILTIN_MIDX_VERIFY_USAGE, + NULL +}; +static char const * const builtin_multi_pack_index_expire_usage[] = { + BUILTIN_MIDX_EXPIRE_USAGE, + NULL +}; +static char const * const builtin_multi_pack_index_repack_usage[] = { + BUILTIN_MIDX_REPACK_USAGE, + NULL +}; static char const * const builtin_multi_pack_index_usage[] = { BUILTIN_MIDX_WRITE_USAGE, BUILTIN_MIDX_VERIFY_USAGE, @@ -31,25 +47,98 @@ static struct opts_multi_pack_index { unsigned flags; } opts; -int cmd_multi_pack_index(int argc, const char **argv, - const char *prefix) +static struct option common_opts[] = { + OPT_FILENAME(0, "object-dir", &opts.object_dir, + N_("object directory containing set of packfile and pack-index pairs")), + OPT_BIT(0, "progress", &opts.flags, N_("force progress reporting"), MIDX_PROGRESS), + OPT_END(), +}; + +static struct option *add_common_options(struct option *prev) { - static struct option builtin_multi_pack_index_options[] = { - OPT_FILENAME(0, "object-dir", &opts.object_dir, - N_("object directory containing set of packfile and pack-index pairs")), - OPT_BIT(0, "progress", &opts.flags, N_("force progress reporting"), MIDX_PROGRESS), + return parse_options_concat(common_opts, prev); +} + +static int cmd_multi_pack_index_write(int argc, const char **argv) +{ + struct option *options = common_opts; + + argc = parse_options(argc, argv, NULL, + options, builtin_multi_pack_index_write_usage, + PARSE_OPT_KEEP_UNKNOWN); + if (argc) + usage_with_options(builtin_multi_pack_index_write_usage, + options); + + return write_midx_file(opts.object_dir, opts.flags); +} + +static int cmd_multi_pack_index_verify(int argc, const char **argv) +{ + struct option *options = common_opts; + + argc = parse_options(argc, argv, NULL, + options, builtin_multi_pack_index_verify_usage, + PARSE_OPT_KEEP_UNKNOWN); + if (argc) + usage_with_options(builtin_multi_pack_index_verify_usage, + options); + + return verify_midx_file(the_repository, opts.object_dir, opts.flags); +} + +static int cmd_multi_pack_index_expire(int argc, const char **argv) +{ + struct option *options = common_opts; + + argc = parse_options(argc, argv, NULL, + options, builtin_multi_pack_index_expire_usage, + PARSE_OPT_KEEP_UNKNOWN); + if (argc) + usage_with_options(builtin_multi_pack_index_expire_usage, + options); + + return expire_midx_packs(the_repository, opts.object_dir, opts.flags); +} + +static int cmd_multi_pack_index_repack(int argc, const char **argv) +{ + struct option *options; + static struct option builtin_multi_pack_index_repack_options[] = { OPT_MAGNITUDE(0, "batch-size", &opts.batch_size, N_("during repack, collect pack-files of smaller size into a batch that is larger than this size")), OPT_END(), }; + options = add_common_options(builtin_multi_pack_index_repack_options); + + argc = parse_options(argc, argv, NULL, + options, + builtin_multi_pack_index_repack_usage, + PARSE_OPT_KEEP_UNKNOWN); + if (argc) + usage_with_options(builtin_multi_pack_index_repack_usage, + options); + + FREE_AND_NULL(options); + + return midx_repack(the_repository, opts.object_dir, + (size_t)opts.batch_size, opts.flags); +} + +int cmd_multi_pack_index(int argc, const char **argv, + const char *prefix) +{ + struct option *builtin_multi_pack_index_options = common_opts; + git_config(git_default_config, NULL); if (isatty(2)) opts.flags |= MIDX_PROGRESS; argc = parse_options(argc, argv, prefix, builtin_multi_pack_index_options, - builtin_multi_pack_index_usage, 0); + builtin_multi_pack_index_usage, + PARSE_OPT_STOP_AT_NON_OPTION); if (!opts.object_dir) opts.object_dir = get_object_directory(); @@ -58,25 +147,16 @@ int cmd_multi_pack_index(int argc, const char **argv, usage_with_options(builtin_multi_pack_index_usage, builtin_multi_pack_index_options); - if (argc > 1) { - die(_("too many arguments")); - return 1; - } - trace2_cmd_mode(argv[0]); if (!strcmp(argv[0], "repack")) - return midx_repack(the_repository, opts.object_dir, - (size_t)opts.batch_size, opts.flags); - if (opts.batch_size) - die(_("--batch-size option is only for 'repack' subcommand")); - - if (!strcmp(argv[0], "write")) - return write_midx_file(opts.object_dir, opts.flags); - if (!strcmp(argv[0], "verify")) - return verify_midx_file(the_repository, opts.object_dir, opts.flags); - if (!strcmp(argv[0], "expire")) - return expire_midx_packs(the_repository, opts.object_dir, opts.flags); - - die(_("unrecognized subcommand: %s"), argv[0]); + return cmd_multi_pack_index_repack(argc, argv); + else if (!strcmp(argv[0], "write")) + return cmd_multi_pack_index_write(argc, argv); + else if (!strcmp(argv[0], "verify")) + return cmd_multi_pack_index_verify(argc, argv); + else if (!strcmp(argv[0], "expire")) + return cmd_multi_pack_index_expire(argc, argv); + else + die(_("unrecognized subcommand: %s"), argv[0]); } From patchwork Tue Mar 30 15:04:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12172875 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1A20FC433E6 for ; Tue, 30 Mar 2021 15:04:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F1284619D1 for ; Tue, 30 Mar 2021 15:04:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232208AbhC3PEZ (ORCPT ); Tue, 30 Mar 2021 11:04:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53008 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230243AbhC3PEE (ORCPT ); Tue, 30 Mar 2021 11:04:04 -0400 Received: from mail-il1-x133.google.com (mail-il1-x133.google.com [IPv6:2607:f8b0:4864:20::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA758C061574 for ; Tue, 30 Mar 2021 08:04:03 -0700 (PDT) Received: by mail-il1-x133.google.com with SMTP id f5so5516185ilr.9 for ; Tue, 30 Mar 2021 08:04:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=rW7H6IZO+GEGW1/HLiB7yGIpc3CLRlYYI8aYoAjVwcs=; b=SLizrfkh/CvsAocLNd42rpjnUP269zBx+0cUlTb/ciuLPtoZBsTK0n4A5HqkTFtxds +aLWlhh1pr01yrHFM528dDR5+w2bR6CE8s0hq1GcUau/TsPVA4o1iM/N1xEQ4glrTdUK TS3RDgexJ2D0TIENEm8xkB+0NOKOFYdOtCrz5tN4L/gTBeux2+pe9IoiVOBOrP6af6fx ErLKR4TQNOPVpkzU0jLO/T7tjSxi5dfdns2nVkTypGWOkxRVgpwmohGtlIpSPJ4k0Moy NU2HF9BU6NtDbB536jYCCYI/Ztd+vy7MPLfhylhoJvFK6+KnqRjQE8gJc0g7GVj5SbEu wnzA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=rW7H6IZO+GEGW1/HLiB7yGIpc3CLRlYYI8aYoAjVwcs=; b=kZginBZaxpTPniAzhRaO7oAzCwXxNsW2ny6GVQ9HYhn/Hda1Hj9axGLnLv9GTceiJy YlSIplgk98rc/z2BW2uC0AKB/0Ug16rhY9Cbu7VevAmcz7+citKMOV4LY9oBfXSzgBwP qVtn3IKustw9ns9zV9r+CIWhB6ZXwhrDCly/efZ6K2oP9u2EjxdAsGqz6WCyd7GvOUOl 7xGTr9YZCu5kPsnpUJjrhcjR/fZ+Ehl8nvL+DF7fBA3kgb/YfdNAsr77Z4e6WRqN1XXv nTySaSZk9H5QNP4sktEjKJ3/LNliGTXJXyveEeirmClkAtbfJ+PH6Sl1fBm0WQQny8oL gi0A== X-Gm-Message-State: AOAM533GdTgtPZDQZ++1j14RESFvSojXCx9jx+9QM4rlr4N9aEb8qll9 myYtO+QI4s6g8sy7LPZkXtcQweOxE/H5Bg== X-Google-Smtp-Source: ABdhPJxAarrl82ynir4uCT0QDODAuGgA76cuzdrbIzF94U8TboZGCs5X5U+QMGMWky6MTAr6A9FvEQ== X-Received: by 2002:a05:6e02:152:: with SMTP id j18mr25650745ilr.302.1617116643163; Tue, 30 Mar 2021 08:04:03 -0700 (PDT) Received: from localhost ([2605:9480:22e:ff10:7b00:4f79:8763:6261]) by smtp.gmail.com with ESMTPSA id q12sm11675206ilm.63.2021.03.30.08.04.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 08:04:02 -0700 (PDT) Date: Tue, 30 Mar 2021 11:04:01 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: gitster@pobox.com, peff@peff.net, dstolee@microsoft.com, jonathantanmy@google.com Subject: [PATCH v4 05/16] builtin/multi-pack-index.c: don't enter bogus cmd_mode Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Even before the recent refactoring, 'git multi-pack-index' calls 'trace2_cmd_mode()' before verifying that the sub-command is recognized. Push this call down into the individual sub-commands so that we don't enter a bogus command mode. Signed-off-by: Ævar Arnfjörð Bjarmason Signed-off-by: Taylor Blau --- builtin/multi-pack-index.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index a78640c061..b590c4fc88 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -63,6 +63,8 @@ static int cmd_multi_pack_index_write(int argc, const char **argv) { struct option *options = common_opts; + trace2_cmd_mode(argv[0]); + argc = parse_options(argc, argv, NULL, options, builtin_multi_pack_index_write_usage, PARSE_OPT_KEEP_UNKNOWN); @@ -77,6 +79,8 @@ static int cmd_multi_pack_index_verify(int argc, const char **argv) { struct option *options = common_opts; + trace2_cmd_mode(argv[0]); + argc = parse_options(argc, argv, NULL, options, builtin_multi_pack_index_verify_usage, PARSE_OPT_KEEP_UNKNOWN); @@ -91,6 +95,8 @@ static int cmd_multi_pack_index_expire(int argc, const char **argv) { struct option *options = common_opts; + trace2_cmd_mode(argv[0]); + argc = parse_options(argc, argv, NULL, options, builtin_multi_pack_index_expire_usage, PARSE_OPT_KEEP_UNKNOWN); @@ -112,6 +118,8 @@ static int cmd_multi_pack_index_repack(int argc, const char **argv) options = add_common_options(builtin_multi_pack_index_repack_options); + trace2_cmd_mode(argv[0]); + argc = parse_options(argc, argv, NULL, options, builtin_multi_pack_index_repack_usage, @@ -147,8 +155,6 @@ int cmd_multi_pack_index(int argc, const char **argv, usage_with_options(builtin_multi_pack_index_usage, builtin_multi_pack_index_options); - trace2_cmd_mode(argv[0]); - if (!strcmp(argv[0], "repack")) return cmd_multi_pack_index_repack(argc, argv); else if (!strcmp(argv[0], "write")) From patchwork Tue Mar 30 15:04:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12172873 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3A8AC433E8 for ; Tue, 30 Mar 2021 15:04:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DD596619E1 for ; Tue, 30 Mar 2021 15:04:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232226AbhC3PE0 (ORCPT ); Tue, 30 Mar 2021 11:04:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53024 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231875AbhC3PEH (ORCPT ); Tue, 30 Mar 2021 11:04:07 -0400 Received: from mail-qt1-x82a.google.com (mail-qt1-x82a.google.com [IPv6:2607:f8b0:4864:20::82a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 70A6DC061574 for ; Tue, 30 Mar 2021 08:04:07 -0700 (PDT) Received: by mail-qt1-x82a.google.com with SMTP id g24so12112007qts.6 for ; Tue, 30 Mar 2021 08:04:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=dtJt3DvuOK3XM/npLCbaRG1X9CdZJ7yFJ6tZkoN+heo=; b=YbiTA3upvnB+OzYJdfd3/gDeGN7doKqtgZmLq8dp96BZhepXWYhaUlhWYdLcntseeg qnfSfPL8+4KiHtP3dllzSsTjdF4u1MQ10IzxkgvQq1akJwMPtlTAP4n/duMysnP3Dw47 8Ln3Q4+Ouag7p9XUg6ySo/EBSXcqxrdNiG15Dwfryle0Y6tQph35kHv+PrWcMffgB+mU OICP2QSD0U5ftLvJB+1Q7M5I5pFbKFs3UZZji9SLdZS2W2dnF1y45BRWbipeUDW3HfuV aamAnWvmbWT1kubxw43vBXB+pIabkn3rqxAgBzQTnuU8wZJ7RdGLpNHK+5IQn1v9KSse AZww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=dtJt3DvuOK3XM/npLCbaRG1X9CdZJ7yFJ6tZkoN+heo=; b=YWh8QLBtbt/gj3mPAC2hnoVPF876eweaPUU1dltHYX8PVy8fWkO4vgFLEUcR2Ag461 tOTwq/Oa/vjhxkDIHfc2VGvb/kHhuTxsQ4bNVArwBkI99+4CR4UHYevbZUKeZKWOB0EJ 0Vnx3ERMN+8vgyo8wz8UCjb7vSxWGveLzB+EmSI2h5krz8lLlBSFhAD18FCahxVvwOGF nzuzuRyDqIuLCV4YkmlOKBv51wsLRJokHq2+xdb3Xf5MsaJ/ErDH2vXy8J5TjgSWmnjb uV36RV34AAkddOHpCt6NwC60Fk2P1YVxkZSuZLiBVbym5vVKqTVUiyokTNlfLHhRslDF 8S2g== X-Gm-Message-State: AOAM532SSXbjZIVWZDML9I45f1aavBNsN0RmkuysiNm7F3F0Xt/GFzWO VzT5aQBSt3/UNaaifiJEc6ssTmDXRU6U7g== X-Google-Smtp-Source: ABdhPJz3PaD7tyShy/+OU247o1EOC0JIa7A2f+4x8BPK91TWghOxIFisVlCZ+BWTAH0McbjdApN8ag== X-Received: by 2002:ac8:73cb:: with SMTP id v11mr17321042qtp.332.1617116646168; Tue, 30 Mar 2021 08:04:06 -0700 (PDT) Received: from localhost ([2605:9480:22e:ff10:7b00:4f79:8763:6261]) by smtp.gmail.com with ESMTPSA id j13sm13107464qth.57.2021.03.30.08.04.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 08:04:05 -0700 (PDT) Date: Tue, 30 Mar 2021 11:04:04 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: gitster@pobox.com, peff@peff.net, dstolee@microsoft.com, jonathantanmy@google.com Subject: [PATCH v4 06/16] builtin/multi-pack-index.c: display usage on unrecognized command Message-ID: <604a02ce8549be870dc7af3109289fa0a4b1e60f.1617116623.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When given a sub-command that it doesn't understand, 'git multi-pack-index' dies with the following message: $ git multi-pack-index bogus fatal: unrecognized subcommand: bogus Instead of 'die()'-ing, we can display the usage text, which is much more helpful: $ git.compile multi-pack-index bogus error: unrecognized subcommand: bogus usage: git multi-pack-index [] write or: git multi-pack-index [] verify or: git multi-pack-index [] expire or: git multi-pack-index [] repack [--batch-size=] --object-dir object directory containing set of packfile and pack-index pairs --progress force progress reporting While we're at it, clean up some duplication between the "no sub-command" and "unrecognized sub-command" conditionals. Signed-off-by: Ævar Arnfjörð Bjarmason Signed-off-by: Taylor Blau --- builtin/multi-pack-index.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index b590c4fc88..8711174fae 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -152,8 +152,7 @@ int cmd_multi_pack_index(int argc, const char **argv, opts.object_dir = get_object_directory(); if (argc == 0) - usage_with_options(builtin_multi_pack_index_usage, - builtin_multi_pack_index_options); + goto usage; if (!strcmp(argv[0], "repack")) return cmd_multi_pack_index_repack(argc, argv); @@ -163,6 +162,10 @@ int cmd_multi_pack_index(int argc, const char **argv, return cmd_multi_pack_index_verify(argc, argv); else if (!strcmp(argv[0], "expire")) return cmd_multi_pack_index_expire(argc, argv); - else - die(_("unrecognized subcommand: %s"), argv[0]); + else { +usage: + error(_("unrecognized subcommand: %s"), argv[0]); + usage_with_options(builtin_multi_pack_index_usage, + builtin_multi_pack_index_options); + } } From patchwork Tue Mar 30 15:04:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12172877 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4F2A3C433E9 for ; Tue, 30 Mar 2021 15:04:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 33390619D7 for ; Tue, 30 Mar 2021 15:04:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232237AbhC3PE1 (ORCPT ); Tue, 30 Mar 2021 11:04:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53034 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231910AbhC3PEK (ORCPT ); Tue, 30 Mar 2021 11:04:10 -0400 Received: from mail-il1-x12b.google.com (mail-il1-x12b.google.com [IPv6:2607:f8b0:4864:20::12b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4F004C061574 for ; Tue, 30 Mar 2021 08:04:10 -0700 (PDT) Received: by mail-il1-x12b.google.com with SMTP id d10so14428047ils.5 for ; Tue, 30 Mar 2021 08:04:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=fK8WEtc2GFHptLJs2Vl6B9J5BbpyrxE1ckawIMUyd9Y=; b=RYFcR2QXXh3pyP5mx3e8IECsSFyHcW3rH73QVeXLcEM3Z9IGVsbALXQoISaJhA5UzE Yo1+kmKAbgtKfTQKrND6Hdb8dcLTy5H5LUQhILbdu/5EbMb/3+jCBXSa5XGDaNq+eqCG cqagp2d+9/vT7t5mcSy/i88qRNVM4vnB5sw4vYD61pJY8eq8RisKTA4hVnTIdiW0WaRZ SAMM1RA8ZpFBLpHMCIoEzFJ07hP7JzECMpG6I3j7prO7H9VQSUJcIdUKMXqCh2Eudanh WcBpZ+z2L8aF0CIC8Nr5WO8BkrxW9/1zS9tUcFMpigP5Orf9GCYcZ+Z/aBIPpxlT2+zd 7g6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=fK8WEtc2GFHptLJs2Vl6B9J5BbpyrxE1ckawIMUyd9Y=; b=TClxecaDoTa9s+PY9gqkvATdkLXfXNQGfa9tgs7OQYWoru0KC54nJgn45l2ZGUWCUm beN+SnZC6A7Qdh0dsVIz6YQ95ES7Sccu0Q5Uw2N38ma420ziCANRvQozbDFt+w4IzYiB 35N4mAH24T/OWlONgDLCfSveUvMF53e5hj56XM+thfnfhkmqCPwtgFs8sGghd0i0TbVK ASgJaTlkg4O7Or/+a1NVjfI+6ONKne5WAycIK8hNRSjN9O8vh8mPTngjKyK64783poqu O739RIlTSloRUFM86dZ2EbZ7PBc2q8+uih4+xUsq5rIVJ9aHqn6SUiClgR70T/y84qDT 71gA== X-Gm-Message-State: AOAM532ilHyjoATpN5Hqxqb/Tz6Rsv5ZSLFvuEmLnB/ts9bTEw6STgot 9lfuqh79mxPez1svsOcrb9U2Mhtj1W8qSA== X-Google-Smtp-Source: ABdhPJz3r4QdsksKZHiQC2KXP54OowLjsagF701o+oCNN8m7axgo12pidDhiLth9WMHDAO4DrJaC5g== X-Received: by 2002:a05:6e02:1a2f:: with SMTP id g15mr25740656ile.176.1617116649534; Tue, 30 Mar 2021 08:04:09 -0700 (PDT) Received: from localhost ([2605:9480:22e:ff10:7b00:4f79:8763:6261]) by smtp.gmail.com with ESMTPSA id 13sm11629830ioz.40.2021.03.30.08.04.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 08:04:09 -0700 (PDT) Date: Tue, 30 Mar 2021 11:04:07 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: gitster@pobox.com, peff@peff.net, dstolee@microsoft.com, jonathantanmy@google.com Subject: [PATCH v4 07/16] t/helper/test-read-midx.c: add '--show-objects' Message-ID: <37e073ea277045f00cdb3dfa2827f8c8d67ed9cb.1617116623.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The 'read-midx' helper is used in places like t5319 to display basic information about a multi-pack-index. In the next patch, the MIDX writing machinery will learn a new way to choose from which pack an object is selected when multiple copies of that object exist. To disambiguate which pack introduces an object so that this feature can be tested, add a '--show-objects' option which displays additional information about each object in the MIDX. Signed-off-by: Taylor Blau --- t/helper/test-read-midx.c | 24 ++++++++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-) diff --git a/t/helper/test-read-midx.c b/t/helper/test-read-midx.c index 2430880f78..7c2eb11a8e 100644 --- a/t/helper/test-read-midx.c +++ b/t/helper/test-read-midx.c @@ -4,7 +4,7 @@ #include "repository.h" #include "object-store.h" -static int read_midx_file(const char *object_dir) +static int read_midx_file(const char *object_dir, int show_objects) { uint32_t i; struct multi_pack_index *m; @@ -43,13 +43,29 @@ static int read_midx_file(const char *object_dir) printf("object-dir: %s\n", m->object_dir); + if (show_objects) { + struct object_id oid; + struct pack_entry e; + + for (i = 0; i < m->num_objects; i++) { + nth_midxed_object_oid(&oid, m, i); + fill_midx_entry(the_repository, &oid, &e, m); + + printf("%s %"PRIu64"\t%s\n", + oid_to_hex(&oid), e.offset, e.p->pack_name); + } + return 0; + } + return 0; } int cmd__read_midx(int argc, const char **argv) { - if (argc != 2) - usage("read-midx "); + if (!(argc == 2 || argc == 3)) + usage("read-midx [--show-objects] "); - return read_midx_file(argv[1]); + if (!strcmp(argv[1], "--show-objects")) + return read_midx_file(argv[2], 1); + return read_midx_file(argv[1], 0); } From patchwork Tue Mar 30 15:04:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12172879 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33395C433EA for ; Tue, 30 Mar 2021 15:04:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 11E39619DD for ; Tue, 30 Mar 2021 15:04:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232248AbhC3PE1 (ORCPT ); Tue, 30 Mar 2021 11:04:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53050 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231928AbhC3PEO (ORCPT ); Tue, 30 Mar 2021 11:04:14 -0400 Received: from mail-qk1-x72c.google.com (mail-qk1-x72c.google.com [IPv6:2607:f8b0:4864:20::72c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F0C6DC061574 for ; Tue, 30 Mar 2021 08:04:13 -0700 (PDT) Received: by mail-qk1-x72c.google.com with SMTP id v70so16146277qkb.8 for ; Tue, 30 Mar 2021 08:04:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=mqSbGfq+RQJhhHZecQa+IfHjfheUueIfu9VTlhYMygk=; b=BU4bIuMbAiOK7WHdtfRrKDAtRh2qnH6WKeEaK6xwrUSQNtvAu+dkOwRDnaUi9NX0Sd nNdjXNYWwFY8Wj+qEMONPmuPAUrttz3mFw0U8fVb8U7I6TnRb5r2Ty9t1tGqfYoEcaEa eUGZxj6tf9221fOV3buO3kXJBy5P7ZCT5PyDsm3wTm/w3q+lgyP1atWnHhhNXU9X+/+O dMKmG3/JPf5YgLXizwvvWU9JW5fN24FBcOaxBDXYTpiDzldtcVaZK2FGtOG/R4yCGSsA +YkChLPqZUHDjAlrAj8btB2jDgdcBgBLo5PJ3YF6YwfCTMeRU1XarPf6OJpRyl0Kpx0L IZLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=mqSbGfq+RQJhhHZecQa+IfHjfheUueIfu9VTlhYMygk=; b=os2xkt8YFhOeEdJqZz2NNjWZGYV3Gh/j8iRH5yxrdhkpDtnpA4URQe7LYCrLPixBE/ L3e/PeCobewS4AchdlMXkbG1iKLZ3Xu85r0bjzKM0zQ5mPabZ+x0gHMkzZHN1msxCjMa HwZ0v4O2us3AZ9tXXiEydUgfrB7GJ/KlqkvSMWVM8g25mgDq+RV1SqxJdklXJYoHZN0j +8m5L31yeLchfbk30BhuwFoC7tmOBxKEKXinzJRvlOsiJDIdwiVAz1KRhj3ZMca7Daa0 BefuHe8Dr7motqbnd8cVXkZu2LIyY2miCtMndsqgGOQ7A83b1/9o7Llsi2vkyHEwwOtU ZjMg== X-Gm-Message-State: AOAM532fUDZ2wV1dr/b7JEx0UpWwxZpjfwhvGWKAcaLmapvZp1ldeNln Bceb7cnMaD97ChpsaXbJUZbDLt712sIgpQ== X-Google-Smtp-Source: ABdhPJwSS65Yxjy4iQ5L4BKyOJUR2Eklcm9HRgOyDM09ViXnh6oarWImLvE7juLiifSnqg8EcpaeVQ== X-Received: by 2002:a05:620a:a12:: with SMTP id i18mr30604117qka.119.1617116652529; Tue, 30 Mar 2021 08:04:12 -0700 (PDT) Received: from localhost ([2605:9480:22e:ff10:7b00:4f79:8763:6261]) by smtp.gmail.com with ESMTPSA id c73sm16312359qkg.6.2021.03.30.08.04.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 08:04:12 -0700 (PDT) Date: Tue, 30 Mar 2021 11:04:11 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: gitster@pobox.com, peff@peff.net, dstolee@microsoft.com, jonathantanmy@google.com Subject: [PATCH v4 08/16] midx: allow marking a pack as preferred Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When multiple packs in the multi-pack index contain the same object, the MIDX machinery must make a choice about which pack it associates with that object. Prior to this patch, the lowest-ordered[1] pack was always selected. Pack selection for duplicate objects is relatively unimportant today, but it will become important for multi-pack bitmaps. This is because we can only invoke the pack-reuse mechanism when all of the bits for reused objects come from the reuse pack (in order to ensure that all reused deltas can find their base objects in the same pack). To encourage the pack selection process to prefer one pack over another (the pack to be preferred is the one a caller would like to later use as a reuse pack), introduce the concept of a "preferred pack". When provided, the MIDX code will always prefer an object found in a preferred pack over any other. No format changes are required to store the preferred pack, since it will be able to be inferred with a corresponding MIDX bitmap, by looking up the pack associated with the object in the first bit position (this ordering is described in detail in a subsequent commit). [1]: the ordering is specified by MIDX internals; for our purposes we can consider the "lowest ordered" pack to be "the one with the most-recent mtime. Signed-off-by: Taylor Blau --- Documentation/git-multi-pack-index.txt | 14 +++- Documentation/technical/multi-pack-index.txt | 5 +- builtin/multi-pack-index.c | 19 ++++- builtin/repack.c | 2 +- midx.c | 82 +++++++++++++++++--- midx.h | 2 +- t/t5319-multi-pack-index.sh | 43 ++++++++++ 7 files changed, 149 insertions(+), 18 deletions(-) diff --git a/Documentation/git-multi-pack-index.txt b/Documentation/git-multi-pack-index.txt index eb0caa0439..ffd601bc17 100644 --- a/Documentation/git-multi-pack-index.txt +++ b/Documentation/git-multi-pack-index.txt @@ -9,7 +9,8 @@ git-multi-pack-index - Write and verify multi-pack-indexes SYNOPSIS -------- [verse] -'git multi-pack-index' [--object-dir=] [--[no-]progress] +'git multi-pack-index' [--object-dir=] [--[no-]progress] + [--preferred-pack=] DESCRIPTION ----------- @@ -30,7 +31,16 @@ OPTIONS The following subcommands are available: write:: - Write a new MIDX file. + Write a new MIDX file. The following options are available for + the `write` sub-command: ++ +-- + --preferred-pack=:: + Optionally specify the tie-breaking pack used when + multiple packs contain the same object. If not given, + ties are broken in favor of the pack with the lowest + mtime. +-- verify:: Verify the contents of the MIDX file. diff --git a/Documentation/technical/multi-pack-index.txt b/Documentation/technical/multi-pack-index.txt index e8e377a59f..fb688976c4 100644 --- a/Documentation/technical/multi-pack-index.txt +++ b/Documentation/technical/multi-pack-index.txt @@ -43,8 +43,9 @@ Design Details a change in format. - The MIDX keeps only one record per object ID. If an object appears - in multiple packfiles, then the MIDX selects the copy in the most- - recently modified packfile. + in multiple packfiles, then the MIDX selects the copy in the + preferred packfile, otherwise selecting from the most-recently + modified packfile. - If there exist packfiles in the pack directory not registered in the MIDX, then those packfiles are loaded into the `packed_git` diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index 8711174fae..5d3ea445fd 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -4,9 +4,10 @@ #include "parse-options.h" #include "midx.h" #include "trace2.h" +#include "object-store.h" #define BUILTIN_MIDX_WRITE_USAGE \ - N_("git multi-pack-index [] write") + N_("git multi-pack-index [] write [--preferred-pack=]") #define BUILTIN_MIDX_VERIFY_USAGE \ N_("git multi-pack-index [] verify") @@ -43,6 +44,7 @@ static char const * const builtin_multi_pack_index_usage[] = { static struct opts_multi_pack_index { const char *object_dir; + const char *preferred_pack; unsigned long batch_size; unsigned flags; } opts; @@ -61,7 +63,15 @@ static struct option *add_common_options(struct option *prev) static int cmd_multi_pack_index_write(int argc, const char **argv) { - struct option *options = common_opts; + struct option *options; + static struct option builtin_multi_pack_index_write_options[] = { + OPT_STRING(0, "preferred-pack", &opts.preferred_pack, + N_("preferred-pack"), + N_("pack for reuse when computing a multi-pack bitmap")), + OPT_END(), + }; + + options = add_common_options(builtin_multi_pack_index_write_options); trace2_cmd_mode(argv[0]); @@ -72,7 +82,10 @@ static int cmd_multi_pack_index_write(int argc, const char **argv) usage_with_options(builtin_multi_pack_index_write_usage, options); - return write_midx_file(opts.object_dir, opts.flags); + FREE_AND_NULL(options); + + return write_midx_file(opts.object_dir, opts.preferred_pack, + opts.flags); } static int cmd_multi_pack_index_verify(int argc, const char **argv) diff --git a/builtin/repack.c b/builtin/repack.c index 6ce2556c9e..2847fdfbab 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -721,7 +721,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) remove_temporary_files(); if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0)) - write_midx_file(get_object_directory(), 0); + write_midx_file(get_object_directory(), NULL, 0); string_list_clear(&names, 0); string_list_clear(&rollback, 0); diff --git a/midx.c b/midx.c index becfafe65e..4a9d84352c 100644 --- a/midx.c +++ b/midx.c @@ -431,6 +431,14 @@ static int pack_info_compare(const void *_a, const void *_b) return strcmp(a->pack_name, b->pack_name); } +static int idx_or_pack_name_cmp(const void *_va, const void *_vb) +{ + const char *pack_name = _va; + const struct pack_info *compar = _vb; + + return cmp_idx_or_pack_name(pack_name, compar->pack_name); +} + struct write_midx_context { struct pack_info *info; uint32_t nr; @@ -445,6 +453,8 @@ struct write_midx_context { uint32_t *pack_perm; unsigned large_offsets_needed:1; uint32_t num_large_offsets; + + int preferred_pack_idx; }; static void add_pack_to_midx(const char *full_path, size_t full_path_len, @@ -489,6 +499,7 @@ struct pack_midx_entry { uint32_t pack_int_id; time_t pack_mtime; uint64_t offset; + unsigned preferred : 1; }; static int midx_oid_compare(const void *_a, const void *_b) @@ -500,6 +511,12 @@ static int midx_oid_compare(const void *_a, const void *_b) if (cmp) return cmp; + /* Sort objects in a preferred pack first when multiple copies exist. */ + if (a->preferred > b->preferred) + return -1; + if (a->preferred < b->preferred) + return 1; + if (a->pack_mtime > b->pack_mtime) return -1; else if (a->pack_mtime < b->pack_mtime) @@ -527,7 +544,8 @@ static int nth_midxed_pack_midx_entry(struct multi_pack_index *m, static void fill_pack_entry(uint32_t pack_int_id, struct packed_git *p, uint32_t cur_object, - struct pack_midx_entry *entry) + struct pack_midx_entry *entry, + int preferred) { if (nth_packed_object_id(&entry->oid, p, cur_object) < 0) die(_("failed to locate object %d in packfile"), cur_object); @@ -536,6 +554,7 @@ static void fill_pack_entry(uint32_t pack_int_id, entry->pack_mtime = p->mtime; entry->offset = nth_packed_object_offset(p, cur_object); + entry->preferred = !!preferred; } /* @@ -552,7 +571,8 @@ static void fill_pack_entry(uint32_t pack_int_id, static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m, struct pack_info *info, uint32_t nr_packs, - uint32_t *nr_objects) + uint32_t *nr_objects, + int preferred_pack) { uint32_t cur_fanout, cur_pack, cur_object; uint32_t alloc_fanout, alloc_objects, total_objects = 0; @@ -589,12 +609,17 @@ static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m, nth_midxed_pack_midx_entry(m, &entries_by_fanout[nr_fanout], cur_object); + if (nth_midxed_pack_int_id(m, cur_object) == preferred_pack) + entries_by_fanout[nr_fanout].preferred = 1; + else + entries_by_fanout[nr_fanout].preferred = 0; nr_fanout++; } } for (cur_pack = start_pack; cur_pack < nr_packs; cur_pack++) { uint32_t start = 0, end; + int preferred = cur_pack == preferred_pack; if (cur_fanout) start = get_pack_fanout(info[cur_pack].p, cur_fanout - 1); @@ -602,7 +627,11 @@ static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m, for (cur_object = start; cur_object < end; cur_object++) { ALLOC_GROW(entries_by_fanout, nr_fanout + 1, alloc_fanout); - fill_pack_entry(cur_pack, info[cur_pack].p, cur_object, &entries_by_fanout[nr_fanout]); + fill_pack_entry(cur_pack, + info[cur_pack].p, + cur_object, + &entries_by_fanout[nr_fanout], + preferred); nr_fanout++; } } @@ -777,7 +806,9 @@ static int write_midx_large_offsets(struct hashfile *f, } static int write_midx_internal(const char *object_dir, struct multi_pack_index *m, - struct string_list *packs_to_drop, unsigned flags) + struct string_list *packs_to_drop, + const char *preferred_pack_name, + unsigned flags) { char *midx_name; uint32_t i; @@ -828,7 +859,19 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * if (ctx.m && ctx.nr == ctx.m->num_packs && !packs_to_drop) goto cleanup; - ctx.entries = get_sorted_entries(ctx.m, ctx.info, ctx.nr, &ctx.entries_nr); + ctx.preferred_pack_idx = -1; + if (preferred_pack_name) { + for (i = 0; i < ctx.nr; i++) { + if (!cmp_idx_or_pack_name(preferred_pack_name, + ctx.info[i].pack_name)) { + ctx.preferred_pack_idx = i; + break; + } + } + } + + ctx.entries = get_sorted_entries(ctx.m, ctx.info, ctx.nr, &ctx.entries_nr, + ctx.preferred_pack_idx); ctx.large_offsets_needed = 0; for (i = 0; i < ctx.entries_nr; i++) { @@ -889,6 +932,24 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * pack_name_concat_len += strlen(ctx.info[i].pack_name) + 1; } + /* Check that the preferred pack wasn't expired (if given). */ + if (preferred_pack_name) { + struct pack_info *preferred = bsearch(preferred_pack_name, + ctx.info, ctx.nr, + sizeof(*ctx.info), + idx_or_pack_name_cmp); + + if (!preferred) + warning(_("unknown preferred pack: '%s'"), + preferred_pack_name); + else { + uint32_t perm = ctx.pack_perm[preferred->orig_pack_int_id]; + if (perm == PACK_EXPIRED) + warning(_("preferred pack '%s' is expired"), + preferred_pack_name); + } + } + if (pack_name_concat_len % MIDX_CHUNK_ALIGNMENT) pack_name_concat_len += MIDX_CHUNK_ALIGNMENT - (pack_name_concat_len % MIDX_CHUNK_ALIGNMENT); @@ -947,9 +1008,12 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * return result; } -int write_midx_file(const char *object_dir, unsigned flags) +int write_midx_file(const char *object_dir, + const char *preferred_pack_name, + unsigned flags) { - return write_midx_internal(object_dir, NULL, NULL, flags); + return write_midx_internal(object_dir, NULL, NULL, preferred_pack_name, + flags); } void clear_midx_file(struct repository *r) @@ -1184,7 +1248,7 @@ int expire_midx_packs(struct repository *r, const char *object_dir, unsigned fla free(count); if (packs_to_drop.nr) - result = write_midx_internal(object_dir, m, &packs_to_drop, flags); + result = write_midx_internal(object_dir, m, &packs_to_drop, NULL, flags); string_list_clear(&packs_to_drop, 0); return result; @@ -1373,7 +1437,7 @@ int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, goto cleanup; } - result = write_midx_internal(object_dir, m, NULL, flags); + result = write_midx_internal(object_dir, m, NULL, NULL, flags); m = NULL; cleanup: diff --git a/midx.h b/midx.h index b18cf53bc4..e7fea61109 100644 --- a/midx.h +++ b/midx.h @@ -47,7 +47,7 @@ int fill_midx_entry(struct repository *r, const struct object_id *oid, struct pa int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pack_name); int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local); -int write_midx_file(const char *object_dir, unsigned flags); +int write_midx_file(const char *object_dir, const char *preferred_pack_name, unsigned flags); void clear_midx_file(struct repository *r); int verify_midx_file(struct repository *r, const char *object_dir, unsigned flags); int expire_midx_packs(struct repository *r, const char *object_dir, unsigned flags); diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index b4afab1dfc..031a5570c0 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -234,6 +234,49 @@ test_expect_success 'warn on improper hash version' ' ) ' +test_expect_success 'midx picks objects from preferred pack' ' + test_when_finished rm -rf preferred.git && + git init --bare preferred.git && + ( + cd preferred.git && + + a=$(echo "a" | git hash-object -w --stdin) && + b=$(echo "b" | git hash-object -w --stdin) && + c=$(echo "c" | git hash-object -w --stdin) && + + # Set up two packs, duplicating the object "B" at different + # offsets. + # + # Note that the "BC" pack (the one we choose as preferred) sorts + # lexically after the "AB" pack, meaning that omitting the + # --preferred-pack argument would cause this test to fail (since + # the MIDX code would select the copy of "b" in the "AB" pack). + git pack-objects objects/pack/test-AB <<-EOF && + $a + $b + EOF + bc=$(git pack-objects objects/pack/test-BC <<-EOF + $b + $c + EOF + ) && + + git multi-pack-index --object-dir=objects \ + write --preferred-pack=test-BC-$bc.idx 2>err && + test_must_be_empty err && + + echo hi && + test-tool read-midx --show-objects objects >out && + + ofs=$(git show-index expect && + grep ^$b out >actual && + + test_cmp expect actual + ) +' test_expect_success 'verify multi-pack-index success' ' git multi-pack-index verify --object-dir=$objdir From patchwork Tue Mar 30 15:04:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12172883 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA2E1C433C1 for ; Tue, 30 Mar 2021 15:05:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8C02A619C8 for ; Tue, 30 Mar 2021 15:05:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232267AbhC3PEv (ORCPT ); Tue, 30 Mar 2021 11:04:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53058 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231933AbhC3PER (ORCPT ); Tue, 30 Mar 2021 11:04:17 -0400 Received: from mail-il1-x135.google.com (mail-il1-x135.google.com [IPv6:2607:f8b0:4864:20::135]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B11CCC061574 for ; Tue, 30 Mar 2021 08:04:16 -0700 (PDT) Received: by mail-il1-x135.google.com with SMTP id h7so7289449ilj.8 for ; Tue, 30 Mar 2021 08:04:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=GbZSZnTDLJ9YbpfXlEhG7/BQovTea6AP9dU5FLAlSRI=; b=g1Vl7wQ7tcGFoazyLbxa5jUyG5ikELxqYDpDAD00gRDNxDK2jiUWSFNCbZOw4t6uLA exGFglUBJDJix086ucE0qxjNXpcqdxYPDk6TxkwciKg8BjU2wwuoMbySCofi6Y2DO7fE 5FybQ3mZpb9IBnt7KhYm9LKOZrUSyXvVoFFMvPEkF2Mw1fcr8mk+Ixbp7e3xcdOjFz3Y at3dWvws26HZzIj9F8LTjiy4lUzjJVGsOsbtuJn0q4ft0Lm14YDF1/cqwuMitGaLgz20 kqFYAmuEDv+3k0Bff7aWMnc87i46j2QI42XIQWWBSv3MG2xcYF2eUJzStk7swT5j2kiu L1Yw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=GbZSZnTDLJ9YbpfXlEhG7/BQovTea6AP9dU5FLAlSRI=; b=eQ0R645px/WIIc2C+orlsG3SZ6a7+Vq+CXR5wFql7/Q4YAVspfI4qYZSuKX2MSActk Dv60dOwNBEbg5YvnLngR2qn54HFpmt7oQTmhiGt+Fgd68/lNf0Z4gASbQXeGBRkKP4w6 RLF3pkOVEeRUmCjk33uYHIvwf2K8oRw1uNEHD4H6FNppZJTIb8MuL81Z00elJJcTtGGH GlQzrjX+NEwSNxlwBxdm/eFcAtPNB/Rpdxft8t8aJh/SFVAggASGxu4H2CBznqjLg8VU lmihw9EkFe8AeQekLWzCybpw/ngREFR7kpTfLbx4y0y2wOg/YNpYPlHjlJ1qazwK3ca6 NbzA== X-Gm-Message-State: AOAM530Z+9SRU3LdHzr8ExYBXRW24tF3csus/4zLHN0OCZi9f7i4NsOu yPHLUyCdxYeJas9U8/6sIRdfIylC6CmzjQ== X-Google-Smtp-Source: ABdhPJx642M4FyqjBDiaQ0WyWjTod6NlTGCHyFyzutrL1sx8ORPlFbfko3y7Y37YY0TSCQh2je55EQ== X-Received: by 2002:a92:6510:: with SMTP id z16mr24196722ilb.71.1617116655986; Tue, 30 Mar 2021 08:04:15 -0700 (PDT) Received: from localhost ([2605:9480:22e:ff10:7b00:4f79:8763:6261]) by smtp.gmail.com with ESMTPSA id z12sm11688392ilb.18.2021.03.30.08.04.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 08:04:15 -0700 (PDT) Date: Tue, 30 Mar 2021 11:04:14 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: gitster@pobox.com, peff@peff.net, dstolee@microsoft.com, jonathantanmy@google.com Subject: [PATCH v4 09/16] midx: don't free midx_name early Message-ID: <33b8af97e7c6fb2d3016422d084eae7aca7cb70c.1617116623.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A subsequent patch will need to refer back to 'midx_name' later on in the function. In fact, this variable is already free()'d later on, so this makes the later free() no longer redundant. Signed-off-by: Taylor Blau --- midx.c | 1 - 1 file changed, 1 deletion(-) diff --git a/midx.c b/midx.c index 4a9d84352c..3edde2b68d 100644 --- a/midx.c +++ b/midx.c @@ -956,7 +956,6 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * hold_lock_file_for_update(&lk, midx_name, LOCK_DIE_ON_ERROR); f = hashfd(get_lock_file_fd(&lk), get_lock_file_path(&lk)); - FREE_AND_NULL(midx_name); if (ctx.m) close_midx(ctx.m); From patchwork Tue Mar 30 15:04:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12172881 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ECB29C433E0 for ; Tue, 30 Mar 2021 15:05:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B5A8D619CF for ; Tue, 30 Mar 2021 15:05:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232278AbhC3PEx (ORCPT ); Tue, 30 Mar 2021 11:04:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53074 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232102AbhC3PEV (ORCPT ); Tue, 30 Mar 2021 11:04:21 -0400 Received: from mail-il1-x12b.google.com (mail-il1-x12b.google.com [IPv6:2607:f8b0:4864:20::12b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CD8C8C061762 for ; Tue, 30 Mar 2021 08:04:19 -0700 (PDT) Received: by mail-il1-x12b.google.com with SMTP id c17so14425446ilj.7 for ; Tue, 30 Mar 2021 08:04:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Zpiw/qHgGxWIVb8+wqZfv7qGGEiUS/itYgzFzNmgnnE=; b=1cQCsMmArQxL25ulxNFeupj4WrabIPgKI1kSff8+xZBd9LDB7i68cIrE4Fg/KS0jgq K57v2/sLTHfshpwDtFNGne1FbZMtgJLPUV4FicyAuOoMLy/8QN5T6/359jAi/ZpZWKDH 7sf+9gFPon+ODf/sH+irSS8kAPyur4ZsiKK47DLQTxjQy7Vh4x1DNObBCYNUXYSLqSeK 2Dik9NddVMpmDcKlxDDWYgM58ZnQomuv708PVkT9lu+7uJWKquXBnYMiiWNsB3VSLpKp kNgSe+warCfvIVNGJmLhJfA1ac5DLFMf/Wv/Tt2T6XACphNLLsc51u+u7GA5k9fWAVXX bPpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Zpiw/qHgGxWIVb8+wqZfv7qGGEiUS/itYgzFzNmgnnE=; b=XRccOPN4geD3SlrGj0D4DtzNyOrRtQcqLPjt+w5u7PwVMT21J9j3XZg9CReDU+YBBQ XVbQ35zjo5oT3tOVUWQOkCXIVhEtxhGl2yRu1+L6U3aHTmE/B3OQhBtOUR3wVlWsuHO7 QypuzKe5kF8baa+IKnCnMw6k28aYu98nSlQBIw/D2cA8TONraHW1t0+XPlHYXIyxAuOV IDAz6w6fMcmy8fCQKrU+TXwewtMuPwLlSD+hvJLIEd3IyfkJiTTnxmdTXO2VEOyGIEyD i5Si6LlGQyKfQxXBPwKi5LdTZc0sQkQ0RtRagjjnTA7OrYi8QygHzt0qSQMLTHNLvAuo Xdnw== X-Gm-Message-State: AOAM532PBxxGkWZ9t8TrgHX94KAcMYaorhdtXfZlEhqYShrwcUDTdt/9 LvtNiirO+mHZqUqei1wXTczUkkwtkFR1xg== X-Google-Smtp-Source: ABdhPJz815zN1pg9ZiIMFAIy2kq50tLXqEAzvI/ccOonJogPgSc9YUyqM+iS/hNneKFpUSWu+PvVlA== X-Received: by 2002:a92:d382:: with SMTP id o2mr20504895ilo.94.1617116659053; Tue, 30 Mar 2021 08:04:19 -0700 (PDT) Received: from localhost ([2605:9480:22e:ff10:7b00:4f79:8763:6261]) by smtp.gmail.com with ESMTPSA id s3sm9853164ilj.77.2021.03.30.08.04.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 08:04:18 -0700 (PDT) Date: Tue, 30 Mar 2021 11:04:17 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: gitster@pobox.com, peff@peff.net, dstolee@microsoft.com, jonathantanmy@google.com Subject: [PATCH v4 10/16] midx: keep track of the checksum Message-ID: <3fc9b83dc61837409b3d68f4676fc7a982f10593.1617116623.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org write_midx_internal() uses a hashfile to write the multi-pack index, but discards its checksum. This makes sense, since nothing that takes place after writing the MIDX cares about its checksum. That is about to change in a subsequent patch, when the optional reverse index corresponding to the MIDX will want to include the MIDX's checksum. Store the checksum of the MIDX in preparation for that. Signed-off-by: Taylor Blau --- midx.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/midx.c b/midx.c index 3edde2b68d..526795ff0e 100644 --- a/midx.c +++ b/midx.c @@ -811,6 +811,7 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * unsigned flags) { char *midx_name; + unsigned char midx_hash[GIT_MAX_RAWSZ]; uint32_t i; struct hashfile *f = NULL; struct lock_file lk; @@ -987,7 +988,7 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * write_midx_header(f, get_num_chunks(cf), ctx.nr - dropped_packs); write_chunkfile(cf, &ctx); - finalize_hashfile(f, NULL, CSUM_FSYNC | CSUM_HASH_IN_STREAM); + finalize_hashfile(f, midx_hash, CSUM_FSYNC | CSUM_HASH_IN_STREAM); free_chunkfile(cf); commit_lock_file(&lk); From patchwork Tue Mar 30 15:04:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12172885 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18250C433E1 for ; Tue, 30 Mar 2021 15:05:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E301F619D0 for ; Tue, 30 Mar 2021 15:05:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232221AbhC3PEw (ORCPT ); Tue, 30 Mar 2021 11:04:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53084 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231808AbhC3PEX (ORCPT ); Tue, 30 Mar 2021 11:04:23 -0400 Received: from mail-io1-xd2f.google.com (mail-io1-xd2f.google.com [IPv6:2607:f8b0:4864:20::d2f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BD9B3C061574 for ; Tue, 30 Mar 2021 08:04:22 -0700 (PDT) Received: by mail-io1-xd2f.google.com with SMTP id r193so16750219ior.9 for ; Tue, 30 Mar 2021 08:04:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=chAfj2MhKdOAg1JAo4haib3UAHcOP3zHfIDIZEwHVZI=; b=Pdh04Ma3r4Ial0FNpz7f3Ns9ZxKCKiPBFQ0Lfs9Y3aPYM4dzcHy0l5TlvXu5mDbHAb TGPf3f0/F3ebI1c4sLbVh36l5nOfcXx9aDBCwl2wiCwgpF1EfryczhOsDryiePNp0Evl 9PJrUq+sQESQMXX1Wkx/zYiNS4Im7nWfnNNICpTnyjVE1luXJu4Y7LneG5jqjj/NC4Xs swAUIe1UGYJ+09DDCcI1S68uycKCngOAD+sBsl1sTpHyaEyPZm99lqoXFmd8F5vc7gBL a9TMtqtvOsigeL2FDtH4Wc/YGDGolB5+o1sjFzlY+ilGTptSEItChr5mE+RjxNkW567L BZqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=chAfj2MhKdOAg1JAo4haib3UAHcOP3zHfIDIZEwHVZI=; b=T/bEglm2udKM2qMdAsxl+K8jmZ+LsjAJqzQhEBF8DE8hrEIP/7LpL+As2Zl9CSxWTY /+CoRaK9l/jJ8HflqnnBeJVgU1lb89s64VqePGwOLQ6Yt7nnFu7yhHVbu75gZXcWCnFx YRHS7Z+U4n+76cekJJv6pbwHmk4IIUqS1FjHJgfWzRrqhI/LcAty0aEoPiw+eRQfXE59 rUGdCotg5fqT7VtvDbhjIlBsMTyr7nhLoehcYygsfOrtUvvC0QgP//1d857/dFAiYNaB CIQrrrlZKrnMEZ210ahYSwmmgEMDirUyuc8bkE53UbJVvnvBP65cv31yA5GDw3Gd+IJh ZdJA== X-Gm-Message-State: AOAM532g3i7TBv8OSPaa7HQxUG63RsS9WFyn9KA17/d034o9Y/kTpqtB qxp95FzGjkuloyb0uDEGCaLhGenRlC77Gg== X-Google-Smtp-Source: ABdhPJwsij/o1acmcDRJ8KOVnu0ObAHIvDXpLxeszpqLXTIxu5SDMUrIrzntI+3I88oxpGnLgwgLcQ== X-Received: by 2002:a5e:c00a:: with SMTP id u10mr24866690iol.165.1617116662029; Tue, 30 Mar 2021 08:04:22 -0700 (PDT) Received: from localhost ([2605:9480:22e:ff10:7b00:4f79:8763:6261]) by smtp.gmail.com with ESMTPSA id v17sm11169535ios.46.2021.03.30.08.04.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 08:04:21 -0700 (PDT) Date: Tue, 30 Mar 2021 11:04:20 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: gitster@pobox.com, peff@peff.net, dstolee@microsoft.com, jonathantanmy@google.com Subject: [PATCH v4 11/16] midx: make some functions non-static Message-ID: <2ada3973202feb737af6c8cbeafae0241afdff90.1617116623.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In a subsequent commit, pack-revindex.c will become responsible for sorting a list of objects in the "MIDX pack order" (which will be defined in the following patch). To do so, it will need to be know the pack identifier and offset within that pack for each object in the MIDX. The MIDX code already has functions for doing just that (nth_midxed_offset() and nth_midxed_pack_int_id()), but they are statically declared. Since there is no reason that they couldn't be exposed publicly, and because they are already doing exactly what the caller in pack-revindex.c will want, expose them publicly so that they can be reused there. Signed-off-by: Taylor Blau --- midx.c | 4 ++-- midx.h | 2 ++ 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/midx.c b/midx.c index 526795ff0e..2173a9b45c 100644 --- a/midx.c +++ b/midx.c @@ -239,7 +239,7 @@ struct object_id *nth_midxed_object_oid(struct object_id *oid, return oid; } -static off_t nth_midxed_offset(struct multi_pack_index *m, uint32_t pos) +off_t nth_midxed_offset(struct multi_pack_index *m, uint32_t pos) { const unsigned char *offset_data; uint32_t offset32; @@ -258,7 +258,7 @@ static off_t nth_midxed_offset(struct multi_pack_index *m, uint32_t pos) return offset32; } -static uint32_t nth_midxed_pack_int_id(struct multi_pack_index *m, uint32_t pos) +uint32_t nth_midxed_pack_int_id(struct multi_pack_index *m, uint32_t pos) { return get_be32(m->chunk_object_offsets + (off_t)pos * MIDX_CHUNK_OFFSET_WIDTH); diff --git a/midx.h b/midx.h index e7fea61109..93bd68189e 100644 --- a/midx.h +++ b/midx.h @@ -40,6 +40,8 @@ struct multi_pack_index { struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local); int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t pack_int_id); int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, uint32_t *result); +off_t nth_midxed_offset(struct multi_pack_index *m, uint32_t pos); +uint32_t nth_midxed_pack_int_id(struct multi_pack_index *m, uint32_t pos); struct object_id *nth_midxed_object_oid(struct object_id *oid, struct multi_pack_index *m, uint32_t n); From patchwork Tue Mar 30 15:04:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12172889 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2B20AC433E2 for ; Tue, 30 Mar 2021 15:05:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0009C619D9 for ; Tue, 30 Mar 2021 15:05:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232283AbhC3PEy (ORCPT ); Tue, 30 Mar 2021 11:04:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53100 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232225AbhC3PE0 (ORCPT ); Tue, 30 Mar 2021 11:04:26 -0400 Received: from mail-qk1-x72f.google.com (mail-qk1-x72f.google.com [IPv6:2607:f8b0:4864:20::72f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 48D42C061574 for ; Tue, 30 Mar 2021 08:04:26 -0700 (PDT) Received: by mail-qk1-x72f.google.com with SMTP id v70so16147085qkb.8 for ; Tue, 30 Mar 2021 08:04:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=DVLp9viqHo1IVfT/JBQyS6Zq8DFqJrl3YTkPfOwX8sY=; b=e0/UuIq88+XmXg1Q7zQmRO6ar+KxFkNXNNy5mTpJWRdOh4X0QxDYxj4WaL6mXo5F5s idM5TDTvetNzK3NVDYEWJUKVKgjUdtSjUQOq8E9HZusNN176fcUc01C99KO+vBhmw2pi qFA2/BufZTMvPkp+fQJjbaQL8ldCKSf8mcTJP1b6J/3k1ShUnifVSTEh9pyRlYSk6kPj LxxWDN+90B3Pt+9UIussawcdCWyfW3mFWh1xJ2JdrjDIziwVX7GpTUhT6v+dIdcF/3PA 2Ao6L4Z0t4pgdZrHvW2hXY9jeUr4m6vRt0/NHk/wotLyaLpGQf6X9q7voFFkUEKdPUST kKPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=DVLp9viqHo1IVfT/JBQyS6Zq8DFqJrl3YTkPfOwX8sY=; b=AfKbR7r9CW7v1bninNOd/MRrC9fDPNNIVIianzlQtKnA2FLd0cx+ipO9fN+bP53MqF TGvdVmcjjV6q5hVeIip8TccGGSrlnllt6eJ69dFs9fWQEzCvJyo2iCDGvRc8kYAeLiJo OdwFrY7graQwvoznciV57aAggOayZleiUtAwZn9EWinm0f8TRbo6Zv/AKTpQ+3AFJnxC 8K/SZGcS67jeFy/aGn4pyCyOrSk4cFjl2qDxdodMGTuoj1iNfmz9kJWs8I5I1tM4AClo HpckZqDLlr2K23Q9W9tx89pcn9SsEbTk+qNCD8Tdk/RAwPov3AqKEq8TSFTevETHI2P8 r0ng== X-Gm-Message-State: AOAM532K26zmK2J4CLPZG9zZcpU6pFr/3QAYv4CDJ9iVqrL0kBd8tOCq w6gxbCJzyEaPDW9m3ftjmSDYJsIwS7DotA== X-Google-Smtp-Source: ABdhPJx7HsrFbN3eCbhRijSjOJCo8gDkine8Zq32TgT1LSjCd5JfwxQvxZKegNvVVUUtfbBlzT6new== X-Received: by 2002:a37:d2c1:: with SMTP id f184mr30286777qkj.107.1617116665175; Tue, 30 Mar 2021 08:04:25 -0700 (PDT) Received: from localhost ([2605:9480:22e:ff10:7b00:4f79:8763:6261]) by smtp.gmail.com with ESMTPSA id 7sm16056886qkm.64.2021.03.30.08.04.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 08:04:24 -0700 (PDT) Date: Tue, 30 Mar 2021 11:04:23 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: gitster@pobox.com, peff@peff.net, dstolee@microsoft.com, jonathantanmy@google.com Subject: [PATCH v4 12/16] Documentation/technical: describe multi-pack reverse indexes Message-ID: <8bb3dd24a7caaef36d041108c5a0503f6f84e410.1617116623.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org As a prerequisite to implementing multi-pack bitmaps, motivate and describe the format and ordering of the multi-pack reverse index. The subsequent patch will implement reading this format, and the patch after that will implement writing it while producing a multi-pack index. Co-authored-by: Jeff King Signed-off-by: Jeff King Signed-off-by: Taylor Blau --- Documentation/technical/pack-format.txt | 83 +++++++++++++++++++++++++ 1 file changed, 83 insertions(+) diff --git a/Documentation/technical/pack-format.txt b/Documentation/technical/pack-format.txt index 1faa949bf6..8d2f42f29e 100644 --- a/Documentation/technical/pack-format.txt +++ b/Documentation/technical/pack-format.txt @@ -379,3 +379,86 @@ CHUNK DATA: TRAILER: Index checksum of the above contents. + +== multi-pack-index reverse indexes + +Similar to the pack-based reverse index, the multi-pack index can also +be used to generate a reverse index. + +Instead of mapping between offset, pack-, and index position, this +reverse index maps between an object's position within the MIDX, and +that object's position within a pseudo-pack that the MIDX describes +(i.e., the ith entry of the multi-pack reverse index holds the MIDX +position of ith object in pseudo-pack order). + +To clarify the difference between these orderings, consider a multi-pack +reachability bitmap (which does not yet exist, but is what we are +building towards here). Each bit needs to correspond to an object in the +MIDX, and so we need an efficient mapping from bit position to MIDX +position. + +One solution is to let bits occupy the same position in the oid-sorted +index stored by the MIDX. But because oids are effectively random, their +resulting reachability bitmaps would have no locality, and thus compress +poorly. (This is the reason that single-pack bitmaps use the pack +ordering, and not the .idx ordering, for the same purpose.) + +So we'd like to define an ordering for the whole MIDX based around +pack ordering, which has far better locality (and thus compresses more +efficiently). We can think of a pseudo-pack created by the concatenation +of all of the packs in the MIDX. E.g., if we had a MIDX with three packs +(a, b, c), with 10, 15, and 20 objects respectively, we can imagine an +ordering of the objects like: + + |a,0|a,1|...|a,9|b,0|b,1|...|b,14|c,0|c,1|...|c,19| + +where the ordering of the packs is defined by the MIDX's pack list, +and then the ordering of objects within each pack is the same as the +order in the actual packfile. + +Given the list of packs and their counts of objects, you can +naïvely reconstruct that pseudo-pack ordering (e.g., the object at +position 27 must be (c,1) because packs "a" and "b" consumed 25 of the +slots). But there's a catch. Objects may be duplicated between packs, in +which case the MIDX only stores one pointer to the object (and thus we'd +want only one slot in the bitmap). + +Callers could handle duplicates themselves by reading objects in order +of their bit-position, but that's linear in the number of objects, and +much too expensive for ordinary bitmap lookups. Building a reverse index +solves this, since it is the logical inverse of the index, and that +index has already removed duplicates. But, building a reverse index on +the fly can be expensive. Since we already have an on-disk format for +pack-based reverse indexes, let's reuse it for the MIDX's pseudo-pack, +too. + +Objects from the MIDX are ordered as follows to string together the +pseudo-pack. Let `pack(o)` return the pack from which `o` was selected +by the MIDX, and define an ordering of packs based on their numeric ID +(as stored by the MIDX). Let `offset(o)` return the object offset of `o` +within `pack(o)`. Then, compare `o1` and `o2` as follows: + + - If one of `pack(o1)` and `pack(o2)` is preferred and the other + is not, then the preferred one sorts first. ++ +(This is a detail that allows the MIDX bitmap to determine which +pack should be used by the pack-reuse mechanism, since it can ask +the MIDX for the pack containing the object at bit position 0). + + - If `pack(o1) ≠ pack(o2)`, then sort the two objects in descending + order based on the pack ID. + + - Otherwise, `pack(o1) = pack(o2)`, and the objects are sorted in + pack-order (i.e., `o1` sorts ahead of `o2` exactly when `offset(o1) + < offset(o2)`). + +In short, a MIDX's pseudo-pack is the de-duplicated concatenation of +objects in packs stored by the MIDX, laid out in pack order, and the +packs arranged in MIDX order (with the preferred pack coming first). + +Finally, note that the MIDX's reverse index is not stored as a chunk in +the multi-pack-index itself. This is done because the reverse index +includes the checksum of the pack or MIDX to which it belongs, which +makes it impossible to write in the MIDX. To avoid races when rewriting +the MIDX, a MIDX reverse index includes the MIDX's checksum in its +filename (e.g., `multi-pack-index-xyz.rev`). From patchwork Tue Mar 30 15:04:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12172891 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2462BC433E3 for ; Tue, 30 Mar 2021 15:05:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0E28A619DC for ; Tue, 30 Mar 2021 15:05:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232298AbhC3PE4 (ORCPT ); Tue, 30 Mar 2021 11:04:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53110 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232249AbhC3PE3 (ORCPT ); Tue, 30 Mar 2021 11:04:29 -0400 Received: from mail-qk1-x731.google.com (mail-qk1-x731.google.com [IPv6:2607:f8b0:4864:20::731]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 676A0C061574 for ; Tue, 30 Mar 2021 08:04:29 -0700 (PDT) Received: by mail-qk1-x731.google.com with SMTP id y5so16130608qkl.9 for ; Tue, 30 Mar 2021 08:04:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=xklA/J4YMarRCg2zjTV1xfE1mmzJQEg1xUYjYS6whic=; b=lawUqpn78u3Bk5iuOG6T7eehL0ig7mhrHbPpJ1Rs1gw1FkOuzHPpKOTchJ94f6J3Gh lSOXtFfJf6MWqSmoJOTWPnbMkeztd9RR+5bDF40MDP+DgnuaMWf9cxVFzJU+pcufeECc kGuTv6nv3jE5OtpSgaX/4T1RKXhbflNItoTjHn3+dwJNFGD4dw5mbJ+MgVc5Ysxg4dlE Hha7QhxGrrMvsw2ZjjtX8BZNFo9pcgjadS6T40IWnBxlaeHOV22nXnRj+TdTKOxjn8XE L+ovPHh5keUWSeXLe2VCGu94MMP+uadO9xmu/WP5r5SEiR8IwFBPbPLrYL55Qz8pQPTH noRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=xklA/J4YMarRCg2zjTV1xfE1mmzJQEg1xUYjYS6whic=; b=QlV6USzrSGQ5JFdX3Z2ZNvtO5y8dqR7VNan3pEj/ScXG1yZ4k1yx+/he1IuTwojA2l CfFDgRhgPx8R0Y3nia3Agknd0qiAdipk+wnfV8OFZ5AP8ZsbJCf9JMlCT3j0C2bUyS3Y EgeenRtVgNZIN/VXF9DTME55aUEYLlhO6eib2DMyR87lYx9Gt81s4hmqko/drmVkUU02 39O2TmulyVdjWK84MRqy6/SRVBtxGAMx4r1x10g/AjU0WJwrkxNgsHnu5bKoebMwhNKj MMTdK6JXDFthMFi7mxXhctBuXNKOsJBSgyYcRlVJ5HkPYHlvVlJBi87D0Pp4XAlshDyE gshg== X-Gm-Message-State: AOAM530s4zChxbpKYtcplc6XO7nFxDSVC/aK42YLK7LFQ7i3ulQ94kIK 4Iu3q8YdQQ5HyTlk6dr2KzR0lOMoWgXTkg== X-Google-Smtp-Source: ABdhPJyKvvbEa0Nz6p+WoPN3vypZLulH+XmuPvnBI2NkboJvokW1KZY4PsjvJ46dbD7YjDwir5pCfw== X-Received: by 2002:a37:9dc1:: with SMTP id g184mr32255433qke.285.1617116668167; Tue, 30 Mar 2021 08:04:28 -0700 (PDT) Received: from localhost ([2605:9480:22e:ff10:7b00:4f79:8763:6261]) by smtp.gmail.com with ESMTPSA id g2sm13474453qtu.0.2021.03.30.08.04.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 08:04:27 -0700 (PDT) Date: Tue, 30 Mar 2021 11:04:26 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: gitster@pobox.com, peff@peff.net, dstolee@microsoft.com, jonathantanmy@google.com Subject: [PATCH v4 13/16] pack-revindex: read multi-pack reverse indexes Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Implement reading for multi-pack reverse indexes, as described in the previous patch. Note that these functions don't yet have any callers, and won't until multi-pack reachability bitmaps are introduced in a later patch series. In the meantime, this patch implements some of the infrastructure necessary to support multi-pack bitmaps. There are three new functions exposed by the revindex API: - load_midx_revindex(): loads the reverse index corresponding to the given multi-pack index. - midx_to_pack_pos() and pack_pos_to_midx(): these convert between the multi-pack index and pseudo-pack order. load_midx_revindex() and pack_pos_to_midx() are both relatively straightforward. load_midx_revindex() needs a few functions to be exposed from the midx API. One to get the checksum of a midx, and another to get the .rev's filename. Similar to recent changes in the packed_git struct, three new fields are added to the multi_pack_index struct: one to keep track of the size, one to keep track of the mmap'd pointer, and another to point past the header and at the reverse index's data. pack_pos_to_midx() simply reads the corresponding entry out of the table. midx_to_pack_pos() is the trickiest, since it needs to find an object's position in the psuedo-pack order, but that order can only be recovered in the .rev file itself. This mapping can be implemented with a binary search, but note that the thing we're binary searching over isn't an array of values, but rather a permuted order of those values. So, when comparing two items, it's helpful to keep in mind the difference. Instead of a traditional binary search, where you are comparing two things directly, here we're comparing a (pack, offset) tuple with an index into the multi-pack index. That index describes another (pack, offset) tuple, and it is _those_ two tuples that are compared. Signed-off-by: Taylor Blau --- midx.c | 11 +++++ midx.h | 6 +++ pack-revindex.c | 126 ++++++++++++++++++++++++++++++++++++++++++++++++ pack-revindex.h | 53 ++++++++++++++++++++ packfile.c | 3 ++ 5 files changed, 199 insertions(+) diff --git a/midx.c b/midx.c index 2173a9b45c..c04e794888 100644 --- a/midx.c +++ b/midx.c @@ -47,11 +47,22 @@ static uint8_t oid_version(void) } } +static const unsigned char *get_midx_checksum(struct multi_pack_index *m) +{ + return m->data + m->data_len - the_hash_algo->rawsz; +} + static char *get_midx_filename(const char *object_dir) { return xstrfmt("%s/pack/multi-pack-index", object_dir); } +char *get_midx_rev_filename(struct multi_pack_index *m) +{ + return xstrfmt("%s/pack/multi-pack-index-%s.rev", + m->object_dir, hash_to_hex(get_midx_checksum(m))); +} + static int midx_read_oid_fanout(const unsigned char *chunk_start, size_t chunk_size, void *data) { diff --git a/midx.h b/midx.h index 93bd68189e..0a8294d2ee 100644 --- a/midx.h +++ b/midx.h @@ -15,6 +15,10 @@ struct multi_pack_index { const unsigned char *data; size_t data_len; + const uint32_t *revindex_data; + const uint32_t *revindex_map; + size_t revindex_len; + uint32_t signature; unsigned char version; unsigned char hash_len; @@ -37,6 +41,8 @@ struct multi_pack_index { #define MIDX_PROGRESS (1 << 0) +char *get_midx_rev_filename(struct multi_pack_index *m); + struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local); int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t pack_int_id); int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, uint32_t *result); diff --git a/pack-revindex.c b/pack-revindex.c index 4262530449..0e4a31d9db 100644 --- a/pack-revindex.c +++ b/pack-revindex.c @@ -3,6 +3,7 @@ #include "object-store.h" #include "packfile.h" #include "config.h" +#include "midx.h" struct revindex_entry { off_t offset; @@ -293,6 +294,43 @@ int load_pack_revindex(struct packed_git *p) return -1; } +int load_midx_revindex(struct multi_pack_index *m) +{ + char *revindex_name; + int ret; + if (m->revindex_data) + return 0; + + revindex_name = get_midx_rev_filename(m); + + ret = load_revindex_from_disk(revindex_name, + m->num_objects, + &m->revindex_map, + &m->revindex_len); + if (ret) + goto cleanup; + + m->revindex_data = (const uint32_t *)((const char *)m->revindex_map + RIDX_HEADER_SIZE); + +cleanup: + free(revindex_name); + return ret; +} + +int close_midx_revindex(struct multi_pack_index *m) +{ + if (!m || !m->revindex_map) + return 0; + + munmap((void*)m->revindex_map, m->revindex_len); + + m->revindex_map = NULL; + m->revindex_data = NULL; + m->revindex_len = 0; + + return 0; +} + int offset_to_pack_pos(struct packed_git *p, off_t ofs, uint32_t *pos) { unsigned lo, hi; @@ -347,3 +385,91 @@ off_t pack_pos_to_offset(struct packed_git *p, uint32_t pos) else return nth_packed_object_offset(p, pack_pos_to_index(p, pos)); } + +uint32_t pack_pos_to_midx(struct multi_pack_index *m, uint32_t pos) +{ + if (!m->revindex_data) + BUG("pack_pos_to_midx: reverse index not yet loaded"); + if (m->num_objects <= pos) + BUG("pack_pos_to_midx: out-of-bounds object at %"PRIu32, pos); + return get_be32(m->revindex_data + pos); +} + +struct midx_pack_key { + uint32_t pack; + off_t offset; + + uint32_t preferred_pack; + struct multi_pack_index *midx; +}; + +static int midx_pack_order_cmp(const void *va, const void *vb) +{ + const struct midx_pack_key *key = va; + struct multi_pack_index *midx = key->midx; + + uint32_t versus = pack_pos_to_midx(midx, (uint32_t*)vb - (const uint32_t *)midx->revindex_data); + uint32_t versus_pack = nth_midxed_pack_int_id(midx, versus); + off_t versus_offset; + + uint32_t key_preferred = key->pack == key->preferred_pack; + uint32_t versus_preferred = versus_pack == key->preferred_pack; + + /* + * First, compare the preferred-ness, noting that the preferred pack + * comes first. + */ + if (key_preferred && !versus_preferred) + return -1; + else if (!key_preferred && versus_preferred) + return 1; + + /* Then, break ties first by comparing the pack IDs. */ + if (key->pack < versus_pack) + return -1; + else if (key->pack > versus_pack) + return 1; + + /* Finally, break ties by comparing offsets within a pack. */ + versus_offset = nth_midxed_offset(midx, versus); + if (key->offset < versus_offset) + return -1; + else if (key->offset > versus_offset) + return 1; + + return 0; +} + +int midx_to_pack_pos(struct multi_pack_index *m, uint32_t at, uint32_t *pos) +{ + struct midx_pack_key key; + uint32_t *found; + + if (!m->revindex_data) + BUG("midx_to_pack_pos: reverse index not yet loaded"); + if (m->num_objects <= at) + BUG("midx_to_pack_pos: out-of-bounds object at %"PRIu32, at); + + key.pack = nth_midxed_pack_int_id(m, at); + key.offset = nth_midxed_offset(m, at); + key.midx = m; + /* + * The preferred pack sorts first, so determine its identifier by + * looking at the first object in pseudo-pack order. + * + * Note that if no --preferred-pack is explicitly given when writing a + * multi-pack index, then whichever pack has the lowest identifier + * implicitly is preferred (and includes all its objects, since ties are + * broken first by pack identifier). + */ + key.preferred_pack = nth_midxed_pack_int_id(m, pack_pos_to_midx(m, 0)); + + found = bsearch(&key, m->revindex_data, m->num_objects, + sizeof(*m->revindex_data), midx_pack_order_cmp); + + if (!found) + return error("bad offset for revindex"); + + *pos = found - m->revindex_data; + return 0; +} diff --git a/pack-revindex.h b/pack-revindex.h index ba7c82c125..479b8f2f9c 100644 --- a/pack-revindex.h +++ b/pack-revindex.h @@ -14,6 +14,20 @@ * * - offset: the byte offset within the .pack file at which the object contents * can be found + * + * The revindex can also be used with a multi-pack index (MIDX). In this + * setting: + * + * - index position refers to an object's numeric position within the MIDX + * + * - pack position refers to an object's position within a non-existent pack + * described by the MIDX. The pack structure is described in + * Documentation/technical/pack-format.txt. + * + * It is effectively a concatanation of all packs in the MIDX (ordered by + * their numeric ID within the MIDX) in their original order within each + * pack), removing duplicates, and placing the preferred pack (if any) + * first. */ @@ -24,6 +38,7 @@ #define GIT_TEST_REV_INDEX_DIE_IN_MEMORY "GIT_TEST_REV_INDEX_DIE_IN_MEMORY" struct packed_git; +struct multi_pack_index; /* * load_pack_revindex populates the revindex's internal data-structures for the @@ -34,6 +49,22 @@ struct packed_git; */ int load_pack_revindex(struct packed_git *p); +/* + * load_midx_revindex loads the '.rev' file corresponding to the given + * multi-pack index by mmap-ing it and assigning pointers in the + * multi_pack_index to point at it. + * + * A negative number is returned on error. + */ +int load_midx_revindex(struct multi_pack_index *m); + +/* + * Frees resources associated with a multi-pack reverse index. + * + * A negative number is returned on error. + */ +int close_midx_revindex(struct multi_pack_index *m); + /* * offset_to_pack_pos converts an object offset to a pack position. This * function returns zero on success, and a negative number otherwise. The @@ -71,4 +102,26 @@ uint32_t pack_pos_to_index(struct packed_git *p, uint32_t pos); */ off_t pack_pos_to_offset(struct packed_git *p, uint32_t pos); +/* + * pack_pos_to_midx converts the object at position "pos" within the MIDX + * pseudo-pack into a MIDX position. + * + * If the reverse index has not yet been loaded, or the position is out of + * bounds, this function aborts. + * + * This function runs in time O(log N) with the number of objects in the MIDX. + */ +uint32_t pack_pos_to_midx(struct multi_pack_index *m, uint32_t pos); + +/* + * midx_to_pack_pos converts from the MIDX-relative position at "at" to the + * corresponding pack position. + * + * If the reverse index has not yet been loaded, or the position is out of + * bounds, this function aborts. + * + * This function runs in constant time. + */ +int midx_to_pack_pos(struct multi_pack_index *midx, uint32_t at, uint32_t *pos); + #endif diff --git a/packfile.c b/packfile.c index 6661f3325a..8668345d93 100644 --- a/packfile.c +++ b/packfile.c @@ -862,6 +862,9 @@ static void prepare_pack(const char *full_name, size_t full_name_len, if (!strcmp(file_name, "multi-pack-index")) return; + if (starts_with(file_name, "multi-pack-index") && + ends_with(file_name, ".rev")) + return; if (ends_with(file_name, ".idx") || ends_with(file_name, ".rev") || ends_with(file_name, ".pack") || From patchwork Tue Mar 30 15:04:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12172887 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3DED3C433E5 for ; Tue, 30 Mar 2021 15:05:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 20B17619D3 for ; Tue, 30 Mar 2021 15:05:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232067AbhC3PE6 (ORCPT ); Tue, 30 Mar 2021 11:04:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53126 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232007AbhC3PEc (ORCPT ); Tue, 30 Mar 2021 11:04:32 -0400 Received: from mail-io1-xd2c.google.com (mail-io1-xd2c.google.com [IPv6:2607:f8b0:4864:20::d2c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0A43CC061574 for ; Tue, 30 Mar 2021 08:04:32 -0700 (PDT) Received: by mail-io1-xd2c.google.com with SMTP id e8so16765823iok.5 for ; Tue, 30 Mar 2021 08:04:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=iHZGzO8Jecmr9vlHZnZ2tnw49/4/HdyfZGEjBh6XtVQ=; b=wgmvkB37z7YSVqeK422WpciZrQSss+edb6C5EDpJ18evLx6VDaJaXGoxDtrdkiBoOw TMyxu4naN2VpCREoP/ixxlPD8vG/PMwhJ9nf71qqn/qFXM15D+RzDArR/Lc+NdNd/hHJ kTQK/IXzcKXb1o7Arx8FZvdCh0RtRUQyifRQmUf8I74U+p7Flb3rhAN9blFty0TK4trY jeoram3pMxhbdr1NETsiPOaGMJhsPH80KMLjmQxB0ztE7XDNX/ZoxBebGIfBGkno/1Fx UakrhULVrw1k5PrmUT2uqoO3KdZaAC6Rd5JlscYy9chgNC4AtiykWTBxV6sLQVALoSSh 7Dmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=iHZGzO8Jecmr9vlHZnZ2tnw49/4/HdyfZGEjBh6XtVQ=; b=i3r81ylsDBcmb9ZlDZ/czT2LsZJpIKQeMy0FQosevjNAzApVA+4MPXEHK/ORlDRYlo WTzlFwylzMUxcfd3IkJaETR/De4OB27pkPhKzCR1R3thYXWfPjjK74kJaG93wNGV8RHz kY5G5ihi2C+7DU9+SZC6hueYgmf3mVA0TPyn3orMjNlsNBGj2EpGlaQdDTzLOFw03X5S Xf9c8GkzKPAgSz4Q05iDraZ7fX0h8UirxxbGvM/GyIognarHOLfDAPZtkRoQ2d8fDW51 /7NLQIe1vgghKf6vEcm5qAkw9c8mwHAqELO9zqUc/mE+++I5LCQFIL7CGVycMT5ihCrn le4g== X-Gm-Message-State: AOAM530h5rQMPhVcaLzirdLRCp+sGRWSvi690BrCDhSLLiM2LDFM5yCH D43Hk/r+7N9dLPT1YrfoaJVf6Xz57x4UpQ== X-Google-Smtp-Source: ABdhPJxgXP5ZcpeOCnF54xNzjz5K/nHsGorRu/mt/Ions6SQHXXyYgfZm6RSzWX5BgreDjWsO6pPtw== X-Received: by 2002:a5e:cb4c:: with SMTP id h12mr24140210iok.183.1617116671227; Tue, 30 Mar 2021 08:04:31 -0700 (PDT) Received: from localhost ([2605:9480:22e:ff10:7b00:4f79:8763:6261]) by smtp.gmail.com with ESMTPSA id u10sm11419022ill.30.2021.03.30.08.04.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 08:04:30 -0700 (PDT) Date: Tue, 30 Mar 2021 11:04:29 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: gitster@pobox.com, peff@peff.net, dstolee@microsoft.com, jonathantanmy@google.com Subject: [PATCH v4 14/16] pack-write.c: extract 'write_rev_file_order' Message-ID: <9f40019eb3d3315cb03440e6237bced4feb6cf67.1617116623.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Existing callers provide the reverse index code with an array of 'struct pack_idx_entry *'s, which is then sorted by pack order (comparing the offsets of each object within the pack). Prepare for the multi-pack index to write a .rev file by providing a way to write the reverse index without an array of pack_idx_entry (which the MIDX code does not have). Instead, callers can invoke 'write_rev_index_positions()', which takes an array of uint32_t's. The ith entry in this array specifies the ith object's (in index order) position within the pack (in pack order). Expose this new function for use in a later patch, and rewrite the existing write_rev_file() in terms of this new function. Signed-off-by: Taylor Blau --- pack-write.c | 36 +++++++++++++++++++++++++----------- pack.h | 1 + 2 files changed, 26 insertions(+), 11 deletions(-) diff --git a/pack-write.c b/pack-write.c index 2ca85a9d16..f1fc3ecafa 100644 --- a/pack-write.c +++ b/pack-write.c @@ -201,21 +201,12 @@ static void write_rev_header(struct hashfile *f) } static void write_rev_index_positions(struct hashfile *f, - struct pack_idx_entry **objects, + uint32_t *pack_order, uint32_t nr_objects) { - uint32_t *pack_order; uint32_t i; - - ALLOC_ARRAY(pack_order, nr_objects); - for (i = 0; i < nr_objects; i++) - pack_order[i] = i; - QSORT_S(pack_order, nr_objects, pack_order_cmp, objects); - for (i = 0; i < nr_objects; i++) hashwrite_be32(f, pack_order[i]); - - free(pack_order); } static void write_rev_trailer(struct hashfile *f, const unsigned char *hash) @@ -228,6 +219,29 @@ const char *write_rev_file(const char *rev_name, uint32_t nr_objects, const unsigned char *hash, unsigned flags) +{ + uint32_t *pack_order; + uint32_t i; + const char *ret; + + ALLOC_ARRAY(pack_order, nr_objects); + for (i = 0; i < nr_objects; i++) + pack_order[i] = i; + QSORT_S(pack_order, nr_objects, pack_order_cmp, objects); + + ret = write_rev_file_order(rev_name, pack_order, nr_objects, hash, + flags); + + free(pack_order); + + return ret; +} + +const char *write_rev_file_order(const char *rev_name, + uint32_t *pack_order, + uint32_t nr_objects, + const unsigned char *hash, + unsigned flags) { struct hashfile *f; int fd; @@ -262,7 +276,7 @@ const char *write_rev_file(const char *rev_name, write_rev_header(f); - write_rev_index_positions(f, objects, nr_objects); + write_rev_index_positions(f, pack_order, nr_objects); write_rev_trailer(f, hash); if (rev_name && adjust_shared_perm(rev_name) < 0) diff --git a/pack.h b/pack.h index 857cbd5bd4..fa13954526 100644 --- a/pack.h +++ b/pack.h @@ -94,6 +94,7 @@ struct ref; void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought); const char *write_rev_file(const char *rev_name, struct pack_idx_entry **objects, uint32_t nr_objects, const unsigned char *hash, unsigned flags); +const char *write_rev_file_order(const char *rev_name, uint32_t *pack_order, uint32_t nr_objects, const unsigned char *hash, unsigned flags); /* * The "hdr" output buffer should be at least this big, which will handle sizes From patchwork Tue Mar 30 15:04:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12172895 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50C36C433E4 for ; Tue, 30 Mar 2021 15:05:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 34FB9619D0 for ; Tue, 30 Mar 2021 15:05:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232321AbhC3PE7 (ORCPT ); Tue, 30 Mar 2021 11:04:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53142 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232257AbhC3PEf (ORCPT ); Tue, 30 Mar 2021 11:04:35 -0400 Received: from mail-qk1-x731.google.com (mail-qk1-x731.google.com [IPv6:2607:f8b0:4864:20::731]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7D575C061574 for ; Tue, 30 Mar 2021 08:04:35 -0700 (PDT) Received: by mail-qk1-x731.google.com with SMTP id c4so16152448qkg.3 for ; Tue, 30 Mar 2021 08:04:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=2VZuxjpFKVQ5urOrPRWad94MAn0l3QPVsVA9sXDgayg=; b=CI9Fy4cmp0H26Bp6cM1HyTWE+SlVQPxywvOeru0WXgTl9cmkWuTR8nkpJeO+hLjPtr UkOKeCnF6HIbo8DzZ2FIkYe4LqfSOaXHBltlEcHu7YbuTbrVbZorzP7lWnMBdIgXrGUf fkxxpFHvLfR9SKf9g/a//zprmE+tS/n4pHcvFgLm+RAkTpxffoUyRBCvtkEAO4TIlEIj yXijZofaGThBZGZZU1QKpoN2+cDgfVsYzg2BgT8WyQs7FLuQctC/UauZ2iLlCrE7HPOR pBetmkpnTTQxqnYrJBsYKGZkUf9YMkWE00zPef7iQBCZtuHzgmTxVgaYMw9JQkTYD/AA vfDw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=2VZuxjpFKVQ5urOrPRWad94MAn0l3QPVsVA9sXDgayg=; b=WW49K3NsLR0Bc/DRZb9njNg+AxZq+Cf66qhNIKRJTZdOMZbv/fqjXhZPEq/FgcIL6q KlUwWCJnm7OP0QCc9G60UpI3FlmwMpAoq3kBLyTyYtCciIGipvAxLUygBPXOiikpKZfe EjltjJ2qlGMhcjObA6MaEsLL+iXB6EuoMtP4yRAe83+Zw6Ai4/pqQyaNirtLDBjcpIPD fWp/Ot4UK1WzX3FqzZdS4iAXOlf4FK0s+qzBQIedfI4zURktLz/T42InBfWVLt/ftYSK FxzNqf2aysha8lUgZJoYlwpfbhETA0SrGk9fHxD/g1s7y5lhNZKgyBnjv+7xseUbXQDv ZB0Q== X-Gm-Message-State: AOAM533WQMgS/IP+exQR7g4m8uTj4ncmQXISPEGopWpc/+7fXscD9QGY k7OWAWtQiDfKBXGMl41059hrGEdR4QFBNQ== X-Google-Smtp-Source: ABdhPJzybHNZQAbpi4TyFljMqaAummq7Upa+X51AlMDFrN/8aroSDATSq6f1uhsbcTHJZXSfCd6DAg== X-Received: by 2002:a37:a390:: with SMTP id m138mr29487828qke.59.1617116674388; Tue, 30 Mar 2021 08:04:34 -0700 (PDT) Received: from localhost ([2605:9480:22e:ff10:7b00:4f79:8763:6261]) by smtp.gmail.com with ESMTPSA id d70sm15893460qkg.30.2021.03.30.08.04.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 08:04:33 -0700 (PDT) Date: Tue, 30 Mar 2021 11:04:32 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: gitster@pobox.com, peff@peff.net, dstolee@microsoft.com, jonathantanmy@google.com Subject: [PATCH v4 15/16] pack-revindex: write multi-pack reverse indexes Message-ID: <47409cc50835fb03d535d3817f01d62a68512144.1617116623.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Implement the writing half of multi-pack reverse indexes. This is nothing more than the format describe a few patches ago, with a new set of helper functions that will be used to clear out stale .rev files corresponding to old MIDXs. Unfortunately, a very similar comparison function as the one implemented recently in pack-revindex.c is reimplemented here, this time accepting a MIDX-internal type. An effort to DRY these up would create more indirection and overhead than is necessary, so it isn't pursued here. Currently, there are no callers which pass the MIDX_WRITE_REV_INDEX flag, meaning that this is all dead code. But, that won't be the case for long, since subsequent patches will introduce the multi-pack bitmap, which will begin passing this field. (In midx.c:write_midx_internal(), the two adjacent if statements share a conditional, but are written separately since the first one will eventually also handle the MIDX_WRITE_BITMAP flag, which does not yet exist.) Signed-off-by: Taylor Blau --- midx.c | 115 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ midx.h | 1 + 2 files changed, 116 insertions(+) diff --git a/midx.c b/midx.c index c04e794888..b96eaa12fb 100644 --- a/midx.c +++ b/midx.c @@ -12,6 +12,7 @@ #include "run-command.h" #include "repository.h" #include "chunk-format.h" +#include "pack.h" #define MIDX_SIGNATURE 0x4d494458 /* "MIDX" */ #define MIDX_VERSION 1 @@ -462,6 +463,7 @@ struct write_midx_context { uint32_t entries_nr; uint32_t *pack_perm; + uint32_t *pack_order; unsigned large_offsets_needed:1; uint32_t num_large_offsets; @@ -816,6 +818,70 @@ static int write_midx_large_offsets(struct hashfile *f, return 0; } +static int midx_pack_order_cmp(const void *va, const void *vb, void *_ctx) +{ + struct write_midx_context *ctx = _ctx; + + struct pack_midx_entry *a = &ctx->entries[*(const uint32_t *)va]; + struct pack_midx_entry *b = &ctx->entries[*(const uint32_t *)vb]; + + uint32_t perm_a = ctx->pack_perm[a->pack_int_id]; + uint32_t perm_b = ctx->pack_perm[b->pack_int_id]; + + /* Sort objects in the preferred pack ahead of any others. */ + if (a->preferred > b->preferred) + return -1; + if (a->preferred < b->preferred) + return 1; + + /* Then, order objects by which packs they appear in. */ + if (perm_a < perm_b) + return -1; + if (perm_a > perm_b) + return 1; + + /* Then, disambiguate by their offset within each pack. */ + if (a->offset < b->offset) + return -1; + if (a->offset > b->offset) + return 1; + + return 0; +} + +static uint32_t *midx_pack_order(struct write_midx_context *ctx) +{ + uint32_t *pack_order; + uint32_t i; + + ALLOC_ARRAY(pack_order, ctx->entries_nr); + for (i = 0; i < ctx->entries_nr; i++) + pack_order[i] = i; + QSORT_S(pack_order, ctx->entries_nr, midx_pack_order_cmp, ctx); + + return pack_order; +} + +static void write_midx_reverse_index(char *midx_name, unsigned char *midx_hash, + struct write_midx_context *ctx) +{ + struct strbuf buf = STRBUF_INIT; + const char *tmp_file; + + strbuf_addf(&buf, "%s-%s.rev", midx_name, hash_to_hex(midx_hash)); + + tmp_file = write_rev_file_order(NULL, ctx->pack_order, ctx->entries_nr, + midx_hash, WRITE_REV); + + if (finalize_object_file(tmp_file, buf.buf)) + die(_("cannot store reverse index file")); + + strbuf_release(&buf); +} + +static void clear_midx_files_ext(struct repository *r, const char *ext, + unsigned char *keep_hash); + static int write_midx_internal(const char *object_dir, struct multi_pack_index *m, struct string_list *packs_to_drop, const char *preferred_pack_name, @@ -1001,6 +1067,14 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * finalize_hashfile(f, midx_hash, CSUM_FSYNC | CSUM_HASH_IN_STREAM); free_chunkfile(cf); + + if (flags & MIDX_WRITE_REV_INDEX) + ctx.pack_order = midx_pack_order(&ctx); + + if (flags & MIDX_WRITE_REV_INDEX) + write_midx_reverse_index(midx_name, midx_hash, &ctx); + clear_midx_files_ext(the_repository, ".rev", midx_hash); + commit_lock_file(&lk); cleanup: @@ -1015,6 +1089,7 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * free(ctx.info); free(ctx.entries); free(ctx.pack_perm); + free(ctx.pack_order); free(midx_name); return result; } @@ -1027,6 +1102,44 @@ int write_midx_file(const char *object_dir, flags); } +struct clear_midx_data { + char *keep; + const char *ext; +}; + +static void clear_midx_file_ext(const char *full_path, size_t full_path_len, + const char *file_name, void *_data) +{ + struct clear_midx_data *data = _data; + + if (!(starts_with(file_name, "multi-pack-index-") && + ends_with(file_name, data->ext))) + return; + if (data->keep && !strcmp(data->keep, file_name)) + return; + + if (unlink(full_path)) + die_errno(_("failed to remove %s"), full_path); +} + +static void clear_midx_files_ext(struct repository *r, const char *ext, + unsigned char *keep_hash) +{ + struct clear_midx_data data; + memset(&data, 0, sizeof(struct clear_midx_data)); + + if (keep_hash) + data.keep = xstrfmt("multi-pack-index-%s%s", + hash_to_hex(keep_hash), ext); + data.ext = ext; + + for_each_file_in_pack_dir(r->objects->odb->path, + clear_midx_file_ext, + &data); + + free(data.keep); +} + void clear_midx_file(struct repository *r) { char *midx = get_midx_filename(r->objects->odb->path); @@ -1039,6 +1152,8 @@ void clear_midx_file(struct repository *r) if (remove_path(midx)) die(_("failed to clear multi-pack-index at %s"), midx); + clear_midx_files_ext(r, ".rev", NULL); + free(midx); } diff --git a/midx.h b/midx.h index 0a8294d2ee..8684cf0fef 100644 --- a/midx.h +++ b/midx.h @@ -40,6 +40,7 @@ struct multi_pack_index { }; #define MIDX_PROGRESS (1 << 0) +#define MIDX_WRITE_REV_INDEX (1 << 1) char *get_midx_rev_filename(struct multi_pack_index *m); From patchwork Tue Mar 30 15:04:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12172893 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62F33C433E6 for ; Tue, 30 Mar 2021 15:05:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 48A4B619D9 for ; Tue, 30 Mar 2021 15:05:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232327AbhC3PFB (ORCPT ); Tue, 30 Mar 2021 11:05:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53152 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231782AbhC3PEi (ORCPT ); Tue, 30 Mar 2021 11:04:38 -0400 Received: from mail-il1-x134.google.com (mail-il1-x134.google.com [IPv6:2607:f8b0:4864:20::134]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 58090C061574 for ; Tue, 30 Mar 2021 08:04:38 -0700 (PDT) Received: by mail-il1-x134.google.com with SMTP id h7so7290529ilj.8 for ; Tue, 30 Mar 2021 08:04:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=3nVkSUnCuRKdX/20i8p1f+C/avnQBDdAKi+Iv4n5nPE=; b=uMg5e8aCTkmRjo/9WvJbMYgFRUmm1nXJivCweYpTdFpeszndwXkAMQMWOO+yKKmTr/ evvDxIzNYxH9PyF8Iye3HkVR7xrwyUz40M7/B5ylqkjsIwcIwhR66HH+2z6Am8lQRr+3 TPgf40nz0T8aVJ0HoAXyF/1fQw/9z0RBNf4FslcPePnqvf9AxO2/QvTdiYicrIvpoBSj qTpTzknTX/Ha/7+5X8YVuEM4E4LKX6p5KVKiboegzZAtxv5d3uHinIddz9Fr5Q4OReHo MQf9jIZRKuPOjoreUc/PmCYX8MjV4LrP6q4pTpPm9IdgSSonvrhxeJFGkC5/jgXzPHYU kp2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=3nVkSUnCuRKdX/20i8p1f+C/avnQBDdAKi+Iv4n5nPE=; b=XFBYRtKhwodpkD2qmX8Xqu6B5mDC0WkXGAmPQdfQ8SAU1MEAN+7PW/PcftcO9aUue7 /YBNx7JLBojUp87MKgBwjImprD+NFaOFA0/qgYVssWoOAH9wKujuk4r4ZpBhFRs99qSn nD0j+HhphN/ZpkbqLcPAk9nXgUxkFKtvfztOhvG8d87yD38VPWE4PdZnguW1rbMZLPJw oKmYm90SVEQD2XVupP9QDORwR4JtJZk8phnB8kbylKCHlgkMtYC4nLeEzqJUtj+onNpd zWMe274NLv3FVinGZoQ7pctY0tuhH/u/iLd3A8j9TaKWwH6BfOwyme73XHdadbmeM5Z+ VWfw== X-Gm-Message-State: AOAM531e50uDBJt7XLg/rqxhkyVt6PXcQV8DV29bA6W1atCkwBurCLv7 5jwPXz46k32imypQH+s5q5Qh0zPJMRYsYA== X-Google-Smtp-Source: ABdhPJwQV6eR+5JAw1MZpwCox/qup65jKSUY4oyVlMRNLB0/59A52cdu/MaZb7rEt661MmxyYJTp4w== X-Received: by 2002:a05:6e02:1a8d:: with SMTP id k13mr14029580ilv.3.1617116677566; Tue, 30 Mar 2021 08:04:37 -0700 (PDT) Received: from localhost ([2605:9480:22e:ff10:7b00:4f79:8763:6261]) by smtp.gmail.com with ESMTPSA id x4sm8595121ilm.39.2021.03.30.08.04.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 08:04:37 -0700 (PDT) Date: Tue, 30 Mar 2021 11:04:36 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: gitster@pobox.com, peff@peff.net, dstolee@microsoft.com, jonathantanmy@google.com Subject: [PATCH v4 16/16] midx.c: improve cache locality in midx_pack_order_cmp() Message-ID: <7b793e7d0918bc5869f9c2aef818fdd7353ea77b.1617116623.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff King There is a lot of pointer dereferencing in the pre-image version of 'midx_pack_order_cmp()', which this patch gets rid of. Instead of comparing the pack preferred-ness and then the pack id, both of these checks are done at the same time by using the high-order bit of the pack id to represent whether it's preferred. Then the pack id and offset are compared as usual. This produces the same result so long as there are less than 2^31 packs, which seems like a likely assumption to make in practice. Signed-off-by: Jeff King Signed-off-by: Taylor Blau --- midx.c | 55 +++++++++++++++++++++++++++++-------------------------- 1 file changed, 29 insertions(+), 26 deletions(-) diff --git a/midx.c b/midx.c index b96eaa12fb..9e86583172 100644 --- a/midx.c +++ b/midx.c @@ -818,46 +818,49 @@ static int write_midx_large_offsets(struct hashfile *f, return 0; } -static int midx_pack_order_cmp(const void *va, const void *vb, void *_ctx) +struct midx_pack_order_data { + uint32_t nr; + uint32_t pack; + off_t offset; +}; + +static int midx_pack_order_cmp(const void *va, const void *vb) { - struct write_midx_context *ctx = _ctx; - - struct pack_midx_entry *a = &ctx->entries[*(const uint32_t *)va]; - struct pack_midx_entry *b = &ctx->entries[*(const uint32_t *)vb]; - - uint32_t perm_a = ctx->pack_perm[a->pack_int_id]; - uint32_t perm_b = ctx->pack_perm[b->pack_int_id]; - - /* Sort objects in the preferred pack ahead of any others. */ - if (a->preferred > b->preferred) + const struct midx_pack_order_data *a = va, *b = vb; + if (a->pack < b->pack) return -1; - if (a->preferred < b->preferred) + else if (a->pack > b->pack) return 1; - - /* Then, order objects by which packs they appear in. */ - if (perm_a < perm_b) + else if (a->offset < b->offset) return -1; - if (perm_a > perm_b) + else if (a->offset > b->offset) return 1; - - /* Then, disambiguate by their offset within each pack. */ - if (a->offset < b->offset) - return -1; - if (a->offset > b->offset) - return 1; - - return 0; + else + return 0; } static uint32_t *midx_pack_order(struct write_midx_context *ctx) { + struct midx_pack_order_data *data; uint32_t *pack_order; uint32_t i; + ALLOC_ARRAY(data, ctx->entries_nr); + for (i = 0; i < ctx->entries_nr; i++) { + struct pack_midx_entry *e = &ctx->entries[i]; + data[i].nr = i; + data[i].pack = ctx->pack_perm[e->pack_int_id]; + if (!e->preferred) + data[i].pack |= (1U << 31); + data[i].offset = e->offset; + } + + QSORT(data, ctx->entries_nr, midx_pack_order_cmp); + ALLOC_ARRAY(pack_order, ctx->entries_nr); for (i = 0; i < ctx->entries_nr; i++) - pack_order[i] = i; - QSORT_S(pack_order, ctx->entries_nr, midx_pack_order_cmp, ctx); + pack_order[i] = data[i].nr; + free(data); return pack_order; }