From patchwork Thu Mar 11 17:04:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12132159 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 11955C433DB for ; Thu, 11 Mar 2021 17:05:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BFF7364FF3 for ; Thu, 11 Mar 2021 17:05:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229776AbhCKREu (ORCPT ); Thu, 11 Mar 2021 12:04:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36570 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230011AbhCKREk (ORCPT ); Thu, 11 Mar 2021 12:04:40 -0500 Received: from mail-il1-x12b.google.com (mail-il1-x12b.google.com [IPv6:2607:f8b0:4864:20::12b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E6F8CC061574 for ; Thu, 11 Mar 2021 09:04:39 -0800 (PST) Received: by mail-il1-x12b.google.com with SMTP id v14so19535201ilj.11 for ; Thu, 11 Mar 2021 09:04:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=EWKQAPQYWkmrYzu/165SRPVqylKwo41lj7IqdytbI/E=; b=vQnWt1VmJLAQnnlHKE3s5ICcJwh7wiQFZUXwcRxsM4dDhmLuVFgB8pAlIGeLGphMqT nDZQX3Z5nHm09VU1XPLSaHSHjn2GclOwXDhW4OnHX5gxCgtB190Lm1Ji1oMuKMILT04F TaJHcFIBQFEx6M8lAS21rLIZ0HtAsrVB7ODn7EifoK5Yqd4e2So1/TFcMTbGsTGs9ur6 yGwZkRGjsAp5HWETmVUnV57nPDlPpeijmLaXdKGRuXfyKNJUU/HDUj1jzNA+jjBLzzs8 I23nf+qjanSXSWhdC7q9DvIjBZO+Chou7o0tlBkwgE0vtaBxjGnL/6ipE7B58yOSzZuy /E4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=EWKQAPQYWkmrYzu/165SRPVqylKwo41lj7IqdytbI/E=; b=SBQHuyZ/0slXX6MCopdqxl0GnimsOnrjMF+qbT00Sd+4GprbnAxSVDV/NJ/sREHVoO lCADdBw4ftJFYsQNDNsHEWiG9r8H5eOb2bCq5UZCeVXyNYdw6UcJHjflt0GyLPX6uxBQ lIMocFCmYddXIP2mzgZmC23VE+aLctxlfwPfV6l4cJy/0XQnnNOqnwwqlsbN9mUzBarH paX1IhQAoyZWAr0orS71WSjkH11ZKW+S/qVrrI9mOz8UNBLYjgAwpuI2N8vsDAFBmwsv iSXnlbAt2rdOJRdgPdXQpI8irG75Qex6bOzLOPf4M2WtXxspp5yoUx34wFhoWyP0R19D xYoQ== X-Gm-Message-State: AOAM530mt41GTnhcIdQCseUSmVy/eygDOrFeEZ0Ca+UcsWqHFG774zGz uqopnznpKUdwb0LKNSeiq/+afnX6BfCEmA37 X-Google-Smtp-Source: ABdhPJxyag3CS84IOOrdiiRK195g2Kev1tB4re2LMwCMGWMp7edscR3cd1PLU5y1RcFobNnuAinjdA== X-Received: by 2002:a05:6e02:1d85:: with SMTP id h5mr7971751ila.246.1615482279059; Thu, 11 Mar 2021 09:04:39 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:f947:1686:6ada:db5b]) by smtp.gmail.com with ESMTPSA id i12sm1586348ilk.46.2021.03.11.09.04.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Mar 2021 09:04:38 -0800 (PST) Date: Thu, 11 Mar 2021 12:04:36 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: avarab@gmail.com, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com, peff@peff.net Subject: [PATCH v3 01/16] builtin/multi-pack-index.c: inline 'flags' with options Message-ID: <43fc0ad276406ff77283613c45188e102a6dc515.1615482270.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Subcommands of the 'git multi-pack-index' command (e.g., 'write', 'verify', etc.) will want to optionally change a set of shared flags that are eventually passed to the MIDX libraries. Right now, options and flags are handled separately. Inline them into the same structure so that sub-commands can more easily share the 'flags' data. Signed-off-by: Taylor Blau --- builtin/multi-pack-index.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index 5bf88cd2a8..4a0ddb06c4 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -14,13 +14,12 @@ static struct opts_multi_pack_index { const char *object_dir; unsigned long batch_size; int progress; + unsigned flags; } opts; int cmd_multi_pack_index(int argc, const char **argv, const char *prefix) { - unsigned flags = 0; - static struct option builtin_multi_pack_index_options[] = { OPT_FILENAME(0, "object-dir", &opts.object_dir, N_("object directory containing set of packfile and pack-index pairs")), @@ -40,7 +39,7 @@ int cmd_multi_pack_index(int argc, const char **argv, if (!opts.object_dir) opts.object_dir = get_object_directory(); if (opts.progress) - flags |= MIDX_PROGRESS; + opts.flags |= MIDX_PROGRESS; if (argc == 0) usage_with_options(builtin_multi_pack_index_usage, @@ -55,16 +54,16 @@ int cmd_multi_pack_index(int argc, const char **argv, if (!strcmp(argv[0], "repack")) return midx_repack(the_repository, opts.object_dir, - (size_t)opts.batch_size, flags); + (size_t)opts.batch_size, opts.flags); if (opts.batch_size) die(_("--batch-size option is only for 'repack' subcommand")); if (!strcmp(argv[0], "write")) - return write_midx_file(opts.object_dir, flags); + return write_midx_file(opts.object_dir, opts.flags); if (!strcmp(argv[0], "verify")) - return verify_midx_file(the_repository, opts.object_dir, flags); + return verify_midx_file(the_repository, opts.object_dir, opts.flags); if (!strcmp(argv[0], "expire")) - return expire_midx_packs(the_repository, opts.object_dir, flags); + return expire_midx_packs(the_repository, opts.object_dir, opts.flags); die(_("unrecognized subcommand: %s"), argv[0]); } From patchwork Thu Mar 11 17:04:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12132155 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E38A3C433E6 for ; Thu, 11 Mar 2021 17:05:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9F92A64FEB for ; Thu, 11 Mar 2021 17:05:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229887AbhCKREv (ORCPT ); Thu, 11 Mar 2021 12:04:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36586 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229699AbhCKREp (ORCPT ); Thu, 11 Mar 2021 12:04:45 -0500 Received: from mail-qt1-x834.google.com (mail-qt1-x834.google.com [IPv6:2607:f8b0:4864:20::834]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 70A39C061574 for ; Thu, 11 Mar 2021 09:04:44 -0800 (PST) Received: by mail-qt1-x834.google.com with SMTP id l13so1660780qtu.9 for ; Thu, 11 Mar 2021 09:04:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=V4a9UY5s40LlGymKW7WgX8xoLwzSyTJIIO9ypByJRAw=; b=PHaAqcdsvji03f2z+lFyZ8uXYsQHIyCKXI3NXRjmrhJt4WCbC8VFsSW+IG01McIrVA xn4ygoTuPmoyX1GHYO7FnKtY2WZRgEImAKhYJ7OY4br6O9JHfCAwAD/vjjc6sCS0U6Mq 2rXGLLlpG+4aoCfUYbDR7kchWQxarSw8LsEdhJ9MkgJtsdXDH2Ejqfgb8kaFR+QeAu9E 9Dd79jaDJRgZca+E2zDekUEh2P6Zu8WJmOoxRHC5gF1+9rpw5nfLNBfXlESnbb7bSN87 obCG5HahbBJHSqMzlqwpxOZeHnlVphNTgS575zCb9CsCGk3jdfYjQDj3KnF38SpcHCZv OTmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=V4a9UY5s40LlGymKW7WgX8xoLwzSyTJIIO9ypByJRAw=; b=pIltwOOLcSB3zoWwTKHMyruD34fi6uIjD1SAn2uBDT4xix78y+IexKAxQxNPeybldn 4PW1bVe3rXuvXyfymrBQjKALczUgeHCJDHNU07gBkaLV94V690dmiVOirsGMiFCtg8z0 DBRxnRfBuo7w0Frjbtf1dGAorL2/zX5GxxgUf3Vy3VYKgF3c8zj6v3CSgPSTxcRStTWn wYUJwomTw+/+kQPHGVj22SnfCS1hTQlRWEKvB+ekxBY73uFeXySSG0IHPdj0X6nCWqtj e7Zg2GZmkwaRrflYxoczoE6HvNDTT+ZgANCMFCxK5FDCQm0yF23ZsAjv654i6lUrkExY 2mXA== X-Gm-Message-State: AOAM531zWEt4449lqWr1X1LcVbogNtkLuDwdtgAqvPlBFjx3Pqgl/NNN /R2PtZPnWh7ZTGBZ3nZ9IS5LqzAQ5r4QOyef X-Google-Smtp-Source: ABdhPJwzgN6xl6wc014xRMHaJpHLci/MU6hvpqCG/IDML4IXFYmTJuy8aCKztRL44oOshyNfyoshBg== X-Received: by 2002:ac8:7412:: with SMTP id p18mr8215654qtq.41.1615482283313; Thu, 11 Mar 2021 09:04:43 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:f947:1686:6ada:db5b]) by smtp.gmail.com with ESMTPSA id i9sm2405117qko.69.2021.03.11.09.04.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Mar 2021 09:04:42 -0800 (PST) Date: Thu, 11 Mar 2021 12:04:40 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: avarab@gmail.com, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com, peff@peff.net Subject: [PATCH v3 02/16] builtin/multi-pack-index.c: don't handle 'progress' separately Message-ID: <181f11e4c55b364dc7f6a6530f397779171671a9.1615482270.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Now that there is a shared 'flags' member in the options structure, there is no need to keep track of whether to force progress or not, since ultimately the decision of whether or not to show a progress meter is controlled by a bit in the flags member. Manipulate that bit directly, and drop the now-unnecessary 'progress' field while we're at it. Signed-off-by: Taylor Blau --- builtin/multi-pack-index.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index 4a0ddb06c4..c70f020d8f 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -13,7 +13,6 @@ static char const * const builtin_multi_pack_index_usage[] = { static struct opts_multi_pack_index { const char *object_dir; unsigned long batch_size; - int progress; unsigned flags; } opts; @@ -23,7 +22,7 @@ int cmd_multi_pack_index(int argc, const char **argv, static struct option builtin_multi_pack_index_options[] = { OPT_FILENAME(0, "object-dir", &opts.object_dir, N_("object directory containing set of packfile and pack-index pairs")), - OPT_BOOL(0, "progress", &opts.progress, N_("force progress reporting")), + OPT_BIT(0, "progress", &opts.flags, N_("force progress reporting"), MIDX_PROGRESS), OPT_MAGNITUDE(0, "batch-size", &opts.batch_size, N_("during repack, collect pack-files of smaller size into a batch that is larger than this size")), OPT_END(), @@ -31,15 +30,14 @@ int cmd_multi_pack_index(int argc, const char **argv, git_config(git_default_config, NULL); - opts.progress = isatty(2); + if (isatty(2)) + opts.flags |= MIDX_PROGRESS; argc = parse_options(argc, argv, prefix, builtin_multi_pack_index_options, builtin_multi_pack_index_usage, 0); if (!opts.object_dir) opts.object_dir = get_object_directory(); - if (opts.progress) - opts.flags |= MIDX_PROGRESS; if (argc == 0) usage_with_options(builtin_multi_pack_index_usage, From patchwork Thu Mar 11 17:04:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12132161 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49E63C43381 for ; Thu, 11 Mar 2021 17:05:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1989564FFB for ; Thu, 11 Mar 2021 17:05:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229862AbhCKRFV (ORCPT ); Thu, 11 Mar 2021 12:05:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36604 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229663AbhCKREt (ORCPT ); Thu, 11 Mar 2021 12:04:49 -0500 Received: from mail-qk1-x734.google.com (mail-qk1-x734.google.com [IPv6:2607:f8b0:4864:20::734]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8E688C061760 for ; Thu, 11 Mar 2021 09:04:48 -0800 (PST) Received: by mail-qk1-x734.google.com with SMTP id b130so21303470qkc.10 for ; Thu, 11 Mar 2021 09:04:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=By4ZDw+6iwX13Ip2dovM5NjuD4lu7Lq1pVXXR7wZA+M=; b=fTctuxq1wQbdD3yIgcAfD23+1qdxelD83LiVTF0PBQakCWD66Yln11MDz3J3y3o+J3 bjC1nwdgzGoODD6nQ4P8rijbZJlq5iWLrGQ1xhSEpf+2KXhvylIaf2DQD28cwjzQdEOw euZY35QPawmL9cMjHHpwOUbgLZ0pq87CKjGpxVGux6+qeTYIxE6eab8SOTZvjsD63WV8 9inpn/0OyLZ1lBWhEKELqd9oVAfWZJqlIAu1INPbGmCOxo9bDdo2TErmZMr8s422fHjf 1zNH8TLVO/1al9y3cnuvJEuIHQe5mKyUUq8fPbFXsidRjjwuXHg0I7Vq6OyHPTqb+YEz gULA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=By4ZDw+6iwX13Ip2dovM5NjuD4lu7Lq1pVXXR7wZA+M=; b=il9z9W7mYYG09C9BONgNg1Ftt8VFHlnU5CA+LGD84huBNtYlOhRkAQ8y9oXKp6P/P0 mSqJvWY7bB3VFCn4QgeuhvymnyLAMSJQTQ/l5Z0d0NgylvBcarmEkwg4kakVAi7oMbxv kH87JF85PeBnWxb+wiCIcV6ZGTs1ablI3RvtviBuLkJICiOPLV1KTriDEyOGCQCrilm+ /nLa0GlunnaSChj0uRDzG/iDPbn2bEROy3hN9JxP96xzIMWQTvNsQql5WdmkNMgw85nQ KBVl+irZ/ojMizccV9wFdFnjrDJXbFS+MQ0Iyc1OIUhtGkrsiTQUcTSaj4K9oEocxanW iFYQ== X-Gm-Message-State: AOAM531Ra+QRtaHWtxeE7kQ8UOsB4H7nokZw9JLnkSbhyN59Lv30slp1 uf4R1RbniTnTv6IjBGCkkXiNXzJCwuO0nadN X-Google-Smtp-Source: ABdhPJw+IyFhwzovLRsVSwlx7A2mn7Sng5DKfXapvR+O5xUBs3uJB0Ho6sLd+oRz5yDn/C/gJEqDmA== X-Received: by 2002:a05:620a:6c1:: with SMTP id 1mr2605771qky.198.1615482287584; Thu, 11 Mar 2021 09:04:47 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:f947:1686:6ada:db5b]) by smtp.gmail.com with ESMTPSA id t21sm2007324qtw.51.2021.03.11.09.04.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Mar 2021 09:04:47 -0800 (PST) Date: Thu, 11 Mar 2021 12:04:45 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: avarab@gmail.com, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com, peff@peff.net Subject: [PATCH v3 03/16] builtin/multi-pack-index.c: define common usage with a macro Message-ID: <94c498f0e25c8dde093b9a1ad8044c4ef37a6c5d.1615482270.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Factor out the usage message into pieces corresponding to each mode. This avoids options specific to one sub-command from being shared with another in the usage. A subsequent commit will use these #define macros to have usage variables for each sub-command without duplicating their contents. Signed-off-by: Taylor Blau --- builtin/multi-pack-index.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index c70f020d8f..eea498e026 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -5,8 +5,23 @@ #include "midx.h" #include "trace2.h" +#define BUILTIN_MIDX_WRITE_USAGE \ + N_("git multi-pack-index [] write") + +#define BUILTIN_MIDX_VERIFY_USAGE \ + N_("git multi-pack-index [] verify") + +#define BUILTIN_MIDX_EXPIRE_USAGE \ + N_("git multi-pack-index [] expire") + +#define BUILTIN_MIDX_REPACK_USAGE \ + N_("git multi-pack-index [] repack [--batch-size=]") + static char const * const builtin_multi_pack_index_usage[] = { - N_("git multi-pack-index [] (write|verify|expire|repack --batch-size=)"), + BUILTIN_MIDX_WRITE_USAGE, + BUILTIN_MIDX_VERIFY_USAGE, + BUILTIN_MIDX_EXPIRE_USAGE, + BUILTIN_MIDX_REPACK_USAGE, NULL }; From patchwork Thu Mar 11 17:04:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12132163 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64FE4C4332B for ; Thu, 11 Mar 2021 17:05:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 38C9A64FE9 for ; Thu, 11 Mar 2021 17:05:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229973AbhCKRFX (ORCPT ); Thu, 11 Mar 2021 12:05:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36628 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229930AbhCKREw (ORCPT ); Thu, 11 Mar 2021 12:04:52 -0500 Received: from mail-il1-x12a.google.com (mail-il1-x12a.google.com [IPv6:2607:f8b0:4864:20::12a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8C32BC061574 for ; Thu, 11 Mar 2021 09:04:52 -0800 (PST) Received: by mail-il1-x12a.google.com with SMTP id v14so19535991ilj.11 for ; Thu, 11 Mar 2021 09:04:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=giX1zS/myYoewg3Lc20PT3NkDhM4pkXExmujP187dQU=; b=kaTdou6LXVObgFS/7nH8dEQx9ZF9bK70dgAiZ/Mhf4yd7/ItPxZObgbcl+yK9MmtBL 4S0BGGo92XqlgZrvli737kPByKj9TWiBe8EOQ/Z5eQQ5Zn1x/o8evgJl8b/kOEnpl8lC VH6ylaYOuA6lDQSbSQAx8RUl6q5PQ7nVpaYVb8UGKKVLhRtRz0jAmpDqt/IyjSUma6Bm ykZ05+sid3r5zzASyEKd4VASn6+Ye6y7hHuEFj6wyy3DbMHkB6a1+dIkdfd8GIf0a4vh +CJ1/xls7yVMRXyDSw9cFYK5qwoGeBWv6luKkaFX2B7IG9GQpZ18tDD9WsEVU1SEhzVQ S4/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=giX1zS/myYoewg3Lc20PT3NkDhM4pkXExmujP187dQU=; b=B0S8KHzs0a7syfR7/LvuHsKZkHdxcBQtRjvIYgQiAFK4RMLG9PdzUMSu42XFBCM/Or unsSI0RCR92KqOPObJ230bw64V26SwL0BUqiGZj0YtKlb26OWLQTXKwmkAIGH9fkOzxz VLf2rYMREA+gxVa0/rw4TyYAL5jLjSY6lLu3J27p3ntZVzYcYItjexWzuQjwsLekpFw0 skKJ5WiJ/ByD9W9eDfc8PnqJFTl1oA8c25V/rQ7hlgZDD9s5CJWROcIORChmHDkzCCoY UdrO+YOMhAz6AfTvkthBW0vUjKer3QsgheKw8J6umLWcz2ST1VRZBf/i5z77mIqxkAK2 fUXA== X-Gm-Message-State: AOAM530Xx3jYx5+HF3hHICkqwZpvD1ra9716dBBGajecKUUgGSdpH5df TxCqWlg5Q2w084YWAC7ws3h7ASb17KmvVggS X-Google-Smtp-Source: ABdhPJzch7zQ8QsuPFYS/ljyFHgdgiT4oG1a7hY7jMwQhf7RYub1A2qfQOPrLuqWEKGYzI43em8oug== X-Received: by 2002:a05:6e02:194a:: with SMTP id x10mr7653478ilu.165.1615482291737; Thu, 11 Mar 2021 09:04:51 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:f947:1686:6ada:db5b]) by smtp.gmail.com with ESMTPSA id y6sm1595054ily.50.2021.03.11.09.04.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Mar 2021 09:04:51 -0800 (PST) Date: Thu, 11 Mar 2021 12:04:49 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: avarab@gmail.com, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com, peff@peff.net Subject: [PATCH v3 04/16] builtin/multi-pack-index.c: split sub-commands Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Handle sub-commands of the 'git multi-pack-index' builtin (e.g., "write", "repack", etc.) separately from one another. This allows sub-commands with unique options, without forcing cmd_multi_pack_index() to reject invalid combinations itself. This comes at the cost of some duplication and boilerplate. Luckily, the duplication is reduced to a minimum, since common options are shared among sub-commands due to a suggestion by Ævar. (Sub-commands do have to retain the common options, too, since this builtin accepts common options on either side of the sub-command). Roughly speaking, cmd_multi_pack_index() parses options (including common ones), and stops at the first non-option, which is the sub-command. It then dispatches to the appropriate sub-command, which parses the remaining options (also including common options). Unknown options are kept by the sub-commands in order to detect their presence (and complain that too many arguments were given). Signed-off-by: Taylor Blau --- builtin/multi-pack-index.c | 131 ++++++++++++++++++++++++++++++------- 1 file changed, 106 insertions(+), 25 deletions(-) diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index eea498e026..23e51dfeb4 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -17,6 +17,22 @@ #define BUILTIN_MIDX_REPACK_USAGE \ N_("git multi-pack-index [] repack [--batch-size=]") +static char const * const builtin_multi_pack_index_write_usage[] = { + BUILTIN_MIDX_WRITE_USAGE, + NULL +}; +static char const * const builtin_multi_pack_index_verify_usage[] = { + BUILTIN_MIDX_VERIFY_USAGE, + NULL +}; +static char const * const builtin_multi_pack_index_expire_usage[] = { + BUILTIN_MIDX_EXPIRE_USAGE, + NULL +}; +static char const * const builtin_multi_pack_index_repack_usage[] = { + BUILTIN_MIDX_REPACK_USAGE, + NULL +}; static char const * const builtin_multi_pack_index_usage[] = { BUILTIN_MIDX_WRITE_USAGE, BUILTIN_MIDX_VERIFY_USAGE, @@ -31,25 +47,99 @@ static struct opts_multi_pack_index { unsigned flags; } opts; -int cmd_multi_pack_index(int argc, const char **argv, - const char *prefix) +static struct option common_opts[] = { + OPT_FILENAME(0, "object-dir", &opts.object_dir, + N_("object directory containing set of packfile and pack-index pairs")), + OPT_BIT(0, "progress", &opts.flags, N_("force progress reporting"), MIDX_PROGRESS), + OPT_END(), +}; + +static struct option *add_common_options(struct option *prev) { - static struct option builtin_multi_pack_index_options[] = { - OPT_FILENAME(0, "object-dir", &opts.object_dir, - N_("object directory containing set of packfile and pack-index pairs")), - OPT_BIT(0, "progress", &opts.flags, N_("force progress reporting"), MIDX_PROGRESS), + struct option *with_common = parse_options_concat(common_opts, prev); + free(prev); + return with_common; +} + +static int cmd_multi_pack_index_write(int argc, const char **argv) +{ + struct option *options = common_opts; + + argc = parse_options(argc, argv, NULL, + options, builtin_multi_pack_index_write_usage, + PARSE_OPT_KEEP_UNKNOWN); + if (argc) + usage_with_options(builtin_multi_pack_index_write_usage, + options); + + return write_midx_file(opts.object_dir, opts.flags); +} + +static int cmd_multi_pack_index_verify(int argc, const char **argv) +{ + struct option *options = common_opts; + + argc = parse_options(argc, argv, NULL, + options, builtin_multi_pack_index_verify_usage, + PARSE_OPT_KEEP_UNKNOWN); + if (argc) + usage_with_options(builtin_multi_pack_index_verify_usage, + options); + + return verify_midx_file(the_repository, opts.object_dir, opts.flags); +} + +static int cmd_multi_pack_index_expire(int argc, const char **argv) +{ + struct option *options = common_opts; + + argc = parse_options(argc, argv, NULL, + options, builtin_multi_pack_index_expire_usage, + PARSE_OPT_KEEP_UNKNOWN); + if (argc) + usage_with_options(builtin_multi_pack_index_expire_usage, + options); + + return expire_midx_packs(the_repository, opts.object_dir, opts.flags); +} + +static int cmd_multi_pack_index_repack(int argc, const char **argv) +{ + struct option *options; + static struct option builtin_multi_pack_index_repack_options[] = { OPT_MAGNITUDE(0, "batch-size", &opts.batch_size, N_("during repack, collect pack-files of smaller size into a batch that is larger than this size")), OPT_END(), }; + options = parse_options_dup(builtin_multi_pack_index_repack_options); + options = add_common_options(options); + + argc = parse_options(argc, argv, NULL, + options, + builtin_multi_pack_index_repack_usage, + PARSE_OPT_KEEP_UNKNOWN); + if (argc) + usage_with_options(builtin_multi_pack_index_repack_usage, + options); + + return midx_repack(the_repository, opts.object_dir, + (size_t)opts.batch_size, opts.flags); +} + +int cmd_multi_pack_index(int argc, const char **argv, + const char *prefix) +{ + struct option *builtin_multi_pack_index_options = common_opts; + git_config(git_default_config, NULL); if (isatty(2)) opts.flags |= MIDX_PROGRESS; argc = parse_options(argc, argv, prefix, builtin_multi_pack_index_options, - builtin_multi_pack_index_usage, 0); + builtin_multi_pack_index_usage, + PARSE_OPT_STOP_AT_NON_OPTION); if (!opts.object_dir) opts.object_dir = get_object_directory(); @@ -58,25 +148,16 @@ int cmd_multi_pack_index(int argc, const char **argv, usage_with_options(builtin_multi_pack_index_usage, builtin_multi_pack_index_options); - if (argc > 1) { - die(_("too many arguments")); - return 1; - } - trace2_cmd_mode(argv[0]); if (!strcmp(argv[0], "repack")) - return midx_repack(the_repository, opts.object_dir, - (size_t)opts.batch_size, opts.flags); - if (opts.batch_size) - die(_("--batch-size option is only for 'repack' subcommand")); - - if (!strcmp(argv[0], "write")) - return write_midx_file(opts.object_dir, opts.flags); - if (!strcmp(argv[0], "verify")) - return verify_midx_file(the_repository, opts.object_dir, opts.flags); - if (!strcmp(argv[0], "expire")) - return expire_midx_packs(the_repository, opts.object_dir, opts.flags); - - die(_("unrecognized subcommand: %s"), argv[0]); + return cmd_multi_pack_index_repack(argc, argv); + else if (!strcmp(argv[0], "write")) + return cmd_multi_pack_index_write(argc, argv); + else if (!strcmp(argv[0], "verify")) + return cmd_multi_pack_index_verify(argc, argv); + else if (!strcmp(argv[0], "expire")) + return cmd_multi_pack_index_expire(argc, argv); + else + die(_("unrecognized subcommand: %s"), argv[0]); } From patchwork Thu Mar 11 17:04:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12132171 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C5291C43331 for ; Thu, 11 Mar 2021 17:05:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 97D5964ECE for ; Thu, 11 Mar 2021 17:05:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230027AbhCKRFX (ORCPT ); Thu, 11 Mar 2021 12:05:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36644 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229821AbhCKRE5 (ORCPT ); Thu, 11 Mar 2021 12:04:57 -0500 Received: from mail-io1-xd2e.google.com (mail-io1-xd2e.google.com [IPv6:2607:f8b0:4864:20::d2e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 030D4C061574 for ; Thu, 11 Mar 2021 09:04:57 -0800 (PST) Received: by mail-io1-xd2e.google.com with SMTP id u8so22643634ior.13 for ; Thu, 11 Mar 2021 09:04:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=HG3GCoWUAVa8w8akz93OyJ5RggkvXDx8g64nnv1ffHU=; b=lPkfxMSQiV45HFkB9Qr+s3cl6iN0mZlDf4oCEk6cpu10uAEjreO2ztxszDL7RQgalH dp7WcgAkSe1BpOWIF7UglvhrY2MT4dYV1erq3gfAuXoWyIv2WU+GT4hxZRTihZbDsTR8 c/04omX3Y8EjnY6F8U9b3WRjxcevakz1v/6sJbPQnEBdUy3Jvvh4dUvI94df9BZlkPL+ KErK+QpGvRKvt59R+Z63uoTVUC+6SipxYgkFbNeUy4OvGYU5CZpc/l+O5m+nPi+RC24N zbbVk9WtgSwPOlbc9hkzHFgHSvaNujHEyd8ME7en/uqNClBRHAc3l4JssEqi1V0bn/9n ifcg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=HG3GCoWUAVa8w8akz93OyJ5RggkvXDx8g64nnv1ffHU=; b=slJNXf75M27Tbe1zvf7rHtL9Xa0LKdoz43LmA/xbStM53IH8HdK2FxyP7g3KlX98M9 /kN6d8Vkd9u2A3kpVSOlDMXlQZUKspYdeMwA5bgxOgZ2iw1abFzeh5WUikS1g9V5Pcx/ 3v4WR5OvaOw5Ft0NZ9LljABX3vxSMF9mlcWWqqb0OhaS3+5OzFcyb+6U9DqRhbtItTRC QLqP5w/0baewGgrG9iZ64wpoi40gWy2uyn1OpBIym0QwhvDCegMuuMopFCa9PFOWFn02 wgOpb5qpM5a5KttJG9CrNHA+F7tZaOmKJ2yJ0X8J/o2M3+HAnx0gylIo8bxtcLo5gJsV 5gjQ== X-Gm-Message-State: AOAM530VXFYvCJhz+Yjr94DY8bXKkJflXQBA+1Rg7TKVg995o1AmoBx5 v02D0dDwcI1ZnWeUPX97XU5H0HzBKQr2Aidx X-Google-Smtp-Source: ABdhPJwzJ2PHGYdYH5o3fqOFOb/+IA7aiFKgVSdG6m6mWYZcK7vynvr4Hb7c3cwGaQLg6IiFnUxPMg== X-Received: by 2002:a5d:97c9:: with SMTP id k9mr7026104ios.45.1615482296183; Thu, 11 Mar 2021 09:04:56 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:f947:1686:6ada:db5b]) by smtp.gmail.com with ESMTPSA id u14sm1499404ilv.0.2021.03.11.09.04.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Mar 2021 09:04:55 -0800 (PST) Date: Thu, 11 Mar 2021 12:04:53 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: avarab@gmail.com, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com, peff@peff.net Subject: [PATCH v3 05/16] builtin/multi-pack-index.c: don't enter bogus cmd_mode Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Even before the recent refactoring, 'git multi-pack-index' calls 'trace2_cmd_mode()' before verifying that the sub-command is recognized. Push this call down into the individual sub-commands so that we don't enter a bogus command mode. Signed-off-by: Ævar Arnfjörð Bjarmason Signed-off-by: Taylor Blau --- builtin/multi-pack-index.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index 23e51dfeb4..b5678cc2bb 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -65,6 +65,8 @@ static int cmd_multi_pack_index_write(int argc, const char **argv) { struct option *options = common_opts; + trace2_cmd_mode(argv[0]); + argc = parse_options(argc, argv, NULL, options, builtin_multi_pack_index_write_usage, PARSE_OPT_KEEP_UNKNOWN); @@ -79,6 +81,8 @@ static int cmd_multi_pack_index_verify(int argc, const char **argv) { struct option *options = common_opts; + trace2_cmd_mode(argv[0]); + argc = parse_options(argc, argv, NULL, options, builtin_multi_pack_index_verify_usage, PARSE_OPT_KEEP_UNKNOWN); @@ -93,6 +97,8 @@ static int cmd_multi_pack_index_expire(int argc, const char **argv) { struct option *options = common_opts; + trace2_cmd_mode(argv[0]); + argc = parse_options(argc, argv, NULL, options, builtin_multi_pack_index_expire_usage, PARSE_OPT_KEEP_UNKNOWN); @@ -115,6 +121,8 @@ static int cmd_multi_pack_index_repack(int argc, const char **argv) options = parse_options_dup(builtin_multi_pack_index_repack_options); options = add_common_options(options); + trace2_cmd_mode(argv[0]); + argc = parse_options(argc, argv, NULL, options, builtin_multi_pack_index_repack_usage, @@ -148,8 +156,6 @@ int cmd_multi_pack_index(int argc, const char **argv, usage_with_options(builtin_multi_pack_index_usage, builtin_multi_pack_index_options); - trace2_cmd_mode(argv[0]); - if (!strcmp(argv[0], "repack")) return cmd_multi_pack_index_repack(argc, argv); else if (!strcmp(argv[0], "write")) From patchwork Thu Mar 11 17:04:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12132169 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 32BF6C432C3 for ; Thu, 11 Mar 2021 17:05:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0BD7A64FFA for ; Thu, 11 Mar 2021 17:05:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229868AbhCKRFY (ORCPT ); Thu, 11 Mar 2021 12:05:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36666 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229974AbhCKRFC (ORCPT ); Thu, 11 Mar 2021 12:05:02 -0500 Received: from mail-io1-xd29.google.com (mail-io1-xd29.google.com [IPv6:2607:f8b0:4864:20::d29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ED56DC061574 for ; Thu, 11 Mar 2021 09:05:01 -0800 (PST) Received: by mail-io1-xd29.google.com with SMTP id u20so22664098iot.9 for ; Thu, 11 Mar 2021 09:05:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=Dpw82TuFjocm9ACB5yEdiaROpRMDQAQQQwUX1fhkE0k=; b=htIGZf3w5c8GIOfaFJ+ubCUki5BAu7paSCs4FOqE430XDKaa3aeET4p+J1lWNgXj8j 3FsbOGe+5bnfHv+WvOa8Y4tluHRTYhBRlVZY4gatszRcK4UWPhqgygz07TkIw9jKYBgF XIjcqLnnB7ct2hist3CpdiiHwD+a8Ib50T3QvmiXj78fPZp9rXAHKFqP/IHOSnDHXGRm xL9B7uHlqxKc/8VufSHiqkS2FF4OO6b4jMtOlMqotJuMnK5+tcqtvMUj5ztpY+zrcSrO oF0bFKRgacoVolgLJ/MmpxhtHzv02qHbgBdAYLl6XnILRR7LtHnqvDhT28Pc0YKyj9jW 6+cw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=Dpw82TuFjocm9ACB5yEdiaROpRMDQAQQQwUX1fhkE0k=; b=IL5BncND92k1f4H2MxRSde0WezD+sX8gdTI3VIY5VK7xN01G4BOYZd6akQ1lyDzzNf 9S0Agzv2uAiJUJK2D3gHhkAEEi9Tg0NqtLOR5hC9lUnx0LbU+rdn3HfI9Xeb0MpwrH9U tD4M/6IzzW+xaV76+TUArf/QQmprMDn3z3SO0CWudT6f4/qWjD3i7xdNtJ0hOv0/wlBX NRroE/m5FYmFCOs9b6CrIH+5FPHpJS7wi5ioAyBh/UevksCtoTDvncEX8+9wr6r6hoaq yWUvH58IimFWPsAeTCUWFs8lX0qnur/Y+Ev/Bcenk3+59Y/ZkuNwsNNhCRK4h3iOiXJv dDaA== X-Gm-Message-State: AOAM531nlKlB/wgVN2QxalvDYZbyCu6LnCDG2HVNThVXW0+ldUUA09Dz fg6RvBfQiDo8S2wZbKmhdm2YoScjo9vTre1x X-Google-Smtp-Source: ABdhPJwynTSDKoEeg7wJjjNs6nwl7c/XUR4iwBFYM9QS5kbuVE6zt+zxIxyGDMik9zuh0BnMoatJTw== X-Received: by 2002:a02:850a:: with SMTP id g10mr4594908jai.140.1615482301130; Thu, 11 Mar 2021 09:05:01 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:f947:1686:6ada:db5b]) by smtp.gmail.com with ESMTPSA id e4sm1564224ils.10.2021.03.11.09.05.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Mar 2021 09:05:00 -0800 (PST) Date: Thu, 11 Mar 2021 12:04:57 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: avarab@gmail.com, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com, peff@peff.net Subject: [PATCH v3 06/16] builtin/multi-pack-index.c: display usage on unrecognized command Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When given a sub-command that it doesn't understand, 'git multi-pack-index' dies with the following message: $ git multi-pack-index bogus fatal: unrecognized subcommand: bogus Instead of 'die()'-ing, we can display the usage text, which is much more helpful: $ git.compile multi-pack-index bogus usage: git multi-pack-index [] write or: git multi-pack-index [] verify or: git multi-pack-index [] expire or: git multi-pack-index [] repack [--batch-size=] --object-dir object directory containing set of packfile and pack-index pairs --progress force progress reporting While we're at it, clean up some duplication between the "no sub-command" and "unrecognized sub-command" conditionals. Signed-off-by: Ævar Arnfjörð Bjarmason Signed-off-by: Taylor Blau --- builtin/multi-pack-index.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index b5678cc2bb..243b6ccc7c 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -153,8 +153,7 @@ int cmd_multi_pack_index(int argc, const char **argv, opts.object_dir = get_object_directory(); if (argc == 0) - usage_with_options(builtin_multi_pack_index_usage, - builtin_multi_pack_index_options); + goto usage; if (!strcmp(argv[0], "repack")) return cmd_multi_pack_index_repack(argc, argv); @@ -165,5 +164,7 @@ int cmd_multi_pack_index(int argc, const char **argv, else if (!strcmp(argv[0], "expire")) return cmd_multi_pack_index_expire(argc, argv); else - die(_("unrecognized subcommand: %s"), argv[0]); +usage: + usage_with_options(builtin_multi_pack_index_usage, + builtin_multi_pack_index_options); } From patchwork Thu Mar 11 17:05:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12132167 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4EA98C43603 for ; Thu, 11 Mar 2021 17:05:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 267F464FE9 for ; Thu, 11 Mar 2021 17:05:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229978AbhCKRF1 (ORCPT ); Thu, 11 Mar 2021 12:05:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36684 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229673AbhCKRFH (ORCPT ); Thu, 11 Mar 2021 12:05:07 -0500 Received: from mail-qt1-x82d.google.com (mail-qt1-x82d.google.com [IPv6:2607:f8b0:4864:20::82d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 12B7CC061574 for ; Thu, 11 Mar 2021 09:05:07 -0800 (PST) Received: by mail-qt1-x82d.google.com with SMTP id l13so1661904qtu.9 for ; Thu, 11 Mar 2021 09:05:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=fK8WEtc2GFHptLJs2Vl6B9J5BbpyrxE1ckawIMUyd9Y=; b=Dkl8ezAWk/lfIMP7SrSigLIGx7AJaPOZnTwFqFvYxX+j05HfHoy+uhtJXF20iOQcd3 UIajWd+E/VVA+J8Az8bJLdk7GKTmAAgdPihu6E1a9j5oJByDv2VFvU0/ySH6kEEJr1xz kEumDJl+7oNIN1Q7ybp9ny/TmL5att9Ls7ouB9FqC+UQkqtRgDElZLgWyF5ZIGI/kXxG WusulMd9+kB7ccfdlJ9wy9itKLj5GEEFSUM/2WS8UEvmkQ4SRnroXFZeSGtPRJKdiCuC ftvf8dgYdvfjrIj7Er03uk7a8OGRujMi3KqUPNE/58nNoZfztnK4kEi8iTwnVKdSHBNW 1hlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=fK8WEtc2GFHptLJs2Vl6B9J5BbpyrxE1ckawIMUyd9Y=; b=rVrodFlSoIOo9V9eLLH7BfmorZt1MeXbMYeGQefNXX8RA+XjHs3j8n0e6/3+CPn0hl 7M+yZUoG3aodsf+UwuxOiPaSVNeMU4/st6KE6MleaQr2SNEM12fA3F14pq+ln6wztYcW wUrnJVH7fHTrgFXvy7TXBPGEkJkCGQluHegBi9lAsP/LlA0y7Gwb6gS0eL3zGgk8rkAc T/eTztILJqrlCTaoDXwpGdWja66NBKKtfMrbAjGidoH41CT42Dp6tzAQWWVw3gMVhtEO //GrbYUAPDv47xkZH1eEZtoslJm9eNJY0RTSpqT/26lOmRXIyqKcq71CZfMe7nKgmdHt 2I/Q== X-Gm-Message-State: AOAM532dLlVkI0z/Esvi/QyIuk29a/CO0UtcYefwqK7Qyc4Vyz7wKGuK G9s8LUW9650iDn9EuDZAyUrRaLERyzGwlwIK X-Google-Smtp-Source: ABdhPJzH0RIwKD44rTo3fuCGb98F3wrqbwmkEeDRrzdbX/qLTILcowIHNGg8sD68av5cNVe7uwvL2Q== X-Received: by 2002:ac8:6e9c:: with SMTP id c28mr8176175qtv.117.1615482305913; Thu, 11 Mar 2021 09:05:05 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:f947:1686:6ada:db5b]) by smtp.gmail.com with ESMTPSA id p8sm2185940qtu.8.2021.03.11.09.05.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Mar 2021 09:05:05 -0800 (PST) Date: Thu, 11 Mar 2021 12:05:03 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: avarab@gmail.com, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com, peff@peff.net Subject: [PATCH v3 07/16] t/helper/test-read-midx.c: add '--show-objects' Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The 'read-midx' helper is used in places like t5319 to display basic information about a multi-pack-index. In the next patch, the MIDX writing machinery will learn a new way to choose from which pack an object is selected when multiple copies of that object exist. To disambiguate which pack introduces an object so that this feature can be tested, add a '--show-objects' option which displays additional information about each object in the MIDX. Signed-off-by: Taylor Blau --- t/helper/test-read-midx.c | 24 ++++++++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-) diff --git a/t/helper/test-read-midx.c b/t/helper/test-read-midx.c index 2430880f78..7c2eb11a8e 100644 --- a/t/helper/test-read-midx.c +++ b/t/helper/test-read-midx.c @@ -4,7 +4,7 @@ #include "repository.h" #include "object-store.h" -static int read_midx_file(const char *object_dir) +static int read_midx_file(const char *object_dir, int show_objects) { uint32_t i; struct multi_pack_index *m; @@ -43,13 +43,29 @@ static int read_midx_file(const char *object_dir) printf("object-dir: %s\n", m->object_dir); + if (show_objects) { + struct object_id oid; + struct pack_entry e; + + for (i = 0; i < m->num_objects; i++) { + nth_midxed_object_oid(&oid, m, i); + fill_midx_entry(the_repository, &oid, &e, m); + + printf("%s %"PRIu64"\t%s\n", + oid_to_hex(&oid), e.offset, e.p->pack_name); + } + return 0; + } + return 0; } int cmd__read_midx(int argc, const char **argv) { - if (argc != 2) - usage("read-midx "); + if (!(argc == 2 || argc == 3)) + usage("read-midx [--show-objects] "); - return read_midx_file(argv[1]); + if (!strcmp(argv[1], "--show-objects")) + return read_midx_file(argv[2], 1); + return read_midx_file(argv[1], 0); } From patchwork Thu Mar 11 17:05:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12132173 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B8B21C4360C for ; Thu, 11 Mar 2021 17:05:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 96B4E64ECE for ; Thu, 11 Mar 2021 17:05:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230053AbhCKRF1 (ORCPT ); Thu, 11 Mar 2021 12:05:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36708 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229876AbhCKRFM (ORCPT ); Thu, 11 Mar 2021 12:05:12 -0500 Received: from mail-io1-xd2d.google.com (mail-io1-xd2d.google.com [IPv6:2607:f8b0:4864:20::d2d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A3B83C061574 for ; Thu, 11 Mar 2021 09:05:11 -0800 (PST) Received: by mail-io1-xd2d.google.com with SMTP id a7so22653949iok.12 for ; Thu, 11 Mar 2021 09:05:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=o7kdbT0FcRN7KdLdpb1qj/OP+dI7U4COzNjs19x8PB8=; b=sxqBsbGSBfFQWMBHb7A+DBndTeivb4aHs4qdRQuhABB5ZAfVzYM+uXFfJQI7bq3+/G XtWarXPy8a5SzNjfUH0L/AazcXd3sG2JViugbHuRM8aVnLCgGERSUfxfTDUXRGYrab48 fXvMWTFASCLsXioZB53jaGhIx/0YS0w4Lc/YgmrKa3B5eJsgtrn+hpdorxWK3JvDAwDr KMRkvFSzLexXcna1+ljAGkT4E9oX7VBMEBFtlmX+BCtos3W/rUXc1gThwxrgc446F5Xw TqH75YcL/PjubhNjDJvcyqTm1BCStWZY+CHOw1jLV927WrIBnWC+6VO2gd/4G2uIRiii bhew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=o7kdbT0FcRN7KdLdpb1qj/OP+dI7U4COzNjs19x8PB8=; b=cWwEaLdjLyPjHygMPXD5zNLNWEhZypmgMOIxhJNxMxrD4iQjJzYBZxFyta2MNsqipq 4eiSe5jnDy+BqsTRJnIt5GDrWp60qZVTHH5YB9GQp4YRQ0ct2hQbHi7995MRQa8uZlAc c2edKpz1cYYz0Onfdoy+YwU2TeADDQam4iJR54ebOkrKy3vcllqRfuQv6X5LxccdcQ5l weXyBWyGBy/fAjx9KK1bz+QkjbtS8T6uzKgPKH4viYjQnJFY7xk7gBq3d7AztctY/f2L 8gKMPjCQUr5GhwrosKx1QKbSO7iCL2PHbHcAS7QCKBzX6M/4QVHn1Q1+x5w408dGpHzB IHFQ== X-Gm-Message-State: AOAM533uxIpjs9oPDLF6/WqVR3psyZUGmuD6pxDIgl77m1aGttidSHcB YRnhAUg6rjE51MLY5K2LeEha6Op4t5jbof56 X-Google-Smtp-Source: ABdhPJwyWG3K/e8ydupB6IprjKrcB7Dn6VQBMUtzDRDDXtlUhqXUTxATtqeNmzxjXKVSREZZlkUYfQ== X-Received: by 2002:a02:9a0a:: with SMTP id b10mr4565191jal.132.1615482310610; Thu, 11 Mar 2021 09:05:10 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:f947:1686:6ada:db5b]) by smtp.gmail.com with ESMTPSA id y13sm306080ioc.36.2021.03.11.09.05.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Mar 2021 09:05:10 -0800 (PST) Date: Thu, 11 Mar 2021 12:05:07 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: avarab@gmail.com, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com, peff@peff.net Subject: [PATCH v3 08/16] midx: allow marking a pack as preferred Message-ID: <30194a6786bec51e0f41de0e6c855dc2297806c6.1615482270.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When multiple packs in the multi-pack index contain the same object, the MIDX machinery must make a choice about which pack it associates with that object. Prior to this patch, the lowest-ordered[1] pack was always selected. Pack selection for duplicate objects is relatively unimportant today, but it will become important for multi-pack bitmaps. This is because we can only invoke the pack-reuse mechanism when all of the bits for reused objects come from the reuse pack (in order to ensure that all reused deltas can find their base objects in the same pack). To encourage the pack selection process to prefer one pack over another (the pack to be preferred is the one a caller would like to later use as a reuse pack), introduce the concept of a "preferred pack". When provided, the MIDX code will always prefer an object found in a preferred pack over any other. No format changes are required to store the preferred pack, since it will be able to be inferred with a corresponding MIDX bitmap, by looking up the pack associated with the object in the first bit position (this ordering is described in detail in a subsequent commit). [1]: the ordering is specified by MIDX internals; for our purposes we can consider the "lowest ordered" pack to be "the one with the most-recent mtime. Signed-off-by: Taylor Blau --- Documentation/git-multi-pack-index.txt | 14 ++- Documentation/technical/multi-pack-index.txt | 5 +- builtin/multi-pack-index.c | 18 +++- builtin/repack.c | 2 +- midx.c | 92 ++++++++++++++++++-- midx.h | 2 +- t/t5319-multi-pack-index.sh | 39 +++++++++ 7 files changed, 154 insertions(+), 18 deletions(-) diff --git a/Documentation/git-multi-pack-index.txt b/Documentation/git-multi-pack-index.txt index eb0caa0439..ffd601bc17 100644 --- a/Documentation/git-multi-pack-index.txt +++ b/Documentation/git-multi-pack-index.txt @@ -9,7 +9,8 @@ git-multi-pack-index - Write and verify multi-pack-indexes SYNOPSIS -------- [verse] -'git multi-pack-index' [--object-dir=] [--[no-]progress] +'git multi-pack-index' [--object-dir=] [--[no-]progress] + [--preferred-pack=] DESCRIPTION ----------- @@ -30,7 +31,16 @@ OPTIONS The following subcommands are available: write:: - Write a new MIDX file. + Write a new MIDX file. The following options are available for + the `write` sub-command: ++ +-- + --preferred-pack=:: + Optionally specify the tie-breaking pack used when + multiple packs contain the same object. If not given, + ties are broken in favor of the pack with the lowest + mtime. +-- verify:: Verify the contents of the MIDX file. diff --git a/Documentation/technical/multi-pack-index.txt b/Documentation/technical/multi-pack-index.txt index e8e377a59f..fb688976c4 100644 --- a/Documentation/technical/multi-pack-index.txt +++ b/Documentation/technical/multi-pack-index.txt @@ -43,8 +43,9 @@ Design Details a change in format. - The MIDX keeps only one record per object ID. If an object appears - in multiple packfiles, then the MIDX selects the copy in the most- - recently modified packfile. + in multiple packfiles, then the MIDX selects the copy in the + preferred packfile, otherwise selecting from the most-recently + modified packfile. - If there exist packfiles in the pack directory not registered in the MIDX, then those packfiles are loaded into the `packed_git` diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index 243b6ccc7c..92f358f212 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -4,9 +4,10 @@ #include "parse-options.h" #include "midx.h" #include "trace2.h" +#include "object-store.h" #define BUILTIN_MIDX_WRITE_USAGE \ - N_("git multi-pack-index [] write") + N_("git multi-pack-index [] write [--preferred-pack=]") #define BUILTIN_MIDX_VERIFY_USAGE \ N_("git multi-pack-index [] verify") @@ -43,6 +44,7 @@ static char const * const builtin_multi_pack_index_usage[] = { static struct opts_multi_pack_index { const char *object_dir; + const char *preferred_pack; unsigned long batch_size; unsigned flags; } opts; @@ -63,7 +65,16 @@ static struct option *add_common_options(struct option *prev) static int cmd_multi_pack_index_write(int argc, const char **argv) { - struct option *options = common_opts; + struct option *options; + static struct option builtin_multi_pack_index_write_options[] = { + OPT_STRING(0, "preferred-pack", &opts.preferred_pack, + N_("preferred-pack"), + N_("pack for reuse when computing a multi-pack bitmap")), + OPT_END(), + }; + + options = parse_options_dup(builtin_multi_pack_index_write_options); + options = add_common_options(options); trace2_cmd_mode(argv[0]); @@ -74,7 +85,8 @@ static int cmd_multi_pack_index_write(int argc, const char **argv) usage_with_options(builtin_multi_pack_index_write_usage, options); - return write_midx_file(opts.object_dir, opts.flags); + return write_midx_file(opts.object_dir, opts.preferred_pack, + opts.flags); } static int cmd_multi_pack_index_verify(int argc, const char **argv) diff --git a/builtin/repack.c b/builtin/repack.c index 01440de2d5..9f00806805 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -523,7 +523,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) remove_temporary_files(); if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0)) - write_midx_file(get_object_directory(), 0); + write_midx_file(get_object_directory(), NULL, 0); string_list_clear(&names, 0); string_list_clear(&rollback, 0); diff --git a/midx.c b/midx.c index 971faa8cfc..46f55ff6cf 100644 --- a/midx.c +++ b/midx.c @@ -431,6 +431,24 @@ static int pack_info_compare(const void *_a, const void *_b) return strcmp(a->pack_name, b->pack_name); } +static int lookup_idx_or_pack_name(struct pack_info *info, + uint32_t nr, + const char *pack_name) +{ + uint32_t lo = 0, hi = nr; + while (lo < hi) { + uint32_t mi = lo + (hi - lo) / 2; + int cmp = cmp_idx_or_pack_name(pack_name, info[mi].pack_name); + if (cmp < 0) + hi = mi; + else if (cmp > 0) + lo = mi + 1; + else + return mi; + } + return -1; +} + struct write_midx_context { struct pack_info *info; uint32_t nr; @@ -445,6 +463,8 @@ struct write_midx_context { uint32_t *pack_perm; unsigned large_offsets_needed:1; uint32_t num_large_offsets; + + int preferred_pack_idx; }; static void add_pack_to_midx(const char *full_path, size_t full_path_len, @@ -489,6 +509,7 @@ struct pack_midx_entry { uint32_t pack_int_id; time_t pack_mtime; uint64_t offset; + unsigned preferred : 1; }; static int midx_oid_compare(const void *_a, const void *_b) @@ -500,6 +521,12 @@ static int midx_oid_compare(const void *_a, const void *_b) if (cmp) return cmp; + /* Sort objects in a preferred pack first when multiple copies exist. */ + if (a->preferred > b->preferred) + return -1; + if (a->preferred < b->preferred) + return 1; + if (a->pack_mtime > b->pack_mtime) return -1; else if (a->pack_mtime < b->pack_mtime) @@ -527,7 +554,8 @@ static int nth_midxed_pack_midx_entry(struct multi_pack_index *m, static void fill_pack_entry(uint32_t pack_int_id, struct packed_git *p, uint32_t cur_object, - struct pack_midx_entry *entry) + struct pack_midx_entry *entry, + int preferred) { if (nth_packed_object_id(&entry->oid, p, cur_object) < 0) die(_("failed to locate object %d in packfile"), cur_object); @@ -536,6 +564,7 @@ static void fill_pack_entry(uint32_t pack_int_id, entry->pack_mtime = p->mtime; entry->offset = nth_packed_object_offset(p, cur_object); + entry->preferred = !!preferred; } /* @@ -552,7 +581,8 @@ static void fill_pack_entry(uint32_t pack_int_id, static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m, struct pack_info *info, uint32_t nr_packs, - uint32_t *nr_objects) + uint32_t *nr_objects, + int preferred_pack) { uint32_t cur_fanout, cur_pack, cur_object; uint32_t alloc_fanout, alloc_objects, total_objects = 0; @@ -589,12 +619,17 @@ static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m, nth_midxed_pack_midx_entry(m, &entries_by_fanout[nr_fanout], cur_object); + if (nth_midxed_pack_int_id(m, cur_object) == preferred_pack) + entries_by_fanout[nr_fanout].preferred = 1; + else + entries_by_fanout[nr_fanout].preferred = 0; nr_fanout++; } } for (cur_pack = start_pack; cur_pack < nr_packs; cur_pack++) { uint32_t start = 0, end; + int preferred = cur_pack == preferred_pack; if (cur_fanout) start = get_pack_fanout(info[cur_pack].p, cur_fanout - 1); @@ -602,7 +637,11 @@ static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m, for (cur_object = start; cur_object < end; cur_object++) { ALLOC_GROW(entries_by_fanout, nr_fanout + 1, alloc_fanout); - fill_pack_entry(cur_pack, info[cur_pack].p, cur_object, &entries_by_fanout[nr_fanout]); + fill_pack_entry(cur_pack, + info[cur_pack].p, + cur_object, + &entries_by_fanout[nr_fanout], + preferred); nr_fanout++; } } @@ -777,7 +816,9 @@ static int write_midx_large_offsets(struct hashfile *f, } static int write_midx_internal(const char *object_dir, struct multi_pack_index *m, - struct string_list *packs_to_drop, unsigned flags) + struct string_list *packs_to_drop, + const char *preferred_pack_name, + unsigned flags) { char *midx_name; uint32_t i; @@ -828,7 +869,19 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * if (ctx.m && ctx.nr == ctx.m->num_packs && !packs_to_drop) goto cleanup; - ctx.entries = get_sorted_entries(ctx.m, ctx.info, ctx.nr, &ctx.entries_nr); + ctx.preferred_pack_idx = -1; + if (preferred_pack_name) { + for (i = 0; i < ctx.nr; i++) { + if (!cmp_idx_or_pack_name(preferred_pack_name, + ctx.info[i].pack_name)) { + ctx.preferred_pack_idx = i; + break; + } + } + } + + ctx.entries = get_sorted_entries(ctx.m, ctx.info, ctx.nr, &ctx.entries_nr, + ctx.preferred_pack_idx); ctx.large_offsets_needed = 0; for (i = 0; i < ctx.entries_nr; i++) { @@ -889,6 +942,24 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * pack_name_concat_len += strlen(ctx.info[i].pack_name) + 1; } + /* Check that the preferred pack wasn't expired (if given). */ + if (preferred_pack_name) { + int preferred_idx = lookup_idx_or_pack_name(ctx.info, + ctx.nr, + preferred_pack_name); + if (preferred_idx < 0) + warning(_("unknown preferred pack: '%s'"), + preferred_pack_name); + else { + uint32_t orig = ctx.info[preferred_idx].orig_pack_int_id; + uint32_t perm = ctx.pack_perm[orig]; + + if (perm == PACK_EXPIRED) + warning(_("preferred pack '%s' is expired"), + preferred_pack_name); + } + } + if (pack_name_concat_len % MIDX_CHUNK_ALIGNMENT) pack_name_concat_len += MIDX_CHUNK_ALIGNMENT - (pack_name_concat_len % MIDX_CHUNK_ALIGNMENT); @@ -947,9 +1018,12 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * return result; } -int write_midx_file(const char *object_dir, unsigned flags) +int write_midx_file(const char *object_dir, + const char *preferred_pack_name, + unsigned flags) { - return write_midx_internal(object_dir, NULL, NULL, flags); + return write_midx_internal(object_dir, NULL, NULL, preferred_pack_name, + flags); } void clear_midx_file(struct repository *r) @@ -1184,7 +1258,7 @@ int expire_midx_packs(struct repository *r, const char *object_dir, unsigned fla free(count); if (packs_to_drop.nr) - result = write_midx_internal(object_dir, m, &packs_to_drop, flags); + result = write_midx_internal(object_dir, m, &packs_to_drop, NULL, flags); string_list_clear(&packs_to_drop, 0); return result; @@ -1373,7 +1447,7 @@ int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, goto cleanup; } - result = write_midx_internal(object_dir, m, NULL, flags); + result = write_midx_internal(object_dir, m, NULL, NULL, flags); m = NULL; cleanup: diff --git a/midx.h b/midx.h index b18cf53bc4..e7fea61109 100644 --- a/midx.h +++ b/midx.h @@ -47,7 +47,7 @@ int fill_midx_entry(struct repository *r, const struct object_id *oid, struct pa int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pack_name); int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local); -int write_midx_file(const char *object_dir, unsigned flags); +int write_midx_file(const char *object_dir, const char *preferred_pack_name, unsigned flags); void clear_midx_file(struct repository *r); int verify_midx_file(struct repository *r, const char *object_dir, unsigned flags); int expire_midx_packs(struct repository *r, const char *object_dir, unsigned flags); diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index b4afab1dfc..fd94ba9053 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -31,6 +31,14 @@ midx_read_expect () { test_cmp expect actual } +midx_expect_object_offset () { + OID="$1" + OFFSET="$2" + OBJECT_DIR="$3" + test-tool read-midx --show-objects $OBJECT_DIR >actual && + grep "^$OID $OFFSET" actual +} + test_expect_success 'setup' ' test_oid_cache <<-EOF idxoff sha1:2999 @@ -234,6 +242,37 @@ test_expect_success 'warn on improper hash version' ' ) ' +test_expect_success 'midx picks objects from preferred pack' ' + test_when_finished rm -rf preferred.git && + git init --bare preferred.git && + ( + cd preferred.git && + + a=$(echo "a" | git hash-object -w --stdin) && + b=$(echo "b" | git hash-object -w --stdin) && + c=$(echo "c" | git hash-object -w --stdin) && + + # Set up two packs, duplicating the object "B" at different + # offsets. + git pack-objects objects/pack/test-AB <<-EOF && + $a + $b + EOF + bc=$(git pack-objects objects/pack/test-BC <<-EOF + $b + $c + EOF + ) && + + git multi-pack-index --object-dir=objects \ + write --preferred-pack=test-BC-$bc.idx 2>err && + test_must_be_empty err && + + ofs=$(git show-index X-Patchwork-Id: 12132165 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F83AC4321A for ; Thu, 11 Mar 2021 17:05:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 41E6C64ECE for ; Thu, 11 Mar 2021 17:05:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230070AbhCKRF2 (ORCPT ); Thu, 11 Mar 2021 12:05:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36724 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230000AbhCKRFQ (ORCPT ); Thu, 11 Mar 2021 12:05:16 -0500 Received: from mail-qt1-x82d.google.com (mail-qt1-x82d.google.com [IPv6:2607:f8b0:4864:20::82d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5EC0AC061574 for ; Thu, 11 Mar 2021 09:05:16 -0800 (PST) Received: by mail-qt1-x82d.google.com with SMTP id g24so1662645qts.6 for ; Thu, 11 Mar 2021 09:05:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=+Gr9GtCqTJRc1UVgIa/SsSLlVVG77SqMxRQG9i/0Fxc=; b=AMltkhTUSjlWM3gMvaj10V3jYSrjqm7v2xO4cUJhAd9NAHeonhCmcO3+v5w0b914vn nutt1lVlqUE0EN0vVWbyTuqs5yZSfEQLhLUVftsA1DwNUkzDcrjDiwhFOE4sFMwe2kZK 83SPISzzC7WxFppgDssB7s6wv9lkKl7CtmqIHpMpyc/vNwl/vmERUwS4Xc3RbSZezEiW tjdwBVhM3du6GsflAIRe+5GGuEk3ClxHXrWJg2kolI6dE1GNxDuWF0jpV74nTcbgNFqI 6XyBlqdxvte1Vr6P7TxKXm3LRkURXG3GBa46g7wyYJffxsdGSxhfxQVhROctKqMrvSiR 5z1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=+Gr9GtCqTJRc1UVgIa/SsSLlVVG77SqMxRQG9i/0Fxc=; b=ZeZ7ulw7+742I556D6MmIeO1UX0EJTwTD9EGbt/mAyRF/MVlIIs/NFid60cXTkLkH8 KgDNFoge7ty1sBla5NXJdBpu/K+gOKj7H+gZ6s9i6dOY8Wq6x3NjKlpYF9+pjvea/iUM +vl8Tzua79ioI0y/e4YVY9+6p73w/w0r0j+uE83UIsHXE9TKKdDj+yTDIkAKrlKR7/ON HBUSNfJ9oB8MNHBEbZF2uLhgE3GO6H5ps/8DUmLLUhMLYJUYUiJCaIrnwKjPzg68KlPC xArFjQQHL8bObPXjDrWoTaNfNrZzH5jpKcdeLPsHrAyrtQ/WZAbKIZmyjLkBWW4CFPKI oEcQ== X-Gm-Message-State: AOAM533j0XlL8L02oo0HbpqWAeW9Iwsm9srS+4jlqj6ND+OdeRbpvsyq DhReTgaPHsKNtHS64c9O6h3wvsWvM/kIB0HA X-Google-Smtp-Source: ABdhPJx5q/Xyaf8RpTaRL1zk+HqSS+iGmCHsmd/HfCnjrbWPJPG333R8UkM5wJyle9tFN1bRzDvd2w== X-Received: by 2002:a05:622a:183:: with SMTP id s3mr8260303qtw.223.1615482315372; Thu, 11 Mar 2021 09:05:15 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:f947:1686:6ada:db5b]) by smtp.gmail.com with ESMTPSA id d128sm2353700qkb.44.2021.03.11.09.05.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Mar 2021 09:05:14 -0800 (PST) Date: Thu, 11 Mar 2021 12:05:12 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: avarab@gmail.com, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com, peff@peff.net Subject: [PATCH v3 09/16] midx: don't free midx_name early Message-ID: <5c5aca761a69e245eaece97703366c9f8d06a889.1615482270.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A subsequent patch will need to refer back to 'midx_name' later on in the function. In fact, this variable is already free()'d later on, so this makes the later free() no longer redundant. Signed-off-by: Taylor Blau --- midx.c | 1 - 1 file changed, 1 deletion(-) diff --git a/midx.c b/midx.c index 46f55ff6cf..e0009d3314 100644 --- a/midx.c +++ b/midx.c @@ -966,7 +966,6 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * hold_lock_file_for_update(&lk, midx_name, LOCK_DIE_ON_ERROR); f = hashfd(get_lock_file_fd(&lk), get_lock_file_path(&lk)); - FREE_AND_NULL(midx_name); if (ctx.m) close_midx(ctx.m); From patchwork Thu Mar 11 17:05:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12132175 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B8F66C433E0 for ; Thu, 11 Mar 2021 17:06:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7B96964FEC for ; Thu, 11 Mar 2021 17:06:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230125AbhCKRFw (ORCPT ); Thu, 11 Mar 2021 12:05:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36740 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229718AbhCKRFU (ORCPT ); Thu, 11 Mar 2021 12:05:20 -0500 Received: from mail-il1-x12c.google.com (mail-il1-x12c.google.com [IPv6:2607:f8b0:4864:20::12c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6CAFEC061574 for ; Thu, 11 Mar 2021 09:05:20 -0800 (PST) Received: by mail-il1-x12c.google.com with SMTP id v14so19537518ilj.11 for ; Thu, 11 Mar 2021 09:05:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=m59jTgUtPTE+tTllhj5RaUNUSio2EyIO756R58/KdLU=; b=OVbKj0b8fbywmLr1ZIXmW2etGMMXcRxNuBvgacyMrW4P+l1VkbRvnu0d3lpSJ+cHhb sAsZIt0rySCQ9diC8zpN3+n29mzyz1xP+ExnhPf7sDwn82VzinCpIxD7KmsXO3pXumDh dZt74POBN8oqH9j5/uAJfCTZn8jZ3mBdz+O+Pyjb2egd9ZXMjntfxSwpQ4EGhklrNt3Z P/kAEqwn0H7iWEy4TVjFxyHjRI0DHOTZDWdPz4V5r5yR5LzhaB6eKtlunqj6X9/h01Xn OQwT3nMpZ7XEZAFtBtGMTgcDwreuBNDqDLJHZqDTfwku40J0p+akdGwDSSeqy72JJ+dq NqsQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=m59jTgUtPTE+tTllhj5RaUNUSio2EyIO756R58/KdLU=; b=fF8Z/WGJFOcQ+QwGHv6/bnvb4WkKyT/o2Rq1JnVY2kMMIB5oV18u1jMkIQijQ/MJtr Fc7vP++Wv1BvCYi/APALKv9KCP6FOI3avyNr5cToJ9e4YaVPnn+VPFCJlWh3mFxV/8sG 3WJsvx76ZhA0q10DQvJPsExa/YO53G4Dr3xVu9l/lMCqi8a8y7ThuoP/mEd7PZVsLuvl W+1nvq/SNbARjbaFs7XvqUTC/c89XpY2Xl/7dot8T1EdvozhN3x6fx7ZZhjmtrtBLczD poGvaEupIlq57GIr0+WjGWWO7KtjCg9WrRsKAnUppFYc3PsLNhIWNFWjAA7BgeCqn44U zJUw== X-Gm-Message-State: AOAM533YKA+c4QYJTt5qCI4Thm/m0AaJvCIney7X/3gZ3BT/iX9addHP P37pfzMTv5ljHvzx0UH+4aTvLd7H6nDMdXma X-Google-Smtp-Source: ABdhPJwt4a/WXBh+lbG+8o7BSCctXTnrwQnSryzW7IPiwgGEjHd+Y+Fezxkumi+5qENlheHUGrnALg== X-Received: by 2002:a92:d5d2:: with SMTP id d18mr7016977ilq.50.1615482319668; Thu, 11 Mar 2021 09:05:19 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:f947:1686:6ada:db5b]) by smtp.gmail.com with ESMTPSA id r3sm1574197ilq.42.2021.03.11.09.05.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Mar 2021 09:05:19 -0800 (PST) Date: Thu, 11 Mar 2021 12:05:17 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: avarab@gmail.com, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com, peff@peff.net Subject: [PATCH v3 10/16] midx: keep track of the checksum Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org write_midx_internal() uses a hashfile to write the multi-pack index, but discards its checksum. This makes sense, since nothing that takes place after writing the MIDX cares about its checksum. That is about to change in a subsequent patch, when the optional reverse index corresponding to the MIDX will want to include the MIDX's checksum. Store the checksum of the MIDX in preparation for that. Signed-off-by: Taylor Blau --- midx.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/midx.c b/midx.c index e0009d3314..31e6d3d2df 100644 --- a/midx.c +++ b/midx.c @@ -821,6 +821,7 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * unsigned flags) { char *midx_name; + unsigned char midx_hash[GIT_MAX_RAWSZ]; uint32_t i; struct hashfile *f = NULL; struct lock_file lk; @@ -997,7 +998,7 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * write_midx_header(f, get_num_chunks(cf), ctx.nr - dropped_packs); write_chunkfile(cf, &ctx); - finalize_hashfile(f, NULL, CSUM_FSYNC | CSUM_HASH_IN_STREAM); + finalize_hashfile(f, midx_hash, CSUM_FSYNC | CSUM_HASH_IN_STREAM); free_chunkfile(cf); commit_lock_file(&lk); From patchwork Thu Mar 11 17:05:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12132177 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB86DC433E6 for ; Thu, 11 Mar 2021 17:06:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9A3DF64FF3 for ; Thu, 11 Mar 2021 17:06:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229930AbhCKRFx (ORCPT ); Thu, 11 Mar 2021 12:05:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36762 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230048AbhCKRFZ (ORCPT ); Thu, 11 Mar 2021 12:05:25 -0500 Received: from mail-qt1-x833.google.com (mail-qt1-x833.google.com [IPv6:2607:f8b0:4864:20::833]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 083B5C061762 for ; Thu, 11 Mar 2021 09:05:25 -0800 (PST) Received: by mail-qt1-x833.google.com with SMTP id s2so1653468qtx.10 for ; Thu, 11 Mar 2021 09:05:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=5A2fGCQ6jdCYNzw2r+1F9r2Ve0DTa4xEVfbRET9qGqw=; b=XbAtLJJl/0FQYvl5Iy3R2C4WnQbVV+inMkO0BzCNuSItk4A0HA097JOWSogGjNchgF rFQZWXY295nsH7rz+GZIcccuJPyZfvMgIz8TuhZd3KLqVRdugtIWEZd7JkgZalJ4UuDg JdLiFNkg77JxYvJ5tquCdHKr+XmlvQEb2hePDIfZn9c0b5aTPujE8xUTITrgliBFFEJ8 IHKRztClKP9Dpj6JiUNWrXGthdpKs6QkNDk1ZKk6QzfCNJ82tuXplIwlRSCj8X4Z047H 9gbtPomX+y2IhjREJgPa/fDcMRljtc5ZR8KithylIHBlqczbW4MJ7fVcjrzeA3rIGpS8 fwLQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=5A2fGCQ6jdCYNzw2r+1F9r2Ve0DTa4xEVfbRET9qGqw=; b=exFzs5Q1rNT+zUvQsVRYlM/C/ibAJydB/pO9J11nIcWiHCn/KWVG0n59BdxhcOHrbJ 7Wq/0BoQQYbfQFNCj9JgB/ngnc83oQ14v4/Aw+MIGwCWJ+o5CRTtIZwA6bl7NrHuJd0y V74cxu30FEi9oqIDG7VCX6Xx5CEqW5pWkdXWexD14l6BdftLCabsqSfaPHvkz1jYzQAR UmoDCEfZatzzF+xoMo74yw2/qw/BwW/lyyzB2MZ2Lm6mTCsAJGMySTfB+T+7VbA9DekN 6GoCLzgaJ6tuZawWdoxZEz/Ecy+Ys+QaTl7GtOXKjCAktYycVfqW5QFqxLXfks4c9Izz zNRg== X-Gm-Message-State: AOAM531eicuETaruiTLlxGSUwk3klQhGtPDJD4TvpZ5pRndB1Rs2IhyA dXZp65HFT8vWY2nSuCJ6FEJoJPOf/o2fq+3B X-Google-Smtp-Source: ABdhPJx06nMixOPyKUdLg9qiUtOkbYf5BxFRvAj9W8RzjJpQPlzQkR9h4gLZNwh8hHPBpBs5n3HeHw== X-Received: by 2002:ac8:7747:: with SMTP id g7mr4730213qtu.144.1615482323972; Thu, 11 Mar 2021 09:05:23 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:f947:1686:6ada:db5b]) by smtp.gmail.com with ESMTPSA id k7sm2021331qtm.10.2021.03.11.09.05.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Mar 2021 09:05:23 -0800 (PST) Date: Thu, 11 Mar 2021 12:05:21 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: avarab@gmail.com, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com, peff@peff.net Subject: [PATCH v3 11/16] midx: make some functions non-static Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In a subsequent commit, pack-revindex.c will become responsible for sorting a list of objects in the "MIDX pack order" (which will be defined in the following patch). To do so, it will need to be know the pack identifier and offset within that pack for each object in the MIDX. The MIDX code already has functions for doing just that (nth_midxed_offset() and nth_midxed_pack_int_id()), but they are statically declared. Since there is no reason that they couldn't be exposed publicly, and because they are already doing exactly what the caller in pack-revindex.c will want, expose them publicly so that they can be reused there. Signed-off-by: Taylor Blau --- midx.c | 4 ++-- midx.h | 2 ++ 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/midx.c b/midx.c index 31e6d3d2df..0a5da49ed6 100644 --- a/midx.c +++ b/midx.c @@ -239,7 +239,7 @@ struct object_id *nth_midxed_object_oid(struct object_id *oid, return oid; } -static off_t nth_midxed_offset(struct multi_pack_index *m, uint32_t pos) +off_t nth_midxed_offset(struct multi_pack_index *m, uint32_t pos) { const unsigned char *offset_data; uint32_t offset32; @@ -258,7 +258,7 @@ static off_t nth_midxed_offset(struct multi_pack_index *m, uint32_t pos) return offset32; } -static uint32_t nth_midxed_pack_int_id(struct multi_pack_index *m, uint32_t pos) +uint32_t nth_midxed_pack_int_id(struct multi_pack_index *m, uint32_t pos) { return get_be32(m->chunk_object_offsets + (off_t)pos * MIDX_CHUNK_OFFSET_WIDTH); diff --git a/midx.h b/midx.h index e7fea61109..93bd68189e 100644 --- a/midx.h +++ b/midx.h @@ -40,6 +40,8 @@ struct multi_pack_index { struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local); int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t pack_int_id); int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, uint32_t *result); +off_t nth_midxed_offset(struct multi_pack_index *m, uint32_t pos); +uint32_t nth_midxed_pack_int_id(struct multi_pack_index *m, uint32_t pos); struct object_id *nth_midxed_object_oid(struct object_id *oid, struct multi_pack_index *m, uint32_t n); From patchwork Thu Mar 11 17:05:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12132179 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0CBCCC433E9 for ; Thu, 11 Mar 2021 17:06:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BAAF264FF9 for ; Thu, 11 Mar 2021 17:06:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230000AbhCKRFz (ORCPT ); Thu, 11 Mar 2021 12:05:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36778 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230074AbhCKRF3 (ORCPT ); Thu, 11 Mar 2021 12:05:29 -0500 Received: from mail-io1-xd2a.google.com (mail-io1-xd2a.google.com [IPv6:2607:f8b0:4864:20::d2a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 16AD4C061574 for ; Thu, 11 Mar 2021 09:05:29 -0800 (PST) Received: by mail-io1-xd2a.google.com with SMTP id y20so4361470iot.4 for ; Thu, 11 Mar 2021 09:05:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=G/wg4ROlEWl9gs4TbtB1/b/GUqHGdO2QzRK4YRW/sMs=; b=KEUpTeUZIbrL3CPpXFN7bKsYvLEtkNeS+2+aJvkr4RoKxfpLcDWOO7FlMEyUzs0nDF jCHsmmWAAE4D1YpjwacNla4GCWJqU9pwAF5iJ53r65dKmilQ01W70vpo1mU7/1g/xFYm BDWwCNUxAGXMHq9y8dXnLDM8Gjj52LqfeBQjs0yuzz38eZzewsPVgWbBx00fIrU2hKJF 4+9VD6ES6FuqmCiHsa62/Y8lpYPmHQ36kTq2WxuS+J67tkcRR8v62XeCtdtFYMSLDU1y gA0rTnHA9PFnbHf2i8NhWadPfm1AR6HCmmmsehwaq9dNrJ5YK0fexOSEZPMISLW7Ykc+ Vj2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=G/wg4ROlEWl9gs4TbtB1/b/GUqHGdO2QzRK4YRW/sMs=; b=D9jt42UVdP6dAbm7g0k9j3MhKQ/4YnJMCit72bfE5Ta+ajj7icq2ddN52kJv+UcFwn ZSCtws/R4Vs7TzRVFCwowpE7sV6shnuAraPTRGDa//XGRF7xK+qEb6iXT1Oc51KrPLiy vM8XwzZvSc0otRlR8gN44BUg/qQzJnIjt7telFK6uC0uig3dYOpAUz+vyU2AhmuNkZzk iMhku87weW2C5PTNjbxWd0vlS5bjNNmJs3XltxJK2jRHx1L/maxMN/gwbjSLq5em6oUc gPa6roWtcnEW2iUnSWRn7XAR00XONqErtkvqu74dbyHzlVTqE9G8IfuwvgCCOC9ENs8i seFQ== X-Gm-Message-State: AOAM533UqxQVxnwMZMXCUae+tfwFV8mcIPs6Oul4aoWWqcq0tEWEy1B0 BG94SOU53owqrxUCx+zQPcExL1VqW5oHzI6M X-Google-Smtp-Source: ABdhPJwVggTGd+1eH6JVFmJdiGafkVyyhSW1I+U9AWXsMGnUSLHBMOaXQwCC9wOfhr661awScuTNrQ== X-Received: by 2002:a02:7086:: with SMTP id f128mr4578993jac.104.1615482328065; Thu, 11 Mar 2021 09:05:28 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:f947:1686:6ada:db5b]) by smtp.gmail.com with ESMTPSA id w9sm1724537iox.20.2021.03.11.09.05.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Mar 2021 09:05:27 -0800 (PST) Date: Thu, 11 Mar 2021 12:05:25 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: avarab@gmail.com, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com, peff@peff.net Subject: [PATCH v3 12/16] Documentation/technical: describe multi-pack reverse indexes Message-ID: <4745bb8590f5cdc24445618dd63ba6bd541227b4.1615482270.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org As a prerequisite to implementing multi-pack bitmaps, motivate and describe the format and ordering of the multi-pack reverse index. The subsequent patch will implement reading this format, and the patch after that will implement writing it while producing a multi-pack index. Co-authored-by: Jeff King Signed-off-by: Jeff King Signed-off-by: Taylor Blau --- Documentation/technical/pack-format.txt | 83 +++++++++++++++++++++++++ 1 file changed, 83 insertions(+) diff --git a/Documentation/technical/pack-format.txt b/Documentation/technical/pack-format.txt index 1faa949bf6..4bbbb188a4 100644 --- a/Documentation/technical/pack-format.txt +++ b/Documentation/technical/pack-format.txt @@ -379,3 +379,86 @@ CHUNK DATA: TRAILER: Index checksum of the above contents. + +== multi-pack-index reverse indexes + +Similar to the pack-based reverse index, the multi-pack index can also +be used to generate a reverse index. + +Instead of mapping between offset, pack-, and index position, this +reverse index maps between an object's position within the MIDX, and +that object's position within a pseudo-pack that the MIDX describes +(i.e., the ith entry of the multi-pack reverse index holds the MIDX +position of ith object in pseudo-pack order). + +To clarify the difference between these orderings, consider a multi-pack +reachability bitmap (which does not yet exist, but is what we are +building towards here). Each bit needs to correspond to an object in the +MIDX, and so we need an efficient mapping from bit position to MIDX +position. + +One solution is to let bits occupy the same position in the oid-sorted +index stored by the MIDX. But because oids are effectively random, there +resulting reachability bitmaps would have no locality, and thus compress +poorly. (This is the reason that single-pack bitmaps use the pack +ordering, and not the .idx ordering, for the same purpose.) + +So we'd like to define an ordering for the whole MIDX based around +pack ordering, which has far better locality (and thus compresses more +efficiently). We can think of a pseudo-pack created by the concatenation +of all of the packs in the MIDX. E.g., if we had a MIDX with three packs +(a, b, c), with 10, 15, and 20 objects respectively, we can imagine an +ordering of the objects like: + + |a,0|a,1|...|a,9|b,0|b,1|...|b,14|c,0|c,1|...|c,19| + +where the ordering of the packs is defined by the MIDX's pack list, +and then the ordering of objects within each pack is the same as the +order in the actual packfile. + +Given the list of packs and their counts of objects, you can +naïvely reconstruct that pseudo-pack ordering (e.g., the object at +position 27 must be (c,1) because packs "a" and "b" consumed 25 of the +slots). But there's a catch. Objects may be duplicated between packs, in +which case the MIDX only stores one pointer to the object (and thus we'd +want only one slot in the bitmap). + +Callers could handle duplicates themselves by reading objects in order +of their bit-position, but that's linear in the number of objects, and +much too expensive for ordinary bitmap lookups. Building a reverse index +solves this, since it is the logical inverse of the index, and that +index has already removed duplicates. But, building a reverse index on +the fly can be expensive. Since we already have an on-disk format for +pack-based reverse indexes, let's reuse it for the MIDX's pseudo-pack, +too. + +Objects from the MIDX are ordered as follows to string together the +pseudo-pack. Let _pack(o)_ return the pack from which _o_ was selected +by the MIDX, and define an ordering of packs based on their numeric ID +(as stored by the MIDX). Let _offset(o)_ return the object offset of _o_ +within _pack(o)_. Then, compare _o~1~_ and _o~2~_ as follows: + + - If one of _pack(o~1~)_ and _pack(o~2~)_ is preferred and the other + is not, then the preferred one sorts first. ++ +(This is a detail that allows the MIDX bitmap to determine which +pack should be used by the pack-reuse mechanism, since it can ask +the MIDX for the pack containing the object at bit position 0). + + - If _pack(o~1~) ≠ pack(o~2~)_, then sort the two objects in + descending order based on the pack ID. + + - Otherwise, _pack(o~1~) = pack(o~2~)_, and the objects are + sorted in pack-order (i.e., _o~1~_ sorts ahead of _o~2~_ exactly + when _offset(o~1~) < offset(o~2~)_). + +In short, a MIDX's pseudo-pack is the de-duplicated concatenation of +objects in packs stored by the MIDX, laid out in pack order, and the +packs arranged in MIDX order (with the preferred pack coming first). + +Finally, note that the MIDX's reverse index is not stored as a chunk in +the multi-pack-index itself. This is done because the reverse index +includes the checksum of the pack or MIDX to which it belongs, which +makes it impossible to write in the MIDX. To avoid races when rewriting +the MIDX, a MIDX reverse index includes the MIDX's checksum in its +filename (e.g., `multi-pack-index-xyz.rev`). From patchwork Thu Mar 11 17:05:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12132181 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1EEB2C43381 for ; Thu, 11 Mar 2021 17:06:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DA15A64FFA for ; Thu, 11 Mar 2021 17:06:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230045AbhCKRFz (ORCPT ); Thu, 11 Mar 2021 12:05:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36794 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230094AbhCKRFd (ORCPT ); Thu, 11 Mar 2021 12:05:33 -0500 Received: from mail-io1-xd2c.google.com (mail-io1-xd2c.google.com [IPv6:2607:f8b0:4864:20::d2c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 65AF4C061574 for ; Thu, 11 Mar 2021 09:05:33 -0800 (PST) Received: by mail-io1-xd2c.google.com with SMTP id n14so22713866iog.3 for ; Thu, 11 Mar 2021 09:05:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=R5EUO0bkHQBgwE7QCQLC3SrcWoX6BWyONPud0LDjAnA=; b=xeXJtipfFglzCdbyu1ci8LQlGK5FCGVh8ci8frdAuVaPLw7P8eSI4krsbxEYQMtIjh TOJgoe/m4ccypxiy/eCEFJWIULmT/YMK2UQLbVZkBiyC+C7vMaIoHenjVvxTDYeI3153 25DaJArAA3fYRK/vbHkgpUN/6iQxf47rKoLlV3FoR03XDah8+br5bZlhyUyxp1gHYv4i vJeDtF5CWtMM0/43+7if4JeG4g9GN0cciVW5JuaHeffj6PLoUwX8Qp5XjeJn0vs+bxkD kAOkSTP58CD/V0NW/TyMJO10UpS9M1YcdwXKusdkc31cyfR/WIKorj3dx/CgdcYGcSE0 AIwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=R5EUO0bkHQBgwE7QCQLC3SrcWoX6BWyONPud0LDjAnA=; b=JsSFcQM1KlLQh/dnDJAEM/CVY3JtpRl6ZxYwSms74ny5El83LVa+iOMcbyBpQ8DjR8 QyqvwEewDr+4r6owi3k8VGKu99HyvzfSBB9KzPhv6oVNlZgMicKQIemIoUrSg1hLaH6J VA2PtM6Z+abR6tJGu6CdbWidOrgBt+OojvI23yFgEtlt2vxXSjcxPpDeQQ9zjX+wOIrg cW2NM+liaZ/SgTRioOCYyRKKQPcnSbqrizqdc5Xgx8DHda1aj5wqhwm1Q/iugfUk1qp+ m4GkS657YqMRHLP5V8ZbhaRd+MRpLC5jjY6OTbQMzkrZb5+ACNRB4GpmE+Vw8gEj2ye3 WNjg== X-Gm-Message-State: AOAM533yOShfQjiBwKJJ936HCXIBLXQWC0slQtfaDbJo78GOm9+FQh3F VMFq8trNOP6A15sJWFVnboNPo4/HMbV0bI2/ X-Google-Smtp-Source: ABdhPJwIlyHXTME4R3qQ7rkjDq+ojBelSpvXV88tT98eh3g3ocWTzTA4OWp2wy6B/Sd2ShSZIZ1GyQ== X-Received: by 2002:a02:caa9:: with SMTP id e9mr4518461jap.59.1615482332450; Thu, 11 Mar 2021 09:05:32 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:f947:1686:6ada:db5b]) by smtp.gmail.com with ESMTPSA id o13sm1615661iob.17.2021.03.11.09.05.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Mar 2021 09:05:32 -0800 (PST) Date: Thu, 11 Mar 2021 12:05:29 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: avarab@gmail.com, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com, peff@peff.net Subject: [PATCH v3 13/16] pack-revindex: read multi-pack reverse indexes Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Implement reading for multi-pack reverse indexes, as described in the previous patch. Note that these functions don't yet have any callers, and won't until multi-pack reachability bitmaps are introduced in a later patch series. In the meantime, this patch implements some of the infrastructure necessary to support multi-pack bitmaps. There are three new functions exposed by the revindex API: - load_midx_revindex(): loads the reverse index corresponding to the given multi-pack index. - midx_to_pack_pos() and pack_pos_to_midx(): these convert between the multi-pack index and pseudo-pack order. load_midx_revindex() and pack_pos_to_midx() are both relatively straightforward. load_midx_revindex() needs a few functions to be exposed from the midx API. One to get the checksum of a midx, and another to get the .rev's filename. Similar to recent changes in the packed_git struct, three new fields are added to the multi_pack_index struct: one to keep track of the size, one to keep track of the mmap'd pointer, and another to point past the header and at the reverse index's data. pack_pos_to_midx() simply reads the corresponding entry out of the table. midx_to_pack_pos() is the trickiest, since it needs to find an object's position in the psuedo-pack order, but that order can only be recovered in the .rev file itself. This mapping can be implemented with a binary search, but note that the thing we're binary searching over isn't an array of values, but rather a permuted order of those values. So, when comparing two items, it's helpful to keep in mind the difference. Instead of a traditional binary search, where you are comparing two things directly, here we're comparing a (pack, offset) tuple with an index into the multi-pack index. That index describes another (pack, offset) tuple, and it is _those_ two tuples that are compared. Signed-off-by: Taylor Blau --- midx.c | 11 +++++ midx.h | 6 +++ pack-revindex.c | 127 ++++++++++++++++++++++++++++++++++++++++++++++++ pack-revindex.h | 53 ++++++++++++++++++++ packfile.c | 3 ++ 5 files changed, 200 insertions(+) diff --git a/midx.c b/midx.c index 0a5da49ed6..55f4567fca 100644 --- a/midx.c +++ b/midx.c @@ -47,11 +47,22 @@ static uint8_t oid_version(void) } } +static const unsigned char *get_midx_checksum(struct multi_pack_index *m) +{ + return m->data + m->data_len - the_hash_algo->rawsz; +} + static char *get_midx_filename(const char *object_dir) { return xstrfmt("%s/pack/multi-pack-index", object_dir); } +char *get_midx_rev_filename(struct multi_pack_index *m) +{ + return xstrfmt("%s/pack/multi-pack-index-%s.rev", + m->object_dir, hash_to_hex(get_midx_checksum(m))); +} + static int midx_read_oid_fanout(const unsigned char *chunk_start, size_t chunk_size, void *data) { diff --git a/midx.h b/midx.h index 93bd68189e..0a8294d2ee 100644 --- a/midx.h +++ b/midx.h @@ -15,6 +15,10 @@ struct multi_pack_index { const unsigned char *data; size_t data_len; + const uint32_t *revindex_data; + const uint32_t *revindex_map; + size_t revindex_len; + uint32_t signature; unsigned char version; unsigned char hash_len; @@ -37,6 +41,8 @@ struct multi_pack_index { #define MIDX_PROGRESS (1 << 0) +char *get_midx_rev_filename(struct multi_pack_index *m); + struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local); int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t pack_int_id); int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, uint32_t *result); diff --git a/pack-revindex.c b/pack-revindex.c index 83fe4de773..2e15ba3a8f 100644 --- a/pack-revindex.c +++ b/pack-revindex.c @@ -3,6 +3,7 @@ #include "object-store.h" #include "packfile.h" #include "config.h" +#include "midx.h" struct revindex_entry { off_t offset; @@ -292,6 +293,44 @@ int load_pack_revindex(struct packed_git *p) return -1; } +int load_midx_revindex(struct multi_pack_index *m) +{ + char *revindex_name; + int ret; + if (m->revindex_data) + return 0; + + revindex_name = get_midx_rev_filename(m); + + ret = load_revindex_from_disk(revindex_name, + m->num_objects, + &m->revindex_map, + &m->revindex_len); + if (ret) + goto cleanup; + + m->revindex_data = (const uint32_t *)((const char *)m->revindex_map + RIDX_HEADER_SIZE); + +cleanup: + free(revindex_name); + return ret; +} + +int close_midx_revindex(struct multi_pack_index *m) +{ + if (!m) + return 0; + + if (munmap((void*)m->revindex_map, m->revindex_len)) + return -1; + + m->revindex_map = NULL; + m->revindex_data = NULL; + m->revindex_len = 0; + + return 0; +} + int offset_to_pack_pos(struct packed_git *p, off_t ofs, uint32_t *pos) { unsigned lo, hi; @@ -346,3 +385,91 @@ off_t pack_pos_to_offset(struct packed_git *p, uint32_t pos) else return nth_packed_object_offset(p, pack_pos_to_index(p, pos)); } + +uint32_t pack_pos_to_midx(struct multi_pack_index *m, uint32_t pos) +{ + if (!m->revindex_data) + BUG("pack_pos_to_midx: reverse index not yet loaded"); + if (m->num_objects <= pos) + BUG("pack_pos_to_midx: out-of-bounds object at %"PRIu32, pos); + return get_be32((const char *)m->revindex_data + (pos * sizeof(uint32_t))); +} + +struct midx_pack_key { + uint32_t pack; + off_t offset; + + uint32_t preferred_pack; + struct multi_pack_index *midx; +}; + +static int midx_pack_order_cmp(const void *va, const void *vb) +{ + const struct midx_pack_key *key = va; + struct multi_pack_index *midx = key->midx; + + uint32_t versus = pack_pos_to_midx(midx, (uint32_t*)vb - (const uint32_t *)midx->revindex_data); + uint32_t versus_pack = nth_midxed_pack_int_id(midx, versus); + off_t versus_offset; + + uint32_t key_preferred = key->pack == key->preferred_pack; + uint32_t versus_preferred = versus_pack == key->preferred_pack; + + /* + * First, compare the preferred-ness, noting that the preferred pack + * comes first. + */ + if (key_preferred && !versus_preferred) + return -1; + else if (!key_preferred && versus_preferred) + return 1; + + /* Then, break ties first by comparing the pack IDs. */ + if (key->pack < versus_pack) + return -1; + else if (key->pack > versus_pack) + return 1; + + /* Finally, break ties by comparing offsets within a pack. */ + versus_offset = nth_midxed_offset(midx, versus); + if (key->offset < versus_offset) + return -1; + else if (key->offset > versus_offset) + return 1; + + return 0; +} + +int midx_to_pack_pos(struct multi_pack_index *m, uint32_t at, uint32_t *pos) +{ + struct midx_pack_key key; + uint32_t *found; + + if (!m->revindex_data) + BUG("midx_to_pack_pos: reverse index not yet loaded"); + if (m->num_objects <= at) + BUG("midx_to_pack_pos: out-of-bounds object at %"PRIu32, at); + + key.pack = nth_midxed_pack_int_id(m, at); + key.offset = nth_midxed_offset(m, at); + key.midx = m; + /* + * The preferred pack sorts first, so determine its identifier by + * looking at the first object in pseudo-pack order. + * + * Note that if no --preferred-pack is explicitly given when writing a + * multi-pack index, then whichever pack has the lowest identifier + * implicitly is preferred (and includes all its objects, since ties are + * broken first by pack identifier). + */ + key.preferred_pack = nth_midxed_pack_int_id(m, pack_pos_to_midx(m, 0)); + + found = bsearch(&key, m->revindex_data, m->num_objects, + sizeof(uint32_t), midx_pack_order_cmp); + + if (!found) + return error("bad offset for revindex"); + + *pos = found - m->revindex_data; + return 0; +} diff --git a/pack-revindex.h b/pack-revindex.h index ba7c82c125..479b8f2f9c 100644 --- a/pack-revindex.h +++ b/pack-revindex.h @@ -14,6 +14,20 @@ * * - offset: the byte offset within the .pack file at which the object contents * can be found + * + * The revindex can also be used with a multi-pack index (MIDX). In this + * setting: + * + * - index position refers to an object's numeric position within the MIDX + * + * - pack position refers to an object's position within a non-existent pack + * described by the MIDX. The pack structure is described in + * Documentation/technical/pack-format.txt. + * + * It is effectively a concatanation of all packs in the MIDX (ordered by + * their numeric ID within the MIDX) in their original order within each + * pack), removing duplicates, and placing the preferred pack (if any) + * first. */ @@ -24,6 +38,7 @@ #define GIT_TEST_REV_INDEX_DIE_IN_MEMORY "GIT_TEST_REV_INDEX_DIE_IN_MEMORY" struct packed_git; +struct multi_pack_index; /* * load_pack_revindex populates the revindex's internal data-structures for the @@ -34,6 +49,22 @@ struct packed_git; */ int load_pack_revindex(struct packed_git *p); +/* + * load_midx_revindex loads the '.rev' file corresponding to the given + * multi-pack index by mmap-ing it and assigning pointers in the + * multi_pack_index to point at it. + * + * A negative number is returned on error. + */ +int load_midx_revindex(struct multi_pack_index *m); + +/* + * Frees resources associated with a multi-pack reverse index. + * + * A negative number is returned on error. + */ +int close_midx_revindex(struct multi_pack_index *m); + /* * offset_to_pack_pos converts an object offset to a pack position. This * function returns zero on success, and a negative number otherwise. The @@ -71,4 +102,26 @@ uint32_t pack_pos_to_index(struct packed_git *p, uint32_t pos); */ off_t pack_pos_to_offset(struct packed_git *p, uint32_t pos); +/* + * pack_pos_to_midx converts the object at position "pos" within the MIDX + * pseudo-pack into a MIDX position. + * + * If the reverse index has not yet been loaded, or the position is out of + * bounds, this function aborts. + * + * This function runs in time O(log N) with the number of objects in the MIDX. + */ +uint32_t pack_pos_to_midx(struct multi_pack_index *m, uint32_t pos); + +/* + * midx_to_pack_pos converts from the MIDX-relative position at "at" to the + * corresponding pack position. + * + * If the reverse index has not yet been loaded, or the position is out of + * bounds, this function aborts. + * + * This function runs in constant time. + */ +int midx_to_pack_pos(struct multi_pack_index *midx, uint32_t at, uint32_t *pos); + #endif diff --git a/packfile.c b/packfile.c index 1fec12ac5f..82623e0cb4 100644 --- a/packfile.c +++ b/packfile.c @@ -862,6 +862,9 @@ static void prepare_pack(const char *full_name, size_t full_name_len, if (!strcmp(file_name, "multi-pack-index")) return; + if (starts_with(file_name, "multi-pack-index") && + ends_with(file_name, ".rev")) + return; if (ends_with(file_name, ".idx") || ends_with(file_name, ".rev") || ends_with(file_name, ".pack") || From patchwork Thu Mar 11 17:05:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12132183 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E09BC4332D for ; Thu, 11 Mar 2021 17:06:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 01E3C64FFD for ; Thu, 11 Mar 2021 17:06:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230092AbhCKRF5 (ORCPT ); Thu, 11 Mar 2021 12:05:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36810 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230104AbhCKRFh (ORCPT ); Thu, 11 Mar 2021 12:05:37 -0500 Received: from mail-qt1-x829.google.com (mail-qt1-x829.google.com [IPv6:2607:f8b0:4864:20::829]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 57B5AC061574 for ; Thu, 11 Mar 2021 09:05:37 -0800 (PST) Received: by mail-qt1-x829.google.com with SMTP id j7so1663074qtx.5 for ; Thu, 11 Mar 2021 09:05:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=iHZGzO8Jecmr9vlHZnZ2tnw49/4/HdyfZGEjBh6XtVQ=; b=s5App8Cwzrk3eytwCziKcfHs1ktStJE4xMfksY+TRsT0s9vb3guUPKQ9BF5t0S1Bu0 CSBYen0AlI3kIcn5HDn76XITBQH9g/532JhrV4/nuBKq44AaFmp6hzxi9NKTsHE1OGkT IX60zEM/u855qhhxKMB2UiWTEruRQ8LbfH9l7LZE4R5lrT/uoVB9Mj7c9kY9wPOgxMQs 2jdC9k3CjFIecq3lSYpxp2L5/4uUrz+Tdiep6gfZ6YFS6UIEQXubJ5w7CfPErhozSFf3 Axo1beOdwqb393ui3eCI+Pl5goBTZCm5oJuK9kFE/4zaogXxNmmv7YTfADDGQrtriZGg ID4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=iHZGzO8Jecmr9vlHZnZ2tnw49/4/HdyfZGEjBh6XtVQ=; b=LI2hUl9fX8vRLp7unqTSOsr0bX057Zzm9FJpIp3KU96SEl+fHSZaM4GVQb9sLYFLMv pumG1u0Np7U2KCkcDlIAoihgd2dpXj2qXVOIACMzdl96V+Ech8s7DbNri6p1iIsqIdeu +mFQbGAFKozmcHWE8UbSN0TGgwKp+VxU1qHk+6uxREoXodP/rKdRPsiQ8lxmVWuhM0M7 dqPFMvWzHcqTahPNPBRXvQ2IO10ToMT+0PY2Hmx1Ul0hyM5bRI97B3L2Hs88MUAuKGvs fknrTwBmhegwzXcspp71yc70RrdUVb6hcl04p0hfFllXv6HuIEOGkMpBcbwFtLSqxV6A l5ag== X-Gm-Message-State: AOAM5302eBT1YVgG0DQKo+qUeHqzxWUspGSnq4aCxr/FvocIL45Gxf4w lNyzXtfx6s2MRd3AB5Ur90QxYss50XmDpeIQ X-Google-Smtp-Source: ABdhPJwydU6mVF7Yfv1LxL1IjwTVIdd2a7kXyr88Ct1JoISpKGpDLzhiA/C0dIXtGUwSRJh6g3YgDA== X-Received: by 2002:ac8:5a0d:: with SMTP id n13mr7885460qta.345.1615482336360; Thu, 11 Mar 2021 09:05:36 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:f947:1686:6ada:db5b]) by smtp.gmail.com with ESMTPSA id y1sm2294554qkf.55.2021.03.11.09.05.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Mar 2021 09:05:35 -0800 (PST) Date: Thu, 11 Mar 2021 12:05:34 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: avarab@gmail.com, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com, peff@peff.net Subject: [PATCH v3 14/16] pack-write.c: extract 'write_rev_file_order' Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Existing callers provide the reverse index code with an array of 'struct pack_idx_entry *'s, which is then sorted by pack order (comparing the offsets of each object within the pack). Prepare for the multi-pack index to write a .rev file by providing a way to write the reverse index without an array of pack_idx_entry (which the MIDX code does not have). Instead, callers can invoke 'write_rev_index_positions()', which takes an array of uint32_t's. The ith entry in this array specifies the ith object's (in index order) position within the pack (in pack order). Expose this new function for use in a later patch, and rewrite the existing write_rev_file() in terms of this new function. Signed-off-by: Taylor Blau --- pack-write.c | 36 +++++++++++++++++++++++++----------- pack.h | 1 + 2 files changed, 26 insertions(+), 11 deletions(-) diff --git a/pack-write.c b/pack-write.c index 2ca85a9d16..f1fc3ecafa 100644 --- a/pack-write.c +++ b/pack-write.c @@ -201,21 +201,12 @@ static void write_rev_header(struct hashfile *f) } static void write_rev_index_positions(struct hashfile *f, - struct pack_idx_entry **objects, + uint32_t *pack_order, uint32_t nr_objects) { - uint32_t *pack_order; uint32_t i; - - ALLOC_ARRAY(pack_order, nr_objects); - for (i = 0; i < nr_objects; i++) - pack_order[i] = i; - QSORT_S(pack_order, nr_objects, pack_order_cmp, objects); - for (i = 0; i < nr_objects; i++) hashwrite_be32(f, pack_order[i]); - - free(pack_order); } static void write_rev_trailer(struct hashfile *f, const unsigned char *hash) @@ -228,6 +219,29 @@ const char *write_rev_file(const char *rev_name, uint32_t nr_objects, const unsigned char *hash, unsigned flags) +{ + uint32_t *pack_order; + uint32_t i; + const char *ret; + + ALLOC_ARRAY(pack_order, nr_objects); + for (i = 0; i < nr_objects; i++) + pack_order[i] = i; + QSORT_S(pack_order, nr_objects, pack_order_cmp, objects); + + ret = write_rev_file_order(rev_name, pack_order, nr_objects, hash, + flags); + + free(pack_order); + + return ret; +} + +const char *write_rev_file_order(const char *rev_name, + uint32_t *pack_order, + uint32_t nr_objects, + const unsigned char *hash, + unsigned flags) { struct hashfile *f; int fd; @@ -262,7 +276,7 @@ const char *write_rev_file(const char *rev_name, write_rev_header(f); - write_rev_index_positions(f, objects, nr_objects); + write_rev_index_positions(f, pack_order, nr_objects); write_rev_trailer(f, hash); if (rev_name && adjust_shared_perm(rev_name) < 0) diff --git a/pack.h b/pack.h index 857cbd5bd4..fa13954526 100644 --- a/pack.h +++ b/pack.h @@ -94,6 +94,7 @@ struct ref; void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought); const char *write_rev_file(const char *rev_name, struct pack_idx_entry **objects, uint32_t nr_objects, const unsigned char *hash, unsigned flags); +const char *write_rev_file_order(const char *rev_name, uint32_t *pack_order, uint32_t nr_objects, const unsigned char *hash, unsigned flags); /* * The "hdr" output buffer should be at least this big, which will handle sizes From patchwork Thu Mar 11 17:05:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12132185 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6191CC4332B for ; Thu, 11 Mar 2021 17:06:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 408A965002 for ; Thu, 11 Mar 2021 17:06:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230131AbhCKRF6 (ORCPT ); Thu, 11 Mar 2021 12:05:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36828 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230115AbhCKRFl (ORCPT ); Thu, 11 Mar 2021 12:05:41 -0500 Received: from mail-qk1-x72f.google.com (mail-qk1-x72f.google.com [IPv6:2607:f8b0:4864:20::72f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 80902C061574 for ; Thu, 11 Mar 2021 09:05:41 -0800 (PST) Received: by mail-qk1-x72f.google.com with SMTP id a9so21310129qkn.13 for ; Thu, 11 Mar 2021 09:05:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=zupcjOX+OjuhNAqQZ1wmKITSMCyqlzkRPDe7kRUGDz8=; b=c8o/xYfGrTIIe4Zih7o/15UYVm77toPrqLb7JVH970jFu3ou7Y0XNxI0N7rV0YA8ii XXitxq89/NskcIb1maXY3rXx0ra9YhEq0Xh9VGJ9RgIxBMO7MNsM7mfkGDPpKyoG+fmH hAssj02HAFzp/x+tB5gRh0M+sX8uF6xLPiZ+PVH5Pd/dOULnAT08FEVo6PAdzRkKWmoF tHfUWXUZbqQlCX884BezJo6uGv/x6q0cqYrDM1otA6xtmYFYp511JZDEuNhh7lHDptqr v0Y2w2jmoKsUl4EzQxYcNaOsmnsIa+LxRaoLCwwtcnMnC310CIdoZiuVx9clYifXuVs6 JfCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=zupcjOX+OjuhNAqQZ1wmKITSMCyqlzkRPDe7kRUGDz8=; b=hm/LZSz9Yzdkwzn9iTc6SHgs9uXVbaqRDIqLFyLK73j7kngfzkcpPvo7fpQxaoTeS/ ja5AXLW/VV0CE+sAGJw1mjhF398cAPN564Oc+JlHPbtjA1k11JGobLre0F2Rl7AR55Rp +neakdg2NUN6xYerZbQpL1tV43e9I3RrU4wy+nsGJNnO9vnKDLAKWq5np4Ewo2Fb2Vcu FRyJJrFg65WRBYjokZ5knMy1PdpimYbaFBX8lFsxsX+593cWjjlV0hhhgrwPieL78V+c 5Fvj81tNrJQrNW6p5R26L9yWRKmlgw13SMw4FixmIFURoPzBSL+M0OPcu/eB18slrEop CZ2A== X-Gm-Message-State: AOAM533ndm3l7r+DLf/548BQothNYsYBJk5WOOoJDvM807YgPZ7kI377 JayAezhfzgzsCJkJUTSuA+NtfVKrM4jQhYZK X-Google-Smtp-Source: ABdhPJx//h5X0BW42+68fF739KeENPsI/lnCb5pa8KAWivbcl1/3rVRksEymsH3qk1fSKKVSj6u/Qg== X-Received: by 2002:a37:9d57:: with SMTP id g84mr7982786qke.71.1615482340407; Thu, 11 Mar 2021 09:05:40 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:f947:1686:6ada:db5b]) by smtp.gmail.com with ESMTPSA id w197sm2278253qkb.89.2021.03.11.09.05.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Mar 2021 09:05:40 -0800 (PST) Date: Thu, 11 Mar 2021 12:05:38 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: avarab@gmail.com, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com, peff@peff.net Subject: [PATCH v3 15/16] pack-revindex: write multi-pack reverse indexes Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Implement the writing half of multi-pack reverse indexes. This is nothing more than the format describe a few patches ago, with a new set of helper functions that will be used to clear out stale .rev files corresponding to old MIDXs. Unfortunately, a very similar comparison function as the one implemented recently in pack-revindex.c is reimplemented here, this time accepting a MIDX-internal type. An effort to DRY these up would create more indirection and overhead than is necessary, so it isn't pursued here. Currently, there are no callers which pass the MIDX_WRITE_REV_INDEX flag, meaning that this is all dead code. But, that won't be the case for long, since subsequent patches will introduce the multi-pack bitmap, which will begin passing this field. (In midx.c:write_midx_internal(), the two adjacent if statements share a conditional, but are written separately since the first one will eventually also handle the MIDX_WRITE_BITMAP flag, which does not yet exist.) Signed-off-by: Taylor Blau --- midx.c | 115 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ midx.h | 1 + 2 files changed, 116 insertions(+) diff --git a/midx.c b/midx.c index 55f4567fca..eea9574d92 100644 --- a/midx.c +++ b/midx.c @@ -12,6 +12,7 @@ #include "run-command.h" #include "repository.h" #include "chunk-format.h" +#include "pack.h" #define MIDX_SIGNATURE 0x4d494458 /* "MIDX" */ #define MIDX_VERSION 1 @@ -472,6 +473,7 @@ struct write_midx_context { uint32_t entries_nr; uint32_t *pack_perm; + uint32_t *pack_order; unsigned large_offsets_needed:1; uint32_t num_large_offsets; @@ -826,6 +828,70 @@ static int write_midx_large_offsets(struct hashfile *f, return 0; } +static int midx_pack_order_cmp(const void *va, const void *vb, void *_ctx) +{ + struct write_midx_context *ctx = _ctx; + + struct pack_midx_entry *a = &ctx->entries[*(const uint32_t *)va]; + struct pack_midx_entry *b = &ctx->entries[*(const uint32_t *)vb]; + + uint32_t perm_a = ctx->pack_perm[a->pack_int_id]; + uint32_t perm_b = ctx->pack_perm[b->pack_int_id]; + + /* Sort objects in the preferred pack ahead of any others. */ + if (a->preferred > b->preferred) + return -1; + if (a->preferred < b->preferred) + return 1; + + /* Then, order objects by which packs they appear in. */ + if (perm_a < perm_b) + return -1; + if (perm_a > perm_b) + return 1; + + /* Then, disambiguate by their offset within each pack. */ + if (a->offset < b->offset) + return -1; + if (a->offset > b->offset) + return 1; + + return 0; +} + +static uint32_t *midx_pack_order(struct write_midx_context *ctx) +{ + uint32_t *pack_order; + uint32_t i; + + ALLOC_ARRAY(pack_order, ctx->entries_nr); + for (i = 0; i < ctx->entries_nr; i++) + pack_order[i] = i; + QSORT_S(pack_order, ctx->entries_nr, midx_pack_order_cmp, ctx); + + return pack_order; +} + +static void write_midx_reverse_index(char *midx_name, unsigned char *midx_hash, + struct write_midx_context *ctx) +{ + struct strbuf buf = STRBUF_INIT; + const char *tmp_file; + + strbuf_addf(&buf, "%s-%s.rev", midx_name, hash_to_hex(midx_hash)); + + tmp_file = write_rev_file_order(NULL, ctx->pack_order, ctx->entries_nr, + midx_hash, WRITE_REV); + + if (finalize_object_file(tmp_file, buf.buf)) + die(_("cannot store reverse index file")); + + strbuf_release(&buf); +} + +static void clear_midx_files_ext(struct repository *r, const char *ext, + unsigned char *keep_hash); + static int write_midx_internal(const char *object_dir, struct multi_pack_index *m, struct string_list *packs_to_drop, const char *preferred_pack_name, @@ -1011,6 +1077,14 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * finalize_hashfile(f, midx_hash, CSUM_FSYNC | CSUM_HASH_IN_STREAM); free_chunkfile(cf); + + if (flags & MIDX_WRITE_REV_INDEX) + ctx.pack_order = midx_pack_order(&ctx); + + if (flags & MIDX_WRITE_REV_INDEX) + write_midx_reverse_index(midx_name, midx_hash, &ctx); + clear_midx_files_ext(the_repository, ".rev", midx_hash); + commit_lock_file(&lk); cleanup: @@ -1025,6 +1099,7 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * free(ctx.info); free(ctx.entries); free(ctx.pack_perm); + free(ctx.pack_order); free(midx_name); return result; } @@ -1037,6 +1112,44 @@ int write_midx_file(const char *object_dir, flags); } +struct clear_midx_data { + char *keep; + const char *ext; +}; + +static void clear_midx_file_ext(const char *full_path, size_t full_path_len, + const char *file_name, void *_data) +{ + struct clear_midx_data *data = _data; + + if (!(starts_with(file_name, "multi-pack-index-") && + ends_with(file_name, data->ext))) + return; + if (data->keep && !strcmp(data->keep, file_name)) + return; + + if (unlink(full_path)) + die_errno(_("failed to remove %s"), full_path); +} + +static void clear_midx_files_ext(struct repository *r, const char *ext, + unsigned char *keep_hash) +{ + struct clear_midx_data data; + memset(&data, 0, sizeof(struct clear_midx_data)); + + if (keep_hash) + data.keep = xstrfmt("multi-pack-index-%s%s", + hash_to_hex(keep_hash), ext); + data.ext = ext; + + for_each_file_in_pack_dir(r->objects->odb->path, + clear_midx_file_ext, + &data); + + free(data.keep); +} + void clear_midx_file(struct repository *r) { char *midx = get_midx_filename(r->objects->odb->path); @@ -1049,6 +1162,8 @@ void clear_midx_file(struct repository *r) if (remove_path(midx)) die(_("failed to clear multi-pack-index at %s"), midx); + clear_midx_files_ext(r, ".rev", NULL); + free(midx); } diff --git a/midx.h b/midx.h index 0a8294d2ee..8684cf0fef 100644 --- a/midx.h +++ b/midx.h @@ -40,6 +40,7 @@ struct multi_pack_index { }; #define MIDX_PROGRESS (1 << 0) +#define MIDX_WRITE_REV_INDEX (1 << 1) char *get_midx_rev_filename(struct multi_pack_index *m); From patchwork Thu Mar 11 17:05:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12132187 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4CA08C4332E for ; Thu, 11 Mar 2021 17:06:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 27B7164FF9 for ; Thu, 11 Mar 2021 17:06:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230137AbhCKRF6 (ORCPT ); Thu, 11 Mar 2021 12:05:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36844 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230116AbhCKRFp (ORCPT ); Thu, 11 Mar 2021 12:05:45 -0500 Received: from mail-il1-x12d.google.com (mail-il1-x12d.google.com [IPv6:2607:f8b0:4864:20::12d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7F409C061574 for ; Thu, 11 Mar 2021 09:05:45 -0800 (PST) Received: by mail-il1-x12d.google.com with SMTP id d5so19566749iln.6 for ; Thu, 11 Mar 2021 09:05:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=dFq0lRmBt/kR5avjX9ISuKsERFydf+Naiz91SkYuCko=; b=zlI2PQVXO27vaqGk4oG9wZMB4QlB0QVvBE38OyiH2NyrBGY0WlY2mQga9ERrbEw3lV ay9t/3DfZiLUV0sHkI9Unz1x6LA7I/3E1t7DOOg4IfSXt3/vgCoAdBKnSRBuhTLFe0ZG W9U3A/+ao/oGjRQkVyzOZUU+wvikvZpzPneFweQPTF1VlybgdA8p5ZuC2Um5xst2kWSF rPvertcrXAJXKTNkpM+ZFyKTPMEKg/K7O39BHlhXW0PxHLeRpU5ndg7xe87pPM9qwz5E yS139BzEONdYtdXjItq1o0jYKVOAFknvPC5xpt8zb3nNK25JjRUpJ5FpCvX9HzOxRDND ZsUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=dFq0lRmBt/kR5avjX9ISuKsERFydf+Naiz91SkYuCko=; b=JtGCIcvO6im1+3es1Ns1Z2KPR1p/9uy5uok56j7K7zocOTKsvuL1q68wFW7d5HpFMU AKUbzKoj6Dpu/nHqYzfpeApDhs6yt1wIBmbKk1JVlN76QIR1IiU7eG7av5irryUnHjgB b+4FpoqkiYPwOm5lOPV2CrwBuYqVhOkX0qWHiOUDBcamtc6N1iSrYh+FDx6WJIyZd3oL ySOXh0aL3EAvWLj1oU4fZciMb3ALK5O/dAQwq8eC1A2EYeqXCVoNam06QZfSsNvWRm8C wCXb0Ai1FIG5I1EfUIiVvVe2mN3Kp1cnaeIT56MUje0mxeB8rFDDmTD4obzYGRop8xd7 baVg== X-Gm-Message-State: AOAM532FvksZvY+RgOCEw5JiT7l5Pib0J7xqS8tVHvYIHCyXitf7fJmL IGd8f6X6f06VldAbcsqsJk3t/fHM2q6p17fw X-Google-Smtp-Source: ABdhPJwPRvu2rfEXth1ikQGhz/HDQQBl5Hk+gUTIFE5wr9aX9byikUKf4ZrYoryLF1SMbrgAUfctww== X-Received: by 2002:a05:6e02:489:: with SMTP id b9mr7559612ils.37.1615482344693; Thu, 11 Mar 2021 09:05:44 -0800 (PST) Received: from localhost ([2605:9480:22e:ff10:f947:1686:6ada:db5b]) by smtp.gmail.com with ESMTPSA id l17sm1651629ilt.27.2021.03.11.09.05.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Mar 2021 09:05:44 -0800 (PST) Date: Thu, 11 Mar 2021 12:05:42 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: avarab@gmail.com, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com, peff@peff.net Subject: [PATCH v3 16/16] midx.c: improve cache locality in midx_pack_order_cmp() Message-ID: <550e785f10ba14f166958501c007b75a04052a0d.1615482270.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff King There is a lot of pointer dereferencing in the pre-image version of 'midx_pack_order_cmp()', which this patch gets rid of. Instead of comparing the pack preferred-ness and then the pack id, both of these checks are done at the same time by using the high-order bit of the pack id to represent whether it's preferred. Then the pack id and offset are compared as usual. This produces the same result so long as there are less than 2^31 packs, which seems like a likely assumption to make in practice. Signed-off-by: Jeff King Signed-off-by: Taylor Blau --- midx.c | 55 +++++++++++++++++++++++++++++-------------------------- 1 file changed, 29 insertions(+), 26 deletions(-) diff --git a/midx.c b/midx.c index eea9574d92..4835cc13d1 100644 --- a/midx.c +++ b/midx.c @@ -828,46 +828,49 @@ static int write_midx_large_offsets(struct hashfile *f, return 0; } -static int midx_pack_order_cmp(const void *va, const void *vb, void *_ctx) +struct midx_pack_order_data { + uint32_t nr; + uint32_t pack; + off_t offset; +}; + +static int midx_pack_order_cmp(const void *va, const void *vb) { - struct write_midx_context *ctx = _ctx; - - struct pack_midx_entry *a = &ctx->entries[*(const uint32_t *)va]; - struct pack_midx_entry *b = &ctx->entries[*(const uint32_t *)vb]; - - uint32_t perm_a = ctx->pack_perm[a->pack_int_id]; - uint32_t perm_b = ctx->pack_perm[b->pack_int_id]; - - /* Sort objects in the preferred pack ahead of any others. */ - if (a->preferred > b->preferred) + const struct midx_pack_order_data *a = va, *b = vb; + if (a->pack < b->pack) return -1; - if (a->preferred < b->preferred) + else if (a->pack > b->pack) return 1; - - /* Then, order objects by which packs they appear in. */ - if (perm_a < perm_b) + else if (a->offset < b->offset) return -1; - if (perm_a > perm_b) + else if (a->offset > b->offset) return 1; - - /* Then, disambiguate by their offset within each pack. */ - if (a->offset < b->offset) - return -1; - if (a->offset > b->offset) - return 1; - - return 0; + else + return 0; } static uint32_t *midx_pack_order(struct write_midx_context *ctx) { + struct midx_pack_order_data *data; uint32_t *pack_order; uint32_t i; + ALLOC_ARRAY(data, ctx->entries_nr); + for (i = 0; i < ctx->entries_nr; i++) { + struct pack_midx_entry *e = &ctx->entries[i]; + data[i].nr = i; + data[i].pack = ctx->pack_perm[e->pack_int_id]; + if (!e->preferred) + data[i].pack |= (1U << 31); + data[i].offset = e->offset; + } + + QSORT(data, ctx->entries_nr, midx_pack_order_cmp); + ALLOC_ARRAY(pack_order, ctx->entries_nr); for (i = 0; i < ctx->entries_nr; i++) - pack_order[i] = i; - QSORT_S(pack_order, ctx->entries_nr, midx_pack_order_cmp, ctx); + pack_order[i] = data[i].nr; + free(data); return pack_order; }