From patchwork Wed Feb 9 02:10:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Cai X-Patchwork-Id: 12739558 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1A03EC43219 for ; Wed, 9 Feb 2022 02:41:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344359AbiBIClO (ORCPT ); Tue, 8 Feb 2022 21:41:14 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35818 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244497AbiBICKL (ORCPT ); Tue, 8 Feb 2022 21:10:11 -0500 Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4A8BCC0613CC for ; Tue, 8 Feb 2022 18:10:10 -0800 (PST) Received: by mail-wr1-x42b.google.com with SMTP id v12so1504700wrv.2 for ; Tue, 08 Feb 2022 18:10:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=PsHc5miY2TT2agThBOzmhPdYa4k13ksMto/NIv8nLxg=; b=O9V3xITzPHt5HK8bBHTIMTaWMcmPP4hwt4j4XYwW8CWxw71CnKAwk/4JHSBXw/jQmU xjfZLIu2Sy24PgFyWsVvb65gH0jVQ2RPJ4+AuOZG4iq8vDSWpPEZwED06Ro9nPQj8UNI r0F2gu7j9DVtaWjTNbpBnxGu/997JoXW3MyJjQZ7wZbmPl/fXjZr2X2qAC9KD4df6HHk VFKesxpx8RBYH3YcJBO66TfdsV5bl32rndqoLBmU9AwtJZf2P/hW7yFx8/Aur9vDDVpi leHxda4mJ7TXpZIyizNl9PfwxlWYjU5k/3JhmLuktS8O1gPc9LW1Nk9m4g2m1xHqUUIw 6mWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=PsHc5miY2TT2agThBOzmhPdYa4k13ksMto/NIv8nLxg=; b=vxuSl1SFpgxSV9STyB17hmEpg7TFVnixvgBe4wm2mhzwhSyaZO5k3xVBrJttVrXcb6 yVIpJ/r8JHBH4Q4fb11KONT6y/UlwNDGcEsqP6/mG7O73tatCcXj3fut5xq86EWEdREQ I+z9bDQEg/yGzcCeHX2WyaKK1IK3DdZkQFy5pde9WkROAHfqQgPsXfudY0ioIN6AeUd9 A2czuFtwSb81cdAE6pIAhwjXhOTO7xVuyW+eUgKQFi7HgQmrlxuQWv7QwGbSWzwwh9RE XzUH/oVTBGsCrXu00RoSUlRaZpxrQjj+nXCm5NvpyXUCyvtsKdQ5BeMrRnSYEmoG3swC GjgQ== X-Gm-Message-State: AOAM530Xv3Ytd+lAShx/M/neNlPXDWN9HnXvbKRtXI97ZbDXBMy8WOXU QShjHbZd5jEJyy/rknCZKKte+1RWhkM= X-Google-Smtp-Source: ABdhPJxerubYoo+DsgQPFBJUkOq/Q516jNDzMLvpyv42z21aW6upt3csnJOsyhyTyLeY/rVKpOoB/g== X-Received: by 2002:a5d:64e5:: with SMTP id g5mr170378wri.541.1644372608781; Tue, 08 Feb 2022 18:10:08 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id az16sm3483046wmb.15.2022.02.08.18.10.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 08 Feb 2022 18:10:08 -0800 (PST) Message-Id: In-Reply-To: References: Date: Wed, 09 Feb 2022 02:10:03 +0000 Subject: [PATCH v2 1/4] pack-objects: allow --filter without --stdout Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: John Cai , John Cai Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: John Cai From: John Cai 9535ce7 taught pack-objects to use filtering, but added a requirement of the --stdout since a partial clone mechanism was not yet in place to handle missing objects. Since then, changes like 9e27beaa and others added support to dynamically fetch objects that were missing. Remove the --stdout requirement so that in the next commit, repack can pass --filter to pack-objects to omit certain objects from the packfile. Based-on-patch-by: Christian Couder Signed-off-by: John Cai --- builtin/pack-objects.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index ba2006f2212..2d1ecb18784 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -4075,8 +4075,6 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) unpack_unreachable_expiration = 0; if (filter_options.choice) { - if (!pack_to_stdout) - die(_("cannot use --filter without --stdout")); if (stdin_packs) die(_("cannot use --filter with --stdin-packs")); } From patchwork Wed Feb 9 02:10:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Cai X-Patchwork-Id: 12739555 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96725C433F5 for ; Wed, 9 Feb 2022 02:40:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344234AbiBICkn (ORCPT ); Tue, 8 Feb 2022 21:40:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35824 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244514AbiBICKM (ORCPT ); Tue, 8 Feb 2022 21:10:12 -0500 Received: from mail-wm1-x334.google.com (mail-wm1-x334.google.com [IPv6:2a00:1450:4864:20::334]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 34CE2C061348 for ; Tue, 8 Feb 2022 18:10:11 -0800 (PST) Received: by mail-wm1-x334.google.com with SMTP id j5-20020a05600c1c0500b0034d2e956aadso532149wms.4 for ; Tue, 08 Feb 2022 18:10:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=1HGnQLW1zcbVY24YYNTMgiHKWstYD0uwXQGiNdsBJok=; b=Iv8d7/sbUrIjMq2S+6H12l6F0desGIckblwZPQ2Y7p6F7NcwOs9Pr2kFYNIcNDPcm8 dGzLY+P6bTXf38Axa8YRb9BWL8TiM65xd45JoHMv9eqtj1pYq3cYaJ/IGol4rly6mwOa XvLeJRiZ2PmsNqzh5yNdC4J8YzTM2YTzmr4jIMQhqGVkjiO/biqSP+l/dR/Cpvxvp387 sTabDAwpHeTcXwFdDZWTB5ITtuMIJbmrWTdfdQbX6icfsQpdPUOaeZa+BooK8H4V+7t5 qbsQL5gcndZADQeONajWHMsN3pX/OZkQohkk8R8GImdOyrhz2cKAJsmNScRjHUYP/EQh GUaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=1HGnQLW1zcbVY24YYNTMgiHKWstYD0uwXQGiNdsBJok=; b=uWUrmkEH/w9L/LduMe2jshkUKPCt0w8MZIgRMTxeb38soIzJ86QhcSjKvj++f+r4Iq Sf7+4Ok7Q/wGY7l+WdM63fLvbsuRYyoS6TpIVQntAEXkksFfY9MmoPvT9F05MouuOYX4 /rZIatYoSBHvnQRnKHRSsuAQ/6dai47BYFhu77bN054p5UmtMpOy3TLUeSo8zi6pjk27 HpddNJLlhEG7xu7rypixjSzQrlvVI8RdcvH+6/lNS07CvYvca4QoDdnedGQScek6dI/C XPd9bBIyucSw6kkSPNqc+zZt7Pcc7JJDfP7npLvu6wIoAQYzosgSVAYNZUj+T5L9NevN 2sjg== X-Gm-Message-State: AOAM5313kzR1vwRgplO23JuuNG57TPyRjiw4NBPlI33SwCet/IJyS9So +tOOkVLNWPpZbvquYTQ9CCjLD/fYYAg= X-Google-Smtp-Source: ABdhPJz/YPuaZRNj8angScAb9Std8d+bfRlzv0q+DHVtNl3jFe6TVPYSZlAnzkLHaHjdPmVN50dpbA== X-Received: by 2002:a05:600c:4295:: with SMTP id v21mr636196wmc.19.1644372609608; Tue, 08 Feb 2022 18:10:09 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id m5sm4006776wms.4.2022.02.08.18.10.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 08 Feb 2022 18:10:09 -0800 (PST) Message-Id: <6e7c8410b8dcd2f4a7e188eb5b55ae8eecb54e40.1644372606.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 09 Feb 2022 02:10:04 +0000 Subject: [PATCH v2 2/4] repack: add --filter= option Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: John Cai , John Cai Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: John Cai From: John Cai In order to use a separate http server as a remote to offload large blobs, imagine the following: A. an http server to use as a generalized object store. B. a server update hook that uploads large blobs to (A). C. a git server D. a remote helper that knows how to download objects from the http server E. a regular job that runs `git repack --filter` to remove large blobs from (C). Clients would need to configure both (C) and (A) as promisor remotes to be able to get everything. When they push new large blobs, they can still push them to (C), as (B) will upload them to (A), and (E) will regularly remove those large blobs from (C). This way with a little bit of client and server configuration, we can have a native way to support offloading large files without git LFS. It would be more flexible as you can easily tweak which blobs are considered large files by tweaking (B) and (E). A fuller demo can be found at http://tiny.cc/object_storage_demo Based-on-patch-by: Christian Couder Signed-off-by: John Cai --- Documentation/git-repack.txt | 5 +++++ builtin/repack.c | 22 +++++++++++++++------- 2 files changed, 20 insertions(+), 7 deletions(-) diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt index ee30edc178a..e394ec52ab1 100644 --- a/Documentation/git-repack.txt +++ b/Documentation/git-repack.txt @@ -126,6 +126,11 @@ depth is 4095. a larger and slower repository; see the discussion in `pack.packSizeLimit`. +--filter=:: + Omits certain objects (usually blobs) from the resulting + packfile. See linkgit:git-rev-list[1] for valid + `` forms. + -b:: --write-bitmap-index:: Write a reachability bitmap index as part of the repack. This diff --git a/builtin/repack.c b/builtin/repack.c index da1e364a756..3f1e8a39a2b 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -152,6 +152,7 @@ struct pack_objects_args { const char *depth; const char *threads; const char *max_pack_size; + const char *filter; int no_reuse_delta; int no_reuse_object; int quiet; @@ -172,6 +173,8 @@ static void prepare_pack_objects(struct child_process *cmd, strvec_pushf(&cmd->args, "--threads=%s", args->threads); if (args->max_pack_size) strvec_pushf(&cmd->args, "--max-pack-size=%s", args->max_pack_size); + if (args->filter) + strvec_pushf(&cmd->args, "--filter=%s", args->filter); if (args->no_reuse_delta) strvec_pushf(&cmd->args, "--no-reuse-delta"); if (args->no_reuse_object) @@ -238,6 +241,13 @@ static unsigned populate_pack_exts(char *name) return ret; } +static void write_promisor_file_1(char *p) +{ + char *promisor_name = mkpathdup("%s-%s.promisor", packtmp, p); + write_promisor_file(promisor_name, NULL, 0); + free(promisor_name); +} + static void repack_promisor_objects(const struct pack_objects_args *args, struct string_list *names) { @@ -269,7 +279,6 @@ static void repack_promisor_objects(const struct pack_objects_args *args, out = xfdopen(cmd.out, "r"); while (strbuf_getline_lf(&line, out) != EOF) { struct string_list_item *item; - char *promisor_name; if (line.len != the_hash_algo->hexsz) die(_("repack: Expecting full hex object ID lines only from pack-objects.")); @@ -286,13 +295,8 @@ static void repack_promisor_objects(const struct pack_objects_args *args, * concatenate the contents of all .promisor files instead of * just creating a new empty file. */ - promisor_name = mkpathdup("%s-%s.promisor", packtmp, - line.buf); - write_promisor_file(promisor_name, NULL, 0); - + write_promisor_file_1(line.buf); item->util = (void *)(uintptr_t)populate_pack_exts(item->string); - - free(promisor_name); } fclose(out); if (finish_command(&cmd)) @@ -660,6 +664,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) N_("limits the maximum number of threads")), OPT_STRING(0, "max-pack-size", &po_args.max_pack_size, N_("bytes"), N_("maximum size of each packfile")), + OPT_STRING(0, "filter", &po_args.filter, N_("args"), + N_("object filtering")), OPT_BOOL(0, "pack-kept-objects", &pack_kept_objects, N_("repack objects in packs marked with .keep")), OPT_STRING_LIST(0, "keep-pack", &keep_pack_list, N_("name"), @@ -819,6 +825,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) if (line.len != the_hash_algo->hexsz) die(_("repack: Expecting full hex object ID lines only from pack-objects.")); string_list_append(&names, line.buf); + if (po_args.filter) + write_promisor_file_1(line.buf); } fclose(out); ret = finish_command(&cmd); From patchwork Wed Feb 9 02:10:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Cai X-Patchwork-Id: 12739557 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23B0EC43217 for ; Wed, 9 Feb 2022 02:41:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344338AbiBIClI (ORCPT ); Tue, 8 Feb 2022 21:41:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35828 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244535AbiBICKM (ORCPT ); Tue, 8 Feb 2022 21:10:12 -0500 Received: from mail-wm1-x32a.google.com (mail-wm1-x32a.google.com [IPv6:2a00:1450:4864:20::32a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DF7B9C061353 for ; Tue, 8 Feb 2022 18:10:11 -0800 (PST) Received: by mail-wm1-x32a.google.com with SMTP id m26so538550wms.0 for ; Tue, 08 Feb 2022 18:10:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=s/VMTgp8zVGz57OURNC4xxkpH68NzGiQBA3psJDzlsw=; b=YJQ709tZzkbOJSjIcZ1uGQ4YFjVM/8BL8E7TCUoI/03cDZBV8bjUrfHD5RW7M6nYzh u/3xForsuspjnOrltuiCFUmmqQ2Vewwi/cUf1FytidzLqYHja4Z0uguZXHs9hBnQyDR2 kgjm9vqLeHnbiDjfCCIT5G492yq5c4A7hJ6mSIK5lp2rZ3/q4+zZs/Qc0n8amVmi7BHl LMZP8ryeP+gHr7V7HN+jQ5tX2PwPFEz5Z4fliu6x+dVSeR90O3aSJMumnaijOj/kOkib VzZlVRRPlsoHiv2m6q1r0YGM3eZQ0dtcbQZAj/8ANzdVbdXhsO/Rglwqvq0GeFAoFtB/ VJGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=s/VMTgp8zVGz57OURNC4xxkpH68NzGiQBA3psJDzlsw=; b=prHtE4bgqmImSr5vzuWow5Wwwp+PiLjhNvRnU90F/cPbsZPt4YNsNNzkS+Dzk9kwt4 DMfhjLcTcB2YcNltdqhjYocev6K4kEqHybc/RdihbKQyqaY/icvpSZ+TV1T20XXbY/U4 Hs/QDSzqP4y2ZRWb7ElvZjcHgUawGsMtDz2WzcpKAnif5oES1u+Me6qtCYPoyXZ98gcz foI8R4/o3hU/Sdx3B6fJPcVSipdLxdX4eMcMY9fTxHxLSezig9oPxWXupHbMDSEzHv9L MpMJd+mn4rjF9RJfraKmbVNbnzLkOegs4beOErGSrmRIk7hFISmoYzOUrEGBTXRditMb OjbA== X-Gm-Message-State: AOAM5300EnaMZoNdprS1d9Va3j2FT/WZogRZmbtrq2phw6j5fCBBniv9 BV4K0FARAwd+BoUjuP+5XKhdbqOCHP4= X-Google-Smtp-Source: ABdhPJzLRg7Q42a0Za8Dft+R01gvxdJu4RbhLGxpFDYp/Psq+x6t7+nM3X8Hj8GP7c6nPezNuNZPdw== X-Received: by 2002:a7b:cb54:: with SMTP id v20mr629145wmj.2.1644372610382; Tue, 08 Feb 2022 18:10:10 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id l10sm14152623wrz.20.2022.02.08.18.10.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 08 Feb 2022 18:10:10 -0800 (PST) Message-Id: <40612b9663b8d20e8cfa25ccfce76c7f97e4934d.1644372606.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 09 Feb 2022 02:10:05 +0000 Subject: [PATCH v2 3/4] upload-pack: allow missing promisor objects Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: John Cai , John Cai Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: John Cai From: John Cai When a git server (A) is being used alongside an http server (B) remote that stores large blobs, and a client fetches objects from both (A) as well as (B), we do not want (A) to fetch missing objects during object traversal. Add a config value uploadpack.allowmissingpromisor that, when set to true, will allow (A) to skip fetching missing objects. Based-on-patch-by: Christian Couder Signed-off-by: John Cai --- upload-pack.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/upload-pack.c b/upload-pack.c index 8acc98741bb..39b56650b77 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -112,6 +112,7 @@ struct upload_pack_data { unsigned allow_ref_in_want : 1; /* v2 only */ unsigned allow_sideband_all : 1; /* v2 only */ unsigned advertise_sid : 1; + unsigned allow_missing_promisor : 1; }; static void upload_pack_data_init(struct upload_pack_data *data) @@ -309,6 +310,8 @@ static void create_pack_file(struct upload_pack_data *pack_data, strvec_push(&pack_objects.args, "--delta-base-offset"); if (pack_data->use_include_tag) strvec_push(&pack_objects.args, "--include-tag"); + if (pack_data->allow_missing_promisor) + strvec_push(&pack_objects.args, "--missing=allow-promisor"); if (pack_data->filter_options.choice) { const char *spec = expand_list_objects_filter_spec(&pack_data->filter_options); @@ -1315,6 +1318,8 @@ static int upload_pack_config(const char *var, const char *value, void *cb_data) data->allow_ref_in_want = git_config_bool(var, value); } else if (!strcmp("uploadpack.allowsidebandall", var)) { data->allow_sideband_all = git_config_bool(var, value); + } else if (!strcmp("uploadpack.allowmissingpromisor", var)) { + data->allow_missing_promisor = git_config_bool(var, value); } else if (!strcmp("core.precomposeunicode", var)) { precomposed_unicode = git_config_bool(var, value); } else if (!strcmp("transfer.advertisesid", var)) { From patchwork Wed Feb 9 02:10:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Cai X-Patchwork-Id: 12739560 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68B1BC4332F for ; Wed, 9 Feb 2022 02:41:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344585AbiBIClV (ORCPT ); Tue, 8 Feb 2022 21:41:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35842 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244556AbiBICKO (ORCPT ); Tue, 8 Feb 2022 21:10:14 -0500 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E84A7C06157B for ; Tue, 8 Feb 2022 18:10:12 -0800 (PST) Received: by mail-wr1-x42d.google.com with SMTP id f17so1515158wrx.1 for ; Tue, 08 Feb 2022 18:10:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=8dz2MG61I0cw4Qg6/QDjdBJMAzkX4CKpnVH6dToB5No=; b=mP3O82+g/W0ojO9o0YX1Nsn4ZsLlhwW03hc/nRW80ArUyHYGHbckET5UbPlqp/zKE7 dmwsWQvy8vELXfurwDdIEsHNmZ2BC1NE+KB0UefQsEtlGEW7JVc/UCCtv1GrtQkSwvop zG1LIFZDz4ah2xubMxf8y1zVbJIEEZ93TEdwwbBANbtB9TVuCCm5cTfgo2OdwU6HKwgM hccbVYLA54kTa/4aehCOZSruk3uzDsQCx+i/4/evpRO8ahAtqGmrwuJlpVqCdZcEcFFR wlbl1ZamF2jm50OYhOBx4ZCGdVwLOaM7taFR44Gzyzjk3JU6eLOzPiYF5lK5PSPXcYys W6Wg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=8dz2MG61I0cw4Qg6/QDjdBJMAzkX4CKpnVH6dToB5No=; b=Dbt1JVuiLhv4fhtcSgNf3UC7w3xUQKMFsuUtZgDgVhHJ0wymd9JNloQmkjPelPVL3v Dp3SnPEND1PmWwXQB/kmW9SNDmcbmLdGPowuaBl4zHT4l4CdIKK3HdoSo9JAKCgLtWx8 STL4LhWbwzp1ZEbr8gbUfYGFjypt9m0gswSNvhOql4FqFUdb7/peLBV39EWLe+ipwBBY n1OtBvWkSplxJ52Ex3G5ko1Mnj+C8v8kOpPHg2SA3Rm4hZtvV/3EeS/K7NKSm1Fry/cd b0aCiSQqk2eGAy7y9nvZy+kKkXYa+C7qacXg1yyJqG3pe56t/wbVM9NM+Ym2DtOhi16w Bnpw== X-Gm-Message-State: AOAM533iH2eHauN0lma3W2Ii8z66DNC/FDo9OLbW5kYV3FYNJi/Dz5Bf fAIfFUP6EQx/yVzPGCNaPz8YIU1LmDg= X-Google-Smtp-Source: ABdhPJxmCtG+OK8JzMwAIo5tIDsoF2lFNPXEIsvKpxWzN20vVJOmL0BafdNgOtyqSPp0f9o4vieo8w== X-Received: by 2002:adf:e5ca:: with SMTP id a10mr168074wrn.151.1644372611286; Tue, 08 Feb 2022 18:10:11 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id x10sm2615520wmj.17.2022.02.08.18.10.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 08 Feb 2022 18:10:10 -0800 (PST) Message-Id: In-Reply-To: References: Date: Wed, 09 Feb 2022 02:10:06 +0000 Subject: [PATCH v2 4/4] tests for repack --filter mode Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: John Cai , John Cai Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: John Cai From: John Cai This patch adds tests to test both repack --filter functionality in isolation (in t7700-repack.sh) as well as how it can be used to offload large blobs (in t0410-partial-clone.sh) There are several scripts added so we can test the process of using a remote helper to upload blobs to an http server. - t/lib-httpd/list.sh lists blobs uploaded to the http server. - t/lib-httpd/upload.sh uploads blobs to the http server. - t/t0410/git-remote-testhttpgit a remote helper that can access blobs onto from an http server. Copied over from t/t5801/git-remote-testhttpgit and modified to upload blobs to an http server. - t/t0410/lib-http-promisor.sh convenience functions for uploading blobs Based-on-patch-by: Christian Couder Signed-off-by: John Cai --- t/lib-httpd.sh | 2 + t/lib-httpd/apache.conf | 8 ++ t/lib-httpd/list.sh | 43 +++++++++ t/lib-httpd/upload.sh | 46 +++++++++ t/t0410-partial-clone.sh | 81 ++++++++++++++++ t/t0410/git-remote-testhttpgit | 170 +++++++++++++++++++++++++++++++++ t/t7700-repack.sh | 20 ++++ 7 files changed, 370 insertions(+) create mode 100644 t/lib-httpd/list.sh create mode 100644 t/lib-httpd/upload.sh create mode 100755 t/t0410/git-remote-testhttpgit diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh index 782891908d7..fc6587c6d39 100644 --- a/t/lib-httpd.sh +++ b/t/lib-httpd.sh @@ -136,6 +136,8 @@ prepare_httpd() { install_script error-smart-http.sh install_script error.sh install_script apply-one-time-perl.sh + install_script upload.sh + install_script list.sh ln -s "$LIB_HTTPD_MODULE_PATH" "$HTTPD_ROOT_PATH/modules" diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf index 497b9b9d927..1ea382750f0 100644 --- a/t/lib-httpd/apache.conf +++ b/t/lib-httpd/apache.conf @@ -129,6 +129,8 @@ ScriptAlias /broken_smart/ broken-smart-http.sh/ ScriptAlias /error_smart/ error-smart-http.sh/ ScriptAlias /error/ error.sh/ ScriptAliasMatch /one_time_perl/(.*) apply-one-time-perl.sh/$1 +ScriptAlias /upload/ upload.sh/ +ScriptAlias /list/ list.sh/ Options FollowSymlinks @@ -156,6 +158,12 @@ ScriptAliasMatch /one_time_perl/(.*) apply-one-time-perl.sh/$1 Options ExecCGI + + Options ExecCGI + + + Options ExecCGI + RewriteEngine on RewriteRule ^/dumb-redir/(.*)$ /dumb/$1 [R=301] diff --git a/t/lib-httpd/list.sh b/t/lib-httpd/list.sh new file mode 100644 index 00000000000..e63406be3b2 --- /dev/null +++ b/t/lib-httpd/list.sh @@ -0,0 +1,43 @@ +#!/bin/sh + +# Used in the httpd test server to be called by a remote helper to list objects. + +FILES_DIR="www/files" + +OLDIFS="$IFS" +IFS='&' +set -- $QUERY_STRING +IFS="$OLDIFS" + +while test $# -gt 0 +do + key=${1%%=*} + val=${1#*=} + + case "$key" in + "sha1") sha1="$val" ;; + *) echo >&2 "unknown key '$key'" ;; + esac + + shift +done + +if test -d "$FILES_DIR" +then + if test -z "$sha1" + then + echo 'Status: 200 OK' + echo + ls "$FILES_DIR" | tr '-' ' ' + else + if test -f "$FILES_DIR/$sha1"-* + then + echo 'Status: 200 OK' + echo + cat "$FILES_DIR/$sha1"-* + else + echo 'Status: 404 Not Found' + echo + fi + fi +fi diff --git a/t/lib-httpd/upload.sh b/t/lib-httpd/upload.sh new file mode 100644 index 00000000000..202de63b2dc --- /dev/null +++ b/t/lib-httpd/upload.sh @@ -0,0 +1,46 @@ +#!/bin/sh + +# In part from http://codereview.stackexchange.com/questions/79549/bash-cgi-upload-file +# Used in the httpd test server to for a remote helper to call to upload blobs. + +FILES_DIR="www/files" + +OLDIFS="$IFS" +IFS='&' +set -- $QUERY_STRING +IFS="$OLDIFS" + +while test $# -gt 0 +do + key=${1%%=*} + val=${1#*=} + + case "$key" in + "sha1") sha1="$val" ;; + "type") type="$val" ;; + "size") size="$val" ;; + "delete") delete=1 ;; + *) echo >&2 "unknown key '$key'" ;; + esac + + shift +done + +case "$REQUEST_METHOD" in +POST) + if test "$delete" = "1" + then + rm -f "$FILES_DIR/$sha1-$size-$type" + else + mkdir -p "$FILES_DIR" + cat >"$FILES_DIR/$sha1-$size-$type" + fi + + echo 'Status: 204 No Content' + echo + ;; + +*) + echo 'Status: 405 Method Not Allowed' + echo +esac diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh index f17abd298c8..0724043ffb7 100755 --- a/t/t0410-partial-clone.sh +++ b/t/t0410-partial-clone.sh @@ -30,6 +30,31 @@ promise_and_delete () { delete_object repo "$HASH" } +upload_blob() { + SERVER_REPO="$1" + HASH="$2" + + test -n "$HASH" || die "Invalid argument '$HASH'" + HASH_SIZE=$(git -C "$SERVER_REPO" cat-file -s "$HASH") || { + echo >&2 "Cannot get blob size of '$HASH'" + return 1 + } + + UPLOAD_URL="http://127.0.0.1:$LIB_HTTPD_PORT/upload/?sha1=$HASH&size=$HASH_SIZE&type=blob" + + git -C "$SERVER_REPO" cat-file blob "$HASH" >object && + curl --data-binary @object --include "$UPLOAD_URL" +} + +upload_blobs_from_stdin() { + SERVER_REPO="$1" + while read -r blob + do + echo "uploading $blob" + upload_blob "$SERVER_REPO" "$blob" || return + done +} + test_expect_success 'extensions.partialclone without filter' ' test_create_repo server && git clone --filter="blob:none" "file://$(pwd)/server" client && @@ -668,6 +693,62 @@ test_expect_success 'fetching of missing objects from an HTTP server' ' grep "$HASH" out ' +PATH="$TEST_DIRECTORY/t0410:$PATH" + +test_expect_success 'fetch of missing objects through remote helper' ' + rm -rf origin server && + test_create_repo origin && + dd if=/dev/zero of=origin/file1 bs=801k count=1 && + git -C origin add file1 && + git -C origin commit -m "large blob" && + sha="$(git -C origin rev-parse :file1)" && + expected="?$(git -C origin rev-parse :file1)" && + git clone --bare --no-local origin server && + git -C server remote add httpremote "testhttpgit::${PWD}/server" && + git -C server config remote.httpremote.promisor true && + git -C server config --remove-section remote.origin && + git -C server rev-list --all --objects --filter-print-omitted \ + --filter=blob:limit=800k | perl -ne "print if s/^[~]//" \ + >large_blobs.txt && + upload_blobs_from_stdin server objects && + grep "$expected" objects && + HTTPD_URL=$HTTPD_URL git -C server show $sha && + git -C server rev-list --objects --all --missing=print >objects && + grep "$sha" objects +' + +test_expect_success 'fetch does not cause server to fetch missing objects' ' + rm -rf origin server client && + test_create_repo origin && + dd if=/dev/zero of=origin/file1 bs=801k count=1 && + git -C origin add file1 && + git -C origin commit -m "large blob" && + sha="$(git -C origin rev-parse :file1)" && + expected="?$(git -C origin rev-parse :file1)" && + git clone --bare --no-local origin server && + git -C server remote add httpremote "testhttpgit::${PWD}/server" && + git -C server config remote.httpremote.promisor true && + git -C server config --remove-section remote.origin && + git -C server rev-list --all --objects --filter-print-omitted \ + --filter=blob:limit=800k | perl -ne "print if s/^[~]//" \ + >large_blobs.txt && + upload_blobs_from_stdin server client_objects && + grep "$expected" client_objects && + git -C server rev-list --objects --all --missing=print >server_objects && + grep "$expected" server_objects +' + # DO NOT add non-httpd-specific tests here, because the last part of this # test script is only executed when httpd is available and enabled. diff --git a/t/t0410/git-remote-testhttpgit b/t/t0410/git-remote-testhttpgit new file mode 100755 index 00000000000..e5e187243ed --- /dev/null +++ b/t/t0410/git-remote-testhttpgit @@ -0,0 +1,170 @@ +#!/bin/sh +# Copyright (c) 2012 Felipe Contreras +# Copyright (c) 2020 Christian Couder + +# This is a git remote helper that can be used to store blobs on an http server + +# The first argument can be a url when the fetch/push command was a url +# instead of a configured remote. In this case, use a generic alias. +if test "$1" = "testhttpgit::$2"; then + alias=_ +else + alias=$1 +fi +url=$2 + +unset GIT_DIR + +h_refspec="refs/heads/*:refs/testhttpgit/$alias/heads/*" +t_refspec="refs/tags/*:refs/testhttpgit/$alias/tags/*" + +if test -n "$GIT_REMOTE_TESTHTTPGIT_NOREFSPEC" +then + h_refspec="" + t_refspec="" +fi + +die () { + echo >&2 "fatal: $*" + echo "fatal: $*" >>/tmp/t0430.txt + echo >>/tmp/t0430.txt + exit 1 +} + +force= + +mark_count_tmp=$(mktemp -t git-remote-http-mark-count_XXXXXX) || die "Failed to create temp file" +echo "1" >"$mark_count_tmp" + +get_mark_count() { + mark=$(cat "$mark_count_tmp") + echo "$mark" + mark=$((mark+1)) + echo "$mark" >"$mark_count_tmp" +} + +export_blob_from_file() { + file="$1" + echo "blob" + echo "mark :$(get_mark_count)" + size=$(wc -c <"$file") || return + echo "data $size" + cat "$file" || return + echo +} + +while read line +do + case $line in + capabilities) + echo 'import' + echo 'export' + test -n "$h_refspec" && echo "refspec $h_refspec" + test -n "$t_refspec" && echo "refspec $t_refspec" + test -n "$GIT_REMOTE_TESTHTTPGIT_SIGNED_TAGS" && echo "signed-tags" + test -n "$GIT_REMOTE_TESTHTTPGIT_NO_PRIVATE_UPDATE" && echo "no-private-update" + echo 'option' + echo + ;; + list) + git -C "$url" for-each-ref --format='? %(refname)' 'refs/heads/' 'refs/tags/' + head=$(git -C "$url" symbolic-ref HEAD) + echo "@$head HEAD" + echo + ;; + import*) + # read all import lines + while true + do + ref="${line#* }" + refs="$refs $ref" + read line + test "${line%% *}" != "import" && break + done + + echo "refs: $refs" >>/tmp/t0430.txt + + if test -n "$GIT_REMOTE_TESTHTTPGIT_FAILURE" + then + echo "feature done" + exit 1 + fi + + echo "feature done" + + tmpdir=$(mktemp -d -t git-remote-http-import_XXXXXX) || die "Failed to create temp directory" + + for ref in $refs + do + get_url="$HTTPD_URL/list/?sha1=$ref" + echo "curl url: $get_url" >>/tmp/t0430.txt + echo "curl output: $tmpdir/$ref" >>/tmp/t0430.txt + curl -s -o "$tmpdir/$ref" "$get_url" || + die "curl '$get_url' failed" + echo "exporting from: $tmpdir/$ref" >>/tmp/t0430.txt + export_blob_from_file "$tmpdir/$ref" || + die "failed to export blob from '$tmpdir/$ref'" + echo "done exporting" >>/tmp/t0430.txt + done + + echo "done" + ;; + export) + if test -n "$GIT_REMOTE_TESTHTTPGIT_FAILURE" + then + # consume input so fast-export doesn't get SIGPIPE; + # git would also notice that case, but we want + # to make sure we are exercising the later + # error checks + while read line; do + test "done" = "$line" && break + done + exit 1 + fi + + before=$(git -C "$url" for-each-ref --format=' %(refname) %(objectname) ') + + git -C "$url" fast-import \ + ${force:+--force} \ + ${testhttpgitmarks:+"--import-marks=$testhttpgitmarks"} \ + ${testhttpgitmarks:+"--export-marks=$testhttpgitmarks"} \ + --quiet + + # figure out which refs were updated + git -C "$url" for-each-ref --format='%(refname) %(objectname)' | + while read ref a + do + case "$before" in + *" $ref $a "*) + continue ;; # unchanged + esac + if test -z "$GIT_REMOTE_TESTHTTPGIT_PUSH_ERROR" + then + echo "ok $ref" + else + echo "error $ref $GIT_REMOTE_TESTHTTPGIT_PUSH_ERROR" + fi + done + + echo + ;; + option\ *) + read cmd opt val <<-EOF + $line + EOF + case $opt in + force) + test $val = "true" && force="true" || force= + echo "ok" + ;; + *) + echo "unsupported" + ;; + esac + ;; + '') + exit + ;; + esac +done + diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh index e489869dd94..78cc1858cb6 100755 --- a/t/t7700-repack.sh +++ b/t/t7700-repack.sh @@ -237,6 +237,26 @@ test_expect_success 'auto-bitmaps do not complain if unavailable' ' test_must_be_empty actual ' +test_expect_success 'repack with filter does not fetch from remote' ' + rm -rf server client && + test_create_repo server && + git -C server config uploadpack.allowFilter true && + git -C server config uploadpack.allowAnySHA1InWant true && + echo content1 >server/file1 && + git -C server add file1 && + git -C server commit -m initial_commit && + expected="?$(git -C server rev-parse :file1)" && + git clone --bare --no-local server client && + git -C client config remote.origin.promisor true && + git -C client -c repack.writebitmaps=false repack -a -d --filter=blob:none && + git -C client rev-list --objects --all --missing=print >objects && + grep "$expected" objects && + git -C client repack -a -d && + expected="$(git -C server rev-parse :file1)" && + git -C client rev-list --objects --all --missing=print >objects && + grep "$expected" objects +' + objdir=.git/objects midx=$objdir/pack/multi-pack-index