From patchwork Thu Sep 19 14:43:10 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood via GitGitGadget X-Patchwork-Id: 11152729 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DF57B14DB for ; Thu, 19 Sep 2019 14:43:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A098420882 for ; Thu, 19 Sep 2019 14:43:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Jk4Iukru" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388186AbfISOnO (ORCPT ); Thu, 19 Sep 2019 10:43:14 -0400 Received: from mail-wm1-f67.google.com ([209.85.128.67]:35592 "EHLO mail-wm1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732606AbfISOnO (ORCPT ); Thu, 19 Sep 2019 10:43:14 -0400 Received: by mail-wm1-f67.google.com with SMTP id y21so4298581wmi.0 for ; Thu, 19 Sep 2019 07:43:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:message-id:in-reply-to:references:from:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=H8o8Q2oXrr8CQxTyMytiFY8blft6fW7e/K/J4EtMgsA=; b=Jk4IukruG7Gi2XJZ8H0mCZyeW31OdwjvoobmodkyHqhSt7z7js30J6SyZb9JFKZRCR aOkYQ0PlJDDv6cSQLiijAyAON6op3BJ9Z8HlznvPAx+txRC3nqdEANY7pn2VkgLj6aTg 14xE3Y6I2nxzf9Rz5BrZHzpT566e0EcePuQaDxYBwBFOVREr5SSSJPllc16uXw2ZM4VR zhCvLmRg6ytqpOrDM34Z2B48YMvmsI3+tBY1JHYMhT73hbja+B7yD+qSYdjzsd1r8PCf G4QN5WaJVhH3ObOZMnKDRTkAphHvhZC8U5T1bSlRKG1pGcMo5dlywEVA0+6RuCyEmewU 73Ww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:in-reply-to:references:from :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=H8o8Q2oXrr8CQxTyMytiFY8blft6fW7e/K/J4EtMgsA=; b=PFr1Ubt/nYCJEsHV6ET9ZwpH9puXo2K9i5QVCE7WUJGh8x18FoY8X6gzxfe8Scjmf6 zyvq9ZPnG5syte574ZIzI8fLg7JA+zyP9dHF4Y8KYyVLcsEvXoviQ7CoEcZ0Qx34l/2I tYMBhORViHcwvWMxoZD/QPjCpYS5fnR9wg5kpIQg3beaNUaEHUqIAhlAykjJT34RODqG D9U/4VyyCzB4Jzxi/Cs0ydJeOQssHGCLiqpApWTqWPqWgZN96S2D6uFo66AOP5jSDenN 2h+vzkxaTim6rIYgJJW56U/by385W3e0CB1ydkOThlGw7bwdMTLETwqQqzIBsv1lW6CM rN1g== X-Gm-Message-State: APjAAAXdebWkV7mJae7eYgqQYUf9liyMCBKQ2SFLtUqqZCcfzZyzUCvM LSzaNGItKC/9jA4u9+qaz/87Apao X-Google-Smtp-Source: APXvYqxw9LOvEOQwZ0spGuNjGD4TihdZdgN5N88RLRv7cE/HvvJAyUtj26ZcHsV0lz+jAzK8iGBjPg== X-Received: by 2002:a7b:c10b:: with SMTP id w11mr3099370wmi.108.1568904190897; Thu, 19 Sep 2019 07:43:10 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id q3sm10708514wrm.86.2019.09.19.07.43.10 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 19 Sep 2019 07:43:10 -0700 (PDT) Date: Thu, 19 Sep 2019 07:43:10 -0700 (PDT) X-Google-Original-Date: Thu, 19 Sep 2019 14:42:58 GMT Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Subject: [PATCH v2 01/11] sparse-checkout: create builtin with 'list' subcommand Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Junio C Hamano , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee The sparse-checkout feature is mostly hidden to users, as its only documentation is supplementary information in the docs for 'git read-tree'. In addition, users need to know how to edit the .git/info/sparse-checkout file with the right patterns, then run the appropriate 'git read-tree -mu HEAD' command. Keeping the working directory in sync with the sparse-checkout file requires care. Begin an effort to make the sparse-checkout feature a porcelain feature by creating a new 'git sparse-checkout' builtin. This builtin will be the preferred mechanism for manipulating the sparse-checkout file and syncing the working directory. The `$GIT_DIR/info/sparse-checkout` file defines the skip- worktree reference bitmap. When Git updates the working directory, it updates the skip-worktree bits in the index based on this file and removes or restores files in the working copy to match. The documentation provided is adapted from the "git read-tree" documentation with a few edits for clarity in the new context. Extra sections are added to hint toward a future change to a more restricted pattern set. Helped-by: Elijah Newren Signed-off-by: Derrick Stolee --- .gitignore | 1 + Documentation/git-read-tree.txt | 2 +- Documentation/git-sparse-checkout.txt | 90 +++++++++++++++++++++++++++ Makefile | 1 + builtin.h | 1 + builtin/sparse-checkout.c | 86 +++++++++++++++++++++++++ git.c | 1 + t/t1091-sparse-checkout-builtin.sh | 51 +++++++++++++++ 8 files changed, 232 insertions(+), 1 deletion(-) create mode 100644 Documentation/git-sparse-checkout.txt create mode 100644 builtin/sparse-checkout.c create mode 100755 t/t1091-sparse-checkout-builtin.sh diff --git a/.gitignore b/.gitignore index 4470d7cfc0..5ccc3d00dd 100644 --- a/.gitignore +++ b/.gitignore @@ -156,6 +156,7 @@ /git-show-branch /git-show-index /git-show-ref +/git-sparse-checkout /git-stage /git-stash /git-status diff --git a/Documentation/git-read-tree.txt b/Documentation/git-read-tree.txt index d271842608..da33f84f33 100644 --- a/Documentation/git-read-tree.txt +++ b/Documentation/git-read-tree.txt @@ -436,7 +436,7 @@ support. SEE ALSO -------- linkgit:git-write-tree[1]; linkgit:git-ls-files[1]; -linkgit:gitignore[5] +linkgit:gitignore[5]; linkgit:git-sparse-checkout[1]; GIT --- diff --git a/Documentation/git-sparse-checkout.txt b/Documentation/git-sparse-checkout.txt new file mode 100644 index 0000000000..cdef451642 --- /dev/null +++ b/Documentation/git-sparse-checkout.txt @@ -0,0 +1,90 @@ +git-sparse-checkout(1) +======================= + +NAME +---- +git-sparse-checkout - Initialize and modify the sparse-checkout +configuration, which reduces the checkout to a set of directories +given by a list of prefixes. + + +SYNOPSIS +-------- +[verse] +'git sparse-checkout [options]' + + +DESCRIPTION +----------- + +Initialize and modify the sparse-checkout configuration, which reduces +the checkout to a set of directories given by a list of prefixes. + + +COMMANDS +-------- +'list':: + Provide a list of the contents in the sparse-checkout file. + + +SPARSE CHECKOUT +---------------- + +"Sparse checkout" allows populating the working directory sparsely. +It uses the skip-worktree bit (see linkgit:git-update-index[1]) to tell +Git whether a file in the working directory is worth looking at. If +the skip-worktree bit is set, then the file is ignored in the working +directory. Git will not populate the contents of those files, which +makes a sparse checkout helpful when working in a repository with many +files, but only a few are important to the current user. + +The `$GIT_DIR/info/sparse-checkout` file is used to define the +skip-worktree reference bitmap. When Git updates the working +directory, it resets the skip-worktree bit in the index based on this +file. If an entry +matches a pattern in this file, skip-worktree will not be set on +that entry. Otherwise, skip-worktree will be set. + +Then it compares the new skip-worktree value with the previous one. If +skip-worktree turns from set to unset, it will add the corresponding +file back. If it turns from unset to set, that file will be removed. + +## FULL PATTERN SET + +By default, the sparse-checkout file uses the same syntax as `.gitignore` +files. + +While `$GIT_DIR/info/sparse-checkout` is usually used to specify what +files are included, you can also specify what files are _not_ included, +using negative patterns. For example, to remove the file `unwanted`: + +---------------- +/* +!unwanted +---------------- + +Another tricky thing is fully repopulating the working directory when you +no longer want sparse checkout. You cannot just disable "sparse +checkout" because skip-worktree bits are still in the index and your working +directory is still sparsely populated. You should re-populate the working +directory with the `$GIT_DIR/info/sparse-checkout` file content as +follows: + +---------------- +/* +---------------- + +Then you can disable sparse checkout. Sparse checkout support in 'git +read-tree' and similar commands is disabled by default. You need to +set `core.sparseCheckout` to `true` in order to have sparse checkout +support. + +SEE ALSO +-------- + +linkgit:git-read-tree[1] +linkgit:gitignore[5] + +GIT +--- +Part of the linkgit:git[1] suite diff --git a/Makefile b/Makefile index f58bf14c7b..f3322b75dd 100644 --- a/Makefile +++ b/Makefile @@ -1121,6 +1121,7 @@ BUILTIN_OBJS += builtin/shortlog.o BUILTIN_OBJS += builtin/show-branch.o BUILTIN_OBJS += builtin/show-index.o BUILTIN_OBJS += builtin/show-ref.o +BUILTIN_OBJS += builtin/sparse-checkout.o BUILTIN_OBJS += builtin/stash.o BUILTIN_OBJS += builtin/stripspace.o BUILTIN_OBJS += builtin/submodule--helper.o diff --git a/builtin.h b/builtin.h index ec7e0954c4..d517068faa 100644 --- a/builtin.h +++ b/builtin.h @@ -223,6 +223,7 @@ int cmd_shortlog(int argc, const char **argv, const char *prefix); int cmd_show(int argc, const char **argv, const char *prefix); int cmd_show_branch(int argc, const char **argv, const char *prefix); int cmd_show_index(int argc, const char **argv, const char *prefix); +int cmd_sparse_checkout(int argc, const char **argv, const char *prefix); int cmd_status(int argc, const char **argv, const char *prefix); int cmd_stash(int argc, const char **argv, const char *prefix); int cmd_stripspace(int argc, const char **argv, const char *prefix); diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c new file mode 100644 index 0000000000..eed9625a05 --- /dev/null +++ b/builtin/sparse-checkout.c @@ -0,0 +1,86 @@ +#include "builtin.h" +#include "config.h" +#include "dir.h" +#include "parse-options.h" +#include "pathspec.h" +#include "repository.h" +#include "run-command.h" +#include "strbuf.h" + +static char const * const builtin_sparse_checkout_usage[] = { + N_("git sparse-checkout [list]"), + NULL +}; + +static char *get_sparse_checkout_filename(void) +{ + return git_pathdup("info/sparse-checkout"); +} + +static void write_patterns_to_file(FILE *fp, struct pattern_list *pl) +{ + int i; + + for (i = 0; i < pl->nr; i++) { + struct path_pattern *p = pl->patterns[i]; + + if (p->flags & PATTERN_FLAG_NEGATIVE) + fprintf(fp, "!"); + + fprintf(fp, "%s", p->pattern); + + if (p->flags & PATTERN_FLAG_MUSTBEDIR) + fprintf(fp, "/"); + + fprintf(fp, "\n"); + } +} + +static int sparse_checkout_list(int argc, const char **argv) +{ + struct pattern_list pl; + char *sparse_filename; + int res; + + memset(&pl, 0, sizeof(pl)); + + sparse_filename = get_sparse_checkout_filename(); + res = add_patterns_from_file_to_list(sparse_filename, "", 0, &pl, NULL); + free(sparse_filename); + + if (res < 0) { + warning(_("this worktree is not sparse (sparse-checkout file may not exist)")); + return 0; + } + + write_patterns_to_file(stdout, &pl); + clear_pattern_list(&pl); + + return 0; +} + +int cmd_sparse_checkout(int argc, const char **argv, const char *prefix) +{ + static struct option builtin_sparse_checkout_options[] = { + OPT_END(), + }; + + if (argc == 2 && !strcmp(argv[1], "-h")) + usage_with_options(builtin_sparse_checkout_usage, + builtin_sparse_checkout_options); + + argc = parse_options(argc, argv, prefix, + builtin_sparse_checkout_options, + builtin_sparse_checkout_usage, + PARSE_OPT_STOP_AT_NON_OPTION); + + git_config(git_default_config, NULL); + + if (argc > 0) { + if (!strcmp(argv[0], "list")) + return sparse_checkout_list(argc, argv); + } + + usage_with_options(builtin_sparse_checkout_usage, + builtin_sparse_checkout_options); +} diff --git a/git.c b/git.c index c2eec470c9..e775fbad42 100644 --- a/git.c +++ b/git.c @@ -576,6 +576,7 @@ static struct cmd_struct commands[] = { { "show-branch", cmd_show_branch, RUN_SETUP }, { "show-index", cmd_show_index }, { "show-ref", cmd_show_ref, RUN_SETUP }, + { "sparse-checkout", cmd_sparse_checkout, RUN_SETUP | NEED_WORK_TREE }, { "stage", cmd_add, RUN_SETUP | NEED_WORK_TREE }, /* * NEEDSWORK: Until the builtin stash is thoroughly robust and no diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh new file mode 100755 index 0000000000..46e7b2dded --- /dev/null +++ b/t/t1091-sparse-checkout-builtin.sh @@ -0,0 +1,51 @@ +#!/bin/sh + +test_description='sparse checkout builtin tests' + +. ./test-lib.sh + +test_expect_success 'setup' ' + git init repo && + ( + cd repo && + echo "initial" >a && + mkdir folder1 folder2 deep && + mkdir deep/deeper1 deep/deeper2 && + mkdir deep/deeper1/deepest && + cp a folder1 && + cp a folder2 && + cp a deep && + cp a deep/deeper1 && + cp a deep/deeper2 && + cp a deep/deeper1/deepest && + git add . && + git commit -m "initial commit" + ) +' + +test_expect_success 'git sparse-checkout list (empty)' ' + git -C repo sparse-checkout list >list 2>err && + test_line_count = 0 list && + test_i18ngrep "this worktree is not sparse (sparse-checkout file may not exist)" err +' + +test_expect_success 'git sparse-checkout list (populated)' ' + test_when_finished rm -f repo/.git/info/sparse-checkout && + cat >repo/.git/info/sparse-checkout <<-EOF && + /folder1/* + /deep/ + **/a + !*bin* + EOF + git -C repo sparse-checkout list >list && + cat >expect <<-EOF && + /folder1/* + /deep/ + **/a + !*bin* + EOF + test_cmp expect list +' + +test_done + From patchwork Thu Sep 19 14:43:11 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood via GitGitGadget X-Patchwork-Id: 11152731 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A9C39197C for ; Thu, 19 Sep 2019 14:43:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7EEC92067B for ; Thu, 19 Sep 2019 14:43:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="o8v+nGO4" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388876AbfISOnQ (ORCPT ); Thu, 19 Sep 2019 10:43:16 -0400 Received: from mail-wr1-f65.google.com ([209.85.221.65]:35820 "EHLO mail-wr1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732731AbfISOnP (ORCPT ); Thu, 19 Sep 2019 10:43:15 -0400 Received: by mail-wr1-f65.google.com with SMTP id v8so3413279wrt.2 for ; Thu, 19 Sep 2019 07:43:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:message-id:in-reply-to:references:from:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=tY9no7uCVYsWIVfnFrKCm9fkPc8W4i/uAAQr+UiOhA8=; b=o8v+nGO4PuKXgonNulbS99Jxdiw4xOfmy7uBkCjOv7FRVSxP2sbHq+yi/tCKBXaUYp ANQpV+vvb0G9+Xz/uh1sQBwcMHke+/Bmv0J7Z+0p1/u5E6W/kL88Bx/aQ2uqQ4ffRRNh 4FPeRVBKKIpYKHqVulTBVzQYN8ADPoVxyh4iT1HyvBR/Sdi6vjbNo7JkVG/wAQtYqlM6 MW0dBZfqF5U+mU/qx3f/zYNazZSwz/olGG+/rHTcaVNV65T9cxE74eKJhGCRXk8b9dvx bjp6uVIFYwVC+B8sdhxOwmyGNJYnIquW6v9sfty8oVwqjrpHBnprPbH3ElGFO8wLdln+ zzvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:in-reply-to:references:from :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=tY9no7uCVYsWIVfnFrKCm9fkPc8W4i/uAAQr+UiOhA8=; b=eq4BNHaRCF8FFSCh4GQrXCHlBp07/hkpqfY2PFkB+RxVOEtHo21hDSg1jagdGh+Wta lOgJcc8sgYpjsiZxYc0lSlXF5x41dmZTsQUka62ovM1V0/xz+wtRESZFepE+t60k42Tx gpxRDiHENz1ObcDpUYz6kUtOjGS53IXJL5+se/WIT9IoE6BkV3GmYX6zUi1AnjHbOwpL FuBhnOGknBuhfjdUXr/oReAl3lL8UDsOIGwNL2FulGNquZYn+j+o5J931I5knx4z5wRH ugMEGy798J+OAgL7cZ1v6/1xJUW/8S/EdtFAPy7IWyWPhCvcQHFcXoClHpqxExp4f0qa kOxw== X-Gm-Message-State: APjAAAWMOtpVgQon39Urvt5l2KYwjuipi/PEM1qPNwAlOgC8mkEkQdWq SMz54z0zpUmYiGpTV5TY1ZMw7enm X-Google-Smtp-Source: APXvYqwsOqviQcclOTfY4Ca2iw4/WM67uRZCftL3nsLimPPCCmlEeGTY86u8pkWku+W4yF9j/e716A== X-Received: by 2002:a5d:6812:: with SMTP id w18mr7236777wru.250.1568904191917; Thu, 19 Sep 2019 07:43:11 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id f143sm11058535wme.40.2019.09.19.07.43.11 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 19 Sep 2019 07:43:11 -0700 (PDT) Date: Thu, 19 Sep 2019 07:43:11 -0700 (PDT) X-Google-Original-Date: Thu, 19 Sep 2019 14:42:59 GMT Message-Id: <412211f5dd6d4d995f258403bf377bc0cb6332b4.1568904188.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Subject: [PATCH v2 02/11] sparse-checkout: create 'init' subcommand Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Junio C Hamano , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee Getting started with a sparse-checkout file can be daunting. Help users start their sparse enlistment using 'git sparse-checkout init'. This will set 'core.sparseCheckout=true' in their config, write an initial set of patterns to the sparse-checkout file, and update their working directory. Using 'git read-tree' to clear directories does not work cleanly on Windows, so manually delete directories that are tracked by Git before running read-tree. The use of running another process for 'git read-tree' is likely suboptimal, but that can be improved in a later change, if valuable. Signed-off-by: Derrick Stolee --- Documentation/git-sparse-checkout.txt | 7 +++ builtin/sparse-checkout.c | 69 ++++++++++++++++++++++++++- t/t1091-sparse-checkout-builtin.sh | 41 ++++++++++++++++ 3 files changed, 116 insertions(+), 1 deletion(-) diff --git a/Documentation/git-sparse-checkout.txt b/Documentation/git-sparse-checkout.txt index cdef451642..9707ef93b1 100644 --- a/Documentation/git-sparse-checkout.txt +++ b/Documentation/git-sparse-checkout.txt @@ -26,6 +26,13 @@ COMMANDS 'list':: Provide a list of the contents in the sparse-checkout file. +'init':: + Enable the `core.sparseCheckout` setting. If the + sparse-checkout file does not exist, then populate it with + patterns that match every file in the root directory and + no other directories, then will remove all directories tracked + by Git. Add patterns to the sparse-checkout file to + repopulate the working directory. SPARSE CHECKOUT ---------------- diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index eed9625a05..895479970d 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -8,7 +8,7 @@ #include "strbuf.h" static char const * const builtin_sparse_checkout_usage[] = { - N_("git sparse-checkout [list]"), + N_("git sparse-checkout [init|list]"), NULL }; @@ -59,6 +59,71 @@ static int sparse_checkout_list(int argc, const char **argv) return 0; } +static int update_working_directory(void) +{ + struct argv_array argv = ARGV_ARRAY_INIT; + int result = 0; + argv_array_pushl(&argv, "read-tree", "-m", "-u", "HEAD", NULL); + + if (run_command_v_opt(argv.argv, RUN_GIT_CMD)) { + error(_("failed to update index with new sparse-checkout paths")); + result = 1; + } + + argv_array_clear(&argv); + return result; +} + +static int sc_enable_config(void) +{ + struct argv_array argv = ARGV_ARRAY_INIT; + + if (git_config_set_gently("extensions.worktreeConfig", "true")) { + error(_("failed to set extensions.worktreeConfig setting")); + return 1; + } + + argv_array_pushl(&argv, "config", "--worktree", "core.sparseCheckout", "true", NULL); + + if (run_command_v_opt(argv.argv, RUN_GIT_CMD)) { + error(_("failed to enable core.sparseCheckout")); + return 1; + } + + return 0; +} + +static int sparse_checkout_init(int argc, const char **argv) +{ + struct pattern_list pl; + char *sparse_filename; + FILE *fp; + int res; + + if (sc_enable_config()) + return 1; + + memset(&pl, 0, sizeof(pl)); + + sparse_filename = get_sparse_checkout_filename(); + res = add_patterns_from_file_to_list(sparse_filename, "", 0, &pl, NULL); + + /* If we already have a sparse-checkout file, use it. */ + if (res >= 0) { + free(sparse_filename); + goto reset_dir; + } + + /* initial mode: all blobs at root */ + fp = fopen(sparse_filename, "w"); + free(sparse_filename); + fprintf(fp, "/*\n!/*/\n"); + fclose(fp); + +reset_dir: + return update_working_directory(); +} + int cmd_sparse_checkout(int argc, const char **argv, const char *prefix) { static struct option builtin_sparse_checkout_options[] = { @@ -79,6 +144,8 @@ int cmd_sparse_checkout(int argc, const char **argv, const char *prefix) if (argc > 0) { if (!strcmp(argv[0], "list")) return sparse_checkout_list(argc, argv); + if (!strcmp(argv[0], "init")) + return sparse_checkout_init(argc, argv); } usage_with_options(builtin_sparse_checkout_usage, diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh index 46e7b2dded..a6c6b336c9 100755 --- a/t/t1091-sparse-checkout-builtin.sh +++ b/t/t1091-sparse-checkout-builtin.sh @@ -47,5 +47,46 @@ test_expect_success 'git sparse-checkout list (populated)' ' test_cmp expect list ' +test_expect_success 'git sparse-checkout init' ' + git -C repo sparse-checkout init && + cat >expect <<-EOF && + /* + !/*/ + EOF + test_cmp expect repo/.git/info/sparse-checkout && + git -C repo config --list >config && + test_i18ngrep "core.sparsecheckout=true" config && + ls repo >dir && + echo a >expect && + test_cmp expect dir +' + +test_expect_success 'git sparse-checkout list after init' ' + git -C repo sparse-checkout list >actual && + cat >expect <<-EOF && + /* + !/*/ + EOF + test_cmp expect actual +' + +test_expect_success 'init with existing sparse-checkout' ' + echo "*folder*" >> repo/.git/info/sparse-checkout && + git -C repo sparse-checkout init && + cat >expect <<-EOF && + /* + !/*/ + *folder* + EOF + test_cmp expect repo/.git/info/sparse-checkout && + ls repo >dir && + cat >expect <<-EOF && + a + folder1 + folder2 + EOF + test_cmp expect dir +' + test_done From patchwork Thu Sep 19 14:43:12 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood via GitGitGadget X-Patchwork-Id: 11152733 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B74FA13BD for ; Thu, 19 Sep 2019 14:43:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8C8D72067B for ; Thu, 19 Sep 2019 14:43:17 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="uNnkTEgR" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388721AbfISOnP (ORCPT ); Thu, 19 Sep 2019 10:43:15 -0400 Received: from mail-wr1-f66.google.com ([209.85.221.66]:33454 "EHLO mail-wr1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732730AbfISOnP (ORCPT ); Thu, 19 Sep 2019 10:43:15 -0400 Received: by mail-wr1-f66.google.com with SMTP id b9so3431961wrs.0 for ; Thu, 19 Sep 2019 07:43:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:message-id:in-reply-to:references:from:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=CyN6B0PG/RFet48vvIge7p9pgKeLlQsSgSIu/Pw2ejw=; b=uNnkTEgR1cKo7aBGdR06KJDDdVtlJOAV6toXVO7For2p0qrV02mTRBjgHUxDDf1j67 HeM8jSukkVNOiSOBktk2oD+YEVsWmfroLvPapmrW38EMT2x4IzgS3Mb+4jaDLq9+teb1 LaXDacMbRVl2VUP/SYGMv9NHp7/e368vgrVHA+LPqK6atq2NlSbyC11mrnkC1+muoVdh wzhZ7zpZrgqoW3sz0aEb7YG5ZGMWNsaL+rbAEpexAJsjNFGxig06GiqL1NsF8dETgg4K vxZxPaWSqiC7gPSNWxiAtpAuUCKx5LuEIICYpWWoKQA0NfhkXGtFrZYAfcqjQCzskxLW J2xA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:in-reply-to:references:from :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=CyN6B0PG/RFet48vvIge7p9pgKeLlQsSgSIu/Pw2ejw=; b=bMU+HpNx2vazYyi5NZoKEZneSeDmgcUokeYEzLFKyZK9a8IQKNkb5aMRNAzWVCaKTD ez5ecWCUKqVs+VXkvrIFTvIEzU//w3JFg2atP+mQAJGh0AjwiGBe48V3w5VlyGVGllKY 3s7S4/IneHwzD3RqvDPj8iGfP0yMlQPyhWusRqFkaqOXTCTzWZCwr8OZe5FjgpyR+gVy RRiEMotT0d0eBuxflMoFhRf0yQ2VE9layuPZ3vuwZFXpMkVrVzpP1ObvAYwen5nj80U7 oIX7cWlG9+XtY52bW43EKiADbHkVIE5iM0mkeEGXw4nw/AeScXo1dcEkO5yXaGfKJEzS DWqw== X-Gm-Message-State: APjAAAXyPZsasbvtElBE41QuYQ0BTl48kEspsfZeaqLTkhnspmY96VWP ALjh7I/6rCZQB+jTQ5piVB4BbIlo X-Google-Smtp-Source: APXvYqx2UPfGue97VtzXvLuVYzHjY1ANonPWPG4TfZGXLyOIeDtcOyQolEuOJflWAKVk4pijerkLOg== X-Received: by 2002:adf:c504:: with SMTP id q4mr8067032wrf.266.1568904192883; Thu, 19 Sep 2019 07:43:12 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id s10sm9741444wmf.48.2019.09.19.07.43.12 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 19 Sep 2019 07:43:12 -0700 (PDT) Date: Thu, 19 Sep 2019 07:43:12 -0700 (PDT) X-Google-Original-Date: Thu, 19 Sep 2019 14:43:00 GMT Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Subject: [PATCH v2 03/11] clone: add --sparse mode Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Junio C Hamano , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee When someone wants to clone a large repository, but plans to work using a sparse-checkout file, they either need to do a full checkout first and then reduce the patterns they included, or clone with --no-checkout, set up their patterns, and then run a checkout manually. This requires knowing a lot about the repo shape and how sparse-checkout works. Add a new '--sparse' option to 'git clone' that initializes the sparse-checkout file to include the following patterns: /* !/*/ These patterns include every file in the root directory, but no directories. This allows a repo to include files like a README or a bootstrapping script to grow enlistments from that point. During the 'git sparse-checkout init' call, we must first look to see if HEAD is valid, or else we will fail while trying to update the working directory. The first checkout will actually update the working directory correctly. Signed-off-by: Derrick Stolee --- Documentation/git-clone.txt | 8 +++++++- builtin/clone.c | 27 +++++++++++++++++++++++++++ builtin/sparse-checkout.c | 6 ++++++ t/t1091-sparse-checkout-builtin.sh | 13 +++++++++++++ 4 files changed, 53 insertions(+), 1 deletion(-) diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt index 5fc97f14de..03299a8adb 100644 --- a/Documentation/git-clone.txt +++ b/Documentation/git-clone.txt @@ -15,7 +15,7 @@ SYNOPSIS [--dissociate] [--separate-git-dir ] [--depth ] [--[no-]single-branch] [--no-tags] [--recurse-submodules[=]] [--[no-]shallow-submodules] - [--[no-]remote-submodules] [--jobs ] [--] + [--[no-]remote-submodules] [--jobs ] [--sparse] [--] [] DESCRIPTION @@ -156,6 +156,12 @@ objects from the source repository into a pack in the cloned repository. used, neither remote-tracking branches nor the related configuration variables are created. +--sparse:: + Initialize the sparse-checkout file so the working + directory starts with only the files in the root + of the repository. The sparse-checkout file can be + modified to grow the working directory as needed. + --mirror:: Set up a mirror of the source repository. This implies `--bare`. Compared to `--bare`, `--mirror` not only maps local branches of the diff --git a/builtin/clone.c b/builtin/clone.c index a693e6ca44..16f4e8b6fd 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -58,6 +58,7 @@ static const char *real_git_dir; static char *option_upload_pack = "git-upload-pack"; static int option_verbosity; static int option_progress = -1; +static int option_sparse_checkout; static enum transport_family family; static struct string_list option_config = STRING_LIST_INIT_NODUP; static struct string_list option_required_reference = STRING_LIST_INIT_NODUP; @@ -145,6 +146,8 @@ static struct option builtin_clone_options[] = { OPT_PARSE_LIST_OBJECTS_FILTER(&filter_options), OPT_BOOL(0, "remote-submodules", &option_remote_submodules, N_("any cloned submodules will use their remote-tracking branch")), + OPT_BOOL(0, "sparse", &option_sparse_checkout, + N_("initialize sparse-checkout file to include only files at root")), OPT_END() }; @@ -723,6 +726,27 @@ static void update_head(const struct ref *our, const struct ref *remote, } } +static int git_sparse_checkout_init(const char *repo) +{ + struct argv_array argv = ARGV_ARRAY_INIT; + int result = 0; + argv_array_pushl(&argv, "-C", repo, "sparse-checkout", "init", NULL); + + /* + * We must apply the setting in the current process + * for the later checkout to use the sparse-checkout file. + */ + core_apply_sparse_checkout = 1; + + if (run_command_v_opt(argv.argv, RUN_GIT_CMD)) { + error(_("failed to initialize sparse-checkout")); + result = 1; + } + + argv_array_clear(&argv); + return result; +} + static int checkout(int submodule_progress) { struct object_id oid; @@ -1096,6 +1120,9 @@ int cmd_clone(int argc, const char **argv, const char *prefix) if (option_required_reference.nr || option_optional_reference.nr) setup_reference(); + if (option_sparse_checkout && git_sparse_checkout_init(repo)) + return 1; + remote = remote_get(option_origin); strbuf_addf(&default_refspec, "+%s*:%s*", src_ref_prefix, diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index 895479970d..656e6ebdd5 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -99,6 +99,7 @@ static int sparse_checkout_init(int argc, const char **argv) char *sparse_filename; FILE *fp; int res; + struct object_id oid; if (sc_enable_config()) return 1; @@ -120,6 +121,11 @@ static int sparse_checkout_init(int argc, const char **argv) fprintf(fp, "/*\n!/*/\n"); fclose(fp); + if (get_oid("HEAD", &oid)) { + /* assume we are in a fresh repo */ + return 0; + } + reset_dir: return update_working_directory(); } diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh index a6c6b336c9..26b4ce9acd 100755 --- a/t/t1091-sparse-checkout-builtin.sh +++ b/t/t1091-sparse-checkout-builtin.sh @@ -88,5 +88,18 @@ test_expect_success 'init with existing sparse-checkout' ' test_cmp expect dir ' +test_expect_success 'clone --sparse' ' + git clone --sparse repo clone && + git -C clone sparse-checkout list >actual && + cat >expect <<-EOF && + /* + !/*/ + EOF + test_cmp expect actual && + ls clone >dir && + echo a >expect && + test_cmp expect dir +' + test_done From patchwork Thu Sep 19 14:43:13 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood via GitGitGadget X-Patchwork-Id: 11152735 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6DD8714DB for ; Thu, 19 Sep 2019 14:43:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4C45920882 for ; Thu, 19 Sep 2019 14:43:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="J4XzBC1k" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389016AbfISOnR (ORCPT ); Thu, 19 Sep 2019 10:43:17 -0400 Received: from mail-wm1-f66.google.com ([209.85.128.66]:40917 "EHLO mail-wm1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732732AbfISOnP (ORCPT ); Thu, 19 Sep 2019 10:43:15 -0400 Received: by mail-wm1-f66.google.com with SMTP id b24so4254762wmj.5 for ; Thu, 19 Sep 2019 07:43:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:message-id:in-reply-to:references:from:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=EbCIxBjP1Gjy6pX4s2tXi7UGlubyPnXXNEjGv/7cmbA=; b=J4XzBC1kUXpQsuuVV/ENAbVeGzmvBIkI4Kcsi0HiO7Ye1CyzuceQA+u2stndrDnPcW apOZFz5QwRvD5h+/w973Vyh2NKLG+EmpSBQqBE8I+x2c5F34tBVWOJcDqPtpA9NAAK2+ daJU10qiH1i2j4JK3AJCSMRh3fhgi/zaB0n90oJiifnhuWSu4HiKG8LjFLx9UIcez51q 3BYtgzaF7geOAh2i/b0qOg+esZFfrCmW/GQMYhametLFfNDsPO1Dsyh10FucgyyxoAq3 K/winSF1qU9hVAIhUDGlYH/TcT+FvHSqAI/0DCsoFQZxraSh3i/zT7wGWlNJzmhIAQX6 SQQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:in-reply-to:references:from :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=EbCIxBjP1Gjy6pX4s2tXi7UGlubyPnXXNEjGv/7cmbA=; b=lrp+SptScKTPItMx/UvYejzMZbTm0EOzO6bCUZNxzGK91G4wife6utox+cE+6SjnYX Ds+LAEzD8b82Zx6hT3vP3H6fAd8B+CUus5htB8UHWXsCYR6S44jCLlzqEBXNKHcvybUE uuYGGDhILxgEQGGhD3Ui2PR86pxYpM70qY3TbYIjWZItiLiKPyX7Nkr/DbKpnsDZKko6 tHbe+dHKTIYK2XmbxdmajB/ahB55P9agBwasN6imGTaFrLytKFhGeV4jrZZunie+R/mm Ipt0kYZJsDuOqBnmv1s4wI4aDAZJt5ygAr66I5uHYsarvqYoHGMyvqZ/LeKDhl5SLfdJ TS0Q== X-Gm-Message-State: APjAAAXvntsNTUCbAq/2B7/59/4H+TjNmTOZ6n48Vfve1OSmFsYHZteo uIyidWo4K1F/8aA90iY4IHbCpV/R X-Google-Smtp-Source: APXvYqyYOzw9E4e1mvAwFSNswLe62905SEl/3wtOPa7h0YBbHxuPW8plw7qnf7cF180MX5/Y722i5w== X-Received: by 2002:a05:600c:d4:: with SMTP id u20mr3397747wmm.66.1568904193613; Thu, 19 Sep 2019 07:43:13 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a3sm5161468wmc.3.2019.09.19.07.43.13 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 19 Sep 2019 07:43:13 -0700 (PDT) Date: Thu, 19 Sep 2019 07:43:13 -0700 (PDT) X-Google-Original-Date: Thu, 19 Sep 2019 14:43:01 GMT Message-Id: <9a78f9ea0fe8d1988654f52a86a01031607621fe.1568904188.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Subject: [PATCH v2 04/11] sparse-checkout: 'set' subcommand Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Junio C Hamano , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee The 'git sparse-checkout set' subcommand takes a list of patterns as arguments and writes them to the sparse-checkout file. Then, it updates the working directory using 'git read-tree -mu HEAD'. The 'set' subcommand will replace the entire contents of the sparse-checkout file. The write_patterns_and_update() method is extracted from cmd_sparse_checkout() to make it easier to implement 'add' and/or 'remove' subcommands in the future. Signed-off-by: Derrick Stolee --- Documentation/git-sparse-checkout.txt | 5 +++++ builtin/sparse-checkout.c | 31 ++++++++++++++++++++++++++- t/t1091-sparse-checkout-builtin.sh | 19 ++++++++++++++++ 3 files changed, 54 insertions(+), 1 deletion(-) diff --git a/Documentation/git-sparse-checkout.txt b/Documentation/git-sparse-checkout.txt index 9707ef93b1..87813e5797 100644 --- a/Documentation/git-sparse-checkout.txt +++ b/Documentation/git-sparse-checkout.txt @@ -34,6 +34,11 @@ COMMANDS by Git. Add patterns to the sparse-checkout file to repopulate the working directory. +'set':: + Write a set of patterns to the sparse-checkout file, as given as + a list of arguments following the 'set' subcommand. Update the + working directory to match the new patterns. + SPARSE CHECKOUT ---------------- diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index 656e6ebdd5..13333fba6a 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -8,7 +8,7 @@ #include "strbuf.h" static char const * const builtin_sparse_checkout_usage[] = { - N_("git sparse-checkout [init|list]"), + N_("git sparse-checkout [init|list|set] "), NULL }; @@ -130,6 +130,33 @@ static int sparse_checkout_init(int argc, const char **argv) return update_working_directory(); } +static int write_patterns_and_update(struct pattern_list *pl) +{ + char *sparse_filename; + FILE *fp; + + sparse_filename = get_sparse_checkout_filename(); + fp = fopen(sparse_filename, "w"); + write_patterns_to_file(fp, pl); + fclose(fp); + free(sparse_filename); + + clear_pattern_list(pl); + return update_working_directory(); +} + +static int sparse_checkout_set(int argc, const char **argv, const char *prefix) +{ + int i; + struct pattern_list pl; + memset(&pl, 0, sizeof(pl)); + + for (i = 1; i < argc; i++) + add_pattern(argv[i], NULL, 0, &pl, 0); + + return write_patterns_and_update(&pl); +} + int cmd_sparse_checkout(int argc, const char **argv, const char *prefix) { static struct option builtin_sparse_checkout_options[] = { @@ -152,6 +179,8 @@ int cmd_sparse_checkout(int argc, const char **argv, const char *prefix) return sparse_checkout_list(argc, argv); if (!strcmp(argv[0], "init")) return sparse_checkout_init(argc, argv); + if (!strcmp(argv[0], "set")) + return sparse_checkout_set(argc, argv, prefix); } usage_with_options(builtin_sparse_checkout_usage, diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh index 26b4ce9acd..f21ea61494 100755 --- a/t/t1091-sparse-checkout-builtin.sh +++ b/t/t1091-sparse-checkout-builtin.sh @@ -101,5 +101,24 @@ test_expect_success 'clone --sparse' ' test_cmp expect dir ' +test_expect_success 'set sparse-checkout using builtin' ' + git -C repo sparse-checkout set "/*" "!/*/" "*folder*" && + cat >expect <<-EOF && + /* + !/*/ + *folder* + EOF + git -C repo sparse-checkout list >actual && + test_cmp expect actual && + test_cmp expect repo/.git/info/sparse-checkout && + ls repo >dir && + cat >expect <<-EOF && + a + folder1 + folder2 + EOF + test_cmp expect dir +' + test_done From patchwork Thu Sep 19 14:43:13 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood via GitGitGadget X-Patchwork-Id: 11152737 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 987CC197C for ; Thu, 19 Sep 2019 14:43:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7761720882 for ; Thu, 19 Sep 2019 14:43:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="bYlr8A5Q" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389047AbfISOnS (ORCPT ); Thu, 19 Sep 2019 10:43:18 -0400 Received: from mail-wr1-f67.google.com ([209.85.221.67]:44547 "EHLO mail-wr1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388953AbfISOnR (ORCPT ); Thu, 19 Sep 2019 10:43:17 -0400 Received: by mail-wr1-f67.google.com with SMTP id i18so3349971wru.11 for ; Thu, 19 Sep 2019 07:43:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:message-id:in-reply-to:references:from:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=NsKP+NJWULFzo+Nul6WMF/A63W/28C3bXfTdPlDK/1w=; b=bYlr8A5Qi1VJ6xMUb5Td7Fx8X7Ya2RJ4SQpM3ZmuYrUaNTd4PFv7d7TBqHMvD7iGCf Iwp5g84a3RPtntMq2UZ4z9ePuk/av/QdhbjhHu3ZO6sT9dle5OIBVBgJ27CMgXLcPURQ HuAQ3N0TkCML4ZYKL+gz9Yp9LlgQ9/5C9H0OQL6/VeCCDgFMFJQMvQH6ZNVyZ4rg17GS b0kT3ujndDbbTa3OHbQds5xbH7a6Hgag6SeNyW2ElrwZKlxAu4DZiIKi2VesD3s6mEP+ PXuqfuiybUC394oLsvrwtO4Q61vwYVGcrU2e9XUjuGA/HZ3TJU6G3NHk/Brn/1JVgCqJ YHFQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:in-reply-to:references:from :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=NsKP+NJWULFzo+Nul6WMF/A63W/28C3bXfTdPlDK/1w=; b=MfvWd9eoeIbL6g33J+MP2aJbX7jY1slArsfab1bVIOLRRDqSQXP7/IrEP1QK7FIfMK 9xz+k4aBXGke3qtel8AHOTKBhBEAEKASYhPZ3HzfOByB01vqU6HJNS1ZzvSvneGgUlaz m0lmWi7tIocRIgDtNEQLecMVpCZ1VyBgGIgbVBlV3Q9DO0mZJV2wepJ4xXsHZV1ZGgeN 3VjRufjMGA8Tb6xuV2YMLCxbnRICJKzeR5OTaIHhkdNUYg94TQlzIv0bj7WWLisBBgo5 nIl4fmCvFPyjRQgCJe5lBrM70llsAVShmTRj5NccPcwWb18++0xC1OXn+d2e3nTYQNds ZNAA== X-Gm-Message-State: APjAAAVe3S1y1YiHLyd43mzKdi01DNTMCP8yiJ5vXLqBrYwUF0MOXwC6 dpPvRMEw0Y14VYoGv76+6Pw4liKr X-Google-Smtp-Source: APXvYqx0ZXBRc7jQXbKsbWkEt0+56Nq1Bbsdz4cIiDbJayeM75jktS4z97HeIgX7B5RB0NkGD7R0Ow== X-Received: by 2002:a5d:500b:: with SMTP id e11mr7458233wrt.285.1568904194344; Thu, 19 Sep 2019 07:43:14 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a10sm9943392wrm.52.2019.09.19.07.43.13 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 19 Sep 2019 07:43:13 -0700 (PDT) Date: Thu, 19 Sep 2019 07:43:13 -0700 (PDT) X-Google-Original-Date: Thu, 19 Sep 2019 14:43:02 GMT Message-Id: <21a0165be73dccf25c2c83a37d506ec061fd1d07.1568904188.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Subject: [PATCH v2 05/11] sparse-checkout: add '--stdin' option to set subcommand Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Junio C Hamano , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee The 'git sparse-checkout set' subcommand takes a list of patterns and places them in the sparse-checkout file. Then, it updates the working directory to match those patterns. For a large list of patterns, the command-line call can get very cumbersome. Add a '--stdin' option to instead read patterns over standard in. Signed-off-by: Derrick Stolee --- builtin/sparse-checkout.c | 35 ++++++++++++++++++++++++++++-- t/t1091-sparse-checkout-builtin.sh | 20 +++++++++++++++++ 2 files changed, 53 insertions(+), 2 deletions(-) diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index 13333fba6a..f726fcd6b8 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -145,14 +145,45 @@ static int write_patterns_and_update(struct pattern_list *pl) return update_working_directory(); } +static char const * const builtin_sparse_checkout_set_usage[] = { + N_("git sparse-checkout set [--stdin|]"), + NULL +}; + +static struct sparse_checkout_set_opts { + int use_stdin; +} set_opts; + static int sparse_checkout_set(int argc, const char **argv, const char *prefix) { int i; struct pattern_list pl; + + static struct option builtin_sparse_checkout_set_options[] = { + OPT_BOOL(0, "stdin", &set_opts.use_stdin, + N_("read patterns from standard in")), + OPT_END(), + }; + memset(&pl, 0, sizeof(pl)); - for (i = 1; i < argc; i++) - add_pattern(argv[i], NULL, 0, &pl, 0); + argc = parse_options(argc, argv, prefix, + builtin_sparse_checkout_set_options, + builtin_sparse_checkout_set_usage, + PARSE_OPT_KEEP_UNKNOWN); + + if (set_opts.use_stdin) { + struct strbuf line = STRBUF_INIT; + + while (!strbuf_getline(&line, stdin)) { + size_t len; + char *buf = strbuf_detach(&line, &len); + add_pattern(buf, buf, len, &pl, 0); + } + } else { + for (i = 0; i < argc; i++) + add_pattern(argv[i], argv[i], strlen(argv[i]), &pl, 0); + } return write_patterns_and_update(&pl); } diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh index f21ea61494..02ba9ec314 100755 --- a/t/t1091-sparse-checkout-builtin.sh +++ b/t/t1091-sparse-checkout-builtin.sh @@ -120,5 +120,25 @@ test_expect_success 'set sparse-checkout using builtin' ' test_cmp expect dir ' +test_expect_success 'set sparse-checkout using --stdin' ' + cat >expect <<-EOF && + /* + !/*/ + /folder1/ + /folder2/ + EOF + git -C repo sparse-checkout set --stdin actual && + test_cmp expect actual && + test_cmp expect repo/.git/info/sparse-checkout && + ls repo >dir && + cat >expect <<-EOF && + a + folder1 + folder2 + EOF + test_cmp expect dir +' + test_done From patchwork Thu Sep 19 14:43:14 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood via GitGitGadget X-Patchwork-Id: 11152747 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9561914DB for ; Thu, 19 Sep 2019 14:43:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7147E20882 for ; Thu, 19 Sep 2019 14:43:23 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="df/tl8OD" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389260AbfISOnW (ORCPT ); Thu, 19 Sep 2019 10:43:22 -0400 Received: from mail-wm1-f65.google.com ([209.85.128.65]:33615 "EHLO mail-wm1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388859AbfISOnS (ORCPT ); Thu, 19 Sep 2019 10:43:18 -0400 Received: by mail-wm1-f65.google.com with SMTP id r17so7348080wme.0 for ; Thu, 19 Sep 2019 07:43:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:message-id:in-reply-to:references:from:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=WWtbC5nTP7f/ZxACRDMdPISN1ij1tPLUd2relRVCRwg=; b=df/tl8ODno5bALxRAmIpsmwFi105Me/KCBaGMXWR4TD4HRjvH6xlY7WRnz998wneGC FVEkqtZ2TFfqqaRl/JROb9GNFoADLrIHzVM73exaCRELUlsBqDGihf42YS+dD4Q5SwH9 nwGnX4oklAZCZEaycwv36X/Xl0RKy8RsRgoavubrwtdsrFI2idIaiSvTl96hdT8ErtnP /2bzUOKNREAxFR9S8pvSkcVhPf0rRuDOQRcbVimJYs96VLXRlrx1wTqLYYgfrU0uwjIX V45/PETJfIdAq0mNNJGyaybALKZpjV9i4JnnBt0YVYU+9NgTdgucBIU9AGUnDsSxNp69 K2PQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:in-reply-to:references:from :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=WWtbC5nTP7f/ZxACRDMdPISN1ij1tPLUd2relRVCRwg=; b=QBt/OcdU9LfrQNT9DOzCs1P4Xzp4zfe0FAMZD+6NLLHFghvOudZdVTBDRx7Rad5Xz+ FRNXzQQrWIMVTRiirYvILXBJRwp65qVMVlj+gieJ14BY/jColkJ7DpC5/RRJFdmjG3i9 Tt/DXKNyyaxdffN6RDlkgBD8aedeqW4lAOMG4gAqACpNam4VoSBHAnEfnalx2t8gK4wy 807z/yClaDngpNK1FfRMupecPt5mL2EpUBLIG3GHFQY4J4hRwQMyBqDRkuScJo6dnaWD rUm3kawfuwDcA8LluulP7hka4729f3slnjurkhZ8+iSQKGaaXxD7Lj7ydXxw0FshvY5P h9Vg== X-Gm-Message-State: APjAAAUK0x8yOtJ4Uj+ctYdcWChScfyShqNugD51hq26Sp7+/SxbGIUd EqPiDObPOqju3TsfjwPGHxbNfvkN X-Google-Smtp-Source: APXvYqz8FTNswUyqlwnI/Nb5J9x6olVVFRpyi/9kGcM46k/FwXqW7940fvm8Ea/59PLuqNjEQnpzJw== X-Received: by 2002:a1c:9e46:: with SMTP id h67mr3304073wme.48.1568904195020; Thu, 19 Sep 2019 07:43:15 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id q66sm8716353wme.39.2019.09.19.07.43.14 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 19 Sep 2019 07:43:14 -0700 (PDT) Date: Thu, 19 Sep 2019 07:43:14 -0700 (PDT) X-Google-Original-Date: Thu, 19 Sep 2019 14:43:03 GMT Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Subject: [PATCH v2 06/11] sparse-checkout: create 'disable' subcommand Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Junio C Hamano , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee The instructions for disabling a sparse-checkout to a full working directory are complicated and non-intuitive. Add a subcommand, 'git sparse-checkout disable', to perform those steps for the user. Signed-off-by: Derrick Stolee --- Documentation/git-sparse-checkout.txt | 26 ++++++++----------- builtin/sparse-checkout.c | 37 ++++++++++++++++++++++++--- t/t1091-sparse-checkout-builtin.sh | 15 +++++++++++ 3 files changed, 59 insertions(+), 19 deletions(-) diff --git a/Documentation/git-sparse-checkout.txt b/Documentation/git-sparse-checkout.txt index 87813e5797..da95b28b1c 100644 --- a/Documentation/git-sparse-checkout.txt +++ b/Documentation/git-sparse-checkout.txt @@ -39,6 +39,10 @@ COMMANDS a list of arguments following the 'set' subcommand. Update the working directory to match the new patterns. +'disable':: + Remove the sparse-checkout file, set `core.sparseCheckout` to + `false`, and restore the working directory to include all files. + SPARSE CHECKOUT ---------------- @@ -61,6 +65,13 @@ Then it compares the new skip-worktree value with the previous one. If skip-worktree turns from set to unset, it will add the corresponding file back. If it turns from unset to set, that file will be removed. +To repopulate the working directory with all files, use the +`git sparse-checkout disable` command. + +Sparse checkout support in 'git checkout' and similar commands is +disabled by default. You need to set `core.sparseCheckout` to `true` +in order to have sparse checkout support. + ## FULL PATTERN SET By default, the sparse-checkout file uses the same syntax as `.gitignore` @@ -75,21 +86,6 @@ using negative patterns. For example, to remove the file `unwanted`: !unwanted ---------------- -Another tricky thing is fully repopulating the working directory when you -no longer want sparse checkout. You cannot just disable "sparse -checkout" because skip-worktree bits are still in the index and your working -directory is still sparsely populated. You should re-populate the working -directory with the `$GIT_DIR/info/sparse-checkout` file content as -follows: - ----------------- -/* ----------------- - -Then you can disable sparse checkout. Sparse checkout support in 'git -read-tree' and similar commands is disabled by default. You need to -set `core.sparseCheckout` to `true` in order to have sparse checkout -support. SEE ALSO -------- diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index f726fcd6b8..f858f0b1b5 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -8,7 +8,7 @@ #include "strbuf.h" static char const * const builtin_sparse_checkout_usage[] = { - N_("git sparse-checkout [init|list|set] "), + N_("git sparse-checkout [init|list|set|disable] "), NULL }; @@ -74,7 +74,7 @@ static int update_working_directory(void) return result; } -static int sc_enable_config(void) +static int sc_set_config(int mode) { struct argv_array argv = ARGV_ARRAY_INIT; @@ -83,7 +83,12 @@ static int sc_enable_config(void) return 1; } - argv_array_pushl(&argv, "config", "--worktree", "core.sparseCheckout", "true", NULL); + argv_array_pushl(&argv, "config", "--worktree", "core.sparseCheckout", NULL); + + if (mode) + argv_array_pushl(&argv, "true", NULL); + else + argv_array_pushl(&argv, "false", NULL); if (run_command_v_opt(argv.argv, RUN_GIT_CMD)) { error(_("failed to enable core.sparseCheckout")); @@ -101,7 +106,7 @@ static int sparse_checkout_init(int argc, const char **argv) int res; struct object_id oid; - if (sc_enable_config()) + if (sc_set_config(1)) return 1; memset(&pl, 0, sizeof(pl)); @@ -188,6 +193,28 @@ static int sparse_checkout_set(int argc, const char **argv, const char *prefix) return write_patterns_and_update(&pl); } +static int sparse_checkout_disable(int argc, const char **argv) +{ + char *sparse_filename; + FILE *fp; + + if (sc_set_config(1)) + die(_("failed to change config")); + + sparse_filename = get_sparse_checkout_filename(); + fp = fopen(sparse_filename, "w"); + fprintf(fp, "/*\n"); + fclose(fp); + + if (update_working_directory()) + die(_("error while refreshing working directory")); + + unlink(sparse_filename); + free(sparse_filename); + + return sc_set_config(0); +} + int cmd_sparse_checkout(int argc, const char **argv, const char *prefix) { static struct option builtin_sparse_checkout_options[] = { @@ -212,6 +239,8 @@ int cmd_sparse_checkout(int argc, const char **argv, const char *prefix) return sparse_checkout_init(argc, argv); if (!strcmp(argv[0], "set")) return sparse_checkout_set(argc, argv, prefix); + if (!strcmp(argv[0], "disable")) + return sparse_checkout_disable(argc, argv); } usage_with_options(builtin_sparse_checkout_usage, diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh index 02ba9ec314..22fa032d6d 100755 --- a/t/t1091-sparse-checkout-builtin.sh +++ b/t/t1091-sparse-checkout-builtin.sh @@ -140,5 +140,20 @@ test_expect_success 'set sparse-checkout using --stdin' ' test_cmp expect dir ' +test_expect_success 'sparse-checkout disable' ' + git -C repo sparse-checkout disable && + test_path_is_missing repo/.git/info/sparse-checkout && + git -C repo config --list >config && + test_i18ngrep "core.sparsecheckout=false" config && + ls repo >dir && + cat >expect <<-EOF && + a + deep + folder1 + folder2 + EOF + test_cmp expect dir +' + test_done From patchwork Thu Sep 19 14:43:15 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood via GitGitGadget X-Patchwork-Id: 11152739 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 186F415E6 for ; Thu, 19 Sep 2019 14:43:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id EA89A2067B for ; Thu, 19 Sep 2019 14:43:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ltAJmtBB" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389149AbfISOnT (ORCPT ); Thu, 19 Sep 2019 10:43:19 -0400 Received: from mail-wm1-f65.google.com ([209.85.128.65]:52031 "EHLO mail-wm1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388938AbfISOnS (ORCPT ); Thu, 19 Sep 2019 10:43:18 -0400 Received: by mail-wm1-f65.google.com with SMTP id 7so4908246wme.1 for ; Thu, 19 Sep 2019 07:43:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:message-id:in-reply-to:references:from:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=aiknyIBVY/8IL38FhivZVJeFUnAaYaiRItPlyWKkNDY=; b=ltAJmtBBBUUYLX7UDxaVmSDK6BNHROHJypCMnXvyiBOdSCVUWxbsjYDBXXx9ivgZ82 n5TX0h94tz4x/gfs4OtBrYvfJ4bqdSeekZ91riE/0/DlTd+AyY1GI8njZ4WeovFfC+SA AeiprztMBsJn10KSHdU0Oe4OzdzCS8c2AoVgXgI0S98BPXI0zzX1ysm12wLhWUhGIH0p uL3PibCt9B4lkUaZh7g0oxpOYWFaQUVjIZjA74voOQOvGYmtib6x78Ba49YBpuq477CL PxY97kScj5KOQMygK5KbSxUnxrD3quHekn5Ae8wXqagKmblegmRBNmfWMIvD196sFYC+ qBAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:in-reply-to:references:from :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=aiknyIBVY/8IL38FhivZVJeFUnAaYaiRItPlyWKkNDY=; b=bkyyczuq+Af0AYNiDNDYG2fes0Ypvu0X1/N06x2WRPSpco5qZ30JNQhnOLhz3n3cFv AJjKdft3DNA148BKwGzb7zwT55/W94fbIr3RhhE3TY4eLogsX8Frmv/zkvBbdAgs1WRp BpWJ4edizlvCbZvDt2Vge2I+n1gjTnPzOUacDv77FzUXhxs1H9DpBwk1UBM+UFJYbJS1 E6pTxOFg2Mz8B8SsWrEjX6dYKBNBEklJQ0RDMEEVgDktBo87bdNwN2VqsyrMLcPJsJPc ozCwuqrN0wI9lh2mSLZYIYDHFrjA/0H0GEVgXy2S3cIhMQ+4zjRDMMdUQ+ri5RmgIw8K FILA== X-Gm-Message-State: APjAAAXH/uqNyNdaqR5ULEf19g0Vn6NSH8urPZ6xJjJvhK46jspR2NmB NUG5ze4e/j4XqAg5mDM8SAVuyg+Q X-Google-Smtp-Source: APXvYqwcJRTeFPhx/+ANaZ8n7HA3V/ECzYIchlJBHTD0hNEK4/WFjAQC3jwhdenEQ7pjz6TxGYg66A== X-Received: by 2002:a1c:7f54:: with SMTP id a81mr3374105wmd.100.1568904195775; Thu, 19 Sep 2019 07:43:15 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id i74sm7109388wmg.44.2019.09.19.07.43.15 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 19 Sep 2019 07:43:15 -0700 (PDT) Date: Thu, 19 Sep 2019 07:43:15 -0700 (PDT) X-Google-Original-Date: Thu, 19 Sep 2019 14:43:04 GMT Message-Id: <25642f8df28825cce61812a24cbd87bf7cb2025f.1568904188.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Jeff Hostetler via GitGitGadget" Subject: [PATCH v2 07/11] trace2: add region in clear_ce_flags Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Junio C Hamano , Jeff Hostetler Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff Hostetler When Git updates the working directory with the sparse-checkout feature enabled, the unpack_trees() method calls clear_ce_flags() to update the skip-wortree bits on the cache entries. This check can be expensive, depending on the patterns used. Add trace2 regions around the method, including some flag information, so we can get granular performance data during experiments. This data will be used to measure improvements to the pattern-matching algorithms for sparse-checkout. Signed-off-by: Jeff Hostetler Signed-off-by: Derrick Stolee --- unpack-trees.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/unpack-trees.c b/unpack-trees.c index cd548f4fa2..26be8f3569 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -1404,15 +1404,23 @@ static int clear_ce_flags(struct index_state *istate, struct pattern_list *pl) { static struct strbuf prefix = STRBUF_INIT; + char label[100]; + int rval; strbuf_reset(&prefix); - return clear_ce_flags_1(istate, + xsnprintf(label, sizeof(label), "clear_ce_flags(0x%08lx,0x%08lx)", + (unsigned long)select_mask, (unsigned long)clear_mask); + trace2_region_enter("unpack_trees", label, the_repository); + rval = clear_ce_flags_1(istate, istate->cache, istate->cache_nr, &prefix, select_mask, clear_mask, pl, 0); + trace2_region_leave("unpack_trees", label, the_repository); + + return rval; } /* From patchwork Thu Sep 19 14:43:16 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood via GitGitGadget X-Patchwork-Id: 11152743 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1F10015E6 for ; Thu, 19 Sep 2019 14:43:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E06F02067B for ; Thu, 19 Sep 2019 14:43:21 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="n3YBZ/RG" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389172AbfISOnV (ORCPT ); Thu, 19 Sep 2019 10:43:21 -0400 Received: from mail-wm1-f66.google.com ([209.85.128.66]:54502 "EHLO mail-wm1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389099AbfISOnU (ORCPT ); Thu, 19 Sep 2019 10:43:20 -0400 Received: by mail-wm1-f66.google.com with SMTP id p7so4880330wmp.4 for ; Thu, 19 Sep 2019 07:43:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:message-id:in-reply-to:references:from:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=t/Ap4W5f4tarbhO3g+AQilFQK2ONrOzMhIiaKf4QM2A=; b=n3YBZ/RGOCEjiWjitquF2cYYGkeMtyfRf+pFnBstJ8++ra/iQP1tFnIaBfLMVPA0nd zoEu/ZQrqij8FT9Gi5HmbEnnj3VxqL9zz6Lg6QioqwI9XJpCqFAVk60+GK8A+lN0ZI0y /kDFgB7PJIE7s3Te3BBRfFAVG2xSOWHWl2ixeayMIDXFG3PLHhQeLj3dDXLuc9KfGumz iXN2sPs/IC/mQprA7/BaaVDP6JRehfLUjITwVz0nC+Q+5RVLzHLuKjr+JiI1aUG0bwac vSy/p9RqgP3vKA3zv3lfsK5Gnfsc0E0bpXmzq2Kx9KWMMcIn5ADy1YtgMino4bI1L+qD Z/sg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:in-reply-to:references:from :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=t/Ap4W5f4tarbhO3g+AQilFQK2ONrOzMhIiaKf4QM2A=; b=laVN21I4OMN/sURJbYcy2IMXk1ZnBcKpAlPri0t3GToZ/wOQUbtwRS7V2R9SbfGVbv sGoDp0qIWyeN5JpxHYGy4shECFdII/gPntdvtfMzTNKJ+2GlBzfW9Ydq29lQwaFmHCym MiUXwGNmTgKzh73fHyl3Iekv1HQ+x7s581w/WZuL5+9QNvhnH/NDofhhoU8QQ/RqVEeK xqey6hc1rG4V9dM248lhAomv0wQi7xnxkutEm4UTO3FK518mmcqMmy1tBUA/CuiyUtJa Us1KzndpwYZjcw7DXwwYj8YZIo5T4ePeV7qNVDRidO0qSOOCP5OEGt6m0JoPKN4rNSQx hs5Q== X-Gm-Message-State: APjAAAUxQmsO9V/WsfjiluhzAEDJETqHwR54onLtck9dS4FvgG/S769r NTQSFmz76zBjlIllYaSNI0s5Qibc X-Google-Smtp-Source: APXvYqxYQUVkiMO2vE5DX7ULyDurZpKMhrPZdSAtOJP0VOWKyX+RS6oWPxsjtOtCvNxGQUq9orBHQA== X-Received: by 2002:a7b:c152:: with SMTP id z18mr3127491wmi.70.1568904196499; Thu, 19 Sep 2019 07:43:16 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id x6sm8429128wmf.38.2019.09.19.07.43.15 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 19 Sep 2019 07:43:16 -0700 (PDT) Date: Thu, 19 Sep 2019 07:43:16 -0700 (PDT) X-Google-Original-Date: Thu, 19 Sep 2019 14:43:05 GMT Message-Id: <84511255d1f28e1bdcec3de6096d2d9ac2a9f483.1568904188.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Subject: [PATCH v2 08/11] sparse-checkout: add 'cone' mode Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Junio C Hamano , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee The sparse-checkout feature can have quadratic performance as the number of patterns and number of entries in the index grow. If there are 1,000 patterns and 1,000,000 entries, this time can be very significant. Create a new Boolean config option, core.sparseCheckoutCone, to indicate that we expect the sparse-checkout file to contain a more limited set of patterns. This is a separate config setting from core.sparseCheckout to avoid breaking older clients by introcuding a tri-state option. The config option does nothing right now, but will be expanded upon in a later commit. Signed-off-by: Derrick Stolee --- Documentation/config/core.txt | 7 ++-- Documentation/git-sparse-checkout.txt | 50 +++++++++++++++++++++++++++ cache.h | 4 ++- config.c | 5 +++ environment.c | 1 + t/t1091-sparse-checkout-builtin.sh | 14 ++++++++ 6 files changed, 78 insertions(+), 3 deletions(-) diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt index 75538d27e7..9b8ab2a6d4 100644 --- a/Documentation/config/core.txt +++ b/Documentation/config/core.txt @@ -591,8 +591,11 @@ core.multiPackIndex:: multi-pack-index design document]. core.sparseCheckout:: - Enable "sparse checkout" feature. See section "Sparse checkout" in - linkgit:git-read-tree[1] for more information. + Enable "sparse checkout" feature. If "false", then sparse-checkout + is disabled. If "true", then sparse-checkout is enabled with the full + .gitignore pattern set. If "cone", then sparse-checkout is enabled with + a restricted pattern set. See linkgit:git-sparse-checkout[1] for more + information. core.abbrev:: Set the length object names are abbreviated to. If diff --git a/Documentation/git-sparse-checkout.txt b/Documentation/git-sparse-checkout.txt index da95b28b1c..757326618d 100644 --- a/Documentation/git-sparse-checkout.txt +++ b/Documentation/git-sparse-checkout.txt @@ -87,6 +87,56 @@ using negative patterns. For example, to remove the file `unwanted`: ---------------- +## CONE PATTERN SET + +The full pattern set allows for arbitrary pattern matches and complicated +inclusion/exclusion rules. These can result in O(N*M) pattern matches when +updating the index, where N is the number of patterns and M is the number +of paths in the index. To combat this performance issue, a more restricted +pattern set is allowed when `core.spareCheckoutCone` is enabled. + +The accepted patterns in the cone pattern set are: + +1. *Recursive:* All paths inside a directory are included. + +2. *Parent:* All files immediately inside a directory are included. + +In addition to the above two patterns, we also expect that all files in the +root directory are included. If a recursive pattern is added, then all +leading directories are added as parent patterns. + +By default, when running `git sparse-checkout init`, the root directory is +added as a parent pattern. At this point, the sparse-checkout file contains +the following patterns: + +``` +/* +!/*/ +``` + +This says "include everything in root, but nothing two levels below root." +If we then add the folder `A/B/C` as a recursive pattern, the folders `A` and +`A/B` are added as parent patterns. The resulting sparse-checkout file is +now + +``` +/* +!/*/ +/A/ +!/A/*/ +/A/B/ +!/A/B/*/ +/A/B/C/ +``` + +Here, order matters, so the negative patterns are overridden by the positive +patterns that appear lower in the file. + +If `core.sparseCheckoutCone=true`, then Git will parse the sparse-checkout file +expecting patterns of these types. Git will warn if the patterns do not match. +If the patterns do match the expected format, then Git will use faster hash- +based algorithms to compute inclusion in the sparse-checkout. + SEE ALSO -------- diff --git a/cache.h b/cache.h index cf5d70c196..8e8ea67efa 100644 --- a/cache.h +++ b/cache.h @@ -911,12 +911,14 @@ extern char *git_replace_ref_base; extern int fsync_object_files; extern int core_preload_index; -extern int core_apply_sparse_checkout; extern int precomposed_unicode; extern int protect_hfs; extern int protect_ntfs; extern const char *core_fsmonitor; +int core_apply_sparse_checkout; +int core_sparse_checkout_cone; + /* * Include broken refs in all ref iterations, which will * generally choke dangerous operations rather than letting diff --git a/config.c b/config.c index 296a6d9cc4..f65c74f5b7 100644 --- a/config.c +++ b/config.c @@ -1329,6 +1329,11 @@ static int git_default_core_config(const char *var, const char *value, void *cb) return 0; } + if (!strcmp(var, "core.sparsecheckoutcone")) { + core_sparse_checkout_cone = git_config_bool(var, value); + return 0; + } + if (!strcmp(var, "core.precomposeunicode")) { precomposed_unicode = git_config_bool(var, value); return 0; diff --git a/environment.c b/environment.c index 89af47cb85..670d92bcc0 100644 --- a/environment.c +++ b/environment.c @@ -69,6 +69,7 @@ enum object_creation_mode object_creation_mode = OBJECT_CREATION_MODE; char *notes_ref_name; int grafts_replace_parents = 1; int core_apply_sparse_checkout; +int core_sparse_checkout_cone; int merge_log_config = -1; int precomposed_unicode = -1; /* see probe_utf8_pathname_composition() */ unsigned long pack_size_limit_cfg; diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh index 22fa032d6d..9b089c98c4 100755 --- a/t/t1091-sparse-checkout-builtin.sh +++ b/t/t1091-sparse-checkout-builtin.sh @@ -140,6 +140,20 @@ test_expect_success 'set sparse-checkout using --stdin' ' test_cmp expect dir ' +test_expect_success 'cone mode: match patterns' ' + git -C repo config --worktree core.sparseCheckoutCone true && + rm -rf repo/a repo/folder1 repo/folder2 && + git -C repo read-tree -mu HEAD && + git -C repo reset --hard && + ls repo >dir && + cat >expect <<-EOF && + a + folder1 + folder2 + EOF + test_cmp expect dir +' + test_expect_success 'sparse-checkout disable' ' git -C repo sparse-checkout disable && test_path_is_missing repo/.git/info/sparse-checkout && From patchwork Thu Sep 19 14:43:16 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood via GitGitGadget X-Patchwork-Id: 11152745 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D733314DB for ; Thu, 19 Sep 2019 14:43:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A7DCE2067B for ; Thu, 19 Sep 2019 14:43:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="IdZLXi6Y" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389226AbfISOnV (ORCPT ); Thu, 19 Sep 2019 10:43:21 -0400 Received: from mail-wm1-f68.google.com ([209.85.128.68]:54505 "EHLO mail-wm1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389040AbfISOnT (ORCPT ); Thu, 19 Sep 2019 10:43:19 -0400 Received: by mail-wm1-f68.google.com with SMTP id p7so4880398wmp.4 for ; Thu, 19 Sep 2019 07:43:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:message-id:in-reply-to:references:from:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=P73PzyKQxO7LFzalWzLbks2iJjAj49RGhuHcshMsQfo=; b=IdZLXi6YyAxn5F52cgmt9s5O/Y6ex70p7Ov7ZP6WQzlj6y5AK0PyiXUANXNVgI0tLf N1S9C+9GKPFL4J9DUGSTSKnTPk+jmdZTAoUaLzwFlu8fhiLbvGTsNc5+bZy+kC1/bmZ7 qgmXblwopdxRhdXnWY3RuNAp2e+OdwiTMMejIiIrn7hv4/layuDAtOjACXFO8oYyCXrK uN/YI64wBQBWe+nvLxiCGdLZ+IxQn/03b14TCQhQAJRdt58BBrPMFxZkZolfdAWD7Lf1 CVPUkE23Vc/8P3+y0U4SLO1SEpK0JBm0pNvYPT3wjF3dbr8ygLn8l1C3b+4d39XnfQqT qWWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:in-reply-to:references:from :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=P73PzyKQxO7LFzalWzLbks2iJjAj49RGhuHcshMsQfo=; b=P2KyoUApnkqbKR+qnlg3ZxqgshAJrV6zrOxLGtRVbBUV7FhccWzHgYOE2mj69TbBge UwGFYs/qzfb1vs67mGWoiz+LkFX7pcLQrHdePG5TqEaQlk4vsTRK5wziWIPiESRBXElE ZOBEp/AK0zdcCI01+tz+s2eQoo98PfNHSfxbdY0diiHPOa6ZL9Wzry/wPPu9gE1u8qp4 a9TZePOUcN9ZJeZr5kDNfPQNKE70yM4d/woilS89ZtXmzHWBsblU5Pv4ZUOgjpTtTXaf Q3wDZ2gSWPlWlkVNscO7vSlsrL9JzMPw+WOUtSCMUmfoGJnp8Sz5JqRm6eVttIQIwQ7L corg== X-Gm-Message-State: APjAAAXNdP2UZaOD6KYexYip0/FN2DG3R34BKaMb9CmLJFz1yD5wgd2r 7IgbqAun8dPBp7eJKKAb592ZBL2X X-Google-Smtp-Source: APXvYqxaxE2vm+L4jhY4tWO3t8LvtboNtmPahLz2hSHNxQ36kqdXHNKCyJ7Mo28p9WjcTOKaggm78w== X-Received: by 2002:a1c:720a:: with SMTP id n10mr3481693wmc.0.1568904197368; Thu, 19 Sep 2019 07:43:17 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id k26sm2306474wmj.33.2019.09.19.07.43.16 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 19 Sep 2019 07:43:16 -0700 (PDT) Date: Thu, 19 Sep 2019 07:43:16 -0700 (PDT) X-Google-Original-Date: Thu, 19 Sep 2019 14:43:06 GMT Message-Id: <95a3285bc6021daa236d98d7e1bbdc5c45fc73b0.1568904188.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Subject: [PATCH v2 09/11] sparse-checkout: use hashmaps for cone patterns Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Junio C Hamano , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee The parent and recursive patterns allowed by the "cone mode" option in sparse-checkout are restrictive enough that we can avoid using the regex parsing. Everything is based on prefix matches, so we can use hashsets to store the prefixes from the sparse-checkout file. When checking a path, we can strip path entries from the path and check the hashset for an exact match. As a test, I created a cone-mode sparse-checkout file for the Linux repository that actually includes every file. This was constructed by taking every folder in the Linux repo and creating the pattern pairs here: /$folder/ !/$folder/*/ This resulted in a sparse-checkout file sith 8,296 patterns. Running 'git read-tree -mu HEAD' on this file had the following performance: core.sparseCheckout=false: 0.21 s (0.00 s) core.sparseCheckout=true: 3.75 s (3.50 s) core.sparseCheckout=cone: 0.23 s (0.01 s) The times in parentheses above correspond to the time spent in the first clear_ce_flags() call, according to the trace2 performance traces. While this example is contrived, it demonstrates how these patterns can slow the sparse-checkout feature. Signed-off-by: Derrick Stolee --- dir.c | 173 +++++++++++++++++++++++++++-- dir.h | 27 +++++ t/t1091-sparse-checkout-builtin.sh | 11 +- 3 files changed, 202 insertions(+), 9 deletions(-) diff --git a/dir.c b/dir.c index 34972abdaf..4fc57187e9 100644 --- a/dir.c +++ b/dir.c @@ -599,6 +599,109 @@ void parse_path_pattern(const char **pattern, *patternlen = len; } +static int pl_hashmap_cmp(const void *unused_cmp_data, + const void *a, const void *b, const void *key) +{ + const struct pattern_entry *ee1 = (const struct pattern_entry *)a; + const struct pattern_entry *ee2 = (const struct pattern_entry *)b; + + size_t min_len = ee1->patternlen <= ee2->patternlen + ? ee1->patternlen + : ee2->patternlen; + + return strncmp(ee1->pattern, ee2->pattern, min_len); +} + +static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern *given) +{ + struct pattern_entry *translated; + char *truncated; + char *data = NULL; + + if (!pl->use_cone_patterns) + return; + + if (!strcmp(given->pattern, "/*")) + return; + + if (given->patternlen > 2 && + !strcmp(given->pattern + given->patternlen - 2, "/*")) { + if (!(given->flags & PATTERN_FLAG_NEGATIVE)) { + /* Not a cone pattern. */ + pl->use_cone_patterns = 0; + warning(_("unrecognized pattern: '%s'"), given->pattern); + goto clear_hashmaps; + } + + truncated = xstrdup(given->pattern); + truncated[given->patternlen - 2] = 0; + + translated = xmalloc(sizeof(struct pattern_entry)); + translated->pattern = truncated; + translated->patternlen = given->patternlen - 2; + hashmap_entry_init(translated, + memhash(translated->pattern, translated->patternlen)); + + if (!hashmap_get(&pl->recursive_hashmap, translated, NULL)) { + /* We did not see the "parent" included */ + warning(_("unrecognized negative pattern: '%s'"), + given->pattern); + free(truncated); + free(translated); + goto clear_hashmaps; + } + + hashmap_add(&pl->parent_hashmap, translated); + hashmap_remove(&pl->recursive_hashmap, translated, &data); + free(data); + return; + } + + if (given->flags & PATTERN_FLAG_NEGATIVE) { + warning(_("unrecognized negative pattern: '%s'"), + given->pattern); + goto clear_hashmaps; + } + + translated = xmalloc(sizeof(struct pattern_entry)); + + translated->pattern = xstrdup(given->pattern); + translated->patternlen = given->patternlen; + hashmap_entry_init(translated, + memhash(translated->pattern, translated->patternlen)); + + hashmap_add(&pl->recursive_hashmap, translated); + + if (hashmap_get(&pl->parent_hashmap, translated, NULL)) { + /* we already included this at the parent level */ + warning(_("your sparse-checkout file may have issues: pattern '%s' is repeated"), + given->pattern); + hashmap_remove(&pl->parent_hashmap, translated, &data); + free(data); + free(translated); + } + + return; + +clear_hashmaps: + warning(_("disabling cone pattern matching")); + hashmap_free(&pl->parent_hashmap, 1); + hashmap_free(&pl->recursive_hashmap, 1); + pl->use_cone_patterns = 0; +} + +static int hashmap_contains_path(struct hashmap *map, + struct strbuf *pattern) +{ + struct pattern_entry p; + + /* Check straight mapping */ + p.pattern = pattern->buf; + p.patternlen = pattern->len; + hashmap_entry_init(&p, memhash(p.pattern, p.patternlen)); + return !!hashmap_get(map, &p, NULL); +} + void add_pattern(const char *string, const char *base, int baselen, struct pattern_list *pl, int srcpos) { @@ -623,6 +726,8 @@ void add_pattern(const char *string, const char *base, ALLOC_GROW(pl->patterns, pl->nr + 1, pl->alloc); pl->patterns[pl->nr++] = pattern; pattern->pl = pl; + + add_pattern_to_hashsets(pl, pattern); } static int read_skip_worktree_file_from_index(const struct index_state *istate, @@ -848,6 +953,10 @@ static int add_patterns_from_buffer(char *buf, size_t size, int i, lineno = 1; char *entry; + pl->use_cone_patterns = core_sparse_checkout_cone; + hashmap_init(&pl->recursive_hashmap, pl_hashmap_cmp, NULL, 0); + hashmap_init(&pl->parent_hashmap, pl_hashmap_cmp, NULL, 0); + pl->filebuf = buf; if (skip_utf8_bom(&buf, size)) @@ -1084,16 +1193,64 @@ enum pattern_match_result path_matches_pattern_list( struct index_state *istate) { struct path_pattern *pattern; - pattern = last_matching_pattern_from_list(pathname, pathlen, basename, - dtype, pl, istate); - if (pattern) { - if (pattern->flags & PATTERN_FLAG_NEGATIVE) - return NOT_MATCHED; - else - return MATCHED; + struct strbuf parent_pathname = STRBUF_INIT; + int result = NOT_MATCHED; + const char *slash_pos; + + if (!pl->use_cone_patterns) { + pattern = last_matching_pattern_from_list(pathname, pathlen, basename, + dtype, pl, istate); + if (pattern) { + if (pattern->flags & PATTERN_FLAG_NEGATIVE) + return NOT_MATCHED; + else + return MATCHED; + } + + return UNDECIDED; } - return UNDECIDED; + strbuf_addch(&parent_pathname, '/'); + strbuf_add(&parent_pathname, pathname, pathlen); + + if (hashmap_contains_path(&pl->recursive_hashmap, + &parent_pathname)) { + result = MATCHED; + goto done; + } + + slash_pos = strrchr(parent_pathname.buf, '/'); + + if (slash_pos == parent_pathname.buf) { + /* include every file in root */ + result = MATCHED; + goto done; + } + + strbuf_setlen(&parent_pathname, slash_pos - parent_pathname.buf); + + if (hashmap_contains_path(&pl->parent_hashmap, &parent_pathname)) { + result = MATCHED; + goto done; + } + + while (parent_pathname.len) { + if (hashmap_contains_path(&pl->recursive_hashmap, + &parent_pathname)) { + result = UNDECIDED; + goto done; + } + + slash_pos = strrchr(parent_pathname.buf, '/'); + if (slash_pos == parent_pathname.buf) + break; + + strbuf_setlen(&parent_pathname, slash_pos - parent_pathname.buf); + } + +done: + strbuf_release(&parent_pathname); + return result; } static struct path_pattern *last_matching_pattern_from_lists( diff --git a/dir.h b/dir.h index 608696c958..bbd5bd1cc9 100644 --- a/dir.h +++ b/dir.h @@ -4,6 +4,7 @@ /* See Documentation/technical/api-directory-listing.txt */ #include "cache.h" +#include "hashmap.h" #include "strbuf.h" struct dir_entry { @@ -37,6 +38,13 @@ struct path_pattern { int srcpos; }; +/* used for hashmaps for cone patterns */ +struct pattern_entry { + struct hashmap_entry ent; + char *pattern; + size_t patternlen; +}; + /* * Each excludes file will be parsed into a fresh exclude_list which * is appended to the relevant exclude_list_group (either EXC_DIRS or @@ -55,6 +63,25 @@ struct pattern_list { const char *src; struct path_pattern **patterns; + + /* + * While scanning the excludes, we attempt to match the patterns + * with a more restricted set that allows us to use hashsets for + * matching logic, which is faster than the linear lookup in the + * excludes array above. If non-zero, that check succeeded. + */ + unsigned use_cone_patterns; + + /* + * Stores paths where everything starting with those paths + * is included. + */ + struct hashmap recursive_hashmap; + + /* + * Used to check single-level parents of blobs. + */ + struct hashmap parent_hashmap; }; /* diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh index 9b089c98c4..f726205d21 100755 --- a/t/t1091-sparse-checkout-builtin.sh +++ b/t/t1091-sparse-checkout-builtin.sh @@ -143,7 +143,8 @@ test_expect_success 'set sparse-checkout using --stdin' ' test_expect_success 'cone mode: match patterns' ' git -C repo config --worktree core.sparseCheckoutCone true && rm -rf repo/a repo/folder1 repo/folder2 && - git -C repo read-tree -mu HEAD && + git -C repo read-tree -mu HEAD 2>err && + test_i18ngrep ! "disabling cone patterns" err && git -C repo reset --hard && ls repo >dir && cat >expect <<-EOF && @@ -154,6 +155,14 @@ test_expect_success 'cone mode: match patterns' ' test_cmp expect dir ' +test_expect_success 'cone mode: warn on bad pattern' ' + test_when_finished mv sparse-checkout repo/.git/info/ && + cp repo/.git/info/sparse-checkout . && + echo "!/deep/deeper/*" >>repo/.git/info/sparse-checkout && + git -C repo read-tree -mu HEAD 2>err && + test_i18ngrep "unrecognized negative pattern" err +' + test_expect_success 'sparse-checkout disable' ' git -C repo sparse-checkout disable && test_path_is_missing repo/.git/info/sparse-checkout && From patchwork Thu Sep 19 14:43:17 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood via GitGitGadget X-Patchwork-Id: 11152751 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9C4A414DB for ; Thu, 19 Sep 2019 14:43:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 71BBE2067B for ; Thu, 19 Sep 2019 14:43:25 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="k4C+P8Jh" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389387AbfISOnY (ORCPT ); Thu, 19 Sep 2019 10:43:24 -0400 Received: from mail-wr1-f68.google.com ([209.85.221.68]:41940 "EHLO mail-wr1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389041AbfISOnW (ORCPT ); Thu, 19 Sep 2019 10:43:22 -0400 Received: by mail-wr1-f68.google.com with SMTP id h7so3367814wrw.8 for ; Thu, 19 Sep 2019 07:43:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:message-id:in-reply-to:references:from:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=ch1B1PQdv5t1YUEbo43+JNl+sht9+UErPZhpGcZTGgw=; b=k4C+P8JhdHb2DrUjMvACnRmoe4u9f7+dRUh3vzDIujBbOFMqxvIZTbXdReBIX/Iqkr zFo9PU/yk6zb4CZZOOqc3dkifqObkX8r5VEx/xxpE6yWcPK8/hEKfGq31ZhCDWEBynKu 5QkS9nUYoFnuNbTrAIobqPjJixy8/RumUbtd+/8sqJJfMlTjNLubX3OfodlZ2sh7V/9t dB4iQQj8OGe2qo3Xv2vNu1GmcqwTPf2KywmDviUcpElg7VdzrgUk/ZODFs8lURqbNOMB d2un8OKqlM6EQZR3l0kurzoYp9+sloALgBb7ClxrrwVHkcHI2oASVslW3b6ZAvk6wA4G N3yw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:in-reply-to:references:from :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=ch1B1PQdv5t1YUEbo43+JNl+sht9+UErPZhpGcZTGgw=; b=SwDElhImDxQfBPkGwC1U2m0/iXHb3vbmzcwSavIoOk/IlQko7SGGxzuCDStzIWpNgV 3VA1tdGuSvwVBnvpIiKtLnL8m8B/ey/0hu2hmi7vmw0U/oAxIe/sqq9UTsoG43O6eZCT +CB5M16mu5Kcrb00YAWK0sgKMYH6rYuqyjE8FZH3eGvoBKs5GrLoN0gdmwADhVgbaLpw WRrQO1cSvp7ddLm6mYml8WfmF/u+K5R/2fKBUAnPCVq1jYYhR5vdNgZgPP309OsmgnfZ CdKHxbS8Rg8jHyAsOTvg5dOI6ji3BJ+l40dw9nBTAPlww/FOwTwpjN6cUN6REdfvRK4i a2wQ== X-Gm-Message-State: APjAAAWy7h4zC7LIWqEJbAVrY2cWIhFOBpiOUHas2A6XOTVrdOB6EIUL u13Vd45BiQVYL73Znz6qx0N6fNWW X-Google-Smtp-Source: APXvYqy/C2gA8bd+3V8TU1RiGB+jdyoZyJ1eA4llC0bRQt9RKjZLKCy5NMNMo1WHnIg1eEH4lxmMmw== X-Received: by 2002:adf:ee4a:: with SMTP id w10mr7486340wro.138.1568904198218; Thu, 19 Sep 2019 07:43:18 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id d28sm13569203wrb.95.2019.09.19.07.43.17 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 19 Sep 2019 07:43:17 -0700 (PDT) Date: Thu, 19 Sep 2019 07:43:17 -0700 (PDT) X-Google-Original-Date: Thu, 19 Sep 2019 14:43:07 GMT Message-Id: <995c5b8e2b4b4ba36c09165de1bc4719819c3ce4.1568904188.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Subject: [PATCH v2 10/11] sparse-checkout: init and set in cone mode Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Junio C Hamano , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee To make the cone pattern set easy to use, update the behavior of 'git sparse-checkout [init|set]'. Add '--cone' flag to 'git sparse-checkout init' to set the config option 'core.sparseCheckoutCone=true'. When running 'git sparse-checkout set' in cone mode, a user only needs to supply a list of recursive folder matches. Git will automatically add the necessary parent matches for the leading directories. When testing 'git sparse-checkout set' in cone mode, check the error stream to ensure we do not see any errors. Specifically, we want to avoid the warning that the patterns do not match the cone-mode patterns. Signed-off-by: Derrick Stolee --- builtin/sparse-checkout.c | 171 +++++++++++++++++++++++++++-- dir.c | 4 +- dir.h | 3 + t/t1091-sparse-checkout-builtin.sh | 49 +++++++++ 4 files changed, 213 insertions(+), 14 deletions(-) diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index f858f0b1b5..111cbc96d9 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -6,6 +6,7 @@ #include "repository.h" #include "run-command.h" #include "strbuf.h" +#include "string-list.h" static char const * const builtin_sparse_checkout_usage[] = { N_("git sparse-checkout [init|list|set|disable] "), @@ -74,9 +75,14 @@ static int update_working_directory(void) return result; } +#define SPARSE_CHECKOUT_NONE 0 +#define SPARSE_CHECKOUT_FULL 1 +#define SPARSE_CHECKOUT_CONE 2 + static int sc_set_config(int mode) { struct argv_array argv = ARGV_ARRAY_INIT; + struct argv_array cone_argv = ARGV_ARRAY_INIT; if (git_config_set_gently("extensions.worktreeConfig", "true")) { error(_("failed to set extensions.worktreeConfig setting")); @@ -95,9 +101,31 @@ static int sc_set_config(int mode) return 1; } + argv_array_pushl(&cone_argv, "config", "--worktree", + "core.sparseCheckoutCone", NULL); + + if (mode == SPARSE_CHECKOUT_CONE) + argv_array_push(&cone_argv, "true"); + else + argv_array_push(&cone_argv, "false"); + + if (run_command_v_opt(cone_argv.argv, RUN_GIT_CMD)) { + error(_("failed to enable core.sparseCheckoutCone")); + return 1; + } + return 0; } +static char const * const builtin_sparse_checkout_init_usage[] = { + N_("git sparse-checkout init [--cone]"), + NULL +}; + +static struct sparse_checkout_init_opts { + int cone_mode; +} init_opts; + static int sparse_checkout_init(int argc, const char **argv) { struct pattern_list pl; @@ -105,8 +133,21 @@ static int sparse_checkout_init(int argc, const char **argv) FILE *fp; int res; struct object_id oid; + int mode; + + static struct option builtin_sparse_checkout_init_options[] = { + OPT_BOOL(0, "cone", &init_opts.cone_mode, + N_("initialize the sparse-checkout in cone mode")), + OPT_END(), + }; - if (sc_set_config(1)) + argc = parse_options(argc, argv, NULL, + builtin_sparse_checkout_init_options, + builtin_sparse_checkout_init_usage, 0); + + mode = init_opts.cone_mode ? SPARSE_CHECKOUT_CONE : SPARSE_CHECKOUT_FULL; + + if (sc_set_config(mode)) return 1; memset(&pl, 0, sizeof(pl)); @@ -135,6 +176,72 @@ static int sparse_checkout_init(int argc, const char **argv) return update_working_directory(); } +static void insert_recursive_pattern(struct pattern_list *pl, struct strbuf *path) +{ + struct pattern_entry *e = xmalloc(sizeof(struct pattern_entry)); + e->patternlen = path->len; + e->pattern = strbuf_detach(path, NULL); + hashmap_entry_init(e, memhash(e->pattern, e->patternlen)); + + hashmap_add(&pl->recursive_hashmap, e); + + while (e->patternlen) { + char *slash = strrchr(e->pattern, '/'); + char *oldpattern = e->pattern; + size_t newlen; + + if (!slash) + break; + + newlen = slash - e->pattern; + e = xmalloc(sizeof(struct pattern_entry)); + e->patternlen = newlen; + e->pattern = xstrndup(oldpattern, newlen); + hashmap_entry_init(e, memhash(e->pattern, e->patternlen)); + + if (!hashmap_get(&pl->parent_hashmap, e, NULL)) + hashmap_add(&pl->parent_hashmap, e); + } +} + +static void write_cone_to_file(FILE *fp, struct pattern_list *pl) +{ + int i; + struct pattern_entry *entry; + struct hashmap_iter iter; + struct string_list sl = STRING_LIST_INIT_DUP; + + hashmap_iter_init(&pl->parent_hashmap, &iter); + while ((entry = hashmap_iter_next(&iter))) + string_list_insert(&sl, entry->pattern); + + string_list_sort(&sl); + string_list_remove_duplicates(&sl, 0); + + fprintf(fp, "/*\n!/*/\n"); + + for (i = 0; i < sl.nr; i++) { + char *pattern = sl.items[i].string; + + if (strlen(pattern)) + fprintf(fp, "/%s/\n!/%s/*/\n", pattern, pattern); + } + + string_list_clear(&sl, 0); + + hashmap_iter_init(&pl->recursive_hashmap, &iter); + while ((entry = hashmap_iter_next(&iter))) + string_list_insert(&sl, entry->pattern); + + string_list_sort(&sl); + string_list_remove_duplicates(&sl, 0); + + for (i = 0; i < sl.nr; i++) { + char *pattern = sl.items[i].string; + fprintf(fp, "/%s/\n", pattern); + } +} + static int write_patterns_and_update(struct pattern_list *pl) { char *sparse_filename; @@ -142,7 +249,12 @@ static int write_patterns_and_update(struct pattern_list *pl) sparse_filename = get_sparse_checkout_filename(); fp = fopen(sparse_filename, "w"); - write_patterns_to_file(fp, pl); + + if (core_sparse_checkout_cone) + write_cone_to_file(fp, pl); + else + write_patterns_to_file(fp, pl); + fclose(fp); free(sparse_filename); @@ -150,6 +262,24 @@ static int write_patterns_and_update(struct pattern_list *pl) return update_working_directory(); } +static void strbuf_to_cone_pattern(struct strbuf *line, struct pattern_list *pl) +{ + strbuf_trim(line); + + strbuf_trim_trailing_dir_sep(line); + + if (!line->len) + return; + + if (line->buf[0] == '/') + strbuf_remove(line, 0, 1); + + if (!line->len) + return; + + insert_recursive_pattern(pl, line); +} + static char const * const builtin_sparse_checkout_set_usage[] = { N_("git sparse-checkout set [--stdin|]"), NULL @@ -177,17 +307,34 @@ static int sparse_checkout_set(int argc, const char **argv, const char *prefix) builtin_sparse_checkout_set_usage, PARSE_OPT_KEEP_UNKNOWN); - if (set_opts.use_stdin) { + if (core_sparse_checkout_cone) { struct strbuf line = STRBUF_INIT; - - while (!strbuf_getline(&line, stdin)) { - size_t len; - char *buf = strbuf_detach(&line, &len); - add_pattern(buf, buf, len, &pl, 0); + hashmap_init(&pl.recursive_hashmap, pl_hashmap_cmp, NULL, 0); + hashmap_init(&pl.parent_hashmap, pl_hashmap_cmp, NULL, 0); + + if (set_opts.use_stdin) { + while (!strbuf_getline(&line, stdin)) + strbuf_to_cone_pattern(&line, &pl); + } else { + for (i = 0; i < argc; i++) { + strbuf_setlen(&line, 0); + strbuf_addstr(&line, argv[i]); + strbuf_to_cone_pattern(&line, &pl); + } } } else { - for (i = 0; i < argc; i++) - add_pattern(argv[i], argv[i], strlen(argv[i]), &pl, 0); + if (set_opts.use_stdin) { + struct strbuf line = STRBUF_INIT; + + while (!strbuf_getline(&line, stdin)) { + size_t len; + char *buf = strbuf_detach(&line, &len); + add_pattern(buf, buf, len, &pl, 0); + } + } else { + for (i = 0; i < argc; i++) + add_pattern(argv[i], argv[i], strlen(argv[i]), &pl, 0); + } } return write_patterns_and_update(&pl); @@ -198,7 +345,7 @@ static int sparse_checkout_disable(int argc, const char **argv) char *sparse_filename; FILE *fp; - if (sc_set_config(1)) + if (sc_set_config(SPARSE_CHECKOUT_FULL)) die(_("failed to change config")); sparse_filename = get_sparse_checkout_filename(); @@ -212,7 +359,7 @@ static int sparse_checkout_disable(int argc, const char **argv) unlink(sparse_filename); free(sparse_filename); - return sc_set_config(0); + return sc_set_config(SPARSE_CHECKOUT_NONE); } int cmd_sparse_checkout(int argc, const char **argv, const char *prefix) diff --git a/dir.c b/dir.c index 4fc57187e9..298a4539ec 100644 --- a/dir.c +++ b/dir.c @@ -599,8 +599,8 @@ void parse_path_pattern(const char **pattern, *patternlen = len; } -static int pl_hashmap_cmp(const void *unused_cmp_data, - const void *a, const void *b, const void *key) +int pl_hashmap_cmp(const void *unused_cmp_data, + const void *a, const void *b, const void *key) { const struct pattern_entry *ee1 = (const struct pattern_entry *)a; const struct pattern_entry *ee2 = (const struct pattern_entry *)b; diff --git a/dir.h b/dir.h index bbd5bd1cc9..7c76a2d55e 100644 --- a/dir.h +++ b/dir.h @@ -296,6 +296,9 @@ int is_excluded(struct dir_struct *dir, struct index_state *istate, const char *name, int *dtype); +int pl_hashmap_cmp(const void *unused_cmp_data, + const void *a, const void *b, const void *key); + struct pattern_list *add_pattern_list(struct dir_struct *dir, int group_type, const char *src); int add_patterns_from_file_to_list(const char *fname, const char *base, int baselen, diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh index f726205d21..b6eb02c69a 100755 --- a/t/t1091-sparse-checkout-builtin.sh +++ b/t/t1091-sparse-checkout-builtin.sh @@ -178,5 +178,54 @@ test_expect_success 'sparse-checkout disable' ' test_cmp expect dir ' +test_expect_success 'cone mode: init and set' ' + git -C repo sparse-checkout init --cone && + git -C repo config --list >config && + test_i18ngrep "core.sparsecheckoutcone=true" config && + ls repo >dir && + echo a >expect && + test_cmp expect dir && + git -C repo sparse-checkout set deep/deeper1/deepest/ 2>err && + test_line_count = 0 err && + ls repo >dir && + cat >expect <<-EOF && + a + deep + EOF + ls repo/deep >dir && + cat >expect <<-EOF && + a + deeper1 + EOF + ls repo/deep/deeper1 >dir && + cat >expect <<-EOF && + a + deepest + EOF + test_cmp expect dir && + cat >expect <<-EOF && + /* + !/*/ + /deep/ + !/deep/*/ + /deep/deeper1/ + !/deep/deeper1/*/ + /deep/deeper1/deepest/ + EOF + test_cmp expect repo/.git/info/sparse-checkout && + git -C repo sparse-checkout set --stdin 2>err <<-EOF && + folder1 + folder2 + EOF + test_line_count = 0 err && + cat >expect <<-EOF && + a + folder1 + folder2 + EOF + ls repo >dir && + test_cmp expect dir +' + test_done From patchwork Thu Sep 19 14:43:18 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood via GitGitGadget X-Patchwork-Id: 11152749 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EFBC613BD for ; Thu, 19 Sep 2019 14:43:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CB8C720882 for ; Thu, 19 Sep 2019 14:43:24 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="EpN1XAMe" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389315AbfISOnX (ORCPT ); Thu, 19 Sep 2019 10:43:23 -0400 Received: from mail-wm1-f65.google.com ([209.85.128.65]:54509 "EHLO mail-wm1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388938AbfISOnV (ORCPT ); Thu, 19 Sep 2019 10:43:21 -0400 Received: by mail-wm1-f65.google.com with SMTP id p7so4880537wmp.4 for ; Thu, 19 Sep 2019 07:43:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:message-id:in-reply-to:references:from:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Ii8l9QJmOK8nEwZpzJEAoSJ7ZiZsOkJW8Ov4mayBw2M=; b=EpN1XAMeyiFCjLmCUFSzsSBNjzUmiw7WZEv1PR4QnOtANgRdyWPsVnFsAMSzWAuKmF HB5q+WGx1pvk2g6X/JFwJj94BgGuKdvxKi5AtZYTqMCKn6bhy7Sw6O91/vaxyRfU7wmK 4tWLWkK/Ehh4bp1H6fCfQxh8GbqafB3bafnnl0+auguWuX+bDfEIdUaM1/A23X8F2pmy pH4cSHoXfEvJjFAHPvxMZ4PkmZjIoIeDGRyKHdoNjoXgnvRG+U4NQrWZD+Vb7YbuyDgP aQTYlbNONb6DHAlzozLNfUA+1nU0Te2EATeUVnjM9N5LxDGh4RDkAd3r7xMwLwMS0b4K cSFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:in-reply-to:references:from :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Ii8l9QJmOK8nEwZpzJEAoSJ7ZiZsOkJW8Ov4mayBw2M=; b=W/gV7tx7pwACU7uHt5U8y0kuylFm2/kyMBTreFlr1KF7Ih2zE0pZOr7L8pK+Dvs4MA 3rK3l7GKgDvlcDt9ZGu5doUsyMPlsse5j5BH6jlcnku7C3dh6QOMoqrjWpIt9Y3R1Wtd RcRqUy5AKMEuWxVhv/ve5KUU2lcGLYIZ30zKRE+A9d6F8u9XJe3ATbk4tchznrFsX/M+ J7/z0ag1I4x9SZ9YZ7TCYEJov6hRp/sYHJmKc64YzxdesJDdeI2j2Y/3IxcaRzJzElzp B9qkrJlYTKUZpThYmDS6JGQ8+9I79dzcS6mkd046Y8xXvBg9mhOWBJqtLGwh6ofEnyC2 RjXA== X-Gm-Message-State: APjAAAUpjI8B9CemTGpjTfodzMHqmwBylCO/7KDnZLjsqeVMgqYkuFGD 4Jqon4th9b2Y72atlyvPA56X4cZi X-Google-Smtp-Source: APXvYqzeKOj0vpVKrdarQifvAHZpJAEwkT6fmZb+vKlisoz5kcY17aBhfiiKyKT2SPuN5q58gRCBMw== X-Received: by 2002:a1c:f317:: with SMTP id q23mr3074606wmq.33.1568904199013; Thu, 19 Sep 2019 07:43:19 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id m18sm13865624wrg.97.2019.09.19.07.43.18 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 19 Sep 2019 07:43:18 -0700 (PDT) Date: Thu, 19 Sep 2019 07:43:18 -0700 (PDT) X-Google-Original-Date: Thu, 19 Sep 2019 14:43:08 GMT Message-Id: <1d4321488ef4edbd4b19a8e26b329d0b54755bf4.1568904188.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Subject: [PATCH v2 11/11] unpack-trees: hash less in cone mode Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Junio C Hamano , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee The sparse-checkout feature in "cone mode" can use the fact that the recursive patterns are "connected" to the root via parent patterns to decide if a directory is entirely contained in the sparse-checkout or entirely removed. In these cases, we can skip hashing the paths within those directories and simply set the skipworktree bit to the correct value. Signed-off-by: Derrick Stolee --- dir.c | 4 ++-- dir.h | 1 + unpack-trees.c | 38 +++++++++++++++++++++++--------------- 3 files changed, 26 insertions(+), 17 deletions(-) diff --git a/dir.c b/dir.c index 298a4539ec..35fd60d487 100644 --- a/dir.c +++ b/dir.c @@ -1215,7 +1215,7 @@ enum pattern_match_result path_matches_pattern_list( if (hashmap_contains_path(&pl->recursive_hashmap, &parent_pathname)) { - result = MATCHED; + result = MATCHED_RECURSIVE; goto done; } @@ -1237,7 +1237,7 @@ enum pattern_match_result path_matches_pattern_list( while (parent_pathname.len) { if (hashmap_contains_path(&pl->recursive_hashmap, &parent_pathname)) { - result = UNDECIDED; + result = MATCHED_RECURSIVE; goto done; } diff --git a/dir.h b/dir.h index 7c76a2d55e..5f410eedbb 100644 --- a/dir.h +++ b/dir.h @@ -261,6 +261,7 @@ enum pattern_match_result { UNDECIDED = -1, NOT_MATCHED = 0, MATCHED = 1, + MATCHED_RECURSIVE = 2, }; /* diff --git a/unpack-trees.c b/unpack-trees.c index 26be8f3569..43acc0ffd6 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -1280,15 +1280,17 @@ static int clear_ce_flags_dir(struct index_state *istate, struct cache_entry **cache_end; int dtype = DT_DIR; int rc; - enum pattern_match_result ret; - ret = path_matches_pattern_list(prefix->buf, prefix->len, - basename, &dtype, pl, istate); + enum pattern_match_result ret, orig_ret; + orig_ret = path_matches_pattern_list(prefix->buf, prefix->len, + basename, &dtype, pl, istate); strbuf_addch(prefix, '/'); /* If undecided, use matching result of parent dir in defval */ - if (ret == UNDECIDED) + if (orig_ret == UNDECIDED) ret = default_match; + else + ret = orig_ret; for (cache_end = cache; cache_end != cache + nr; cache_end++) { struct cache_entry *ce = *cache_end; @@ -1296,17 +1298,23 @@ static int clear_ce_flags_dir(struct index_state *istate, break; } - /* - * TODO: check pl, if there are no patterns that may conflict - * with ret (iow, we know in advance the incl/excl - * decision for the entire directory), clear flag here without - * calling clear_ce_flags_1(). That function will call - * the expensive path_matches_pattern_list() on every entry. - */ - rc = clear_ce_flags_1(istate, cache, cache_end - cache, - prefix, - select_mask, clear_mask, - pl, ret); + if (pl->use_cone_patterns && orig_ret == MATCHED_RECURSIVE) { + struct cache_entry **ce = cache; + rc = (cache_end - cache) / sizeof(struct cache_entry *); + + while (ce < cache_end) { + (*ce)->ce_flags &= ~clear_mask; + ce++; + } + } else if (pl->use_cone_patterns && orig_ret == NOT_MATCHED) { + rc = (cache_end - cache) / sizeof(struct cache_entry *); + } else { + rc = clear_ce_flags_1(istate, cache, cache_end - cache, + prefix, + select_mask, clear_mask, + pl, ret); + } + strbuf_setlen(prefix, prefix->len - 1); return rc; }