From patchwork Tue Mar 23 13:44:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12157907 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E17AC433C1 for ; Tue, 23 Mar 2021 13:45:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2A15E6199F for ; Tue, 23 Mar 2021 13:45:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230078AbhCWNow (ORCPT ); Tue, 23 Mar 2021 09:44:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38950 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231627AbhCWNoh (ORCPT ); Tue, 23 Mar 2021 09:44:37 -0400 Received: from mail-wr1-x435.google.com (mail-wr1-x435.google.com [IPv6:2a00:1450:4864:20::435]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B1D9FC061764 for ; Tue, 23 Mar 2021 06:44:33 -0700 (PDT) Received: by mail-wr1-x435.google.com with SMTP id j18so20827606wra.2 for ; Tue, 23 Mar 2021 06:44:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=kxW/evtnx6pEjTOMiCrUm0IWiPy797qr0ZBDnWZfQps=; b=Vw4X4YXP4fGhIgGH1uk+So3tDBAOj9FIocvz2N62dXtu1mWWV+AXTQHV3bZbDkp+lv dzFHTsa/ay5r4X6xLQAL6/n03Pb7O+JZY0TjWuTK2yzg49t6Te0JMVIQs5QBbxJkOVF7 NVd2FTXxpQPBpDX95Jrq2fjUL2618uHCOzn2HQOOIGm9DQ8y7YUBvgj+wh9BvBr1iMmY WkjbrqVeFUJBLXLcN6pWxZ+JRUPiMfTQ2ujZ/tPY49JPmEHgcg+nIarvuS6TylorUhey uqTFNGnbOPu80GpnLO2O0E702tzxnlX6y3Frs9Wcw/pxhIfRS2ATy6wmQ3C4rz+jl/Ee c2zw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=kxW/evtnx6pEjTOMiCrUm0IWiPy797qr0ZBDnWZfQps=; b=lYSKNVwvanl8+UWU8GxxVu6NovWEv2aLdYHx6gUf4RX8GeOOEPcseM4jQ8aPzFd0or wQVpa5tMzDLufXqf76JyIMLMwUYzoKLaeCr09xqLv4/0eF/WcVW6AqKSvTcWKaIxP4Ut Vh1Q/0O8iPTHoR+c4X2wFq6qzWP1EQ2AMHd48QVIQC/idOYLb9MDwg+rl522tRT9iieC CGhzMAgG1yANvg6V3c7N37RYz0BBv5jIKRa05RAt0T/+QI5fRjSBQ8w/b7vS+BVOcTMb t6Bo2tkdwJBkuwszc3uE2DZirMdXvEAcwueo7hXawOAM46ilOsTNbAwCp3sIK1noefIb tYgg== X-Gm-Message-State: AOAM531FNG1a5ligd5GVYlszzrSkXj5rCFzvXwJ7CvBMv5LEt1fSmizd gET1IreAyeu2LTdamXUZm63xrCeY3O0= X-Google-Smtp-Source: ABdhPJz2pxd6cym2nWH8lwzqs/GhTKM+24jd/BwTGnfpKlgJYyigeehYthV4Lp770JKGHa68/HmjPQ== X-Received: by 2002:a5d:6945:: with SMTP id r5mr4002993wrw.367.1616507072351; Tue, 23 Mar 2021 06:44:32 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id l8sm24103676wrx.83.2021.03.23.06.44.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Mar 2021 06:44:31 -0700 (PDT) Message-Id: <6426a5c60e53e30091360c00c61c9123803fe9c1.1616507069.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 23 Mar 2021 13:44:09 +0000 Subject: [PATCH v4 01/20] sparse-index: design doc and format update Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee This begins a long effort to update the index format to allow sparse directory entries. This should result in a significant improvement to Git commands when HEAD contains millions of files, but the user has selected many fewer files to keep in their sparse-checkout definition. Currently, the index format is only updated in the presence of extensions.sparseIndex instead of increasing a file format version number. This is temporary, and index v5 is part of the plan for future work in this area. The design document details many of the reasons for embarking on this work, and also the plan for completing it safely. Signed-off-by: Derrick Stolee --- Documentation/technical/index-format.txt | 7 + Documentation/technical/sparse-index.txt | 174 +++++++++++++++++++++++ 2 files changed, 181 insertions(+) create mode 100644 Documentation/technical/sparse-index.txt diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt index d363a71c37ec..3b74c05647db 100644 --- a/Documentation/technical/index-format.txt +++ b/Documentation/technical/index-format.txt @@ -44,6 +44,13 @@ Git index format localization, no special casing of directory separator '/'). Entries with the same name are sorted by their stage field. + An index entry typically represents a file. However, if sparse-checkout + is enabled in cone mode (`core.sparseCheckoutCone` is enabled) and the + `extensions.sparseIndex` extension is enabled, then the index may + contain entries for directories outside of the sparse-checkout definition. + These entries have mode `040000`, include the `SKIP_WORKTREE` bit, and + the path ends in a directory separator. + 32-bit ctime seconds, the last time a file's metadata changed this is stat(2) data diff --git a/Documentation/technical/sparse-index.txt b/Documentation/technical/sparse-index.txt new file mode 100644 index 000000000000..62f6dc225a44 --- /dev/null +++ b/Documentation/technical/sparse-index.txt @@ -0,0 +1,174 @@ +Git Sparse-Index Design Document +================================ + +The sparse-checkout feature allows users to focus a working directory on +a subset of the files at HEAD. The cone mode patterns, enabled by +`core.sparseCheckoutCone`, allow for very fast pattern matching to +discover which files at HEAD belong in the sparse-checkout cone. + +Three important scale dimensions for a Git working directory are: + +* `HEAD`: How many files are present at `HEAD`? + +* Populated: How many files are within the sparse-checkout cone. + +* Modified: How many files has the user modified in the working directory? + +We will use big-O notation -- O(X) -- to denote how expensive certain +operations are in terms of these dimensions. + +These dimensions are ordered by their magnitude: users (typically) modify +fewer files than are populated, and we can only populate files at `HEAD`. + +Problems occur if there is an extreme imbalance in these dimensions. For +example, if `HEAD` contains millions of paths but the populated set has +only tens of thousands, then commands like `git status` and `git add` can +be dominated by operations that require O(`HEAD`) operations instead of +O(Populated). Primarily, the cost is in parsing and rewriting the index, +which is filled primarily with files at `HEAD` that are marked with the +`SKIP_WORKTREE` bit. + +The sparse-index intends to take these commands that read and modify the +index from O(`HEAD`) to O(Populated). To do this, we need to modify the +index format in a significant way: add "sparse directory" entries. + +With cone mode patterns, it is possible to detect when an entire +directory will have its contents outside of the sparse-checkout definition. +Instead of listing all of the files it contains as individual entries, a +sparse-index contains an entry with the directory name, referencing the +object ID of the tree at `HEAD` and marked with the `SKIP_WORKTREE` bit. +If we need to discover the details for paths within that directory, we +can parse trees to find that list. + +At time of writing, sparse-directory entries violate expectations about the +index format and its in-memory data structure. There are many consumers in +the codebase that expect to iterate through all of the index entries and +see only files. In fact, these loops expect to see a reference to every +staged file. One way to handle this is to parse trees to replace a +sparse-directory entry with all of the files within that tree as the index +is loaded. However, parsing trees is slower than parsing the index format, +so that is a slower operation than if we left the index alone. The plan is +to make all of these integrations "sparse aware" so this expansion through +tree parsing is unnecessary and they use fewer resources than when using a +full index. + +The implementation plan below follows four phases to slowly integrate with +the sparse-index. The intention is to incrementally update Git commands to +interact safely with the sparse-index without significant slowdowns. This +may not always be possible, but the hope is that the primary commands that +users need in their daily work are dramatically improved. + +Phase I: Format and initial speedups +------------------------------------ + +During this phase, Git learns to enable the sparse-index and safely parse +one. Protections are put in place so that every consumer of the in-memory +data structure can operate with its current assumption of every file at +`HEAD`. + +At first, every index parse will call a helper method, +`ensure_full_index()`, which scans the index for sparse-directory entries +(pointing to trees) and replaces them with the full list of paths (with +blob contents) by parsing tree objects. This will be slower in all cases. +The only noticeable change in behavior will be that the serialized index +file contains sparse-directory entries. + +To start, we use a new repository extension, `extensions.sparseIndex`, to +allow inserting sparse-directory entries into indexes with file format +versions 2, 3, and 4. This prevents Git versions that do not understand +the sparse-index from operating on one, but it also prevents other +operations that do not use the index at all. A new format, index v5, will +be introduced that includes sparse-directory entries by default. It might +also introduce other features that have been considered for improving the +index, as well. + +Next, consumers of the index will be guarded against operating on a +sparse-index by inserting calls to `ensure_full_index()` or +`expand_index_to_path()`. After these guards are in place, we can begin +leaving sparse-directory entries in the in-memory index structure. + +Even after inserting these guards, we will keep expanding sparse-indexes +for most Git commands using the `command_requires_full_index` repository +setting. This setting will be on by default and disabled one builtin at a +time until we have sufficient confidence that all of the index operations +are properly guarded. + +To complete this phase, the commands `git status` and `git add` will be +integrated with the sparse-index so that they operate with O(Populated) +performance. They will be carefully tested for operations within and +outside the sparse-checkout definition. + +Phase II: Careful integrations +------------------------------ + +This phase focuses on ensuring that all index extensions and APIs work +well with a sparse-index. This requires significant increases to our test +coverage, especially for operations that interact with the working +directory outside of the sparse-checkout definition. Some of these +behaviors may not be the desirable ones, such as some tests already +marked for failure in `t1092-sparse-checkout-compatibility.sh`. + +The index extensions that may require special integrations are: + +* FS Monitor +* Untracked cache + +While integrating with these features, we should look for patterns that +might lead to better APIs for interacting with the index. Coalescing +common usage patterns into an API call can reduce the number of places +where sparse-directories need to be handled carefully. + +Phase III: Important command speedups +------------------------------------- + +At this point, the patterns for testing and implementing sparse-directory +logic should be relatively stable. This phase focuses on updating some of +the most common builtins that use the index to operate as O(Populated). +Here is a potential list of commands that could be valuable to integrate +at this point: + +* `git commit` +* `git checkout` +* `git merge` +* `git rebase` + +Hopefully, commands such as `git merge` and `git rebase` can benefit +instead from merge algorithms that do not use the index as a data +structure, such as the merge-ORT strategy. As these topics mature, we +may enable the ORT strategy by default for repositories using the +sparse-index feature. + +Along with `git status` and `git add`, these commands cover the majority +of users' interactions with the working directory. In addition, we can +integrate with these commands: + +* `git grep` +* `git rm` + +These have been proposed as some whose behavior could change when in a +repo with a sparse-checkout definition. It would be good to include this +behavior automatically when using a sparse-index. Some clarity is needed +to make the behavior switch clear to the user. + +This phase is the first where parallel work might be possible without too +much conflicts between topics. + +Phase IV: The long tail +----------------------- + +This last phase is less a "phase" and more "the new normal" after all of +the previous work. + +To start, the `command_requires_full_index` option could be removed in +favor of expanding only when hitting an API guard. + +There are many Git commands that could use special attention to operate as +O(Populated), while some might be so rare that it is acceptable to leave +them with additional overhead when a sparse-index is present. + +Here are some commands that might be useful to update: + +* `git sparse-checkout set` +* `git am` +* `git clean` +* `git stash` From patchwork Tue Mar 23 13:44:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12157917 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2FD5C433E5 for ; Tue, 23 Mar 2021 13:45:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8DB42619BA for ; Tue, 23 Mar 2021 13:45:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231462AbhCWNo6 (ORCPT ); Tue, 23 Mar 2021 09:44:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38952 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231638AbhCWNoh (ORCPT ); Tue, 23 Mar 2021 09:44:37 -0400 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 65552C061765 for ; Tue, 23 Mar 2021 06:44:34 -0700 (PDT) Received: by mail-wr1-x42d.google.com with SMTP id 61so20799745wrm.12 for ; Tue, 23 Mar 2021 06:44:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=3Sib70KHcBBpG3tPfAQlzsWFLwlZKFCZNLjdNCTGm0w=; b=ioJZ0Zo7ErIVWxhF9S3tNtLYq1Dw9qIGNuABitMVQlITfTGMJcE8Z/IyhKzhBFibiX lxnJbtbuwLdddchLQj6HZOeYaacRTBTrKe3sKWa2fwHfFtWkX5AgKAsotHKHTkhB7EUF WPppTDSsluk2Kj0tM8Pyg6R81Syvgkit5TGEj06EvL3xUjZ8lYPHjFVgsai/JFAuHVTb RXak4VHqb7GWlzhZ7cfYqakKJ7b3NMmODYbgmO0cbemMyTeB27LAGmbZqpyk/SL8jfkM 7ReNbITYJzDTFfE07cNGY89AG9EJWUjEXSmyFxWFN8A8DOGuQFiv7RB8kNUuInLAcnNK aIDw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=3Sib70KHcBBpG3tPfAQlzsWFLwlZKFCZNLjdNCTGm0w=; b=NSFC8FQZJf2jXyZZxGV4NKcIScOBbeM8RMEjZdaRG29KWTpjqjQl8CgFb8X8p8zVSO aL2rJ4CsypNBTRqYpF9+78HA/r8LWMDud+807HC2u6PMkIn5Dx/9krQ2C526fZTBRbpa SJNIEl9+YPHFDlKQCqTDuWdXS5IZN/zoehbpI24sCHFQb/8GtiofzZEPQQL0ugS3BzLb 407mJYXn4KqfVN2I1GnIVW23riCwwgYOUBLvgcoh/cPzKcUUkRm0L6oYcvvoYrEdKXSF /DGz3N8t78qhVS3CoMzStpCGEhV3eDKD3psbEJyqdwZ5zuHydLrVVWhl9/k6HsDET/qv myoA== X-Gm-Message-State: AOAM5322hQYYsLRWnM+b9gWXLxneoV42fztgC2XW+7NeJe0lbgl4b5hV 04BPcXGCj9K5xjJ6HZYb1/Tv2LbyE0A= X-Google-Smtp-Source: ABdhPJzX8sM08l7zkZkMbLneuOLl544ArUSSZWiqktLSWeNMXL8gTCiBy+R/40Z9Cd1wgM8ofcEsXA== X-Received: by 2002:adf:90f0:: with SMTP id i103mr3993162wri.318.1616507073169; Tue, 23 Mar 2021 06:44:33 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id r11sm23534943wrx.37.2021.03.23.06.44.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Mar 2021 06:44:32 -0700 (PDT) Message-Id: <7eabc1d0586cfd2d6526b5b6e40b7a42f2495e06.1616507069.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 23 Mar 2021 13:44:10 +0000 Subject: [PATCH v4 02/20] t/perf: add performance test for sparse operations Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Create a test script that takes the default performance test (the Git codebase) and multiplies it by 256 using four layers of duplicated trees of width four. This results in nearly one million blob entries in the index. Then, we can clone this repository with sparse-checkout patterns that demonstrate four copies of the initial repository. Each clone will use a different index format or mode so peformance can be tested across the different options. Note that the initial repo is stripped of submodules before doing the copies. This preserves the expected data shape of the sparse index, because directories containing submodules are not collapsed to a sparse directory entry. Run a few Git commands on these clones, especially those that use the index (status, add, commit). Here are the results on my Linux machine: Test -------------------------------------------------------------- 2000.2: git status (full-index-v3) 0.37(0.30+0.09) 2000.3: git status (full-index-v4) 0.39(0.32+0.10) 2000.4: git add -A (full-index-v3) 1.42(1.06+0.20) 2000.5: git add -A (full-index-v4) 1.26(0.98+0.16) 2000.6: git add . (full-index-v3) 1.40(1.04+0.18) 2000.7: git add . (full-index-v4) 1.26(0.98+0.17) 2000.8: git commit -a -m A (full-index-v3) 1.42(1.11+0.16) 2000.9: git commit -a -m A (full-index-v4) 1.33(1.08+0.16) It is perhaps noteworthy that there is an improvement when using index version 4. This is because the v3 index uses 108 MiB while the v4 index uses 80 MiB. Since the repeated portions of the directories are very short (f3/f1/f2, for example) this ratio is less pronounced than in similarly-sized real repositories. Signed-off-by: Derrick Stolee --- t/perf/p2000-sparse-operations.sh | 84 +++++++++++++++++++++++++++++++ 1 file changed, 84 insertions(+) create mode 100755 t/perf/p2000-sparse-operations.sh diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh new file mode 100755 index 000000000000..dddd527b6330 --- /dev/null +++ b/t/perf/p2000-sparse-operations.sh @@ -0,0 +1,84 @@ +#!/bin/sh + +test_description="test performance of Git operations using the index" + +. ./perf-lib.sh + +test_perf_default_repo + +SPARSE_CONE=f2/f4/f1 + +test_expect_success 'setup repo and indexes' ' + git reset --hard HEAD && + + # Remove submodules from the example repo, because our + # duplication of the entire repo creates an unlikely data shape. + if git config --file .gitmodules --get-regexp "submodule.*.path" >modules + then + git rm $(awk "{print \$2}" modules) && + git commit -m "remove submodules" || return 1 + fi && + + echo bogus >a && + cp a b && + git add a b && + git commit -m "level 0" && + BLOB=$(git rev-parse HEAD:a) && + OLD_COMMIT=$(git rev-parse HEAD) && + OLD_TREE=$(git rev-parse HEAD^{tree}) && + + for i in $(test_seq 1 4) + do + cat >in <<-EOF && + 100755 blob $BLOB a + 040000 tree $OLD_TREE f1 + 040000 tree $OLD_TREE f2 + 040000 tree $OLD_TREE f3 + 040000 tree $OLD_TREE f4 + EOF + NEW_TREE=$(git mktree >$SPARSE_CONE/a && + $command + ) + " + done +} + +test_perf_on_all git status +test_perf_on_all git add -A +test_perf_on_all git add . +test_perf_on_all git commit -a -m A + +test_done From patchwork Tue Mar 23 13:44:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12157913 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96BE3C433E1 for ; Tue, 23 Mar 2021 13:45:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 672DB619C1 for ; Tue, 23 Mar 2021 13:45:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231388AbhCWNoz (ORCPT ); Tue, 23 Mar 2021 09:44:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38954 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231645AbhCWNoh (ORCPT ); Tue, 23 Mar 2021 09:44:37 -0400 Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2FF63C0613D8 for ; Tue, 23 Mar 2021 06:44:35 -0700 (PDT) Received: by mail-wr1-x434.google.com with SMTP id v11so20815030wro.7 for ; Tue, 23 Mar 2021 06:44:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=zT/KLSMEIY5jvVQ9z2xbPhRiUryKVuorFSgT+6Od1bg=; b=uvLDpFGqoZskE+Jx3zNOlvp262dU/TxwSW2sk9bUSmOI6UeJAbkur09Cj9iLL8bMc6 wfX2hRW9tco6K1Ucnr0JdMy5acZtKVNu5xAUSJab+T++W5p3rvq5/EPDpI79ofiRGSZR yXrueTZNwE3akfVX8Ee7tpiYN6nlfNf+tB6kdGn2WIaKatwqzb9Mtq+J2/NRrPfdhl3P qVE75gXZ04RuJOv9t5P79vCUinTmGPo4r9G6bwkSuqRAXEEbx/vZ+KPeR7Gsey0NZn0I aDhsvcNnIUKUNEf8SlSYtZOprZbjKR43c7kIDf2t+g4t/TIzqVwAtBzLcqIHxzU4oFq5 bSHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=zT/KLSMEIY5jvVQ9z2xbPhRiUryKVuorFSgT+6Od1bg=; b=YsQd230hW1Ga+wttTiU4dH49hdxfR9tYCeD2ex/L6bqCM6ob9SLNTOMP0ONpv4C1jy eGO8gbRmwuIAYCm3Z/wpd7kggkvFQbntkEJ0CctFLwYfVHiv4GLpgJCjCKlSbLASL9iA f/NUmeEph6TD0BEn1TcE8UejJk4E1my6uZ6TVd6G9uLhrNpO/ZQCoCXla8fQ80jZgAIO w/vxsV3ZSohYlPWshUOk7tROF/WJQrtiqDF3T7cJfGjwI0mUdQVkPasRlKkOEjLGa4n8 S3LlFD4mpRpu4Bnr8xYrKYNuphMQR2GkeYfy2unSUifhSn9kfRiUmtdcubw5sL4paqqN rJPg== X-Gm-Message-State: AOAM532V/QNJnalYhQAr0tfEhC+SLHY+WvCJdJOHIWJUvFYTcENrod1v X4C9GWpaAQbaJZUjfyFrUnDKueVnpQQ= X-Google-Smtp-Source: ABdhPJzcH/0uhEmJNTgyKySz36a+SEpEyUGSoB07va/vO1VXo/B//+nqfolIJNZoqYvNfa2IojOdJA== X-Received: by 2002:adf:e0d1:: with SMTP id m17mr4092687wri.90.1616507073917; Tue, 23 Mar 2021 06:44:33 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id q4sm2577056wma.20.2021.03.23.06.44.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Mar 2021 06:44:33 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 23 Mar 2021 13:44:11 +0000 Subject: [PATCH v4 03/20] t1092: clean up script quoting Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee This test was introduced in 19a0acc83e4 (t1092: test interesting sparse-checkout scenarios, 2021-01-23), but it contains issues with quoting that were not noticed until starting this follow-up series. The old mechanism would drop quoting such as in test_all_match git commit -m "touch README.md" The above happened to work because README.md is a file in the repository, so 'git commit -m touch REAMDE.md' would succeed by accident. Other cases included quoting for no good reason, so clean that up now. Signed-off-by: Derrick Stolee --- t/t1092-sparse-checkout-compatibility.sh | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index 8cd3e5a8d227..3725d3997e70 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -96,20 +96,20 @@ init_repos () { run_on_sparse () { ( cd sparse-checkout && - $* >../sparse-checkout-out 2>../sparse-checkout-err + "$@" >../sparse-checkout-out 2>../sparse-checkout-err ) } run_on_all () { ( cd full-checkout && - $* >../full-checkout-out 2>../full-checkout-err + "$@" >../full-checkout-out 2>../full-checkout-err ) && - run_on_sparse $* + run_on_sparse "$@" } test_all_match () { - run_on_all $* && + run_on_all "$@" && test_cmp full-checkout-out sparse-checkout-out && test_cmp full-checkout-err sparse-checkout-err } @@ -119,7 +119,7 @@ test_expect_success 'status with options' ' test_all_match git status --porcelain=v2 && test_all_match git status --porcelain=v2 -z -u && test_all_match git status --porcelain=v2 -uno && - run_on_all "touch README.md" && + run_on_all touch README.md && test_all_match git status --porcelain=v2 && test_all_match git status --porcelain=v2 -z -u && test_all_match git status --porcelain=v2 -uno && @@ -135,7 +135,7 @@ test_expect_success 'add, commit, checkout' ' write_script edit-contents <<-\EOF && echo text >>$1 EOF - run_on_all "../edit-contents README.md" && + run_on_all ../edit-contents README.md && test_all_match git add README.md && test_all_match git status --porcelain=v2 && @@ -144,7 +144,7 @@ test_expect_success 'add, commit, checkout' ' test_all_match git checkout HEAD~1 && test_all_match git checkout - && - run_on_all "../edit-contents README.md" && + run_on_all ../edit-contents README.md && test_all_match git add -A && test_all_match git status --porcelain=v2 && @@ -153,7 +153,7 @@ test_expect_success 'add, commit, checkout' ' test_all_match git checkout HEAD~1 && test_all_match git checkout - && - run_on_all "../edit-contents deep/newfile" && + run_on_all ../edit-contents deep/newfile && test_all_match git status --porcelain=v2 -uno && test_all_match git status --porcelain=v2 && @@ -186,7 +186,7 @@ test_expect_success 'diff --staged' ' write_script edit-contents <<-\EOF && echo text >>README.md EOF - run_on_all "../edit-contents" && + run_on_all ../edit-contents && test_all_match git diff && test_all_match git diff --staged && @@ -280,7 +280,7 @@ test_expect_success 'clean' ' echo bogus >>.gitignore && run_on_all cp ../.gitignore . && test_all_match git add .gitignore && - test_all_match git commit -m ignore-bogus-files && + test_all_match git commit -m "ignore bogus files" && run_on_sparse mkdir folder1 && run_on_all touch folder1/bogus && From patchwork Tue Mar 23 13:44:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12157911 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7599FC433E0 for ; Tue, 23 Mar 2021 13:45:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4E777619BA for ; Tue, 23 Mar 2021 13:45:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231191AbhCWNoy (ORCPT ); Tue, 23 Mar 2021 09:44:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38962 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229913AbhCWNoj (ORCPT ); Tue, 23 Mar 2021 09:44:39 -0400 Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E1A9EC0613DA for ; Tue, 23 Mar 2021 06:44:35 -0700 (PDT) Received: by mail-wr1-x42c.google.com with SMTP id k8so20819644wrc.3 for ; Tue, 23 Mar 2021 06:44:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=tG7+YZdwAKze8gBAJEZBRIJJgY5B1vbHrx8V3JKD0do=; b=D6gCBjxWoX+OJDvdh8yB2oGLPBwRq3G/tv/igkrqJlVmnJCi8kshCUEAnos/amZae5 srWQV+9iLEKn5O80p6DGrHDGrZ8NA3KgKLQsf8HQTyYaGwWWf6QNblA6lctJ+QJaEOaH HgCLgOWfoBRmNFlX8xuqMA4fAR99VchhCdVgLdAYJhe3ONHnOpYlb1H+IOmQ6cQrhya3 3sSy1bDr1J+OgY1/+M/UzMpBkfsObGivXwVs1UOWkC5WDhDNa5c/q3uLyxRZQLxxhzww pl6mbR/wSoa+bkSfwy0EETq4pbPDQKSx52BLb/TzQ5Eb2cFqhej3bgNWi4eE20QPhHxc ahew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=tG7+YZdwAKze8gBAJEZBRIJJgY5B1vbHrx8V3JKD0do=; b=TX3YW5Q/98hizTOh3U5Vy+ojrKSCqhpu6ntNT0H4fhsVLt4IrthEz23WFNrp90XUWC 9ycY1wslys36rm3QCykxEQXLuNoiYM6nqx9uKdVjW64DKIWiFsKelERwtzL2N5kE6Ibs qr7xShhuPC8B5KsoagciWMq4Zx94E0tHpf6fM9eZ7MRD7tIfQsDlIeN+7Pev6kuBQZhp AA7Ru4L27PSUiVXbCKk68+FuG5VqDj+5wWqtCuQWNgu5j4VisRmUUV6iHOH9eE4LkVlQ 0iedfYM+38GXokMKy7/aPuruI87JEvRBQGfc2WHcewwZ4EGdkHIdB9L0wd6wwS81cTRc zE3A== X-Gm-Message-State: AOAM530j4udZSGuEChxuEeUtzdPL/VMXSRDuTS7FSm+XrZtmwo7g39Jj ZK5Br+oCsNkXLmHyEgXnFs7ReHP5Ku8= X-Google-Smtp-Source: ABdhPJyLVqNeVNyWQD8UqG5gizS+5K6Aaqd587nCGAs7W18dXlVSRfRJjXkByFGWxQ7GdYgrd2QMcw== X-Received: by 2002:adf:f005:: with SMTP id j5mr3963293wro.423.1616507074606; Tue, 23 Mar 2021 06:44:34 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id q17sm23201715wrv.25.2021.03.23.06.44.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Mar 2021 06:44:34 -0700 (PDT) Message-Id: <03cdde7565630e43135028350092373917e6fa31.1616507069.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 23 Mar 2021 13:44:12 +0000 Subject: [PATCH v4 04/20] sparse-index: add guard to ensure full index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Upcoming changes will introduce modifications to the index format that allow sparse directories. It will be useful to have a mechanism for converting those sparse index files into full indexes by walking the tree at those sparse directories. Name this method ensure_full_index() as it will guarantee that the index is fully expanded. This method is not implemented yet, and instead we focus on the scaffolding to declare it and call it at the appropriate time. Add a 'command_requires_full_index' member to struct repo_settings. This will be an indicator that we need the index in full mode to do certain index operations. This starts as being true for every command, then we will set it to false as some commands integrate with sparse indexes. If 'command_requires_full_index' is true, then we will immediately expand a sparse index to a full one upon reading from disk. This suffices for now, but we will want to add more callers to ensure_full_index() later. Signed-off-by: Derrick Stolee --- Makefile | 1 + repo-settings.c | 8 ++++++++ repository.c | 11 ++++++++++- repository.h | 2 ++ sparse-index.c | 8 ++++++++ sparse-index.h | 7 +++++++ 6 files changed, 36 insertions(+), 1 deletion(-) create mode 100644 sparse-index.c create mode 100644 sparse-index.h diff --git a/Makefile b/Makefile index dfb0f1000fa3..89b1d5374107 100644 --- a/Makefile +++ b/Makefile @@ -985,6 +985,7 @@ LIB_OBJS += setup.o LIB_OBJS += shallow.o LIB_OBJS += sideband.o LIB_OBJS += sigchain.o +LIB_OBJS += sparse-index.o LIB_OBJS += split-index.o LIB_OBJS += stable-qsort.o LIB_OBJS += strbuf.o diff --git a/repo-settings.c b/repo-settings.c index f7fff0f5ab83..d63569e4041e 100644 --- a/repo-settings.c +++ b/repo-settings.c @@ -77,4 +77,12 @@ void prepare_repo_settings(struct repository *r) UPDATE_DEFAULT_BOOL(r->settings.core_untracked_cache, UNTRACKED_CACHE_KEEP); UPDATE_DEFAULT_BOOL(r->settings.fetch_negotiation_algorithm, FETCH_NEGOTIATION_DEFAULT); + + /* + * This setting guards all index reads to require a full index + * over a sparse index. After suitable guards are placed in the + * codebase around uses of the index, this setting will be + * removed. + */ + r->settings.command_requires_full_index = 1; } diff --git a/repository.c b/repository.c index c98298acd017..a8acae002f71 100644 --- a/repository.c +++ b/repository.c @@ -10,6 +10,7 @@ #include "object.h" #include "lockfile.h" #include "submodule-config.h" +#include "sparse-index.h" /* The main repository */ static struct repository the_repo; @@ -261,6 +262,8 @@ void repo_clear(struct repository *repo) int repo_read_index(struct repository *repo) { + int res; + if (!repo->index) repo->index = xcalloc(1, sizeof(*repo->index)); @@ -270,7 +273,13 @@ int repo_read_index(struct repository *repo) else if (repo->index->repo != repo) BUG("repo's index should point back at itself"); - return read_index_from(repo->index, repo->index_file, repo->gitdir); + res = read_index_from(repo->index, repo->index_file, repo->gitdir); + + prepare_repo_settings(repo); + if (repo->settings.command_requires_full_index) + ensure_full_index(repo->index); + + return res; } int repo_hold_locked_index(struct repository *repo, diff --git a/repository.h b/repository.h index b385ca3c94b6..e06a23015697 100644 --- a/repository.h +++ b/repository.h @@ -41,6 +41,8 @@ struct repo_settings { enum fetch_negotiation_setting fetch_negotiation_algorithm; int core_multi_pack_index; + + unsigned command_requires_full_index:1; }; struct repository { diff --git a/sparse-index.c b/sparse-index.c new file mode 100644 index 000000000000..82183ead563b --- /dev/null +++ b/sparse-index.c @@ -0,0 +1,8 @@ +#include "cache.h" +#include "repository.h" +#include "sparse-index.h" + +void ensure_full_index(struct index_state *istate) +{ + /* intentionally left blank */ +} diff --git a/sparse-index.h b/sparse-index.h new file mode 100644 index 000000000000..09a20d036c46 --- /dev/null +++ b/sparse-index.h @@ -0,0 +1,7 @@ +#ifndef SPARSE_INDEX_H__ +#define SPARSE_INDEX_H__ + +struct index_state; +void ensure_full_index(struct index_state *istate); + +#endif From patchwork Tue Mar 23 13:44:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12157909 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64AE1C433DB for ; Tue, 23 Mar 2021 13:45:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 370996197F for ; Tue, 23 Mar 2021 13:45:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229953AbhCWNox (ORCPT ); Tue, 23 Mar 2021 09:44:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38964 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231484AbhCWNoj (ORCPT ); Tue, 23 Mar 2021 09:44:39 -0400 Received: from mail-wm1-x335.google.com (mail-wm1-x335.google.com [IPv6:2a00:1450:4864:20::335]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E990CC0613D9 for ; Tue, 23 Mar 2021 06:44:36 -0700 (PDT) Received: by mail-wm1-x335.google.com with SMTP id w203-20020a1c49d40000b029010c706d0642so1162155wma.0 for ; Tue, 23 Mar 2021 06:44:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=pOHqh+Af5P0jXkTT9g9iWoHBkxvTMOM1AiMZUVBuQro=; b=Agm0Ba0ZbOHLvXlYFNJJMQNgzXrvc/gkdkA0pHLhUDfeM3NSYrZK2IZHBPexp3oHAZ 94wvASfmKWKnmmnAwGhrftIAq3zUrqHpmopuV2W5cULp936uUGKavHbjEClXA57FVAmW FP/OlUzvC5Yv3AjjWXwEyCRouIqz9MBQdsKWCLD+cbBi281DMoFfbWlgh5eRsgbuMkCV 35ZmxEUWxns25JTMAX8CTdnE8zg3DQXzyR859QrVi8iYg12APAvNEph7M+8vGxpsY3tZ y1/JcGNQhr9DrlUOtQAz70RzhG0QiDYNdITSBgQNQRzAbT9YPWeTm1mh++P/AjFhHCYM JzDw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=pOHqh+Af5P0jXkTT9g9iWoHBkxvTMOM1AiMZUVBuQro=; b=qcFqDUR9bv1alpAVWaIsVnWvBB13ZA1QwfTzHtMlfRjTwLRbBtMVVXMmyj3YEUbbfY 4j1nrVQ2ulnctSisMuigB0EXN12eIFQD4HUR4d1IcFLf4nqOlwcvg899SYL05qI1gRZI zrn2wK7lft5YjFrx8npUXJLgbYoEseRsC0J8TKQ44+zSCJ8m2PBX/TY4N8psb+dGDDvf bRTt2kJ42CRuCg4bCK1KMlSWXVdIP4yVerzyNxNjNXVGw7G7KEH7TJjVoBvqpCacNy7q 0Krajb6lXRHU6/XJuTk34qppbPMBYAHMYXdZa539NO9CZZjOC4k7PGRgabwLNLDiCk3Z aocQ== X-Gm-Message-State: AOAM531kf6tEZyaigTbtTfgo/FeiQIjX0xLFeACQyH3ccQFhgxASgGxL rnoDO2x759AdyEuNlvYBqzvVtr4CZvk= X-Google-Smtp-Source: ABdhPJxlLZ2o1hOcviR/22i0/geWt1Xv62EvGmbIuRLFRbv0bZb2fVTuE5K0YMPn99LPEATFRjsH1A== X-Received: by 2002:a1c:4b0a:: with SMTP id y10mr3385604wma.141.1616507075631; Tue, 23 Mar 2021 06:44:35 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id w22sm2633442wmi.22.2021.03.23.06.44.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Mar 2021 06:44:35 -0700 (PDT) Message-Id: <6b3b6d86385d7d8430644e6248996ee469041c3f.1616507069.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 23 Mar 2021 13:44:13 +0000 Subject: [PATCH v4 05/20] sparse-index: implement ensure_full_index() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee We will mark an in-memory index_state as having sparse directory entries with the sparse_index bit. These currently cannot exist, but we will add a mechanism for collapsing a full index to a sparse one in a later change. That will happen at write time, so we must first allow parsing the format before writing it. Commands or methods that require a full index in order to operate can call ensure_full_index() to expand that index in-memory. This requires parsing trees using that index's repository. Sparse directory entries have a specific 'ce_mode' value. The macro S_ISSPARSEDIR(ce->ce_mode) can check if a cache_entry 'ce' has this type. This ce_mode is not possible with the existing index formats, so we don't also verify all properties of a sparse-directory entry, which are: 1. ce->ce_mode == 0040000 2. ce->flags & CE_SKIP_WORKTREE is true 3. ce->name[ce->namelen - 1] == '/' (ends in dir separator) 4. ce->oid references a tree object. These are all semi-enforced in ensure_full_index() to some extent. Any deviation will cause a warning at minimum or a failure in the worst case. Signed-off-by: Derrick Stolee --- cache.h | 13 ++++++- read-cache.c | 9 +++++ sparse-index.c | 98 +++++++++++++++++++++++++++++++++++++++++++++++++- 3 files changed, 118 insertions(+), 2 deletions(-) diff --git a/cache.h b/cache.h index bb317abc91fb..136dd496c95d 100644 --- a/cache.h +++ b/cache.h @@ -204,6 +204,8 @@ struct cache_entry { #error "CE_EXTENDED_FLAGS out of range" #endif +#define S_ISSPARSEDIR(m) ((m) == S_IFDIR) + /* Forward structure decls */ struct pathspec; struct child_process; @@ -319,7 +321,14 @@ struct index_state { drop_cache_tree : 1, updated_workdir : 1, updated_skipworktree : 1, - fsmonitor_has_run_once : 1; + fsmonitor_has_run_once : 1, + + /* + * sparse_index == 1 when sparse-directory + * entries exist. Requires sparse-checkout + * in cone mode. + */ + sparse_index : 1; struct hashmap name_hash; struct hashmap dir_hash; struct object_id oid; @@ -722,6 +731,8 @@ int read_index_from(struct index_state *, const char *path, const char *gitdir); int is_index_unborn(struct index_state *); +void ensure_full_index(struct index_state *istate); + /* For use with `write_locked_index()`. */ #define COMMIT_LOCK (1 << 0) #define SKIP_IF_UNCHANGED (1 << 1) diff --git a/read-cache.c b/read-cache.c index 1e9a50c6c734..dd3980c12b53 100644 --- a/read-cache.c +++ b/read-cache.c @@ -101,6 +101,9 @@ static const char *alternate_index_output; static void set_index_entry(struct index_state *istate, int nr, struct cache_entry *ce) { + if (S_ISSPARSEDIR(ce->ce_mode)) + istate->sparse_index = 1; + istate->cache[nr] = ce; add_name_hash(istate, ce); } @@ -2273,6 +2276,12 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist) trace2_data_intmax("index", the_repository, "read/cache_nr", istate->cache_nr); + if (!istate->repo) + istate->repo = the_repository; + prepare_repo_settings(istate->repo); + if (istate->repo->settings.command_requires_full_index) + ensure_full_index(istate); + return istate->cache_nr; unmap: diff --git a/sparse-index.c b/sparse-index.c index 82183ead563b..7095378a1b28 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -1,8 +1,104 @@ #include "cache.h" #include "repository.h" #include "sparse-index.h" +#include "tree.h" +#include "pathspec.h" +#include "trace2.h" + +static void set_index_entry(struct index_state *istate, int nr, struct cache_entry *ce) +{ + ALLOC_GROW(istate->cache, nr + 1, istate->cache_alloc); + + istate->cache[nr] = ce; + add_name_hash(istate, ce); +} + +static int add_path_to_index(const struct object_id *oid, + struct strbuf *base, const char *path, + unsigned int mode, void *context) +{ + struct index_state *istate = (struct index_state *)context; + struct cache_entry *ce; + size_t len = base->len; + + if (S_ISDIR(mode)) + return READ_TREE_RECURSIVE; + + strbuf_addstr(base, path); + + ce = make_cache_entry(istate, mode, oid, base->buf, 0, 0); + ce->ce_flags |= CE_SKIP_WORKTREE; + set_index_entry(istate, istate->cache_nr++, ce); + + strbuf_setlen(base, len); + return 0; +} void ensure_full_index(struct index_state *istate) { - /* intentionally left blank */ + int i; + struct index_state *full; + struct strbuf base = STRBUF_INIT; + + if (!istate || !istate->sparse_index) + return; + + if (!istate->repo) + istate->repo = the_repository; + + trace2_region_enter("index", "ensure_full_index", istate->repo); + + /* initialize basics of new index */ + full = xcalloc(1, sizeof(struct index_state)); + memcpy(full, istate, sizeof(struct index_state)); + + /* then change the necessary things */ + full->sparse_index = 0; + full->cache_alloc = (3 * istate->cache_alloc) / 2; + full->cache_nr = 0; + ALLOC_ARRAY(full->cache, full->cache_alloc); + + for (i = 0; i < istate->cache_nr; i++) { + struct cache_entry *ce = istate->cache[i]; + struct tree *tree; + struct pathspec ps; + + if (!S_ISSPARSEDIR(ce->ce_mode)) { + set_index_entry(full, full->cache_nr++, ce); + continue; + } + if (!(ce->ce_flags & CE_SKIP_WORKTREE)) + warning(_("index entry is a directory, but not sparse (%08x)"), + ce->ce_flags); + + /* recursively walk into cd->name */ + tree = lookup_tree(istate->repo, &ce->oid); + + memset(&ps, 0, sizeof(ps)); + ps.recursive = 1; + ps.has_wildcard = 1; + ps.max_depth = -1; + + strbuf_setlen(&base, 0); + strbuf_add(&base, ce->name, strlen(ce->name)); + + read_tree_at(istate->repo, tree, &base, &ps, + add_path_to_index, full); + + /* free directory entries. full entries are re-used */ + discard_cache_entry(ce); + } + + /* Copy back into original index. */ + memcpy(&istate->name_hash, &full->name_hash, sizeof(full->name_hash)); + istate->sparse_index = 0; + free(istate->cache); + istate->cache = full->cache; + istate->cache_nr = full->cache_nr; + istate->cache_alloc = full->cache_alloc; + + strbuf_release(&base); + free(full); + + trace2_region_leave("index", "ensure_full_index", istate->repo); } From patchwork Tue Mar 23 13:44:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12157927 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1DBFCC433E8 for ; Tue, 23 Mar 2021 13:45:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DF0EB619BF for ; Tue, 23 Mar 2021 13:45:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231613AbhCWNpA (ORCPT ); Tue, 23 Mar 2021 09:45:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38974 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231668AbhCWNol (ORCPT ); Tue, 23 Mar 2021 09:44:41 -0400 Received: from mail-wm1-x32a.google.com (mail-wm1-x32a.google.com [IPv6:2a00:1450:4864:20::32a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 94504C0613DB for ; Tue, 23 Mar 2021 06:44:37 -0700 (PDT) Received: by mail-wm1-x32a.google.com with SMTP id 12so11078299wmf.5 for ; Tue, 23 Mar 2021 06:44:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=3UfR33i2EmD24Ig4R+PabyKQOjZ/JHxeIKOaa94XMX0=; b=igpH0Mh8sWW3cgCKm9LxUqOHUVq2IPZebPr2WqzSflqs+t/VdkDfR5ABUKYZQj3bnQ GU6vbMNQ+UxTPaDduW4l9LAuhgmqEEfgWOvrnoOQe/RXawapUIjveZc8jN0URKrcs6EJ 5+/Y658IS21LhvLjy2w0W/ohdJ8p8dxwacZQBfRoBD1ZSwl8MiFlPrKNjm4FrEoO9bGN EIoAZJi/CH2Qg+Vn+Acf6XORxr/w10+Acw1yc0LioQgFARdlNzUl6lFQEkHbbjpsZjG+ rCZhqjSfrWLnqB+0p5JOKHNH1hwcM69sGbT6N2TmdW5IbBl8LpBWtlM4nqf7yBMCXR4z MUqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=3UfR33i2EmD24Ig4R+PabyKQOjZ/JHxeIKOaa94XMX0=; b=F06j2UaOZ5a1gPk4FqE2m1wg78sT9GbR02w9OHW0jhZmg5OcsBsjSL5a5U7Hl9gwIF MWbXmYkAmYiEJZwh57AE9Ix9okU/obJ9rj9WpVY/tfCXRES8UKTSJy0jAQ1Wb0oLH4yD 78pknoNKBL0muYlTVBNf6GUk4YLOa7awtJidVvqPsEaY4uVtRX/5HenN+9juM4IckZh0 6By9jPBIQlwM1t9qCcN8uTreQQ2LyOlPF9eBtIUvf1XkAmxnyEFWqPqNbongnVEhVnPC bg5tQggz8aAvc7LioGDNnwfljizCUFFS0wNkt8i8U48nH2Mc/GkfO2Bi4OReO05D05tU FYMA== X-Gm-Message-State: AOAM530pommkA36k+bIFQmDzKPdD55qtk2LgrVqNwEYIr43Y6T4XW5n9 U5/UG379uRmk6fZ9fy/3dDEt8x4g0VE= X-Google-Smtp-Source: ABdhPJxrOZoC7K0286x6ieHmwO/ZdxQuilGXpDKFmy6FJma2LQ1j4U6vAOTGHh4HWVOVQe2yNCoT5w== X-Received: by 2002:a1c:7d41:: with SMTP id y62mr3631129wmc.48.1616507076311; Tue, 23 Mar 2021 06:44:36 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id r10sm2938616wmh.45.2021.03.23.06.44.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Mar 2021 06:44:35 -0700 (PDT) Message-Id: <7f67adba0498d9ff481cae31ac10be9fe228d3d1.1616507069.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 23 Mar 2021 13:44:14 +0000 Subject: [PATCH v4 06/20] t1092: compare sparse-checkout to sparse-index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Add a new 'sparse-index' repo alongside the 'full-checkout' and 'sparse-checkout' repos in t1092-sparse-checkout-compatibility.sh. Also add run_on_sparse and test_sparse_match helpers. These helpers will be used when the sparse index is implemented. Add the GIT_TEST_SPARSE_INDEX environment variable to enable the sparse-index by default. This can be enabled across all tests, but that will only affect cases where the sparse-checkout feature is enabled. Signed-off-by: Derrick Stolee --- t/README | 3 +++ t/t1092-sparse-checkout-compatibility.sh | 24 ++++++++++++++++++++---- 2 files changed, 23 insertions(+), 4 deletions(-) diff --git a/t/README b/t/README index 593d4a4e270c..b98bc563aab5 100644 --- a/t/README +++ b/t/README @@ -439,6 +439,9 @@ and "sha256". GIT_TEST_WRITE_REV_INDEX=, when true enables the 'pack.writeReverseIndex' setting. +GIT_TEST_SPARSE_INDEX=, when true enables index writes to use the +sparse-index format by default. + Naming Tests ------------ diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index 3725d3997e70..de5d8461c993 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -7,6 +7,7 @@ test_description='compare full workdir to sparse workdir' test_expect_success 'setup' ' git init initial-repo && ( + GIT_TEST_SPARSE_INDEX=0 && cd initial-repo && echo a >a && echo "after deep" >e && @@ -87,23 +88,32 @@ init_repos () { cp -r initial-repo sparse-checkout && git -C sparse-checkout reset --hard && - git -C sparse-checkout sparse-checkout init --cone && + + cp -r initial-repo sparse-index && + git -C sparse-index reset --hard && # initialize sparse-checkout definitions - git -C sparse-checkout sparse-checkout set deep + git -C sparse-checkout sparse-checkout init --cone && + git -C sparse-checkout sparse-checkout set deep && + GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout init --cone && + GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout set deep } run_on_sparse () { ( cd sparse-checkout && - "$@" >../sparse-checkout-out 2>../sparse-checkout-err + GIT_TEST_SPARSE_INDEX=0 "$@" >../sparse-checkout-out 2>../sparse-checkout-err + ) && + ( + cd sparse-index && + GIT_TEST_SPARSE_INDEX=1 "$@" >../sparse-index-out 2>../sparse-index-err ) } run_on_all () { ( cd full-checkout && - "$@" >../full-checkout-out 2>../full-checkout-err + GIT_TEST_SPARSE_INDEX=0 "$@" >../full-checkout-out 2>../full-checkout-err ) && run_on_sparse "$@" } @@ -114,6 +124,12 @@ test_all_match () { test_cmp full-checkout-err sparse-checkout-err } +test_sparse_match () { + run_on_sparse "$@" && + test_cmp sparse-checkout-out sparse-index-out && + test_cmp sparse-checkout-err sparse-index-err +} + test_expect_success 'status with options' ' init_repos && test_all_match git status --porcelain=v2 && From patchwork Tue Mar 23 13:44:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12157937 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0566C433E3 for ; Tue, 23 Mar 2021 13:45:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9E6036199F for ; Tue, 23 Mar 2021 13:45:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231511AbhCWNo7 (ORCPT ); Tue, 23 Mar 2021 09:44:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38976 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231671AbhCWNol (ORCPT ); Tue, 23 Mar 2021 09:44:41 -0400 Received: from mail-wr1-x432.google.com (mail-wr1-x432.google.com [IPv6:2a00:1450:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3F7A9C0613DC for ; Tue, 23 Mar 2021 06:44:38 -0700 (PDT) Received: by mail-wr1-x432.google.com with SMTP id e9so20829325wrw.10 for ; Tue, 23 Mar 2021 06:44:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=b0SPiO5pLHpc3Yqb6Wr4cuzcLpirEqK37PQkp7wDU/s=; b=hCAB/BHbIMPP40KnQ+jvcUWXuYNOWtQ9SVXWr9En7LPMY6AYanZX1WJRDU9eDxoEkR 4a71Ea5WcM7UwXZyb/geYBFkeyCG7xhxeKDOtfgTOSPvKB0zBQ4van19Nqu6qcjuhHGZ ApztaNVcydkXkq2vfymrcaBjOumcEm8PW/kHvz8mfDr99wkQrbo+xuyuCfn0gZhzgTcF 0drNBBs4y80pNd19T/Xx/SN/60414G5p2yV9rvobAWAZ0ZRKFJpBXvZMmcjsbsMuMOVp rqgNezNSCACBH924cSuw/6Holxl8JOLdpYzYMTcRgW2H4s+gnuaSIEn3XYrzzNaRpC+v xDfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=b0SPiO5pLHpc3Yqb6Wr4cuzcLpirEqK37PQkp7wDU/s=; b=Kq5f5he7W26//IlVAsr2GmKyzzTPl8Ig2KynOij6x+y3FsgNKCUU0XXE5crp5ZESl6 OrFta5WnqDpHTZDmJ2g3sYNTL/Kz0e31eez+qg6UILVmPW7d2T4wObj0JBNMXzUKkG88 Qv3ZXCkr1uWin2awNUZpzqnpCHjzrdW9x1zJrjQC/j/aaacFU2kPvmyNqJqTU/KV43w+ eGl1BxtIP8aTEdZnWi/eGtp/pMTfSvddDID5+v9hh2K8opd58SQynPB5oIxOlp24hvpr XLqgHFoi+YB2Yn+Y/Dmt8+Av2iwc1vgcynd53Bo9PvmZeThUe6/kTt9CUyHP4Mnvh+nL wS7g== X-Gm-Message-State: AOAM531+329wOi2KWnkUhzgOiSgSFiPWFLESQNfBAXm9712ef0LDLK2a u4Km0RDKVgd6HiNX2/mInSyQESXGjYc= X-Google-Smtp-Source: ABdhPJyugpXZAnyy/+euD7QrajCVC2PUJFZY1QloFYbcnNh/flyMU1SGt13PbHj+0qofhn6zWDdKgA== X-Received: by 2002:adf:ff8c:: with SMTP id j12mr4090812wrr.297.1616507077054; Tue, 23 Mar 2021 06:44:37 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id y10sm24771261wrl.19.2021.03.23.06.44.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Mar 2021 06:44:36 -0700 (PDT) Message-Id: <7ebd9570b1ad81720569a770526651c62c152b9f.1616507069.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 23 Mar 2021 13:44:15 +0000 Subject: [PATCH v4 07/20] test-read-cache: print cache entries with --table Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee This table is helpful for discovering data in the index to ensure it is being written correctly, especially as we build and test the sparse-index. This table includes an output format similar to 'git ls-tree', but should not be compared to that directly. The biggest reasons are that 'git ls-tree' includes a tree entry for every subdirectory, even those that would not appear as a sparse directory in a sparse-index. Further, 'git ls-tree' does not use a trailing directory separator for its tree rows. This does not print the stat() information for the blobs. That will be added in a future change with another option. The tests that are added in the next few changes care only about the object types and IDs. However, this future need for full index information justifies the need for this test helper over extending a user-facing feature, such as 'git ls-files'. To make the option parsing slightly more robust, wrap the string comparisons in a loop adapted from test-dir-iterator.c. Care must be taken with the final check for the 'cnt' variable. We continue the expectation that the numerical value is the final argument. Signed-off-by: Derrick Stolee --- t/helper/test-read-cache.c | 55 +++++++++++++++++++++++++++++++------- 1 file changed, 45 insertions(+), 10 deletions(-) diff --git a/t/helper/test-read-cache.c b/t/helper/test-read-cache.c index 244977a29bdf..6cfd8f2de71c 100644 --- a/t/helper/test-read-cache.c +++ b/t/helper/test-read-cache.c @@ -1,36 +1,71 @@ #include "test-tool.h" #include "cache.h" #include "config.h" +#include "blob.h" +#include "commit.h" +#include "tree.h" + +static void print_cache_entry(struct cache_entry *ce) +{ + const char *type; + printf("%06o ", ce->ce_mode & 0177777); + + if (S_ISSPARSEDIR(ce->ce_mode)) + type = tree_type; + else if (S_ISGITLINK(ce->ce_mode)) + type = commit_type; + else + type = blob_type; + + printf("%s %s\t%s\n", + type, + oid_to_hex(&ce->oid), + ce->name); +} + +static void print_cache(struct index_state *istate) +{ + int i; + for (i = 0; i < istate->cache_nr; i++) + print_cache_entry(istate->cache[i]); +} int cmd__read_cache(int argc, const char **argv) { + struct repository *r = the_repository; int i, cnt = 1; const char *name = NULL; + int table = 0; - if (argc > 1 && skip_prefix(argv[1], "--print-and-refresh=", &name)) { - argc--; - argv++; + for (++argv, --argc; *argv && starts_with(*argv, "--"); ++argv, --argc) { + if (skip_prefix(*argv, "--print-and-refresh=", &name)) + continue; + if (!strcmp(*argv, "--table")) + table = 1; } - if (argc == 2) - cnt = strtol(argv[1], NULL, 0); + if (argc == 1) + cnt = strtol(argv[0], NULL, 0); setup_git_directory(); git_config(git_default_config, NULL); + for (i = 0; i < cnt; i++) { - read_cache(); + repo_read_index(r); if (name) { int pos; - refresh_index(&the_index, REFRESH_QUIET, + refresh_index(r->index, REFRESH_QUIET, NULL, NULL, NULL); - pos = index_name_pos(&the_index, name, strlen(name)); + pos = index_name_pos(r->index, name, strlen(name)); if (pos < 0) die("%s not in index", name); printf("%s is%s up to date\n", name, - ce_uptodate(the_index.cache[pos]) ? "" : " not"); + ce_uptodate(r->index->cache[pos]) ? "" : " not"); write_file(name, "%d\n", i); } - discard_cache(); + if (table) + print_cache(r->index); + discard_index(r->index); } return 0; } From patchwork Tue Mar 23 13:44:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12157921 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A3FEC433E9 for ; Tue, 23 Mar 2021 13:45:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 15BB5619BA for ; Tue, 23 Mar 2021 13:45:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231683AbhCWNpC (ORCPT ); Tue, 23 Mar 2021 09:45:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38984 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231673AbhCWNol (ORCPT ); Tue, 23 Mar 2021 09:44:41 -0400 Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 045C6C0613DD for ; Tue, 23 Mar 2021 06:44:39 -0700 (PDT) Received: by mail-wm1-x32c.google.com with SMTP id d8-20020a1c1d080000b029010f15546281so10718560wmd.4 for ; Tue, 23 Mar 2021 06:44:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=3o+t5/dOGp5xpMAAQd/XUzGtlCJPY6SPrX9Z+uQ/DZ4=; b=q5/GXqz8fE0GDjlPfyGY+fBiu+AoLJrWb1pFVh52sjAvU3nhwRrh8dBaCMasQAEGrf /weBeiAKuLAJlQ1mK11uAlMs/iDWxPeb/FBfH9//VQf/jzzs6QXj7lB0szx3MXeStcRL E8P9jsHiPewvAsJiufGeYY/1cwvhVSdxGnjh2wKKi+Kfn8JSdUkeakrCn8ucAGx8u2pk UWgJaEeAC+Lrldlrtr6RsU3jfl/dtd+B2/4fzg3roaECdkbhF1xl82o8HSxNwotJ0jhi 3/QeeamOYlkyYKVpfGRP8ES6yR8/h8IsciG2GM25793z+qfCX92fDc25OxB6u7xAYfT1 1YUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=3o+t5/dOGp5xpMAAQd/XUzGtlCJPY6SPrX9Z+uQ/DZ4=; b=d2AqijdldHvwwMoI0nM/c9ZNASa7zzYrWYxClxRbYtF4/j2sZkeWB88nzWhdL37s32 EaXWAVoIoMuFYIE6IkE4Ze8fXhYMM+OYd80S9s70uMT9xulJ1gQdTAN+pMYSZqtafSil 5CdFLjtzG6JD948tPKrwe2jKzUgVdyxUdOcOwMbkuqr1CjT9mjStK5BO30RVqTaHjbHv NRzIk/GdJsyppN8Hktul/8i8ib2wyYXkXwM8dCz6r5Q0mNIOUSyTIXTbY3rCn5IiGe8r 278F+yAkgJAC+JYiXS+Z9kx5SBvBYpDbIFzPfBWFezaI43fMcaH/tcAYWYQXoKuqvHhZ eyLg== X-Gm-Message-State: AOAM531Bnd5SWPF91h3VlgeNrEQ+cypjvag++g71JvnTl6zCiiyMEnSE VZRL67km6GO6Xe5qBbr5TPgRbhpqDv0= X-Google-Smtp-Source: ABdhPJwnGi5ZKDDyB2vU4vWQmIs9WzLE2Xvt+hZROuRbZL+Ot2a3IEfSNlbDpg88SXffCCNNL+pHeA== X-Received: by 2002:a1c:10f:: with SMTP id 15mr3554184wmb.14.1616507077803; Tue, 23 Mar 2021 06:44:37 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id m11sm22132231wrz.40.2021.03.23.06.44.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Mar 2021 06:44:37 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 23 Mar 2021 13:44:16 +0000 Subject: [PATCH v4 08/20] test-tool: don't force full index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee We will use 'test-tool read-cache --table' to check that a sparse index is written as part of init_repos. Since we will no longer always expand a sparse index into a full index, add an '--expand' parameter that adds a call to ensure_full_index() so we can compare a sparse index directly against a full index, or at least what the in-memory index looks like when expanded in this way. Signed-off-by: Derrick Stolee --- t/helper/test-read-cache.c | 13 ++++++++++++- t/t1092-sparse-checkout-compatibility.sh | 5 +++++ 2 files changed, 17 insertions(+), 1 deletion(-) diff --git a/t/helper/test-read-cache.c b/t/helper/test-read-cache.c index 6cfd8f2de71c..b52c174acc7a 100644 --- a/t/helper/test-read-cache.c +++ b/t/helper/test-read-cache.c @@ -4,6 +4,7 @@ #include "blob.h" #include "commit.h" #include "tree.h" +#include "sparse-index.h" static void print_cache_entry(struct cache_entry *ce) { @@ -35,13 +36,19 @@ int cmd__read_cache(int argc, const char **argv) struct repository *r = the_repository; int i, cnt = 1; const char *name = NULL; - int table = 0; + int table = 0, expand = 0; + + initialize_the_repository(); + prepare_repo_settings(r); + r->settings.command_requires_full_index = 0; for (++argv, --argc; *argv && starts_with(*argv, "--"); ++argv, --argc) { if (skip_prefix(*argv, "--print-and-refresh=", &name)) continue; if (!strcmp(*argv, "--table")) table = 1; + else if (!strcmp(*argv, "--expand")) + expand = 1; } if (argc == 1) @@ -51,6 +58,10 @@ int cmd__read_cache(int argc, const char **argv) for (i = 0; i < cnt; i++) { repo_read_index(r); + + if (expand) + ensure_full_index(r->index); + if (name) { int pos; diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index de5d8461c993..a1aea141c62c 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -130,6 +130,11 @@ test_sparse_match () { test_cmp sparse-checkout-err sparse-index-err } +test_expect_success 'expanded in-memory index matches full index' ' + init_repos && + test_sparse_match test-tool read-cache --expand --table +' + test_expect_success 'status with options' ' init_repos && test_all_match git status --porcelain=v2 && From patchwork Tue Mar 23 13:44:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12157923 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43D5BC433E4 for ; Tue, 23 Mar 2021 13:45:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2CF04619C0 for ; Tue, 23 Mar 2021 13:45:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231665AbhCWNpC (ORCPT ); Tue, 23 Mar 2021 09:45:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38986 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231683AbhCWNol (ORCPT ); Tue, 23 Mar 2021 09:44:41 -0400 Received: from mail-wr1-x432.google.com (mail-wr1-x432.google.com [IPv6:2a00:1450:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 99210C061574 for ; Tue, 23 Mar 2021 06:44:39 -0700 (PDT) Received: by mail-wr1-x432.google.com with SMTP id o16so20846944wrn.0 for ; Tue, 23 Mar 2021 06:44:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=i+gKe8EgHfQlDyjY32XCjj4qcNKDY7Jfl7pEV2IN20w=; b=oC7Iz2bkNx8t98Ww3+tqbux9qf824cDkVPz7V3HRB/3RRxugEnhkOkFooe4tY6R//S 6EbrxMaYLVL1skKLqUTrwuQx0dLWF7SdXmcnQ4UEzSM+MVnlHxmfVJafM6EAPtkl+Jk7 MuU0KG2nr1HK8BSFoSBbdWcKsUu0dQ/0/OLSY54i2wV355nXYfEu/iflZ/ZdY+ELaNta Wo15PSkZTjinXCuBManIHiwVLJzZjR+WnEnDYFshz1Ev9YOWi6esLX6ZJaP/O3/aFH+n 1Hn0PlhHgivG9TyOWoSLMA3q2eECP5N50EgwodVOhLJAm94tLr4YKXW0pTDt4sDbjDdo f2WQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=i+gKe8EgHfQlDyjY32XCjj4qcNKDY7Jfl7pEV2IN20w=; b=AaE1hoGALDR4fmNY7utgZY/tpiAQGm45kVbWt7Br/P1eexvNM0JliQmx5BkYkhNB2b kuOcBVtRNBAQqUa9DOUkwqUDEsXgVrfE/yFuNDaLpB1Gy98BSUpAZvkJ/jsLcr8N9UD2 FDUMKvzqIUYWEIajakK07rgOEzOCN2clzXuSaoeJkIOLyaQshAztt39EoVoB3Cgyr4+R M6qDaa0tRKfUlAbgUgeLE901f7AiTCHtQw8rfcQBAPHJ9k0PkLu72IHc81YhqxyruoFf LiCTNBPUnh4a5whukCDDZ9w/mUW4Uq+BdU69DIBG0tmhnndklOBhSccq2ot3CqWnB5AU IaYQ== X-Gm-Message-State: AOAM531Nmm7A+WZQpH64Ehqwll1elbdTHVOnMBGS1mR/XsWpnDrZLa7r L/peCSs2QicxVBBgprBj9N2Vh0vs7uc= X-Google-Smtp-Source: ABdhPJw1iJA4Kz5d96Hx6LV4t524Y50ARt7KjTOT/42X1Q2dvfHydryTinm+T+ZGKgj8h3UGA0rPTg== X-Received: by 2002:a5d:6c67:: with SMTP id r7mr4060627wrz.373.1616507078445; Tue, 23 Mar 2021 06:44:38 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id l15sm23475893wru.38.2021.03.23.06.44.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Mar 2021 06:44:38 -0700 (PDT) Message-Id: <3ddd5e794b5edc862b6047328f61cad5e6134c9f.1616507069.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 23 Mar 2021 13:44:17 +0000 Subject: [PATCH v4 09/20] unpack-trees: ensure full index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The next change will translate full indexes into sparse indexes at write time. The existing logic provides a way for every sparse index to be expanded to a full index at read time. However, there are cases where an index is written and then continues to be used in-memory to perform further updates. unpack_trees() is frequently called after such a write. In particular, commands like 'git reset' do this double-update of the index. Ensure that we have a full index when entering unpack_trees(), but only when command_requires_full_index is true. This is always true at the moment, but we will later relax that after unpack_trees() is updated to handle sparse directory entries. Signed-off-by: Derrick Stolee --- unpack-trees.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/unpack-trees.c b/unpack-trees.c index f5f668f532d8..4dd99219073a 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -1567,6 +1567,7 @@ static int verify_absent(const struct cache_entry *, */ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options *o) { + struct repository *repo = the_repository; int i, ret; static struct cache_entry *dfc; struct pattern_list pl; @@ -1578,6 +1579,12 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options trace_performance_enter(); trace2_region_enter("unpack_trees", "unpack_trees", the_repository); + prepare_repo_settings(repo); + if (repo->settings.command_requires_full_index) { + ensure_full_index(o->src_index); + ensure_full_index(o->dst_index); + } + if (!core_apply_sparse_checkout || !o->update) o->skip_sparse_checkout = 1; if (!o->skip_sparse_checkout && !o->pl) { From patchwork Tue Mar 23 13:44:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12157919 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CC4C8C433E6 for ; Tue, 23 Mar 2021 13:45:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B0061619C3 for ; Tue, 23 Mar 2021 13:45:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231645AbhCWNpB (ORCPT ); Tue, 23 Mar 2021 09:45:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38988 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231694AbhCWNom (ORCPT ); Tue, 23 Mar 2021 09:44:42 -0400 Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 541D2C0613DE for ; Tue, 23 Mar 2021 06:44:40 -0700 (PDT) Received: by mail-wm1-x32d.google.com with SMTP id p19so11089833wmq.1 for ; Tue, 23 Mar 2021 06:44:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=AkJvT4gY7lKLChaTErd6DbcmGyVraiVliS0HbBhNfCE=; b=hmsbLSsNu2AJ/dXSUVQ6sgRDsjG2Yxtl1OUCUWZY/RQMw1BTRe4dMiy00CEd/d3aHv P+/19o4bNglSPR+ZVzt35uPXk83C0jFW7RCeLXuux97CN9bs11btFuSFWhlUTIIfdJDW 0kzSJuN8aBEoUaoHGu7neWZsSmEc9WMav8bc7icDg7ntmN0uCJDFM3Mzrogot57jrfHt dX2jZbf6j7mC43coaWDufJpjsIe2TUCdHpRd7APGJZt8hv1RwssIIJ8EyCEuLaC8b1JA G7Qhhj64VLsP0hPJmddxt79Dvf6SpbxJmc1DezVHFUu5+kn1bLrADnCgFQqSedwx0B6o AdZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=AkJvT4gY7lKLChaTErd6DbcmGyVraiVliS0HbBhNfCE=; b=YOrx51vHFDL7Wx2PjwBOrp91KFyUgSf6PjK5akLAjOa4M/u9t8cH6BPZ1XJzO+9pZE jCXfRiEmHtn9siyEcEffBMdDDZYRnjvSd6++L8xO8HHY2O7Pbf6w0AysiXeu1j2GlJ98 RhBWT7OIgU/W44iNLy9M8ZpwyAZjgKGdkZ2vyP6in3AN9FhtaUvvtCMH8ESFi40I51gr NtHFROfcKbaZRnT4gyDnAg7eLrTVJN0a8IUg2BdRBzNFO5BcO8Hxr2l12yP82Vudbj+G Uad+Qq4IqpWI+pHeBiSyYORi0qECREeslGljVclu6Nt3SCa+kvlhjVmr0MIOecKttlaB Ra/Q== X-Gm-Message-State: AOAM530lfeIAjX7W6bb2AKqnrhyM108Y5bHW2DR1E5AeUQ/ARwBSYZyU rnJ2WmpQbWTkx0q2eeNhAUeRAp6tEIw= X-Google-Smtp-Source: ABdhPJyBFrGeELZldiBqcSNQkdyeeUPtreRZYN0rYl2npSaZMxiITzjoPl8MUKPqgdmdJwSG7+ZtYA== X-Received: by 2002:a05:600c:9:: with SMTP id g9mr3568929wmc.134.1616507079083; Tue, 23 Mar 2021 06:44:39 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id y1sm2573865wmq.29.2021.03.23.06.44.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Mar 2021 06:44:38 -0700 (PDT) Message-Id: <7308c87697f179c06a6dc1abd85b64230060bc25.1616507069.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 23 Mar 2021 13:44:18 +0000 Subject: [PATCH v4 10/20] sparse-checkout: hold pattern list in index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee As we modify the sparse-checkout definition, we perform index operations on a pattern_list that only exists in-memory. This allows easy backing out in case the index update fails. However, if the index write itself cares about the sparse-checkout pattern set, we need access to that in-memory copy. Place a pointer to a 'struct pattern_list' in the index so we can access this on-demand. This will be used in the next change which uses the sparse-checkout definition to filter out directories that are outside the sparse cone. Signed-off-by: Derrick Stolee --- builtin/sparse-checkout.c | 17 ++++++++++------- cache.h | 2 ++ 2 files changed, 12 insertions(+), 7 deletions(-) diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index 2306a9ad98e0..e00b82af727b 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -110,6 +110,8 @@ static int update_working_directory(struct pattern_list *pl) if (is_index_unborn(r->index)) return UPDATE_SPARSITY_SUCCESS; + r->index->sparse_checkout_patterns = pl; + memset(&o, 0, sizeof(o)); o.verbose_update = isatty(2); o.update = 1; @@ -138,6 +140,7 @@ static int update_working_directory(struct pattern_list *pl) else rollback_lock_file(&lock_file); + r->index->sparse_checkout_patterns = NULL; return result; } @@ -517,19 +520,18 @@ static int modify_pattern_list(int argc, const char **argv, enum modify_type m) { int result; int changed_config = 0; - struct pattern_list pl; - memset(&pl, 0, sizeof(pl)); + struct pattern_list *pl = xcalloc(1, sizeof(*pl)); switch (m) { case ADD: if (core_sparse_checkout_cone) - add_patterns_cone_mode(argc, argv, &pl); + add_patterns_cone_mode(argc, argv, pl); else - add_patterns_literal(argc, argv, &pl); + add_patterns_literal(argc, argv, pl); break; case REPLACE: - add_patterns_from_input(&pl, argc, argv); + add_patterns_from_input(pl, argc, argv); break; } @@ -539,12 +541,13 @@ static int modify_pattern_list(int argc, const char **argv, enum modify_type m) changed_config = 1; } - result = write_patterns_and_update(&pl); + result = write_patterns_and_update(pl); if (result && changed_config) set_config(MODE_NO_PATTERNS); - clear_pattern_list(&pl); + clear_pattern_list(pl); + free(pl); return result; } diff --git a/cache.h b/cache.h index 136dd496c95d..8c4464420d0a 100644 --- a/cache.h +++ b/cache.h @@ -307,6 +307,7 @@ static inline unsigned int canon_mode(unsigned int mode) struct split_index; struct untracked_cache; struct progress; +struct pattern_list; struct index_state { struct cache_entry **cache; @@ -338,6 +339,7 @@ struct index_state { struct mem_pool *ce_mem_pool; struct progress *progress; struct repository *repo; + struct pattern_list *sparse_checkout_patterns; }; /* Name hashing */ From patchwork Tue Mar 23 13:44:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12157931 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F46DC433F1 for ; Tue, 23 Mar 2021 13:45:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 74D95619CE for ; Tue, 23 Mar 2021 13:45:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231755AbhCWNpF (ORCPT ); Tue, 23 Mar 2021 09:45:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38962 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231702AbhCWNom (ORCPT ); Tue, 23 Mar 2021 09:44:42 -0400 Received: from mail-wr1-x432.google.com (mail-wr1-x432.google.com [IPv6:2a00:1450:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2874FC0613DF for ; Tue, 23 Mar 2021 06:44:41 -0700 (PDT) Received: by mail-wr1-x432.google.com with SMTP id z2so20843097wrl.5 for ; Tue, 23 Mar 2021 06:44:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=9dHXyHhQyD3K8SWQ7DP6gkkoXkjgHEQ04kq5IdFubWs=; b=qiomdeqtKpBpA8QfQKUWOjn3VI9eZB9MhfVO34lY5/S8p3/+ANgA6qEQQ3NFMO2dpi jc8dhO7HEzi/GLlPMF4O2AWSVC/Mdj7LwdE2aVllAa6W4Ca1mzExO+kVD710rV+kMpyp eJis3dbbs42c2bc6+OaeCIJuhw1LjTf/Y6+WNtyMfAbnqykri9xG5fBugOv23ldIUkkQ Mf2lnXq6qC/urHufhEzeTw2PaGHdjRVEClLWKaFAsu+bVvmMhTviDf2cTIg5+w3sca1V BDoFFXOeEs8RzPr/CYW9DGIGn7vdla5r5jpr52aZEZjms/r/rIhEwhssAM/miGZ0TTb/ 9Hcg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=9dHXyHhQyD3K8SWQ7DP6gkkoXkjgHEQ04kq5IdFubWs=; b=J3v77FGk0fVC7goYb4kzdoQ9sSA2NDrlX5B0v0tU6a1+lDluSA1bDrwwNRRu1lU6Np sq6y0ITdrSG868bnzQqtck3vaOu/TjYMSRI1fLZitRmcFiUji+LmDVsosEzQjdQUFuKv Anv0XxVc/9zqUTlguPB7J/WaBqeIjUYpiZaKUTS7+3RcWq3OF2oahrfGPHuJ77R+7SSu IC0TPoVQXUU8EeRauSyQylgPKctNmhsE9Q2Gpc0FePuu2uGyhtrzHqRp/1U+Tt6k6hM0 LVeCA/R4gUGPYXOJQPxqe1oICfvCmVYCuPFsnEIUP9uXSrbyzHkh+Z8PhXWszqYgHGaI tpDg== X-Gm-Message-State: AOAM533MNs4tdjdCpBxQqnUz62cwZ+BgzVMDcKwouZnLc6mT5DhiLXbN 7RkeO2mn5H4kuZishiz2mvuMkCyWz4U= X-Google-Smtp-Source: ABdhPJyYY8U7Z8e6HgrHr1JbSlvcGIJG3Q7VRTNKTrJnuVnxtTvdmTJelUg5BA4jiGz6PxSD4efb+Q== X-Received: by 2002:a5d:4002:: with SMTP id n2mr4195346wrp.148.1616507079836; Tue, 23 Mar 2021 06:44:39 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id b131sm2555623wmb.34.2021.03.23.06.44.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Mar 2021 06:44:39 -0700 (PDT) Message-Id: <7c10d653ca6b03d10dbff27da459d757386bfe01.1616507069.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 23 Mar 2021 13:44:19 +0000 Subject: [PATCH v4 11/20] sparse-index: convert from full to sparse Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee If we have a full index, then we can convert it to a sparse index by replacing directories outside of the sparse cone with sparse directory entries. The convert_to_sparse() method does this, when the situation is appropriate. For now, we avoid converting the index to a sparse index if: 1. the index is split. 2. the index is already sparse. 3. sparse-checkout is disabled. 4. sparse-checkout does not use cone mode. Finally, we currently limit the conversion to when the GIT_TEST_SPARSE_INDEX environment variable is enabled. A mode using Git config will be added in a later change. The trickiest thing about this conversion is that we might not be able to mark a directory as a sparse directory just because it is outside the sparse cone. There might be unmerged files within that directory, so we need to look for those. Also, if there is some strange reason why a file is not marked with CE_SKIP_WORKTREE, then we should give up on converting that directory. There is still hope that some of its subdirectories might be able to convert to sparse, so we keep looking deeper. The conversion process is assisted by the cache-tree extension. This is calculated from the full index if it does not already exist. We then abandon the cache-tree as it no longer applies to the newly-sparse index. Thus, this cache-tree will be recalculated in every sparse-full-sparse round-trip until we integrate the cache-tree extension with the sparse index. Some Git commands use the index after writing it. For example, 'git add' will update the index, then write it to disk, then read its entries to report information. To keep the in-memory index in a full state after writing, we re-expand it to a full one after the write. This is wasteful for commands that only write the index and do not read from it again, but that is only the case until we make those commands "sparse aware." We can compare the behavior of the sparse-index in t1092-sparse-checkout-compability.sh by using GIT_TEST_SPARSE_INDEX=1 when operating on the 'sparse-index' repo. We can also compare the two sparse repos directly, such as comparing their indexes (when expanded to full in the case of the 'sparse-index' repo). We also verify that the index is actually populated with sparse directory entries. The 'checkout and reset (mixed)' test is marked for failure when comparing a sparse repo to a full repo, but we can compare the two sparse-checkout cases directly to ensure that we are not changing the behavior when using a sparse index. Signed-off-by: Derrick Stolee --- cache-tree.c | 3 + cache.h | 2 + read-cache.c | 26 ++++- sparse-index.c | 139 +++++++++++++++++++++++ sparse-index.h | 1 + t/t1092-sparse-checkout-compatibility.sh | 61 +++++++++- 6 files changed, 228 insertions(+), 4 deletions(-) diff --git a/cache-tree.c b/cache-tree.c index 2fb483d3c083..5f07a39e501e 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -6,6 +6,7 @@ #include "object-store.h" #include "replace-object.h" #include "promisor-remote.h" +#include "sparse-index.h" #ifndef DEBUG_CACHE_TREE #define DEBUG_CACHE_TREE 0 @@ -442,6 +443,8 @@ int cache_tree_update(struct index_state *istate, int flags) if (i) return i; + ensure_full_index(istate); + if (!istate->cache_tree) istate->cache_tree = cache_tree(); diff --git a/cache.h b/cache.h index 8c4464420d0a..74b43aaa2bd1 100644 --- a/cache.h +++ b/cache.h @@ -251,6 +251,8 @@ static inline unsigned int create_ce_mode(unsigned int mode) { if (S_ISLNK(mode)) return S_IFLNK; + if (S_ISSPARSEDIR(mode)) + return S_IFDIR; if (S_ISDIR(mode) || S_ISGITLINK(mode)) return S_IFGITLINK; return S_IFREG | ce_permissions(mode); diff --git a/read-cache.c b/read-cache.c index dd3980c12b53..b9c08773466c 100644 --- a/read-cache.c +++ b/read-cache.c @@ -25,6 +25,7 @@ #include "fsmonitor.h" #include "thread-utils.h" #include "progress.h" +#include "sparse-index.h" /* Mask for the name length in ce_flags in the on-disk index */ @@ -1002,8 +1003,14 @@ int verify_path(const char *path, unsigned mode) c = *path++; if ((c == '.' && !verify_dotfile(path, mode)) || - is_dir_sep(c) || c == '\0') + is_dir_sep(c)) return 0; + /* + * allow terminating directory separators for + * sparse directory entries. + */ + if (c == '\0') + return S_ISDIR(mode); } else if (c == '\\' && protect_ntfs) { if (is_ntfs_dotgit(path)) return 0; @@ -3079,6 +3086,14 @@ static int do_write_locked_index(struct index_state *istate, struct lock_file *l unsigned flags) { int ret; + int was_full = !istate->sparse_index; + + ret = convert_to_sparse(istate); + + if (ret) { + warning(_("failed to convert to a sparse-index")); + return ret; + } /* * TODO trace2: replace "the_repository" with the actual repo instance @@ -3090,6 +3105,9 @@ static int do_write_locked_index(struct index_state *istate, struct lock_file *l trace2_region_leave_printf("index", "do_write_index", the_repository, "%s", get_lock_file_path(lock)); + if (was_full) + ensure_full_index(istate); + if (ret) return ret; if (flags & COMMIT_LOCK) @@ -3180,9 +3198,10 @@ static int write_shared_index(struct index_state *istate, struct tempfile **temp) { struct split_index *si = istate->split_index; - int ret; + int ret, was_full = !istate->sparse_index; move_cache_to_base_index(istate); + convert_to_sparse(istate); trace2_region_enter_printf("index", "shared/do_write_index", the_repository, "%s", get_tempfile_path(*temp)); @@ -3190,6 +3209,9 @@ static int write_shared_index(struct index_state *istate, trace2_region_leave_printf("index", "shared/do_write_index", the_repository, "%s", get_tempfile_path(*temp)); + if (was_full) + ensure_full_index(istate); + if (ret) return ret; ret = adjust_shared_perm(get_tempfile_path(*temp)); diff --git a/sparse-index.c b/sparse-index.c index 7095378a1b28..619ff7c2e217 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -4,6 +4,145 @@ #include "tree.h" #include "pathspec.h" #include "trace2.h" +#include "cache-tree.h" +#include "config.h" +#include "dir.h" +#include "fsmonitor.h" + +static struct cache_entry *construct_sparse_dir_entry( + struct index_state *istate, + const char *sparse_dir, + struct cache_tree *tree) +{ + struct cache_entry *de; + + de = make_cache_entry(istate, S_IFDIR, &tree->oid, sparse_dir, 0, 0); + + de->ce_flags |= CE_SKIP_WORKTREE; + return de; +} + +/* + * Returns the number of entries "inserted" into the index. + */ +static int convert_to_sparse_rec(struct index_state *istate, + int num_converted, + int start, int end, + const char *ct_path, size_t ct_pathlen, + struct cache_tree *ct) +{ + int i, can_convert = 1; + int start_converted = num_converted; + enum pattern_match_result match; + int dtype; + struct strbuf child_path = STRBUF_INIT; + struct pattern_list *pl = istate->sparse_checkout_patterns; + + /* + * Is the current path outside of the sparse cone? + * Then check if the region can be replaced by a sparse + * directory entry (everything is sparse and merged). + */ + match = path_matches_pattern_list(ct_path, ct_pathlen, + NULL, &dtype, pl, istate); + if (match != NOT_MATCHED) + can_convert = 0; + + for (i = start; can_convert && i < end; i++) { + struct cache_entry *ce = istate->cache[i]; + + if (ce_stage(ce) || + !(ce->ce_flags & CE_SKIP_WORKTREE)) + can_convert = 0; + } + + if (can_convert) { + struct cache_entry *se; + se = construct_sparse_dir_entry(istate, ct_path, ct); + + istate->cache[num_converted++] = se; + return 1; + } + + for (i = start; i < end; ) { + int count, span, pos = -1; + const char *base, *slash; + struct cache_entry *ce = istate->cache[i]; + + /* + * Detect if this is a normal entry outside of any subtree + * entry. + */ + base = ce->name + ct_pathlen; + slash = strchr(base, '/'); + + if (slash) + pos = cache_tree_subtree_pos(ct, base, slash - base); + + if (pos < 0) { + istate->cache[num_converted++] = ce; + i++; + continue; + } + + strbuf_setlen(&child_path, 0); + strbuf_add(&child_path, ce->name, slash - ce->name + 1); + + span = ct->down[pos]->cache_tree->entry_count; + count = convert_to_sparse_rec(istate, + num_converted, i, i + span, + child_path.buf, child_path.len, + ct->down[pos]->cache_tree); + num_converted += count; + i += span; + } + + strbuf_release(&child_path); + return num_converted - start_converted; +} + +int convert_to_sparse(struct index_state *istate) +{ + if (istate->split_index || istate->sparse_index || + !core_apply_sparse_checkout || !core_sparse_checkout_cone) + return 0; + + /* + * For now, only create a sparse index with the + * GIT_TEST_SPARSE_INDEX environment variable. We will relax + * this once we have a proper way to opt-in (and later still, + * opt-out). + */ + if (!git_env_bool("GIT_TEST_SPARSE_INDEX", 0)) + return 0; + + if (!istate->sparse_checkout_patterns) { + istate->sparse_checkout_patterns = xcalloc(1, sizeof(struct pattern_list)); + if (get_sparse_checkout_patterns(istate->sparse_checkout_patterns) < 0) + return 0; + } + + if (!istate->sparse_checkout_patterns->use_cone_patterns) { + warning(_("attempting to use sparse-index without cone mode")); + return -1; + } + + if (cache_tree_update(istate, 0)) { + warning(_("unable to update cache-tree, staying full")); + return -1; + } + + remove_fsmonitor(istate); + + trace2_region_enter("index", "convert_to_sparse", istate->repo); + istate->cache_nr = convert_to_sparse_rec(istate, + 0, 0, istate->cache_nr, + "", 0, istate->cache_tree); + istate->drop_cache_tree = 1; + istate->sparse_index = 1; + trace2_region_leave("index", "convert_to_sparse", istate->repo); + return 0; +} static void set_index_entry(struct index_state *istate, int nr, struct cache_entry *ce) { diff --git a/sparse-index.h b/sparse-index.h index 09a20d036c46..64380e121d80 100644 --- a/sparse-index.h +++ b/sparse-index.h @@ -3,5 +3,6 @@ struct index_state; void ensure_full_index(struct index_state *istate); +int convert_to_sparse(struct index_state *istate); #endif diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index a1aea141c62c..1e888d195122 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -2,6 +2,11 @@ test_description='compare full workdir to sparse workdir' +# The verify_cache_tree() check is not sparse-aware (yet). +# So, disable the check until that integration is complete. +GIT_TEST_CHECK_CACHE_TREE=0 +GIT_TEST_SPLIT_INDEX=0 + . ./test-lib.sh test_expect_success 'setup' ' @@ -121,7 +126,9 @@ run_on_all () { test_all_match () { run_on_all "$@" && test_cmp full-checkout-out sparse-checkout-out && - test_cmp full-checkout-err sparse-checkout-err + test_cmp full-checkout-out sparse-index-out && + test_cmp full-checkout-err sparse-checkout-err && + test_cmp full-checkout-err sparse-index-err } test_sparse_match () { @@ -130,6 +137,38 @@ test_sparse_match () { test_cmp sparse-checkout-err sparse-index-err } +test_expect_success 'sparse-index contents' ' + init_repos && + + test-tool -C sparse-index read-cache --table >cache && + for dir in folder1 folder2 x + do + TREE=$(git -C sparse-index rev-parse HEAD:$dir) && + grep "040000 tree $TREE $dir/" cache \ + || return 1 + done && + + GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout set folder1 && + + test-tool -C sparse-index read-cache --table >cache && + for dir in deep folder2 x + do + TREE=$(git -C sparse-index rev-parse HEAD:$dir) && + grep "040000 tree $TREE $dir/" cache \ + || return 1 + done && + + GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout set deep/deeper1 && + + test-tool -C sparse-index read-cache --table >cache && + for dir in deep/deeper2 folder1 folder2 x + do + TREE=$(git -C sparse-index rev-parse HEAD:$dir) && + grep "040000 tree $TREE $dir/" cache \ + || return 1 + done +' + test_expect_success 'expanded in-memory index matches full index' ' init_repos && test_sparse_match test-tool read-cache --expand --table @@ -137,6 +176,7 @@ test_expect_success 'expanded in-memory index matches full index' ' test_expect_success 'status with options' ' init_repos && + test_sparse_match ls && test_all_match git status --porcelain=v2 && test_all_match git status --porcelain=v2 -z -u && test_all_match git status --porcelain=v2 -uno && @@ -273,6 +313,17 @@ test_expect_failure 'checkout and reset (mixed)' ' test_all_match git reset update-folder2 ' +# Ensure that sparse-index behaves identically to +# sparse-checkout with a full index. +test_expect_success 'checkout and reset (mixed) [sparse]' ' + init_repos && + + test_sparse_match git checkout -b reset-test update-deep && + test_sparse_match git reset deepest && + test_sparse_match git reset update-folder1 && + test_sparse_match git reset update-folder2 +' + test_expect_success 'merge' ' init_repos && @@ -309,14 +360,20 @@ test_expect_success 'clean' ' test_all_match git status --porcelain=v2 && test_all_match git clean -f && test_all_match git status --porcelain=v2 && + test_sparse_match ls && + test_sparse_match ls folder1 && test_all_match git clean -xf && test_all_match git status --porcelain=v2 && + test_sparse_match ls && + test_sparse_match ls folder1 && test_all_match git clean -xdf && test_all_match git status --porcelain=v2 && + test_sparse_match ls && + test_sparse_match ls folder1 && - test_path_is_dir sparse-checkout/folder1 + test_sparse_match test_path_is_dir folder1 ' test_done From patchwork Tue Mar 23 13:44:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12157925 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6CB7DC433EC for ; Tue, 23 Mar 2021 13:45:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5419F619C0 for ; Tue, 23 Mar 2021 13:45:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231751AbhCWNpF (ORCPT ); Tue, 23 Mar 2021 09:45:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38990 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231709AbhCWNom (ORCPT ); Tue, 23 Mar 2021 09:44:42 -0400 Received: from mail-wm1-x336.google.com (mail-wm1-x336.google.com [IPv6:2a00:1450:4864:20::336]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B75FFC0613E0 for ; Tue, 23 Mar 2021 06:44:41 -0700 (PDT) Received: by mail-wm1-x336.google.com with SMTP id d191so11076806wmd.2 for ; Tue, 23 Mar 2021 06:44:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=d57sN1iwbvl34esKzbNWg9awcmZQjE9yfjMflF8q+sQ=; b=tqKL8bTbxiwYaOZYKcT/+olbKatvx19GCky8/rfPhjRp2nOIAcEtx9CimDBIcTyrMs lXwTYJvxBqQ3bxsEicSt1kmPRm1ipY27xWkZuBTt6PppKc9MdJJjq+9w7GqBc+ImnJGn 4uS05PRP9x9KSXVBzNFvRN64mAlShQwUBNulUh1Kf8NN6D+sN+g8MCzJOANJ8Cutt00B aDevv8y29lrSJAHJttL8KFwAdSRyPqY+RQuO5oPtyRCDzVYAX8q3idVWZ4wAiT6ScrOW ZeCuGtYcrITbmruKUUHyBvKg3iE0bqnnODqPWMIlb9D6lSy56Mkf0NidxA2FFj43u6hf nAyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=d57sN1iwbvl34esKzbNWg9awcmZQjE9yfjMflF8q+sQ=; b=cuYBokqLtqMD/TSIm+gdhSmKM6XpUbXQU1SnDKY1Vx/7tZxBFjNEitDBqyoVjRoPZ6 2geN5pf3z6eqKmXOg0e3QAykLVhpAZp8cOKzGmyJLz0W2RlmuEA9cKzLlRenoFKkzlqz pYANI/0Kh3jyKjRBbHcZwwXah4y1MVxtHh5PZKPpmK7hL42qsJlS1VnglR/r8oYyKxCP 3XQejPbK+0FH9bVZ+kVZwyZfgSBo2OQNnGtSh7UOnU9n9mz3keEU+Ad1w3vYrcPhCAO8 PYXWc53IDRbKWh/gWhb8jH7TMEvUCF88Q8CRPkzY3cW/GLGLmm/8R0YqrmJCPrXuxwZb pr1Q== X-Gm-Message-State: AOAM5324OVxJvEhikPNaHC8qX/uQKaaDhop4vOOtoz9gp3gmKMrny2NH CNnwpVWPG9KLhme1m3z8f8c88waToN8= X-Google-Smtp-Source: ABdhPJzi44E02l50hUTf725GXYQTvmHF7290Mn2zYYK8KiRhSpMWoPmzQJc9xJFUHS9dTTjDY1RcIg== X-Received: by 2002:a1c:a7d3:: with SMTP id q202mr3429558wme.93.1616507080483; Tue, 23 Mar 2021 06:44:40 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id j30sm25275163wrj.62.2021.03.23.06.44.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Mar 2021 06:44:40 -0700 (PDT) Message-Id: <6db36f33e960d6bfd4a156efc2e070dd9c23378b.1616507069.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 23 Mar 2021 13:44:20 +0000 Subject: [PATCH v4 12/20] submodule: sparse-index should not collapse links Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee A submodule is stored as a "Git link" that actually points to a commit within a submodule. Submodules are populated or not depending on submodule configuration, not sparse-checkout. To ensure that the sparse-index feature integrates correctly with submodules, we should not collapse a directory if there is a Git link within its range. Signed-off-by: Derrick Stolee --- sparse-index.c | 1 + t/t1092-sparse-checkout-compatibility.sh | 17 +++++++++++++++++ 2 files changed, 18 insertions(+) diff --git a/sparse-index.c b/sparse-index.c index 619ff7c2e217..7631f7bd00b7 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -52,6 +52,7 @@ static int convert_to_sparse_rec(struct index_state *istate, struct cache_entry *ce = istate->cache[i]; if (ce_stage(ce) || + S_ISGITLINK(ce->ce_mode) || !(ce->ce_flags & CE_SKIP_WORKTREE)) can_convert = 0; } diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index 1e888d195122..cba5f89b1e96 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -376,4 +376,21 @@ test_expect_success 'clean' ' test_sparse_match test_path_is_dir folder1 ' +test_expect_success 'submodule handling' ' + init_repos && + + test_all_match mkdir modules && + test_all_match touch modules/a && + test_all_match git add modules && + test_all_match git commit -m "add modules directory" && + + run_on_all git submodule add "$(pwd)/initial-repo" modules/sub && + test_all_match git commit -m "add submodule" && + + # having a submodule prevents "modules" from collapse + test-tool -C sparse-index read-cache --table >cache && + grep "100644 blob .* modules/a" cache && + grep "160000 commit $(git -C initial-repo rev-parse HEAD) modules/sub" cache +' + test_done From patchwork Tue Mar 23 13:44:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12157929 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54035C433EA for ; Tue, 23 Mar 2021 13:45:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 41169619C6 for ; Tue, 23 Mar 2021 13:45:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231730AbhCWNpD (ORCPT ); Tue, 23 Mar 2021 09:45:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38964 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231727AbhCWNon (ORCPT ); Tue, 23 Mar 2021 09:44:43 -0400 Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6C401C061763 for ; Tue, 23 Mar 2021 06:44:42 -0700 (PDT) Received: by mail-wr1-x42b.google.com with SMTP id o16so20847108wrn.0 for ; Tue, 23 Mar 2021 06:44:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=harCcaW6krA58Ri1udByTg/8REVVoWQHOitg/KDNLxQ=; b=pZhqsAFEZoL8uUePjB31o5xLZeJlomadxoIAA20xhmtbD0xLbtBhDKGX3/SZ/pZG4r iiFqhbEioPj5fux13S5EmBIJSMPVUnPocHnp57Hm903FnpVZwIsciE032FxgYfB1BhVT g84QimlT2f/VhYwTFnpXIdQBqV6WSYqUavVHURV0JCnjj5YwErLo+gqnB6grlS1seXJg yFS7XXLBKjMRygYXOnf1nFNpycmLNFPXB6EkIQl0liEF7djF6/+/aLmxCd0cqQnJ30db WKMhwzgRIvynlPYuaJNzga5bckQy21S3WBoEE+61r8GOBrN1ZEFw1Lz+dnfm54+0hG2W H+Vw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=harCcaW6krA58Ri1udByTg/8REVVoWQHOitg/KDNLxQ=; b=D6wM+YV6xB4UpahP7DsQkHXb+Cscvy6w5ac6T1Hfd0h4aXrMC4+n+P8sbTW+L+FvqO KJ69Ty8Rnw64sLqwNNELM7qRgSS9bou02wMMz8cksGN7fktUOqEP2GK9I30n7F9XSlrT HusyBnThKsuHZEZuowQv9jJBKQIgJxzAlFU67sFm4ytAe1QS869N10I9ipfsHmcmkX9h eyryPTpqSYPTlYITShbOYn1iz0ZSJUmSimoC2NHUVePYnxch/PaiQTLVQep+LYchGQhY kYLPdhQ+mCnz25AqJogx7lsMkoyxHF3UiiWTZr73CFtXb2/b3VK00HM2lg6O28ZhTs4S c5SA== X-Gm-Message-State: AOAM532LvU+uBDgL5DGycootZlMa31YLCpQs97RZYx7xw0Z6uSrZPo1y T58MzO5t4kn3UA+vsZ7KktVkcDA0SGs= X-Google-Smtp-Source: ABdhPJyO2QPXSaXC6tnTmE3nhK2VZm57bUzxGbgwcv1eyWNVLozRc8MvJg1+fFw761Br+7NpgHWzDg== X-Received: by 2002:adf:9043:: with SMTP id h61mr4083564wrh.216.1616507081175; Tue, 23 Mar 2021 06:44:41 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id m5sm3273426wrq.15.2021.03.23.06.44.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Mar 2021 06:44:40 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 23 Mar 2021 13:44:21 +0000 Subject: [PATCH v4 13/20] unpack-trees: allow sparse directories Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The index_pos_by_traverse_info() currently throws a BUG() when a directory entry exists exactly in the index. We need to consider that it is possible to have a directory in a sparse index as long as that entry is itself marked with the skip-worktree bit. The 'pos' variable is assigned a negative value if an exact match is not found. Since a directory name can be an exact match, it is no longer an error to have a nonnegative 'pos' value. Signed-off-by: Derrick Stolee --- unpack-trees.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/unpack-trees.c b/unpack-trees.c index 4dd99219073a..0b888dab2246 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -746,9 +746,13 @@ static int index_pos_by_traverse_info(struct name_entry *names, strbuf_make_traverse_path(&name, info, names->path, names->pathlen); strbuf_addch(&name, '/'); pos = index_name_pos(o->src_index, name.buf, name.len); - if (pos >= 0) - BUG("This is a directory and should not exist in index"); - pos = -pos - 1; + if (pos >= 0) { + if (!o->src_index->sparse_index || + !(o->src_index->cache[pos]->ce_flags & CE_SKIP_WORKTREE)) + BUG("This is a directory and should not exist in index"); + } else { + pos = -pos - 1; + } if (pos >= o->src_index->cache_nr || !starts_with(o->src_index->cache[pos]->name, name.buf) || (pos > 0 && starts_with(o->src_index->cache[pos-1]->name, name.buf))) From patchwork Tue Mar 23 13:44:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12157941 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86093C433EB for ; Tue, 23 Mar 2021 13:45:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 64AA9619C3 for ; Tue, 23 Mar 2021 13:45:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231757AbhCWNpG (ORCPT ); Tue, 23 Mar 2021 09:45:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38992 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231734AbhCWNoo (ORCPT ); Tue, 23 Mar 2021 09:44:44 -0400 Received: from mail-wr1-x42e.google.com (mail-wr1-x42e.google.com [IPv6:2a00:1450:4864:20::42e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5B275C061764 for ; Tue, 23 Mar 2021 06:44:43 -0700 (PDT) Received: by mail-wr1-x42e.google.com with SMTP id x16so20821508wrn.4 for ; Tue, 23 Mar 2021 06:44:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=HXXyCb3008cEuxUb0eFiCeUe/gRR165/1lETbl5mS4A=; b=CjUsDM7az5Dr8kiWrkQzg4omtbiK/0lCNGE6nC2PCLmQqsdW8TQfftsfH/MZn2NS0K nC/4iD+XLa7YNHCRknuco5dENT6tTfXeHv6zEf0Kg9mxbLEAB7dlMkUN5BZB8SRLhJu7 pN64H/3mjYd3SeejDEsBQM6F+ijo8iVzbN6aN0PeKwmuV1/cR6E60oqByG1gMmmbWMLL Xvs3D3CVVYkULWXczHQ3rCqRf8BQc8nFPIqU90s3KgSJuXzjQH54unZRVPut5WjOwGYW c6/leUbi9bbEyo6hBZfOZ6lMmm0hqaWyewnbG6IFyrvJnf+yB+hzA9tNjrUq+XAgT2EK sgYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=HXXyCb3008cEuxUb0eFiCeUe/gRR165/1lETbl5mS4A=; b=KOK4BbaNaXQstbBdIKpD/1zQ/9GT66tbiBsMuVLLwKaBT87xlFISMdX67LLGrFVn+o K4HXSwqJbstrFBvgoJQrt8PCAf6JYUXC1ZeFCSrEC97kXFspxtFh/FMi8ES9rf80rZcU mnPZC5QWoorUeYG/Gh2Kj2ywEIMYSM97bOFjzSUJk20+s4bHYB/xLJjXGABcJnv9xSvr kan1OX4oPjakZGltL50/YBK8f1gXgY3Gq//sSB+AnEG7e4cs+l0scB8Rio3FQixbV7Ll 97dqJaoFtlIsMygugasksv503TELMCnbSd8QSpXdcGKLM2B/7jcpcl83yxz19uUy8HEA al8w== X-Gm-Message-State: AOAM531WNCarsstR1cfbpecqZN4XlQ+ZV/KoNkCHhfRWIeoH/l4gs1uy Fb8CuuUvYF5+OFwJOQMK1mi95zFBRXQ= X-Google-Smtp-Source: ABdhPJwmPwtniyVT77hl5kd5K2tv9yhSFoTwiTi9KfEZ/jU6AFT/ObnlMlZvqUCUvpo2xuZvgmpVCA== X-Received: by 2002:adf:f44b:: with SMTP id f11mr4147435wrp.345.1616507082135; Tue, 23 Mar 2021 06:44:42 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id n4sm2527443wmq.40.2021.03.23.06.44.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Mar 2021 06:44:41 -0700 (PDT) Message-Id: <08d9f5f3c0d126e29ac19b36c87b0c3f43ecfd4a.1616507069.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 23 Mar 2021 13:44:22 +0000 Subject: [PATCH v4 14/20] sparse-index: check index conversion happens Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Add a test case that uses test_region to ensure that we are truly expanding a sparse index to a full one, then converting back to sparse when writing the index. As we integrate more Git commands with the sparse index, we will convert these commands to check that we do _not_ convert the sparse index to a full index and instead stay sparse the entire time. Signed-off-by: Derrick Stolee --- t/t1092-sparse-checkout-compatibility.sh | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index cba5f89b1e96..47f983217852 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -393,4 +393,22 @@ test_expect_success 'submodule handling' ' grep "160000 commit $(git -C initial-repo rev-parse HEAD) modules/sub" cache ' +test_expect_success 'sparse-index is expanded and converted back' ' + init_repos && + + ( + GIT_TEST_SPARSE_INDEX=1 && + export GIT_TEST_SPARSE_INDEX && + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ + git -C sparse-index -c core.fsmonitor="" reset --hard && + test_region index convert_to_sparse trace2.txt && + test_region index ensure_full_index trace2.txt && + + rm trace2.txt && + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ + git -C sparse-index -c core.fsmonitor="" status -uno && + test_region index ensure_full_index trace2.txt + ) +' + test_done From patchwork Tue Mar 23 13:44:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12157935 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5557C433F8 for ; Tue, 23 Mar 2021 13:45:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C4B2D619CA for ; Tue, 23 Mar 2021 13:45:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231644AbhCWNpJ (ORCPT ); Tue, 23 Mar 2021 09:45:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39002 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231735AbhCWNoo (ORCPT ); Tue, 23 Mar 2021 09:44:44 -0400 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2DE0BC061765 for ; Tue, 23 Mar 2021 06:44:44 -0700 (PDT) Received: by mail-wr1-x42d.google.com with SMTP id j18so20828219wra.2 for ; Tue, 23 Mar 2021 06:44:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=FJ1aCKqeDvQVAOCTaiUS1h7INWmwY3b0rShnfKBKEQI=; b=Q7lgKzsr0EQ2/iExzn6Lm42PFTVIhQi+moA7OaZ/1ec4/vWZaNeIdLyUivgZL5BwmV FcTNDuhNkYpjfdvx0mnwQ7KdRGcx8JA1zvMSIhWxhS50Qa0xLPx4VKsZDgfvFRTkbN3N GU3qy9qt2IqB0kqccyEx9juL9JWfwZnsr52XIW83YTAgkxrqgXxM3RvG5IhkDjY1jq0h MZBwVpe05KDpdilMlRO+LNUdscJh4fv9BDr3ZNbNnfszcCxG/6asM3sgr6Gz/2KIHl9c WT8PfL9gVGk16b1mkdMLOjzYKorBG7IxI/E0r1Y3FSz9iostkxnOjWVIkPaC7VFhg36M SZMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=FJ1aCKqeDvQVAOCTaiUS1h7INWmwY3b0rShnfKBKEQI=; b=SPIxDU5P/moVN9lWKoElqHQWzXSfEMU438Py9UCwIFe9rwpsOVjCK46Dk9RAffI51G 7FpEiDIhhO6OwvrzXHevXdhX4e68K2ojT+DPAFvg1fM/8YiC6khHV36gvxharlfL7Wuc tNZhpE93MfZRCtO3JtuUi5M0N9wUrbxQ/xQHpNmLIKlQkpajvO6GB27eZ5EECsc076IT kZ8KWmAWtIdj1z6LJqdq3ZALcakJSL5LzUpZ0ej7oMRhJqmFeoSyGhHFXxL3OFGyOqOw 8BbjOm8176YTEBtpny+sTdv2tOhM+X7ssUAVaAIJoDfKlJmrQOAF3LgxP9VL/o8bg4ke uIGA== X-Gm-Message-State: AOAM530lSwitMPnGo7E3XSQU9CvBET7TUROnpYOuODpGG8TlF6mR2hdt E3VODIu9wcuAx0NGt58xqKu3r3K3gsU= X-Google-Smtp-Source: ABdhPJxQDLuVYwnazll96o9q5q8jLAqRAMT+NOxfpLMg6f97/wtXtRBmr9EuMD7KIBhVAP6lo7lezQ== X-Received: by 2002:a5d:6a4c:: with SMTP id t12mr3980588wrw.289.1616507082989; Tue, 23 Mar 2021 06:44:42 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id b12sm13378472wrf.39.2021.03.23.06.44.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Mar 2021 06:44:42 -0700 (PDT) Message-Id: <6f38cef196b0e9dca92e92399532290f39d8cece.1616507069.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 23 Mar 2021 13:44:23 +0000 Subject: [PATCH v4 15/20] sparse-index: create extension for compatibility Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Previously, we enabled the sparse index format only using GIT_TEST_SPARSE_INDEX=1. This is not a feasible direction for users to actually select this mode. Further, sparse directory entries are not understood by the index formats as advertised. We _could_ add a new index version that explicitly adds these capabilities, but there are nuances to index formats 2, 3, and 4 that are still valuable to select as options. Until we add index format version 5, create a repo extension, "extensions.sparseIndex", that specifies that the tool reading this repository must understand sparse directory entries. This change only encodes the extension and enables it when GIT_TEST_SPARSE_INDEX=1. Later, we will add a more user-friendly CLI mechanism. Signed-off-by: Derrick Stolee --- Documentation/config/extensions.txt | 8 ++++++ cache.h | 1 + repo-settings.c | 7 ++++++ repository.h | 3 ++- setup.c | 3 +++ sparse-index.c | 38 +++++++++++++++++++++++++---- 6 files changed, 54 insertions(+), 6 deletions(-) diff --git a/Documentation/config/extensions.txt b/Documentation/config/extensions.txt index 4e23d73cdcad..c02e09af0046 100644 --- a/Documentation/config/extensions.txt +++ b/Documentation/config/extensions.txt @@ -6,3 +6,11 @@ extensions.objectFormat:: Note that this setting should only be set by linkgit:git-init[1] or linkgit:git-clone[1]. Trying to change it after initialization will not work and will produce hard-to-diagnose issues. + +extensions.sparseIndex:: + When combined with `core.sparseCheckout=true` and + `core.sparseCheckoutCone=true`, the index may contain entries + corresponding to directories outside of the sparse-checkout + definition in lieu of containing each path under such directories. + Versions of Git that do not understand this extension do not + expect directory entries in the index. diff --git a/cache.h b/cache.h index 74b43aaa2bd1..8aede373aeb3 100644 --- a/cache.h +++ b/cache.h @@ -1059,6 +1059,7 @@ struct repository_format { int worktree_config; int is_bare; int hash_algo; + int sparse_index; char *work_tree; struct string_list unknown_extensions; struct string_list v1_only_extensions; diff --git a/repo-settings.c b/repo-settings.c index d63569e4041e..9677d50f9238 100644 --- a/repo-settings.c +++ b/repo-settings.c @@ -85,4 +85,11 @@ void prepare_repo_settings(struct repository *r) * removed. */ r->settings.command_requires_full_index = 1; + + /* + * Initialize this as off. + */ + r->settings.sparse_index = 0; + if (!repo_config_get_bool(r, "extensions.sparseindex", &value) && value) + r->settings.sparse_index = 1; } diff --git a/repository.h b/repository.h index e06a23015697..a45f7520fd9e 100644 --- a/repository.h +++ b/repository.h @@ -42,7 +42,8 @@ struct repo_settings { int core_multi_pack_index; - unsigned command_requires_full_index:1; + unsigned command_requires_full_index:1, + sparse_index:1; }; struct repository { diff --git a/setup.c b/setup.c index c04cd25a30df..cd8394564613 100644 --- a/setup.c +++ b/setup.c @@ -500,6 +500,9 @@ static enum extension_result handle_extension(const char *var, return error("invalid value for 'extensions.objectformat'"); data->hash_algo = format; return EXTENSION_OK; + } else if (!strcmp(ext, "sparseindex")) { + data->sparse_index = 1; + return EXTENSION_OK; } return EXTENSION_UNKNOWN; } diff --git a/sparse-index.c b/sparse-index.c index 7631f7bd00b7..3a6df66faeab 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -102,19 +102,47 @@ static int convert_to_sparse_rec(struct index_state *istate, return num_converted - start_converted; } +static int enable_sparse_index(struct repository *repo) +{ + const char *config_path = repo_git_path(repo, "config.worktree"); + + if (upgrade_repository_format(1) < 0) { + warning(_("unable to upgrade repository format to enable sparse-index")); + return -1; + } + git_config_set_in_file_gently(config_path, + "extensions.sparseIndex", + "true"); + + prepare_repo_settings(repo); + repo->settings.sparse_index = 1; + return 0; +} + int convert_to_sparse(struct index_state *istate) { if (istate->split_index || istate->sparse_index || !core_apply_sparse_checkout || !core_sparse_checkout_cone) return 0; + if (!istate->repo) + istate->repo = the_repository; + + /* + * The GIT_TEST_SPARSE_INDEX environment variable triggers the + * extensions.sparseIndex config variable to be on. + */ + if (git_env_bool("GIT_TEST_SPARSE_INDEX", 0)) { + int err = enable_sparse_index(istate->repo); + if (err < 0) + return err; + } + /* - * For now, only create a sparse index with the - * GIT_TEST_SPARSE_INDEX environment variable. We will relax - * this once we have a proper way to opt-in (and later still, - * opt-out). + * Only convert to sparse if extensions.sparseIndex is set. */ - if (!git_env_bool("GIT_TEST_SPARSE_INDEX", 0)) + prepare_repo_settings(istate->repo); + if (!istate->repo->settings.sparse_index) return 0; if (!istate->sparse_checkout_patterns) { From patchwork Tue Mar 23 13:44:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12157939 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C34B2C433F2 for ; Tue, 23 Mar 2021 13:45:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B3455619C6 for ; Tue, 23 Mar 2021 13:45:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231770AbhCWNpJ (ORCPT ); Tue, 23 Mar 2021 09:45:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38974 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231737AbhCWNop (ORCPT ); Tue, 23 Mar 2021 09:44:45 -0400 Received: from mail-wr1-x433.google.com (mail-wr1-x433.google.com [IPv6:2a00:1450:4864:20::433]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 08FDBC0613D8 for ; Tue, 23 Mar 2021 06:44:45 -0700 (PDT) Received: by mail-wr1-x433.google.com with SMTP id e18so20842844wrt.6 for ; Tue, 23 Mar 2021 06:44:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=kncCz5Lyrf/H+lO6YLs/MlrpL++udfl5vXkop5ZyHzg=; b=WuyMTShAhpMbxHsg7sElMja78V4bY3fN1kWhO4JBs8916/uX5HIxMcMxC2j+fplkSL zvdAG+gzNJQXXSUNCBaBLlr13xx43sTs390pGkQj/d8ad3G+5gQ/f3JVCcMpQJEsTgGT D9rZoLk7RP8TXQQffFdefKzmpDixoWTvYb2+NfrZCc7n3MrhGNkzwnqY/I+1gEzVuZWV hxKv62AoyJiHlz92Ghafi0eAK2ea0/YdhNl/UhVBoMplXT/npwd2rIybKsJUQL6WV0KF l1FSe9v+m64ScoNvoNIdEiX3v17yhdEqmc5Qct756CedzpUKI3kguwsGlzGBLFCxdejA fqOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=kncCz5Lyrf/H+lO6YLs/MlrpL++udfl5vXkop5ZyHzg=; b=o9camnA1hL289zJ58KmV6k3tHZe4owNhYZ8PNqcOEWHT6T8JkRCxIHTDnUl0GBMRhf cmWLd5UkskaLlyvhrEYdohO9Lv/W8X3i54nvVwBu9BJVGQAMnIE+pDymLx/TeWLmMQ5g kacDWFyVbMzN1LNpifv7KuHs+PgVIgGMQhhi/s/lVdBLofIyd1mQEcFmt8kzMmqkcinx pggdfQVaIHlsqDRpIfPmwIClIF3mtdMSIeIfabFYvHBujRAWmq8FE8WiI1a4dJ4DBS/U hV4X+ranS3MKT9PHnfrJ4f6PfFGIwR9+lEBn5xGgizskhducO3d8SQFr6SKgj0iw7jhq kjFw== X-Gm-Message-State: AOAM5331W3TgPHSLtoAiqEi1AzLNIfr53mtjqONPJFHReb8F8fwHFPC5 UruY4vz8a8TrC1k2XZqOW2OZtVfI8wQ= X-Google-Smtp-Source: ABdhPJxeMod9i3iSHUPuLVZ3bswoGt1C/6TeOX8weygaYQTdYSY0Q/v781ZFdOUQI5jUeiLCQh5i9g== X-Received: by 2002:a5d:4d01:: with SMTP id z1mr4030690wrt.133.1616507083756; Tue, 23 Mar 2021 06:44:43 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id n23sm25100318wra.71.2021.03.23.06.44.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Mar 2021 06:44:43 -0700 (PDT) Message-Id: <923081e7e079f72835a8997a9234fa58eb1b37de.1616507069.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 23 Mar 2021 13:44:24 +0000 Subject: [PATCH v4 16/20] sparse-checkout: toggle sparse index from builtin Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The sparse index extension is used to signal that index writes should be in sparse mode. This was only updated using GIT_TEST_SPARSE_INDEX=1. Add a '--[no-]sparse-index' option to 'git sparse-checkout init' that specifies if the sparse index should be used. It also updates the index to use the correct format, either way. Add a warning in the documentation that the use of a repository extension might reduce compatibility with third-party tools. 'git sparse-checkout init' already sets extension.worktreeConfig, which places most sparse-checkout users outside of the scope of most third-party tools. Update t1092-sparse-checkout-compatibility.sh to use this CLI instead of GIT_TEST_SPARSE_INDEX=1. Signed-off-by: Derrick Stolee --- Documentation/git-sparse-checkout.txt | 14 +++++++ builtin/sparse-checkout.c | 17 ++++++++- sparse-index.c | 37 +++++++++++++------ sparse-index.h | 3 ++ t/t1092-sparse-checkout-compatibility.sh | 47 +++++++++++++----------- 5 files changed, 84 insertions(+), 34 deletions(-) diff --git a/Documentation/git-sparse-checkout.txt b/Documentation/git-sparse-checkout.txt index a0eeaeb02ee3..2ff66c5a4e41 100644 --- a/Documentation/git-sparse-checkout.txt +++ b/Documentation/git-sparse-checkout.txt @@ -45,6 +45,20 @@ To avoid interfering with other worktrees, it first enables the When `--cone` is provided, the `core.sparseCheckoutCone` setting is also set, allowing for better performance with a limited set of patterns (see 'CONE PATTERN SET' below). ++ +Use the `--[no-]sparse-index` option to toggle the use of the sparse +index format. This reduces the size of the index to be more closely +aligned with your sparse-checkout definition. This can have significant +performance advantages for commands such as `git status` or `git add`. +This feature is still experimental. Some commands might be slower with +a sparse index until they are properly integrated with the feature. ++ +**WARNING:** Using a sparse index requires modifying the index in a way +that is not completely understood by external tools. If you have trouble +with this compatibility, then run `git sparse-checkout init --no-sparse-index` +to rewrite your index to not be sparse. Older versions of Git will not +understand the `sparseIndex` repository extension and may fail to interact +with your repository until it is disabled. 'set':: Write a set of patterns to the sparse-checkout file, as given as diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index e00b82af727b..ca63e2c64e95 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -14,6 +14,7 @@ #include "unpack-trees.h" #include "wt-status.h" #include "quote.h" +#include "sparse-index.h" static const char *empty_base = ""; @@ -283,12 +284,13 @@ static int set_config(enum sparse_checkout_mode mode) } static char const * const builtin_sparse_checkout_init_usage[] = { - N_("git sparse-checkout init [--cone]"), + N_("git sparse-checkout init [--cone] [--[no-]sparse-index]"), NULL }; static struct sparse_checkout_init_opts { int cone_mode; + int sparse_index; } init_opts; static int sparse_checkout_init(int argc, const char **argv) @@ -303,11 +305,15 @@ static int sparse_checkout_init(int argc, const char **argv) static struct option builtin_sparse_checkout_init_options[] = { OPT_BOOL(0, "cone", &init_opts.cone_mode, N_("initialize the sparse-checkout in cone mode")), + OPT_BOOL(0, "sparse-index", &init_opts.sparse_index, + N_("toggle the use of a sparse index")), OPT_END(), }; repo_read_index(the_repository); + init_opts.sparse_index = -1; + argc = parse_options(argc, argv, NULL, builtin_sparse_checkout_init_options, builtin_sparse_checkout_init_usage, 0); @@ -326,6 +332,15 @@ static int sparse_checkout_init(int argc, const char **argv) sparse_filename = get_sparse_checkout_filename(); res = add_patterns_from_file_to_list(sparse_filename, "", 0, &pl, NULL); + if (init_opts.sparse_index >= 0) { + if (set_sparse_index_config(the_repository, init_opts.sparse_index) < 0) + die(_("failed to modify sparse-index config")); + + /* force an index rewrite */ + repo_read_index(the_repository); + the_repository->index->updated_workdir = 1; + } + /* If we already have a sparse-checkout file, use it. */ if (res >= 0) { free(sparse_filename); diff --git a/sparse-index.c b/sparse-index.c index 3a6df66faeab..30c1a11fd62d 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -104,23 +104,37 @@ static int convert_to_sparse_rec(struct index_state *istate, static int enable_sparse_index(struct repository *repo) { - const char *config_path = repo_git_path(repo, "config.worktree"); + int res; if (upgrade_repository_format(1) < 0) { warning(_("unable to upgrade repository format to enable sparse-index")); return -1; } - git_config_set_in_file_gently(config_path, - "extensions.sparseIndex", - "true"); + res = git_config_set_gently("extensions.sparseindex", "true"); prepare_repo_settings(repo); repo->settings.sparse_index = 1; - return 0; + return res; +} + +int set_sparse_index_config(struct repository *repo, int enable) +{ + int res; + + if (enable) + return enable_sparse_index(repo); + + /* Don't downgrade repository format, just remove the extension. */ + res = git_config_set_gently("extensions.sparseindex", NULL); + + prepare_repo_settings(repo); + repo->settings.sparse_index = 0; + return res; } int convert_to_sparse(struct index_state *istate) { + int test_env; if (istate->split_index || istate->sparse_index || !core_apply_sparse_checkout || !core_sparse_checkout_cone) return 0; @@ -129,14 +143,13 @@ int convert_to_sparse(struct index_state *istate) istate->repo = the_repository; /* - * The GIT_TEST_SPARSE_INDEX environment variable triggers the - * extensions.sparseIndex config variable to be on. + * If GIT_TEST_SPARSE_INDEX=1, then trigger extensions.sparseIndex + * to be fully enabled. If GIT_TEST_SPARSE_INDEX=0 (set explicitly), + * then purposefully disable the setting. */ - if (git_env_bool("GIT_TEST_SPARSE_INDEX", 0)) { - int err = enable_sparse_index(istate->repo); - if (err < 0) - return err; - } + test_env = git_env_bool("GIT_TEST_SPARSE_INDEX", -1); + if (test_env >= 0) + set_sparse_index_config(istate->repo, test_env); /* * Only convert to sparse if extensions.sparseIndex is set. diff --git a/sparse-index.h b/sparse-index.h index 64380e121d80..39dcc859735e 100644 --- a/sparse-index.h +++ b/sparse-index.h @@ -5,4 +5,7 @@ struct index_state; void ensure_full_index(struct index_state *istate); int convert_to_sparse(struct index_state *istate); +struct repository; +int set_sparse_index_config(struct repository *repo, int enable); + #endif diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index 47f983217852..f14dc48924d2 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -6,6 +6,7 @@ test_description='compare full workdir to sparse workdir' # So, disable the check until that integration is complete. GIT_TEST_CHECK_CACHE_TREE=0 GIT_TEST_SPLIT_INDEX=0 +GIT_TEST_SPARSE_INDEX= . ./test-lib.sh @@ -100,25 +101,26 @@ init_repos () { # initialize sparse-checkout definitions git -C sparse-checkout sparse-checkout init --cone && git -C sparse-checkout sparse-checkout set deep && - GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout init --cone && - GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout set deep + git -C sparse-index sparse-checkout init --cone --sparse-index && + test_cmp_config -C sparse-index true extensions.sparseindex && + git -C sparse-index sparse-checkout set deep } run_on_sparse () { ( cd sparse-checkout && - GIT_TEST_SPARSE_INDEX=0 "$@" >../sparse-checkout-out 2>../sparse-checkout-err + "$@" >../sparse-checkout-out 2>../sparse-checkout-err ) && ( cd sparse-index && - GIT_TEST_SPARSE_INDEX=1 "$@" >../sparse-index-out 2>../sparse-index-err + "$@" >../sparse-index-out 2>../sparse-index-err ) } run_on_all () { ( cd full-checkout && - GIT_TEST_SPARSE_INDEX=0 "$@" >../full-checkout-out 2>../full-checkout-err + "$@" >../full-checkout-out 2>../full-checkout-err ) && run_on_sparse "$@" } @@ -148,7 +150,7 @@ test_expect_success 'sparse-index contents' ' || return 1 done && - GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout set folder1 && + git -C sparse-index sparse-checkout set folder1 && test-tool -C sparse-index read-cache --table >cache && for dir in deep folder2 x @@ -158,7 +160,7 @@ test_expect_success 'sparse-index contents' ' || return 1 done && - GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout set deep/deeper1 && + git -C sparse-index sparse-checkout set deep/deeper1 && test-tool -C sparse-index read-cache --table >cache && for dir in deep/deeper2 folder1 folder2 x @@ -166,7 +168,14 @@ test_expect_success 'sparse-index contents' ' TREE=$(git -C sparse-index rev-parse HEAD:$dir) && grep "040000 tree $TREE $dir/" cache \ || return 1 - done + done && + + # Disabling the sparse-index removes tree entries with full ones + git -C sparse-index sparse-checkout init --no-sparse-index && + + test-tool -C sparse-index read-cache --table >cache && + ! grep "040000 tree" cache && + test_sparse_match test-tool read-cache --table ' test_expect_success 'expanded in-memory index matches full index' ' @@ -396,19 +405,15 @@ test_expect_success 'submodule handling' ' test_expect_success 'sparse-index is expanded and converted back' ' init_repos && - ( - GIT_TEST_SPARSE_INDEX=1 && - export GIT_TEST_SPARSE_INDEX && - GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ - git -C sparse-index -c core.fsmonitor="" reset --hard && - test_region index convert_to_sparse trace2.txt && - test_region index ensure_full_index trace2.txt && - - rm trace2.txt && - GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ - git -C sparse-index -c core.fsmonitor="" status -uno && - test_region index ensure_full_index trace2.txt - ) + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ + git -C sparse-index -c core.fsmonitor="" reset --hard && + test_region index convert_to_sparse trace2.txt && + test_region index ensure_full_index trace2.txt && + + rm trace2.txt && + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ + git -C sparse-index -c core.fsmonitor="" status -uno && + test_region index ensure_full_index trace2.txt ' test_done From patchwork Tue Mar 23 13:44:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12157943 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BC83AC433F4 for ; Tue, 23 Mar 2021 13:45:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A155A619BA for ; Tue, 23 Mar 2021 13:45:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231767AbhCWNpH (ORCPT ); Tue, 23 Mar 2021 09:45:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38976 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231739AbhCWNoq (ORCPT ); Tue, 23 Mar 2021 09:44:46 -0400 Received: from mail-wm1-x335.google.com (mail-wm1-x335.google.com [IPv6:2a00:1450:4864:20::335]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BA8D7C0613D9 for ; Tue, 23 Mar 2021 06:44:45 -0700 (PDT) Received: by mail-wm1-x335.google.com with SMTP id y124-20020a1c32820000b029010c93864955so13042225wmy.5 for ; Tue, 23 Mar 2021 06:44:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=h/Sv2nt7HYoFWjtNBIuIh6ZaRdlBye0GXuHQNudUZb8=; b=e2PqLUKRaZNgnAWi6oEqFK59MUGSJ1Khi0LyP3UkxodskHkdmb4WLkC1Mo3ykCFP8X +/w0Gfp666u0a/8RLcx8gTFdzFzQ69mbXqGixP4qhtNAZPC2z2F790tllDstLH4zAO8T taa4Wabjl3lUAwRbWuBdHHYZ8xusHxrl0sUhmJhVqgygCaOrjkujorggI+p0ydRO5lFV Wkb6l/YTFuC5Wyos4bhr0C+jEvaRtw8eB08Xa6cldkEAczN99V+YZtyLp713YXZHHvEg MYnuQTijc9EXRM2Oba7sPAAeMuC7YaXtayB8Aeki+/EC6PWxlGH7GUiYDT7FxPgxRhpF /PZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=h/Sv2nt7HYoFWjtNBIuIh6ZaRdlBye0GXuHQNudUZb8=; b=kDcYLPyEMBDtHJhHNGsgpFrl1yx/S9vP/0w2qL/3EDE8LVKWotM53YowZns6lPkA+H A+oJmafQcVbhjYQyuIlB1sNRCpxVIp8tJDbhwXmHT8mKGtDGWOLGhLcfJKlAy1uS2U5t R95HHxvHKbrZ6JUaks9TA7tbigfe9lty7jxNd4SjBalO1tXFM663q5D2josoZ9daQ89Q d/Jqd+z7la/m746h9H/3Fm4GsPBowrxifEPj/lpYt8kgAtgy9uxjRduK5XxsGZRth3RO oFEOy1CCXDrWGa95ACVn1m2idm35bV6cxGBCzlh88WfTNvE5w9Kfr91tLG6enm59yZcO XX/g== X-Gm-Message-State: AOAM533MTQWtEgfjYjnAyanaGORgo9+Q923inbH2plO/6l4QTamyvnwc PvsirQ1Vmg1OVStNGq651sNS1BBjfWY= X-Google-Smtp-Source: ABdhPJy21vxcwkk4PYHAhAwqrdVALFPvYJsYY9OckYK+7WR5aWDiwlaUDmenI3YyImZd9MSKmIVTXw== X-Received: by 2002:a1c:7407:: with SMTP id p7mr3441058wmc.51.1616507084547; Tue, 23 Mar 2021 06:44:44 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id m10sm2726078wmh.13.2021.03.23.06.44.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Mar 2021 06:44:44 -0700 (PDT) Message-Id: <6f1ad72c390dc56f9e4a4d724369a0e1c7ac3a94.1616507069.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 23 Mar 2021 13:44:25 +0000 Subject: [PATCH v4 17/20] sparse-checkout: disable sparse-index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee We use 'git sparse-checkout init --cone --sparse-index' to toggle the sparse-index feature. It makes sense to also disable it when running 'git sparse-checkout disable'. This is particularly important because it removes the extensions.sparseIndex config option, allowing other tools to use this Git repository again. This does mean that 'git sparse-checkout init' will not re-enable the sparse-index feature, even if it was previously enabled. While testing this feature, I noticed that the sparse-index was not being written on the first run, but by a second. This was caught by the call to 'test-tool read-cache --table'. This requires adjusting some assignments to core_apply_sparse_checkout and pl.use_cone_patterns in the sparse_checkout_init() logic. Signed-off-by: Derrick Stolee --- builtin/sparse-checkout.c | 10 +++++++++- t/t1091-sparse-checkout-builtin.sh | 13 +++++++++++++ 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index ca63e2c64e95..585343fa1972 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -280,6 +280,9 @@ static int set_config(enum sparse_checkout_mode mode) "core.sparseCheckoutCone", mode == MODE_CONE_PATTERNS ? "true" : NULL); + if (mode == MODE_NO_PATTERNS) + set_sparse_index_config(the_repository, 0); + return 0; } @@ -341,10 +344,11 @@ static int sparse_checkout_init(int argc, const char **argv) the_repository->index->updated_workdir = 1; } + core_apply_sparse_checkout = 1; + /* If we already have a sparse-checkout file, use it. */ if (res >= 0) { free(sparse_filename); - core_apply_sparse_checkout = 1; return update_working_directory(NULL); } @@ -366,6 +370,7 @@ static int sparse_checkout_init(int argc, const char **argv) add_pattern(strbuf_detach(&pattern, NULL), empty_base, 0, &pl, 0); strbuf_addstr(&pattern, "!/*/"); add_pattern(strbuf_detach(&pattern, NULL), empty_base, 0, &pl, 0); + pl.use_cone_patterns = init_opts.cone_mode; return write_patterns_and_update(&pl); } @@ -632,6 +637,9 @@ static int sparse_checkout_disable(int argc, const char **argv) strbuf_addstr(&match_all, "/*"); add_pattern(strbuf_detach(&match_all, NULL), empty_base, 0, &pl, 0); + prepare_repo_settings(the_repository); + the_repository->settings.sparse_index = 0; + if (update_working_directory(&pl)) die(_("error while refreshing working directory")); diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh index fc64e9ed99f4..ff1ad570a255 100755 --- a/t/t1091-sparse-checkout-builtin.sh +++ b/t/t1091-sparse-checkout-builtin.sh @@ -205,6 +205,19 @@ test_expect_success 'sparse-checkout disable' ' check_files repo a deep folder1 folder2 ' +test_expect_success 'sparse-index enabled and disabled' ' + git -C repo sparse-checkout init --cone --sparse-index && + test_cmp_config -C repo true extensions.sparseIndex && + test-tool -C repo read-cache --table >cache && + grep " tree " cache && + + git -C repo sparse-checkout disable && + test-tool -C repo read-cache --table >cache && + ! grep " tree " cache && + git -C repo config --list >config && + ! grep extensions.sparseindex config +' + test_expect_success 'cone mode: init and set' ' git -C repo sparse-checkout init --cone && git -C repo config --list >config && From patchwork Tue Mar 23 13:44:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12157933 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5E90C433ED for ; Tue, 23 Mar 2021 13:45:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8E7A2619CA for ; Tue, 23 Mar 2021 13:45:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231760AbhCWNpG (ORCPT ); Tue, 23 Mar 2021 09:45:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38984 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231741AbhCWNoq (ORCPT ); Tue, 23 Mar 2021 09:44:46 -0400 Received: from mail-wr1-x432.google.com (mail-wr1-x432.google.com [IPv6:2a00:1450:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 746E6C0613DA for ; Tue, 23 Mar 2021 06:44:46 -0700 (PDT) Received: by mail-wr1-x432.google.com with SMTP id 61so20800427wrm.12 for ; Tue, 23 Mar 2021 06:44:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=fCX9Pdv6Q50MZfKlGM7RED1/TN58Z2uHwpjRZk86caE=; b=avDub1oN0ETL/JCztmKlbhgbf7Wophmvq/wbOV9xeYv4Nhxud8C7K+Cq3ckqU/X02O m/wybwV8CSa8iIM5bgVNXwWfgnH3/jSOjI+5bvqzgplcN/KxIhLQE7jFqwqVpKCtF95b J7FzhvSOkZYUW2Uw61fEwVbOF1jRLA8tkHgL4j+Xk/7IVioDNi0k33wSh6OhJm2xfKs1 4XnjuRpW4P8F3NUHjuq99ENxSlYBMSuaFwAcajm+Oo5S/yig0E3TX/OVOOjH0OHTrla4 2Eh/3SL8+qTqRoKtZPVI7NThwjLdD04FbT0wvZPDjv4q3nnygO8v+UW9vE0ZAgju856f 5Crw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=fCX9Pdv6Q50MZfKlGM7RED1/TN58Z2uHwpjRZk86caE=; b=UZRWeG9uMg4bjHhMe/G9Ona7Sa6ETPzgbvcucyOf7cSBilGkPnqVIRArZDsDhEXkeb TXsXCHtBR1uAkx69ypc/Jg/t72042kk93ey75BPDfExh5qa4yWi89sUpK+omuqrMrAlO BB9XYwIscPSiPfVPk/pvEB5i4Gi9PnDYYS6ucd9yeqXUYCowN3RjMSiZoWj7R2kWhIaf 8PTx3C1KlSc3yhcmlBQ+qqf0YW0rJUULgWCqJpDfjaFsOkHXiT/wvU1T5s16I8N0/IT1 JikKmvIZdZ1AEcZoCkiB+mUI58vZAiDd0zkMw0dcEYGthuc13ydYbr0jXjXRYHrmLqbd OY9w== X-Gm-Message-State: AOAM531n/vBpZERqt0tq2fHgVEEw3QDeXSIJUexXGQFg0OySY5Xcy8ll foMqE3cen5/ho/fprfuSQU8byrgsb2g= X-Google-Smtp-Source: ABdhPJwdxXAbadN/lLqtXaGCk9lTb3O7R3g5BOXCuTxAF0jbKpv0CSV76ZvcQhOQj8XSBH/5NWieWg== X-Received: by 2002:adf:e64d:: with SMTP id b13mr4160954wrn.204.1616507085285; Tue, 23 Mar 2021 06:44:45 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id g11sm22944966wrw.89.2021.03.23.06.44.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Mar 2021 06:44:44 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 23 Mar 2021 13:44:26 +0000 Subject: [PATCH v4 18/20] cache-tree: integrate with sparse directory entries Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The cache-tree extension was previously disabled with sparse indexes. However, the cache-tree is an important performance feature for commands like 'git status' and 'git add'. Integrate it with sparse directory entries. When writing a sparse index, completely clear and recalculate the cache tree. By starting from scratch, the only integration necessary is to check if we hit a sparse directory entry and create a leaf of the cache-tree that has an entry_count of one and no subtrees. Signed-off-by: Derrick Stolee --- cache-tree.c | 18 ++++++++++++++++++ sparse-index.c | 10 +++++++++- 2 files changed, 27 insertions(+), 1 deletion(-) diff --git a/cache-tree.c b/cache-tree.c index 5f07a39e501e..950a9615db8f 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -256,6 +256,24 @@ static int update_one(struct cache_tree *it, *skip_count = 0; + /* + * If the first entry of this region is a sparse directory + * entry corresponding exactly to 'base', then this cache_tree + * struct is a "leaf" in the data structure, pointing to the + * tree OID specified in the entry. + */ + if (entries > 0) { + const struct cache_entry *ce = cache[0]; + + if (S_ISSPARSEDIR(ce->ce_mode) && + ce->ce_namelen == baselen && + !strncmp(ce->name, base, baselen)) { + it->entry_count = 1; + oidcpy(&it->oid, &ce->oid); + return 1; + } + } + if (0 <= it->entry_count && has_object_file(&it->oid)) return it->entry_count; diff --git a/sparse-index.c b/sparse-index.c index 30c1a11fd62d..56313e805d9d 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -180,7 +180,11 @@ int convert_to_sparse(struct index_state *istate) istate->cache_nr = convert_to_sparse_rec(istate, 0, 0, istate->cache_nr, "", 0, istate->cache_tree); - istate->drop_cache_tree = 1; + + /* Clear and recompute the cache-tree */ + cache_tree_free(&istate->cache_tree); + cache_tree_update(istate, 0); + istate->sparse_index = 1; trace2_region_leave("index", "convert_to_sparse", istate->repo); return 0; @@ -281,5 +285,9 @@ void ensure_full_index(struct index_state *istate) strbuf_release(&base); free(full); + /* Clear and recompute the cache-tree */ + cache_tree_free(&istate->cache_tree); + cache_tree_update(istate, 0); + trace2_region_leave("index", "ensure_full_index", istate->repo); } From patchwork Tue Mar 23 13:44:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12157945 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EAA11C433F7 for ; Tue, 23 Mar 2021 13:45:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D58CD619D3 for ; Tue, 23 Mar 2021 13:45:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231774AbhCWNpK (ORCPT ); Tue, 23 Mar 2021 09:45:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38986 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231745AbhCWNor (ORCPT ); Tue, 23 Mar 2021 09:44:47 -0400 Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [IPv6:2a00:1450:4864:20::333]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 45703C061574 for ; Tue, 23 Mar 2021 06:44:47 -0700 (PDT) Received: by mail-wm1-x333.google.com with SMTP id k128so8686929wmk.4 for ; Tue, 23 Mar 2021 06:44:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=UI+eRyRGiD8kv9ARhAds+wlgsXLrX8hcYHQH8XYkEp8=; b=A0qo39YYt+YMnbVkQCix6p5jPYDqE8Y/iTTtA3G0TBwbe0PbPGXVjDbnfCRzngwOyR Pw9gIAOMoffGEDkxQ/tx3TLbzgaphNrMhnTvRLOu9romrPLbYLO+QaZFx0lb6Zt59R+h 3kW+GGBklWfOEYPf5j8nqjwFvj70I60iFtP3Ebg9Wgk5mEhn9+owcWLD6sSbbBv8kQ8b tbqd6MAzIHqmVvyrxoYRF0H6OJlgbc1zHUbrVg4PaiSrfvmO0OkTXYvJkQpqHWxoxDtw oKdqOmc1hfJ4k9ZKghxJhBckyYIfL44wKx1UcTW3p8tsNioaz8QjZaMuOfckXzEx9pQL vX1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=UI+eRyRGiD8kv9ARhAds+wlgsXLrX8hcYHQH8XYkEp8=; b=svv0fsyDbrbi1qxEcbeX3bJ7KT8SN/XVrT22tElQX3Q0cQe69s3FIIZwp7y7KFMN66 dvUHiVp/Nk2CuWWO4zqm/6oENSpMyvJCfggXZeZ+X9AfF9r0H4NRbpBdRw1mtoHLVjRB xV45Ee2/WMv0rpFNY59UXcid/Yxv3C3mdTXlNTnKYdDhvaje2rOXSg6cJL5DI9c2Bf21 FGSapMNuftYGzXvCHhG3wX7C0FYxWaGgL4eHAWzV/NGRJTKsdnTvYHiPmBzYRazBMY56 gs8gYP7q4yOBvqA5uGe7virYQRyuBHSbjQ5e/8vCtvb+tLW279JVUtEz6iEOiKLw8Txa nJXw== X-Gm-Message-State: AOAM532ZSwCbRhVIxSD8sKUSD283W+YSw9+LGvUoz9ereRPtVNCipYKv Y3G+HI3LQKxhaKBFnsgUiPDZlKmmerQ= X-Google-Smtp-Source: ABdhPJzm3/8ltfk2rBX3SHNJDpcfNdtxwlIc/0uL192bEIj4R7ExrVCkerWmhZzQN6+fojaKmqNamg== X-Received: by 2002:a7b:c20d:: with SMTP id x13mr3541026wmi.32.1616507086047; Tue, 23 Mar 2021 06:44:46 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a8sm2505354wmm.46.2021.03.23.06.44.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Mar 2021 06:44:45 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 23 Mar 2021 13:44:27 +0000 Subject: [PATCH v4 19/20] sparse-index: loose integration with cache_tree_verify() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The cache_tree_verify() method is run when GIT_TEST_CHECK_CACHE_TREE is enabled, which it is by default in the test suite. The logic must be adjusted for the presence of these directory entries. For now, leave the test as a simple check for whether the directory entry is sparse. Do not go any further until needed. This allows us to re-enable GIT_TEST_CHECK_CACHE_TREE in t1092-sparse-checkout-compatibility.sh. Further, p2000-sparse-operations.sh uses the test suite and hence this is enabled for all tests. We need to integrate with it before we run our performance tests with a sparse-index. Signed-off-by: Derrick Stolee --- cache-tree.c | 19 +++++++++++++++++++ t/t1092-sparse-checkout-compatibility.sh | 3 --- 2 files changed, 19 insertions(+), 3 deletions(-) diff --git a/cache-tree.c b/cache-tree.c index 950a9615db8f..11bf1fcae6e1 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -808,6 +808,19 @@ int cache_tree_matches_traversal(struct cache_tree *root, return 0; } +static void verify_one_sparse(struct repository *r, + struct index_state *istate, + struct cache_tree *it, + struct strbuf *path, + int pos) +{ + struct cache_entry *ce = istate->cache[pos]; + + if (!S_ISSPARSEDIR(ce->ce_mode)) + BUG("directory '%s' is present in index, but not sparse", + path->buf); +} + static void verify_one(struct repository *r, struct index_state *istate, struct cache_tree *it, @@ -830,6 +843,12 @@ static void verify_one(struct repository *r, if (path->len) { pos = index_name_pos(istate, path->buf, path->len); + + if (pos >= 0) { + verify_one_sparse(r, istate, it, path, pos); + return; + } + pos = -pos - 1; } else { pos = 0; diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index f14dc48924d2..d97bf9b64527 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -2,9 +2,6 @@ test_description='compare full workdir to sparse workdir' -# The verify_cache_tree() check is not sparse-aware (yet). -# So, disable the check until that integration is complete. -GIT_TEST_CHECK_CACHE_TREE=0 GIT_TEST_SPLIT_INDEX=0 GIT_TEST_SPARSE_INDEX= From patchwork Tue Mar 23 13:44:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12157947 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ABC5DC433E0 for ; Tue, 23 Mar 2021 13:45:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 720116199F for ; Tue, 23 Mar 2021 13:45:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231778AbhCWNp0 (ORCPT ); Tue, 23 Mar 2021 09:45:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38990 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231747AbhCWNos (ORCPT ); Tue, 23 Mar 2021 09:44:48 -0400 Received: from mail-wm1-x330.google.com (mail-wm1-x330.google.com [IPv6:2a00:1450:4864:20::330]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 026F6C061763 for ; Tue, 23 Mar 2021 06:44:48 -0700 (PDT) Received: by mail-wm1-x330.google.com with SMTP id 12so11078702wmf.5 for ; Tue, 23 Mar 2021 06:44:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=W9y5dD+98QXMbGi6i1SxBzC8O8jQQKZVAIqARbUBgiE=; b=vS4G6o16yo3JpxjFjPzTnIi+u0zRN3ucrrzIRp3PMDJFvTmJAC1Zsa8YMP8YfXmume pktSszCWOB+OdAUi3e+sT1rcZbOttvD4GhOIP78dv7fDQs9bemI0l8ga8NzsaFn6xgJC Ffdw4bi60Yp+V3w3oX8vGfyM4UA2MrRJEjKhstVm+NG8IH4nR6sUCIF32lKOPAcOgcYp Z7B5OVwLpuDrsa3CjPgJ44YX2J5fJb2QGeZ6ffrcSZXyaENGdP8YRnrCzcvV5lu2KVUp yY8Xg3i/Nyd9EU/jpAnkzgjcWS7FqNiY7Tc+WkVfspe26cUMKah0H/ZO1tH/GVYDkc/J A0XA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=W9y5dD+98QXMbGi6i1SxBzC8O8jQQKZVAIqARbUBgiE=; b=J7/jK5SXRLl3aQc1KCYZtsAGj86whVQxagB9hKt4x3vx8VscEiRZkk8gskAERMpngt 8JYL8HyZbmg1kyxMFF3zl8KDKyhk3VtQzUmkO5pIm4FmCbGBdfX33jGnoOQGh6VHK5p4 +VI/uU92dat/ZkAh3jg4V7/HBdhG4WO6yaMLo4zTL78pnACcSB1Qp3ZBC8HJTmGS8WZ5 eJyjHURPV3oboqRnIiWKbA2IYc15F2pmzY5zNIpm5esMn7XBPfMIoAKEF23XdJIlTK7Q 6/4kKcJfs7IxH1fww3/Rfkbfw+7H7pw0lkJYIXykPtm+kSG34kvT+Y+43ZJijbGY5MXJ hQAg== X-Gm-Message-State: AOAM53356aKMeJ1Xa1wIGr9e9hXNSqZZT7yCCwpfGQCF8MG8BcYBMtkk MENmbIPSIVaZBnFC4RCTimzKdJaDkjU= X-Google-Smtp-Source: ABdhPJzVaA21ecbwnswB1VNhDokkdw4xuTJCwreSWi9znc1QWB32JdH8zpIWNy5QBrRZr4mJgcZVgg== X-Received: by 2002:a1c:a958:: with SMTP id s85mr3368721wme.138.1616507086751; Tue, 23 Mar 2021 06:44:46 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id u17sm2539790wmq.3.2021.03.23.06.44.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Mar 2021 06:44:46 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 23 Mar 2021 13:44:28 +0000 Subject: [PATCH v4 20/20] p2000: add sparse-index repos Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee p2000-sparse-operations.sh compares different Git commands in repositories with many files at HEAD but using sparse-checkout to focus on a small portion of those files. Add extra copies of the repository that use the sparse-index format so we can track how that affects the performance of different commands. At this point in time, the sparse-index is 100% overhead from the CPU front, and this is measurable in these tests: Test --------------------------------------------------------------- 2000.2: git status (full-index-v3) 0.59(0.51+0.12) 2000.3: git status (full-index-v4) 0.59(0.52+0.11) 2000.4: git status (sparse-index-v3) 1.40(1.32+0.12) 2000.5: git status (sparse-index-v4) 1.41(1.36+0.08) 2000.6: git add -A (full-index-v3) 2.32(1.97+0.19) 2000.7: git add -A (full-index-v4) 2.17(1.92+0.14) 2000.8: git add -A (sparse-index-v3) 2.31(2.21+0.15) 2000.9: git add -A (sparse-index-v4) 2.30(2.20+0.13) 2000.10: git add . (full-index-v3) 2.39(2.02+0.20) 2000.11: git add . (full-index-v4) 2.20(1.94+0.16) 2000.12: git add . (sparse-index-v3) 2.36(2.27+0.12) 2000.13: git add . (sparse-index-v4) 2.33(2.21+0.16) 2000.14: git commit -a -m A (full-index-v3) 2.47(2.12+0.20) 2000.15: git commit -a -m A (full-index-v4) 2.26(2.00+0.17) 2000.16: git commit -a -m A (sparse-index-v3) 3.01(2.92+0.16) 2000.17: git commit -a -m A (sparse-index-v4) 3.01(2.94+0.15) Note that there is very little difference between the v3 and v4 index formats when the sparse-index is enabled. This is primarily due to the fact that the relative file sizes are the same, and the command time is mostly taken up by parsing tree objects to expand the sparse index into a full one. With the current file layout, the index file sizes are given by this table: | full index | sparse index | +-------------+--------------+ v3 | 108 MiB | 1.6 MiB | v4 | 80 MiB | 1.2 MiB | Future updates will improve the performance of Git commands when the index is sparse. Signed-off-by: Derrick Stolee --- t/perf/p2000-sparse-operations.sh | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh index dddd527b6330..94513c977489 100755 --- a/t/perf/p2000-sparse-operations.sh +++ b/t/perf/p2000-sparse-operations.sh @@ -59,12 +59,29 @@ test_expect_success 'setup repo and indexes' ' git sparse-checkout set $SPARSE_CONE && git config index.version 4 && git update-index --index-version=4 + ) && + git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-index-v3 && + ( + cd sparse-index-v3 && + git sparse-checkout init --cone --sparse-index && + git sparse-checkout set $SPARSE_CONE && + git config index.version 3 && + git update-index --index-version=3 + ) && + git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-index-v4 && + ( + cd sparse-index-v4 && + git sparse-checkout init --cone --sparse-index && + git sparse-checkout set $SPARSE_CONE && + git config index.version 4 && + git update-index --index-version=4 ) ' test_perf_on_all () { command="$@" - for repo in full-index-v3 full-index-v4 + for repo in full-index-v3 full-index-v4 \ + sparse-index-v3 sparse-index-v4 do test_perf "$command ($repo)" " (