From patchwork Tue Feb 23 20:14:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12100833 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 764BAC433E6 for ; Tue, 23 Feb 2021 20:15:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3E47F64DDC for ; Tue, 23 Feb 2021 20:15:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234143AbhBWUPP (ORCPT ); Tue, 23 Feb 2021 15:15:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47058 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233291AbhBWUPO (ORCPT ); Tue, 23 Feb 2021 15:15:14 -0500 Received: from mail-wm1-x334.google.com (mail-wm1-x334.google.com [IPv6:2a00:1450:4864:20::334]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4F203C061786 for ; Tue, 23 Feb 2021 12:14:33 -0800 (PST) Received: by mail-wm1-x334.google.com with SMTP id o16so1657392wmh.0 for ; Tue, 23 Feb 2021 12:14:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=DJnTwocUJqKSt9dOPfPou2h/3d024kYNJiAp5xWd0UA=; b=L87XHzL1tQj2JGwjfetkrQcyjCZLlN0hLoswGdr9jvm8BL0oW2f0x4nWy5Ha66ijOy 7HwlOf0DaPAC0XXqD9J8p8XC1dyKtI2WPa7na5d+KXQCp5oFbp/34CdPDzL3Y39MQ0w8 GaJ7u2lWJQ+ewtS/vSjRwvPuy8H/lyOWC3IdnigplxNbFJ65PKK6w4s0jIkxQfSuQypI EzGgtDhfToO2rVqoFKeTEzwsjK03FXKb5c1HEpLEZx5p1Qearn/L5J2/cs56GoL6Pzzc Nrb/dQ++DXOnsITEGrLn2aMBeVwmqKwXsEr8imezNl8GIIjaEC2uc13QWz7EAti4nsCd 02jw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=DJnTwocUJqKSt9dOPfPou2h/3d024kYNJiAp5xWd0UA=; b=XPqJ1w4Cxu8mL8t16AfF6lOaa20VsuaClf+WQngDgLJvlC5K/I7SApx4Elbb4uX3t9 phKn1WxrMDW7Fhp+Hxti7NxHAefEFVoN+J7MTo1ElmOgySliMUKKAC0tS3tB9O5Fy/AU XWUy71Ry7skWApaFjCdxzNeXvUdGtykDkq9B97gF7oyAoomJYIh9fHpb5sFXLS2wzuW0 Xv+oSH51aGohoG4LNBhe00vedTqAeL7gv8AO31xGGAzQv8OZYrqOBBsXkVBwLDaxxnbo v8OwxYOY6+HfMDftoIxmNIv+Kj6oavOBORRBNSNFsrX+ZmzofVo0VCnbjZJe4LgQTp2D AW8g== X-Gm-Message-State: AOAM533N1J7GiCFzTeGZBnaVf+j9gukf+t1yyqSzE/hSWqCbHT2HWp9f y7NXYUTADcvQC6i0MOTgReVET8i1ZBc= X-Google-Smtp-Source: ABdhPJywrMZaEYKckO1mS+ypqDSjGxQsLnfpwp3M6VpAqwCB5m9iC6a+sOnBHHVXNYAl4boTKnzfaw== X-Received: by 2002:a7b:c4d1:: with SMTP id g17mr406176wmk.101.1614111271942; Tue, 23 Feb 2021 12:14:31 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id d12sm15696748wrg.9.2021.02.23.12.14.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Feb 2021 12:14:31 -0800 (PST) Message-Id: In-Reply-To: References: Date: Tue, 23 Feb 2021 20:14:10 +0000 Subject: [PATCH 01/20] sparse-index: design doc and format update Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee This begins a long effort to update the index format to allow sparse directory entries. This should result in a significant improvement to Git commands when HEAD contains millions of files, but the user has selected many fewer files to keep in their sparse-checkout definition. Currently, the index format is only updated in the presence of extensions.sparseIndex instead of increasing a file format version number. This is temporary, and index v5 is part of the plan for future work in this area. The design document details many of the reasons for embarking on this work, and also the plan for completing it safely. Signed-off-by: Derrick Stolee --- Documentation/technical/index-format.txt | 7 + Documentation/technical/sparse-index.txt | 167 +++++++++++++++++++++++ 2 files changed, 174 insertions(+) create mode 100644 Documentation/technical/sparse-index.txt diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt index b633482b1bdf..387126582556 100644 --- a/Documentation/technical/index-format.txt +++ b/Documentation/technical/index-format.txt @@ -44,6 +44,13 @@ Git index format localization, no special casing of directory separator '/'). Entries with the same name are sorted by their stage field. + An index entry typically represents a file. However, if sparse-checkout + is enabled in cone mode (`core.sparseCheckoutCone` is enabled) and the + `extensions.sparseIndex` extension is enabled, then the index may + contain entries for directories outside of the sparse-checkout definition. + These entries have mode `0040000`, include the `SKIP_WORKTREE` bit, and + the path ends in a directory separator. + 32-bit ctime seconds, the last time a file's metadata changed this is stat(2) data diff --git a/Documentation/technical/sparse-index.txt b/Documentation/technical/sparse-index.txt new file mode 100644 index 000000000000..9070836f0655 --- /dev/null +++ b/Documentation/technical/sparse-index.txt @@ -0,0 +1,167 @@ +Git Sparse-Index Design Document +================================ + +The sparse-checkout feature allows users to focus a working directory on +a subset of the files at HEAD. The cone mode patterns, enabled by +`core.sparseCheckoutCone`, allow for very fast pattern matching to +discover which files at HEAD belong in the sparse-checkout cone. + +Three important scale dimensions for a Git worktree are: + +* `HEAD`: How many files are present at `HEAD`? + +* Populated: How many files are within the sparse-checkout cone. + +* Modified: How many files has the user modified in the working directory? + +We will use big-O notation -- O(X) -- to denote how expensive certain +operations are in terms of these dimensions. + +These dimensions are ordered by their magnitude: users (typically) modify +fewer files than are populated, and we can only populate files at `HEAD`. +These dimensions are also ordered by how expensive they are per item: it +is expensive to detect a modified file than it is to write one that we +know must be populated; changing `HEAD` only really requires updating the +index. + +Problems occur if there is an extreme imbalance in these dimensions. For +example, if `HEAD` contains millions of paths but the populated set has +only tens of thousands, then commands like `git status` and `git add` can +be dominated by operations that require O(`HEAD`) operations instead of +O(Populated). Primarily, the cost is in parsing and rewriting the index, +which is filled primarily with files at `HEAD` that are marked with the +`SKIP_WORKTREE` bit. + +The sparse-index intends to take these commands that read and modify the +index from O(`HEAD`) to O(Populated). To do this, we need to modify the +index format in a significant way: add "sparse directory" entries. + +With cone mode patterns, it is possible to detect when an entire +directory will have its contents outside of the sparse-checkout definition. +Instead of listing all of the files it contains as individual entries, a +sparse-index contains an entry with the directory name, referencing the +object ID of the tree at `HEAD` and marked with the `SKIP_WORKTREE` bit. +If we need to discover the details for paths within that directory, we +can parse trees to find that list. + +This addition of sparse-directory entries violates expectations about the +index format and its in-memory data structure. There are many consumers in +the codebase that expect to iterate through all of the index entries and +see only files. In addition, they expect to see all files at `HEAD`. One +way to handle this is to parse trees to replace a sparse-directory entry +with all of the files within that tree as the index is loaded. However, +parsing trees is slower than parsing the index format, so that is a slower +operation than if we left the index alone. + +The implementation plan below follows four phases to slowly integrate with +the sparse-index. The intention is to incrementally update Git commands to +interact safely with the sparse-index without significant slowdowns. This +may not always be possible, but the hope is that the primary commands that +users need in their daily work are dramatically improved. + +Phase I: Format and initial speedups +------------------------------------ + +During this phase, Git learns to enable the sparse-index and safely parse +one. Protections are put in place so that every consumer of the in-memory +data structure can operate with its current assumption of every file at +`HEAD`. + +At first, every index parse will expand the sparse-directory entries into +the full list of paths at `HEAD`. This will be slower in all cases. The +only noticable change in behavior will be that the serialized index file +contains sparse-directory entries. + +To start, we use a new repository extension, `extensions.sparseIndex`, to +allow inserting sparse-directory entries into indexes with file format +versions 2, 3, and 4. This prevents Git versions that do not understand +the sparse-index from operating on one, but it also prevents other +operations that do not use the index at all. A new format, index v5, will +be introduced that includes sparse-directory entries by default. It might +also introduce other features that have been considered for improving the +index, as well. + +Next, consumers of the index will be guarded against operating on a +sparse-index by inserting calls to `ensure_full_index()` or +`expand_index_to_path()`. After these guards are in place, we can begin +leaving sparse-directory entries in the in-memory index structure. + +Even after inserting these guards, we will keep expanding sparse-indexes +for most Git commands using the `command_requires_full_index` repository +setting. This setting will be on by default and disabled one builtin at a +time until we have sufficient confidence that all of the index operations +are properly guarded. + +To complete this phase, the commands `git status` and `git add` will be +integrated with the sparse-index so that they operate with O(Populated) +performance. They will be carefully tested for operations within and +outside the sparse-checkout definition. + +Phase II: Careful integrations +------------------------------ + +This phase focuses on ensuring that all index extensions and APIs work +well with a sparse-index. This requires significant increases to our test +coverage, especially for operations that interact with the working +directory outside of the sparse-checkout definition. Some of these +behaviors may not be the desirable ones, such as some tests already +marked for failure in `t1092-sparse-checkout-compatibility.sh`. + +The index extensions that may require special integrations are: + +* FS Monitor +* Untracked cache + +While integrating with these features, we should look for patterns that +might lead to better APIs for interacting with the index. Coalescing +common usage patterns into an API call can reduce the number of places +where sparse-directories need to be handled carefully. + +Phase III: Important command speedups +------------------------------------- + +At this point, the patterns for testing and implementing sparse-directory +logic should be relatively stable. This phase focuses on updating some of +the most common builtins that use the index to operate as O(Populated). +Here is a potential list of commands that could be valuable to integrate +at this point: + +* `git commit` +* `git checkout` +* `git merge` +* `git rebase` + +Along with `git status` and `git add`, these commands cover the majority +of users' interactions with the working directory. In addition, we can +integrate with these commands: + +* `git grep` +* `git rm` + +These have been proposed as some whose behavior could change when in a +repo with a sparse-checkout definition. It would be good to include this +behavior automatically when using a sparse-index. Some clarity is needed +to make the behavior switch clear to the user. + +This phase is the first where parallel work might be possible without too +much conflicts between topics. + +Phase IV: The long tail +----------------------- + +This last phase is less a "phase" and more "the new normal" after all of +the previous work. + +To start, the `command_requires_full_index` option could be removed in +favor of expanding only when hitting an API guard. + +There are many Git commands that could use special attention to operate as +O(Populated), while some might be so rare that it is acceptable to leave +them with additional overhead when a sparse-index is present. + +Here are some commands that might be useful to update: + +* `git sparse-checkout set` +* `git am` +* `git clean` +* `git stash` From patchwork Tue Feb 23 20:14:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12100837 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4A6EC433DB for ; Tue, 23 Feb 2021 20:15:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B027E64D99 for ; Tue, 23 Feb 2021 20:15:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234193AbhBWUPU (ORCPT ); Tue, 23 Feb 2021 15:15:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47064 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233482AbhBWUPP (ORCPT ); Tue, 23 Feb 2021 15:15:15 -0500 Received: from mail-wm1-x32b.google.com (mail-wm1-x32b.google.com [IPv6:2a00:1450:4864:20::32b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A2297C06178B for ; Tue, 23 Feb 2021 12:14:34 -0800 (PST) Received: by mail-wm1-x32b.google.com with SMTP id x16so3628228wmk.3 for ; Tue, 23 Feb 2021 12:14:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=NVPCR6yPnG+zSxv1JkAaAXu15R9XfnzPKEj5XxAvU5k=; b=U1HqW+TeWxls+DF/XejDO7/mR1AqDa+VCHJqDluM0YDZaHnRQKoHGUzVjJs09M6tGQ Ld93L2yzeayJaOyWgXZBAnvErOyZroa+k03D2qUe7O/NFbPIrY8XygYQq9EHI9axM+m7 kqaZeqQtDMpRFSqQLuWEshmj1LdPobOz4MpkZ4hy0MTlerF6p/9CSMPvUrqDIJ/FvQdW B5kMVc3/9NQzLxUc8VDtDuLvNBum5VgIHMAc/FCOl8uLwElTFl3mIAQL7/NVw3osOeGI xOrg/rdTVt9EyOrSM8pcwEZhW2wLfNxiBYxMalHvo+oqxAx1UEBiQnGvzBNGWd46TDha U5SQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=NVPCR6yPnG+zSxv1JkAaAXu15R9XfnzPKEj5XxAvU5k=; b=pfyqj1YyyV0lcScbZ1Krec4Hrl+kYRkQszOHDtSRrjjGkcHt9um/G41hDz99Oa6Sf9 5PpmgMAwAkF0B47OR0BAKddxx96CCUWU8Ncc1K540ZAXdZjOyZM3f8d6aIh2lYPyFixv KybYVOLHjqD+Ec5WLYwQHIC8pQgSxnkS5GBxMnVLozKcz2kLrKxl7SRhYxNPs92ixqzs i0tqWInzBuR3qf11LN1zOov70D1VGvYjeNFxGxr6bwsu/NH05M+UgsPqhKl1RB5zRYtG hQjjR0TEiOiPLL7TPYQ/euHT/FViKxlO51esEwS9ev5Yg7iOLquPWr9GAnbLHOhJZylg aWgQ== X-Gm-Message-State: AOAM533P3IfN+mUzME+lNN9wjMoY2ofCcT1HVN8EAjKOdFGCQV/EZqIn 50afSlveI3piyfLMqRultzH8DJeExf4= X-Google-Smtp-Source: ABdhPJwbbX4OTGgvcF3xYoMJtFziMLfaBCoX6ytaUbSzIUenDC4NCh1xlJpM+xTS0lCpvPboVOuFFQ== X-Received: by 2002:a1c:2311:: with SMTP id j17mr453393wmj.38.1614111272483; Tue, 23 Feb 2021 12:14:32 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a6sm3893892wmj.23.2021.02.23.12.14.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Feb 2021 12:14:32 -0800 (PST) Message-Id: In-Reply-To: References: Date: Tue, 23 Feb 2021 20:14:11 +0000 Subject: [PATCH 02/20] t/perf: add performance test for sparse operations Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Create a test script that takes the default performance test (the Git codebase) and multiplies it by 256 using four layers of duplicated trees of width four. This results in nearly one million blob entries in the index. Then, we can clone this repository with sparse-checkout patterns that demonstrate four copies of the initial repository. Each clone will use a different index format or mode so peformance can be tested across the different options. Note that the initial repo is stripped of submodules before doing the copies. This preserves the expected data shape of the sparse index, because directories containing submodules are not collapsed to a sparse directory entry. Run a few Git commands on these clones, especially those that use the index (status, add, commit). Here are the results on my Linux machine: Test -------------------------------------------------------------- 2000.2: git status (full-index-v3) 0.37(0.30+0.09) 2000.3: git status (full-index-v4) 0.39(0.32+0.10) 2000.4: git add -A (full-index-v3) 1.42(1.06+0.20) 2000.5: git add -A (full-index-v4) 1.26(0.98+0.16) 2000.6: git add . (full-index-v3) 1.40(1.04+0.18) 2000.7: git add . (full-index-v4) 1.26(0.98+0.17) 2000.8: git commit -a -m A (full-index-v3) 1.42(1.11+0.16) 2000.9: git commit -a -m A (full-index-v4) 1.33(1.08+0.16) It is perhaps noteworthy that there is an improvement when using index version 4. This is because the v3 index uses 108 MiB while the v4 index uses 80 MiB. Since the repeated portions of the directories are very short (f3/f1/f2, for example) this ratio is less pronounced than in similarly-sized real repositories. Signed-off-by: Derrick Stolee --- t/perf/p2000-sparse-operations.sh | 87 +++++++++++++++++++++++++++++++ 1 file changed, 87 insertions(+) create mode 100755 t/perf/p2000-sparse-operations.sh diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh new file mode 100755 index 000000000000..52597683376e --- /dev/null +++ b/t/perf/p2000-sparse-operations.sh @@ -0,0 +1,87 @@ +#!/bin/sh + +test_description="test performance of Git operations using the index" + +. ./perf-lib.sh + +test_perf_default_repo + +SPARSE_CONE=f2/f4/f1 + +test_expect_success 'setup repo and indexes' ' + git reset --hard HEAD && + # Remove submodules from the example repo, because our + # duplication of the entire repo creates an unlikly data shape. + git config --file .gitmodules --get-regexp "submodule.*.path" >modules && + rm -f .gitmodules && + git add .gitmodules && + for module in $(awk "{print \$2}" modules) + do + git rm $module || return 1 + done && + git add . && + git commit -m "remove submodules" && + + echo bogus >a && + cp a b && + git add a b && + git commit -m "level 0" && + BLOB=$(git rev-parse HEAD:a) && + OLD_COMMIT=$(git rev-parse HEAD) && + OLD_TREE=$(git rev-parse HEAD^{tree}) && + + for i in $(test_seq 1 4) + do + cat >in <<-EOF && + 100755 blob $BLOB a + 040000 tree $OLD_TREE f1 + 040000 tree $OLD_TREE f2 + 040000 tree $OLD_TREE f3 + 040000 tree $OLD_TREE f4 + EOF + NEW_TREE=$(git mktree >$SPARSE_CONE/a && + $command + ) + " + done +} + +test_perf_on_all git status +test_perf_on_all git add -A +test_perf_on_all git add . +test_perf_on_all git commit -a -m A + +test_done From patchwork Tue Feb 23 20:14:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12100835 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1FAA1C433E9 for ; Tue, 23 Feb 2021 20:15:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EE57564E22 for ; Tue, 23 Feb 2021 20:15:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234189AbhBWUPT (ORCPT ); Tue, 23 Feb 2021 15:15:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47062 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233478AbhBWUPP (ORCPT ); Tue, 23 Feb 2021 15:15:15 -0500 Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2BA04C06178A for ; Tue, 23 Feb 2021 12:14:34 -0800 (PST) Received: by mail-wr1-x42b.google.com with SMTP id v1so23848499wrd.6 for ; Tue, 23 Feb 2021 12:14:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Sr8T60pvjyaHGo05oWI6Lye8k/6nZgW5k/6ArV7rGJ8=; b=X+WVmnQ72yidjc12yefFuwRavCyQnpJFf8HKjn2TvkYFRXN60V/sWlYcWvMcLrvR+J zxNP3sGnvRzx8xaXCX3CpmJPI21EuknXyQ10bSgINg0l3bcjC0BQr2t/hTeEdZa0rqu1 pqCAIHxCuCPbO4ENqj/3WjFH8ITiOHaSvw6d6UEXj0Q3L4R9n5lKx52wvXbSz1MfPr/R zValdRxrByAt7/D3BxGkASP+OEZlm1lgUy8l60WWHbWuAE+AHG6qTNBIMGukK1F9grVQ KgPFpWoj6LxaHiPz5ApdSH2eehiJRpJYsQBeIgPmKznU6VvhyvHtXo/WPhpwOxSVhnzN aR6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Sr8T60pvjyaHGo05oWI6Lye8k/6nZgW5k/6ArV7rGJ8=; b=aIwHqbYR2pc3j5HjEMN4V9t92h5jBj5RY+B90lej18t3UnIoK/jNHHFPu5ng0yD3Qd ldv+7+TjCx9jOWqxQfLNqC0BJXedM7Wq5SXK+2ISJrJjpA+a4aUgeRzd7r83WyB9cEYW bOvsF1+repthS2haiP9V4z4K137I51zwadzOSCv173NuwbqEzu9YNI4YkE5UOVDLxatV qEN+vnSYcAwP99acXHv4iMo/I8/jpFlko56A179yo97uyCSgAQLTao5Soq3nu9hK2Cll jR4I4tZA+AMwith1y/QhucjTfGjerOTe11wXKM2ssCOmQU4jBeSuE/7Ys+66UK5efcOZ qjWA== X-Gm-Message-State: AOAM5321358PV5o8S493hErBFRfnwJXazPBRpX0evGln25+l/+mBMtYB HmMyK+SMPfdH6fbMj/GcpOLzAdNUw6o= X-Google-Smtp-Source: ABdhPJyb0+30sI7moX6UGZs5lv3ehn3q1DTv0kHc7DRgb8ug/s9jVyxJ+Bzld0hr/abyaNBK6d0N0A== X-Received: by 2002:adf:dd52:: with SMTP id u18mr9284537wrm.175.1614111273004; Tue, 23 Feb 2021 12:14:33 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id m24sm3911013wmc.18.2021.02.23.12.14.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Feb 2021 12:14:32 -0800 (PST) Message-Id: <6e783c88821e3b86f6ce976e5673dc1df8992c8f.1614111270.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 23 Feb 2021 20:14:12 +0000 Subject: [PATCH 03/20] t1092: clean up script quoting Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee This test was introduced in 19a0acc83e4 (t1092: test interesting sparse-checkout scenarios, 2021-01-23), but these issues with quoting were not noticed until starting this follow-up series. The old mechanism would drop quoting such as in test_all_match git commit -m "touch README.md" The above happened to work because README.md is a file in the repository, so 'git commit -m touch REAMDE.md' would succeed by accident. Other cases included quoting for no good reason, so clean that up now. Signed-off-by: Derrick Stolee --- t/t1092-sparse-checkout-compatibility.sh | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index 8cd3e5a8d227..3725d3997e70 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -96,20 +96,20 @@ init_repos () { run_on_sparse () { ( cd sparse-checkout && - $* >../sparse-checkout-out 2>../sparse-checkout-err + "$@" >../sparse-checkout-out 2>../sparse-checkout-err ) } run_on_all () { ( cd full-checkout && - $* >../full-checkout-out 2>../full-checkout-err + "$@" >../full-checkout-out 2>../full-checkout-err ) && - run_on_sparse $* + run_on_sparse "$@" } test_all_match () { - run_on_all $* && + run_on_all "$@" && test_cmp full-checkout-out sparse-checkout-out && test_cmp full-checkout-err sparse-checkout-err } @@ -119,7 +119,7 @@ test_expect_success 'status with options' ' test_all_match git status --porcelain=v2 && test_all_match git status --porcelain=v2 -z -u && test_all_match git status --porcelain=v2 -uno && - run_on_all "touch README.md" && + run_on_all touch README.md && test_all_match git status --porcelain=v2 && test_all_match git status --porcelain=v2 -z -u && test_all_match git status --porcelain=v2 -uno && @@ -135,7 +135,7 @@ test_expect_success 'add, commit, checkout' ' write_script edit-contents <<-\EOF && echo text >>$1 EOF - run_on_all "../edit-contents README.md" && + run_on_all ../edit-contents README.md && test_all_match git add README.md && test_all_match git status --porcelain=v2 && @@ -144,7 +144,7 @@ test_expect_success 'add, commit, checkout' ' test_all_match git checkout HEAD~1 && test_all_match git checkout - && - run_on_all "../edit-contents README.md" && + run_on_all ../edit-contents README.md && test_all_match git add -A && test_all_match git status --porcelain=v2 && @@ -153,7 +153,7 @@ test_expect_success 'add, commit, checkout' ' test_all_match git checkout HEAD~1 && test_all_match git checkout - && - run_on_all "../edit-contents deep/newfile" && + run_on_all ../edit-contents deep/newfile && test_all_match git status --porcelain=v2 -uno && test_all_match git status --porcelain=v2 && @@ -186,7 +186,7 @@ test_expect_success 'diff --staged' ' write_script edit-contents <<-\EOF && echo text >>README.md EOF - run_on_all "../edit-contents" && + run_on_all ../edit-contents && test_all_match git diff && test_all_match git diff --staged && @@ -280,7 +280,7 @@ test_expect_success 'clean' ' echo bogus >>.gitignore && run_on_all cp ../.gitignore . && test_all_match git add .gitignore && - test_all_match git commit -m ignore-bogus-files && + test_all_match git commit -m "ignore bogus files" && run_on_sparse mkdir folder1 && run_on_all touch folder1/bogus && From patchwork Tue Feb 23 20:14:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12100839 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 277D0C433E0 for ; Tue, 23 Feb 2021 20:15:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 00F5D64D99 for ; Tue, 23 Feb 2021 20:15:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234199AbhBWUPU (ORCPT ); Tue, 23 Feb 2021 15:15:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47070 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234171AbhBWUPQ (ORCPT ); Tue, 23 Feb 2021 15:15:16 -0500 Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [IPv6:2a00:1450:4864:20::32e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C7D6AC06178C for ; Tue, 23 Feb 2021 12:14:34 -0800 (PST) Received: by mail-wm1-x32e.google.com with SMTP id l13so3576360wmg.5 for ; Tue, 23 Feb 2021 12:14:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=ygaDs5wlP7UHuBKjOsrDT3aRtt2dHHpJXltd1v63C3s=; b=blmjSDKpbiI3PDnGVGXIJ3iWctN8KwdwcG++9QFf7+Wvr4j+dzb/BhzGz62T4YQ7iF ltuSc7Ng2PICltdNU2Ecxawza0BbkXMMRH2dnCN2dXCrO+VXgaoWHiEEf26YEy9Z6V/+ D7P+Ps9c/IOL+WAQIytDMBj1F2Dm1PaoCh2gt2MN094rlioSCmou0nu2XESk16ijE8hi C5ub7/7d1IrsU59644bVhe6gvpq4pbe5jY+V+Y3gk+oYRo2QtEYyjAOvKOZJVJBjpPl1 WvGJGgCR2aPXBT0vg4izcWSx9jsxx9UuRMAMI+UyIObPlAlQRdKxP29FINoH1OHl6bii CHIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=ygaDs5wlP7UHuBKjOsrDT3aRtt2dHHpJXltd1v63C3s=; b=ZyrCxqpnZMvyqqDPjPy29vi/tv7bu32K7XcSr06DKMYqeCjnuCeYNnneiLnWjIHmH5 eoy7cxesYRQi9etsS+QiWamc3hZTazubyy0kLIh8zrZe8Ycv4+XAXuG2cVIrUdMKFEYW HJQ5Dhdd1GyWE3TXgm6kUXfYL49KNn4zqoa3AsGzmYdQ2LvKXTmLIC0/yF5safyIEZfD 9mvS8UPasBoWxEu4iwZMG5qazFZrqGwT50Ji1ah8C7MO3ZlngmJhngAsg+Pd/Df/AGM0 ggZXHljg9v3Gi1UNdXsG1KCfvm7VvP21yTigCTpv8igYoDgPYnJsCB2J7oAlzJzVxtBv abbg== X-Gm-Message-State: AOAM530DqgxZqlhcl94iWfs2DwTfFeoJVo0OIrnrWfdhtFZAiI0ilNyz GPhTcquzAjjmZh8OKvOnuXmWVte+qxg= X-Google-Smtp-Source: ABdhPJyknZZVHQ7axHSbM39/v+DuHqL4wGt13e6CMz59aMYLDD84FUQzFxtVi5DyXVuDOFAR1O9wKg== X-Received: by 2002:a7b:c1c4:: with SMTP id a4mr399253wmj.123.1614111273575; Tue, 23 Feb 2021 12:14:33 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id y5sm16995533wrh.38.2021.02.23.12.14.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Feb 2021 12:14:33 -0800 (PST) Message-Id: <01da4c48a1fa94188faf45ab1e23b98ecb4164d5.1614111270.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 23 Feb 2021 20:14:13 +0000 Subject: [PATCH 04/20] sparse-index: add guard to ensure full index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Upcoming changes will introduce modifications to the index format that allow sparse directories. It will be useful to have a mechanism for converting those sparse index files into full indexes by walking the tree at those sparse directories. Name this method ensure_full_index() as it will guarantee that the index is fully expanded. This method is not implemented yet, and instead we focus on the scaffolding to declare it and call it at the appropriate time. Add a 'command_requires_full_index' member to struct repo_settings. This will be an indicator that we need the index in full mode to do certain index operations. This starts as being true for every command, then we will set it to false as some commands integrate with sparse indexes. If 'command_requires_full_index' is true, then we will immediately expand a sparse index to a full one upon reading from disk. This suffices for now, but we will want to add more callers to ensure_full_index() later. Signed-off-by: Derrick Stolee --- Makefile | 1 + repo-settings.c | 8 ++++++++ repository.c | 11 ++++++++++- repository.h | 2 ++ sparse-index.c | 8 ++++++++ sparse-index.h | 7 +++++++ 6 files changed, 36 insertions(+), 1 deletion(-) create mode 100644 sparse-index.c create mode 100644 sparse-index.h diff --git a/Makefile b/Makefile index 5a239cac20e3..3bf61699238d 100644 --- a/Makefile +++ b/Makefile @@ -980,6 +980,7 @@ LIB_OBJS += setup.o LIB_OBJS += shallow.o LIB_OBJS += sideband.o LIB_OBJS += sigchain.o +LIB_OBJS += sparse-index.o LIB_OBJS += split-index.o LIB_OBJS += stable-qsort.o LIB_OBJS += strbuf.o diff --git a/repo-settings.c b/repo-settings.c index f7fff0f5ab83..d63569e4041e 100644 --- a/repo-settings.c +++ b/repo-settings.c @@ -77,4 +77,12 @@ void prepare_repo_settings(struct repository *r) UPDATE_DEFAULT_BOOL(r->settings.core_untracked_cache, UNTRACKED_CACHE_KEEP); UPDATE_DEFAULT_BOOL(r->settings.fetch_negotiation_algorithm, FETCH_NEGOTIATION_DEFAULT); + + /* + * This setting guards all index reads to require a full index + * over a sparse index. After suitable guards are placed in the + * codebase around uses of the index, this setting will be + * removed. + */ + r->settings.command_requires_full_index = 1; } diff --git a/repository.c b/repository.c index c98298acd017..a8acae002f71 100644 --- a/repository.c +++ b/repository.c @@ -10,6 +10,7 @@ #include "object.h" #include "lockfile.h" #include "submodule-config.h" +#include "sparse-index.h" /* The main repository */ static struct repository the_repo; @@ -261,6 +262,8 @@ void repo_clear(struct repository *repo) int repo_read_index(struct repository *repo) { + int res; + if (!repo->index) repo->index = xcalloc(1, sizeof(*repo->index)); @@ -270,7 +273,13 @@ int repo_read_index(struct repository *repo) else if (repo->index->repo != repo) BUG("repo's index should point back at itself"); - return read_index_from(repo->index, repo->index_file, repo->gitdir); + res = read_index_from(repo->index, repo->index_file, repo->gitdir); + + prepare_repo_settings(repo); + if (repo->settings.command_requires_full_index) + ensure_full_index(repo->index); + + return res; } int repo_hold_locked_index(struct repository *repo, diff --git a/repository.h b/repository.h index b385ca3c94b6..e06a23015697 100644 --- a/repository.h +++ b/repository.h @@ -41,6 +41,8 @@ struct repo_settings { enum fetch_negotiation_setting fetch_negotiation_algorithm; int core_multi_pack_index; + + unsigned command_requires_full_index:1; }; struct repository { diff --git a/sparse-index.c b/sparse-index.c new file mode 100644 index 000000000000..82183ead563b --- /dev/null +++ b/sparse-index.c @@ -0,0 +1,8 @@ +#include "cache.h" +#include "repository.h" +#include "sparse-index.h" + +void ensure_full_index(struct index_state *istate) +{ + /* intentionally left blank */ +} diff --git a/sparse-index.h b/sparse-index.h new file mode 100644 index 000000000000..09a20d036c46 --- /dev/null +++ b/sparse-index.h @@ -0,0 +1,7 @@ +#ifndef SPARSE_INDEX_H__ +#define SPARSE_INDEX_H__ + +struct index_state; +void ensure_full_index(struct index_state *istate); + +#endif From patchwork Tue Feb 23 20:14:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12100841 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D389DC433DB for ; Tue, 23 Feb 2021 20:15:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A7D0C64DF2 for ; Tue, 23 Feb 2021 20:15:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234209AbhBWUPW (ORCPT ); Tue, 23 Feb 2021 15:15:22 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47072 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234179AbhBWUPQ (ORCPT ); Tue, 23 Feb 2021 15:15:16 -0500 Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5E938C061793 for ; Tue, 23 Feb 2021 12:14:35 -0800 (PST) Received: by mail-wr1-x434.google.com with SMTP id d11so4495998wrj.7 for ; Tue, 23 Feb 2021 12:14:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Ci4y4YWSW1vJ2rkZPyHzHL7blfaXgDlSy/YXzS6MrKY=; b=KNvQd/o1NF0KROkeBUHcJkgvJm2SuvXLsowZhy+miZ29H6CVISIk4kqsp/Jwo+wixg /Y+tWU2VCcmgqbzF88RmaeuHzPyjYVUjwo3e9+CxSkxsaDWhcGyFUBmb9fOQahNeaJKh kdxj2hsnensUrR4uctNiVt7A7MWkGXkZeS3wB7Br+e/Yy62NA5wypgTfDlVCLi8kChnT fAcqwXEDjPf+yLgiHMP2Dz6XpE6EEWGWislXbeArfq1c5rUQ6DaJATtwgiSUSBqiIPzL cHDMH6DJ8g1675CkkMNDG9FlzLpFhJgt+KE6srP9p3wVUzPCZa83zeE/6TgQgd9N+V8q FGww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Ci4y4YWSW1vJ2rkZPyHzHL7blfaXgDlSy/YXzS6MrKY=; b=groyCFzZBmlJObVBE38+crNo0RiZNsyhJybp9AQhGdxVpvideWBN656DbepC32MM73 PrniyqNwkjs6UDagWqu5YnCAtYkVruFBao5kPcnDRd+YZg1Xeu3ZqftgvzKYCNlre2O6 OHVht4jOafXWbiRpyfracutgA8JtnS+Y28FFOBmJO+yu/Cx8qT1xE/ZFumCasZ8KQzaD U0VIQJTYGqG9f64scmD7DNHWc4uKc3xUfLOfNUXlRiYC94wJ0p2YqKv3WKg5kO1rc3z8 vTi0sjOsRIA4qKh1YqyTWz2V27JBoyvsWeSQYW69BNm7hV1AGw33mNFExwoxSd94Gh9L 8+RQ== X-Gm-Message-State: AOAM530Hdw5qgkabFUV878SLMBVs4KR+9SIEh2YEt34IQ/2COOg0DRYK 58fcHwvfpVguecqBXtjjYtVLZMxrbgk= X-Google-Smtp-Source: ABdhPJyyI7edW1WNxHpo+kPQVpDqIMfzoOXC/BsPDBdWja2dPQjGDntdsi8FcnkkLwApUB9Nhi1vzw== X-Received: by 2002:a5d:6b45:: with SMTP id x5mr27205212wrw.415.1614111274198; Tue, 23 Feb 2021 12:14:34 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id n186sm3720037wmn.22.2021.02.23.12.14.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Feb 2021 12:14:33 -0800 (PST) Message-Id: <2b83989fbcd3d464a3172eeb7cfea2e06e4f3785.1614111270.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 23 Feb 2021 20:14:14 +0000 Subject: [PATCH 05/20] sparse-index: implement ensure_full_index() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee We will mark an in-memory index_state as having sparse directory entries with the sparse_index bit. These currently cannot exist, but we will add a mechanism for collapsing a full index to a sparse one in a later change. That will happen at write time, so we must first allow parsing the format before writing it. Commands or methods that require a full index in order to operate can call ensure_full_index() to expand that index in-memory. This requires parsing trees using that index's repository. Sparse directory entries have a specific 'ce_mode' value. The macro S_ISSPARSEDIR(ce->ce_mode) can check if a cache_entry 'ce' has this type. This ce_mode is not possible with the existing index formats, so we don't also verify all properties of a sparse-directory entry, which are: 1. ce->ce_mode == 0040000 2. ce->flags & CE_SKIP_WORKTREE is true 3. ce->name[ce->namelen - 1] == '/' (ends in dir separator) 4. ce->oid references a tree object. These are all semi-enforced in ensure_full_index() to some extent. Any deviation will cause a warning at minimum or a failure in the worst case. Signed-off-by: Derrick Stolee --- cache.h | 7 +++- read-cache.c | 9 +++++ sparse-index.c | 95 +++++++++++++++++++++++++++++++++++++++++++++++++- 3 files changed, 109 insertions(+), 2 deletions(-) diff --git a/cache.h b/cache.h index d92814961405..1336c8d7435e 100644 --- a/cache.h +++ b/cache.h @@ -204,6 +204,8 @@ struct cache_entry { #error "CE_EXTENDED_FLAGS out of range" #endif +#define S_ISSPARSEDIR(m) ((m) == S_IFDIR) + /* Forward structure decls */ struct pathspec; struct child_process; @@ -319,7 +321,8 @@ struct index_state { drop_cache_tree : 1, updated_workdir : 1, updated_skipworktree : 1, - fsmonitor_has_run_once : 1; + fsmonitor_has_run_once : 1, + sparse_index : 1; struct hashmap name_hash; struct hashmap dir_hash; struct object_id oid; @@ -722,6 +725,8 @@ int read_index_from(struct index_state *, const char *path, const char *gitdir); int is_index_unborn(struct index_state *); +void ensure_full_index(struct index_state *istate); + /* For use with `write_locked_index()`. */ #define COMMIT_LOCK (1 << 0) #define SKIP_IF_UNCHANGED (1 << 1) diff --git a/read-cache.c b/read-cache.c index 29144cf879e7..97dbf2434f30 100644 --- a/read-cache.c +++ b/read-cache.c @@ -101,6 +101,9 @@ static const char *alternate_index_output; static void set_index_entry(struct index_state *istate, int nr, struct cache_entry *ce) { + if (S_ISSPARSEDIR(ce->ce_mode)) + istate->sparse_index = 1; + istate->cache[nr] = ce; add_name_hash(istate, ce); } @@ -2255,6 +2258,12 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist) trace2_data_intmax("index", the_repository, "read/cache_nr", istate->cache_nr); + if (!istate->repo) + istate->repo = the_repository; + prepare_repo_settings(istate->repo); + if (istate->repo->settings.command_requires_full_index) + ensure_full_index(istate); + return istate->cache_nr; unmap: diff --git a/sparse-index.c b/sparse-index.c index 82183ead563b..316cb949b74b 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -1,8 +1,101 @@ #include "cache.h" #include "repository.h" #include "sparse-index.h" +#include "tree.h" +#include "pathspec.h" +#include "trace2.h" + +static void set_index_entry(struct index_state *istate, int nr, struct cache_entry *ce) +{ + ALLOC_GROW(istate->cache, nr + 1, istate->cache_alloc); + + istate->cache[nr] = ce; + add_name_hash(istate, ce); +} + +static int add_path_to_index(const struct object_id *oid, + struct strbuf *base, const char *path, + unsigned int mode, int stage, void *context) +{ + struct index_state *istate = (struct index_state *)context; + struct cache_entry *ce; + size_t len = base->len; + + if (S_ISDIR(mode)) + return READ_TREE_RECURSIVE; + + strbuf_addstr(base, path); + + ce = make_cache_entry(istate, mode, oid, base->buf, 0, 0); + ce->ce_flags |= CE_SKIP_WORKTREE; + set_index_entry(istate, istate->cache_nr++, ce); + + strbuf_setlen(base, len); + return 0; +} void ensure_full_index(struct index_state *istate) { - /* intentionally left blank */ + int i; + struct index_state *full; + + if (!istate || !istate->sparse_index) + return; + + if (!istate->repo) + istate->repo = the_repository; + + trace2_region_enter("index", "ensure_full_index", istate->repo); + + /* initialize basics of new index */ + full = xcalloc(1, sizeof(struct index_state)); + memcpy(full, istate, sizeof(struct index_state)); + + /* then change the necessary things */ + full->sparse_index = 0; + full->cache_alloc = (3 * istate->cache_alloc) / 2; + full->cache_nr = 0; + ALLOC_ARRAY(full->cache, full->cache_alloc); + + for (i = 0; i < istate->cache_nr; i++) { + struct cache_entry *ce = istate->cache[i]; + struct tree *tree; + struct pathspec ps; + + if (!S_ISSPARSEDIR(ce->ce_mode)) { + set_index_entry(full, full->cache_nr++, ce); + continue; + } + if (!(ce->ce_flags & CE_SKIP_WORKTREE)) + warning(_("index entry is a directory, but not sparse (%08x)"), + ce->ce_flags); + + /* recursively walk into cd->name */ + tree = lookup_tree(istate->repo, &ce->oid); + + memset(&ps, 0, sizeof(ps)); + ps.recursive = 1; + ps.has_wildcard = 1; + ps.max_depth = -1; + + read_tree_recursive(istate->repo, tree, + ce->name, strlen(ce->name), + 0, &ps, + add_path_to_index, full); + + /* free directory entries. full entries are re-used */ + discard_cache_entry(ce); + } + + /* Copy back into original index. */ + memcpy(&istate->name_hash, &full->name_hash, sizeof(full->name_hash)); + istate->sparse_index = 0; + free(istate->cache); + istate->cache = full->cache; + istate->cache_nr = full->cache_nr; + istate->cache_alloc = full->cache_alloc; + + free(full); + + trace2_region_leave("index", "ensure_full_index", istate->repo); } From patchwork Tue Feb 23 20:14:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12100849 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D372C433DB for ; Tue, 23 Feb 2021 20:16:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D647464E6B for ; Tue, 23 Feb 2021 20:16:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233183AbhBWUQG (ORCPT ); Tue, 23 Feb 2021 15:16:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47212 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232698AbhBWUPy (ORCPT ); Tue, 23 Feb 2021 15:15:54 -0500 Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EA7E7C061794 for ; Tue, 23 Feb 2021 12:14:35 -0800 (PST) Received: by mail-wr1-x42c.google.com with SMTP id y17so2301579wrs.12 for ; Tue, 23 Feb 2021 12:14:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=5iigrqOwj76r5bXCKC1LYHTsGQ9spzO+EGwZyXD+2fQ=; b=c4+/uhnHgUUmVvw5ksE6RRUpcfb7Tm4wq91x4nsRm17cGjrStWvLG4hE5UjjD97zPf 7UQR6D+54jxsNe2qFih3j/7CM5HlkxpNqXIyGfkPkQaZisZaIKkG5kDFrXzmCa1qANtk 8GFIpiyCpBHTkv2SLY8Rubu8XmOJ3j5AAjSRpPKEJFpXrhHucP783Xf9HSEaPyFlK9pF iqE8XvQv8vdTRu1KDxGlF7qwct+OCP1GoGQxYjwQk/r77PR63RkW5Qf5aJ0TJmttOzhp QWb9QOmdSiepoOBt2/pjSagrLprX4S07leIBgsBKy5uueGyrMGwIQULhaFFQ7UQnnU8w VDbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=5iigrqOwj76r5bXCKC1LYHTsGQ9spzO+EGwZyXD+2fQ=; b=rUm5LSdf/aj0YK0ty4VALEchKP93PymQW+4fLhXofk871bQZRo2psOzhtnI9nWS1Po oPRRpiSFsqmUVTmM1CZhQ7tNZRj846MCs6TdWsvrBeBqVWvyb6qUycsY0boYMcnS2w0Y 4lXEneu5Ma6rrMt1Q354+Lz/10Gle13w+Q0+/LRo3D2iJe87WPAwF4nAnzuZOIEZvooN 3Q/ta31i4ZggQv6Kxu7YcHg50xlLQyDvTb+Mv/9o44hUIzB9QI/ZyX5cgICArJh9xiXa Hef2dycUYf9jOrA2M4ILKMRlN85oU0wKVPEi7nghP5mUAGcYpZTW44rr0dApbKN3/qhk kIkA== X-Gm-Message-State: AOAM532lEVve1zO7FvctpMUu8pHCfA9BIJqr6BF0kt6w/zFCA7dO33P+ UONwWgvRfy2pyVjm55W0D5I6M+kBJdM= X-Google-Smtp-Source: ABdhPJyaVmS5BXybujv4SUDtsmBWTYJ200INwQWu1qtn2z2k+oXRPc+QP1hqE2bKTdvMQPx2cbl/zA== X-Received: by 2002:a5d:6403:: with SMTP id z3mr25531148wru.391.1614111274743; Tue, 23 Feb 2021 12:14:34 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id b15sm96797wmd.41.2021.02.23.12.14.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Feb 2021 12:14:34 -0800 (PST) Message-Id: In-Reply-To: References: Date: Tue, 23 Feb 2021 20:14:15 +0000 Subject: [PATCH 06/20] t1092: compare sparse-checkout to sparse-index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Add a new 'sparse-index' repo alongside the 'full-checkout' and 'sparse-checkout' repos in t1092-sparse-checkout-compatibility.sh. Also add run_on_sparse and test_sparse_match helpers. These helpers will be used when the sparse index is implemented. Add GIT_TEST_SPARSE_INDEX environment variable to enable the sparse-index by default. This will be intended to use across the entire test suite, except that it will only affect cases where the sparse-checkout feature is enabled. Signed-off-by: Derrick Stolee --- t/README | 3 +++ t/t1092-sparse-checkout-compatibility.sh | 24 ++++++++++++++++++++---- 2 files changed, 23 insertions(+), 4 deletions(-) diff --git a/t/README b/t/README index 593d4a4e270c..b98bc563aab5 100644 --- a/t/README +++ b/t/README @@ -439,6 +439,9 @@ and "sha256". GIT_TEST_WRITE_REV_INDEX=, when true enables the 'pack.writeReverseIndex' setting. +GIT_TEST_SPARSE_INDEX=, when true enables index writes to use the +sparse-index format by default. + Naming Tests ------------ diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index 3725d3997e70..71d6f9e4c014 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -7,6 +7,7 @@ test_description='compare full workdir to sparse workdir' test_expect_success 'setup' ' git init initial-repo && ( + GIT_TEST_SPARSE_INDEX=0 && cd initial-repo && echo a >a && echo "after deep" >e && @@ -87,23 +88,32 @@ init_repos () { cp -r initial-repo sparse-checkout && git -C sparse-checkout reset --hard && - git -C sparse-checkout sparse-checkout init --cone && + + cp -r initial-repo sparse-index && + git -C sparse-index reset --hard && # initialize sparse-checkout definitions - git -C sparse-checkout sparse-checkout set deep + git -C sparse-checkout sparse-checkout init --cone && + git -C sparse-checkout sparse-checkout set deep && + GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout init --cone && + GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout set deep } run_on_sparse () { ( cd sparse-checkout && - "$@" >../sparse-checkout-out 2>../sparse-checkout-err + GIT_TEST_SPARSE_INDEX=0 "$@" >../sparse-checkout-out 2>../sparse-checkout-err + ) && + ( + cd sparse-index && + GIT_TEST_SPARSE_INDEX=1 "$@" >../sparse-index-out 2>../sparse-index-err ) } run_on_all () { ( cd full-checkout && - "$@" >../full-checkout-out 2>../full-checkout-err + GIT_TEST_SPARSE_INDEX=0 "$@" >../full-checkout-out 2>../full-checkout-err ) && run_on_sparse "$@" } @@ -114,6 +124,12 @@ test_all_match () { test_cmp full-checkout-err sparse-checkout-err } +test_sparse_match () { + run_on_sparse $* && + test_cmp sparse-checkout-out sparse-index-out && + test_cmp sparse-checkout-err sparse-index-err +} + test_expect_success 'status with options' ' init_repos && test_all_match git status --porcelain=v2 && From patchwork Tue Feb 23 20:14:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12100847 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7155CC433DB for ; Tue, 23 Feb 2021 20:16:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 48A0E64E6B for ; Tue, 23 Feb 2021 20:16:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233117AbhBWUQB (ORCPT ); Tue, 23 Feb 2021 15:16:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47210 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232589AbhBWUPy (ORCPT ); Tue, 23 Feb 2021 15:15:54 -0500 Received: from mail-wm1-x335.google.com (mail-wm1-x335.google.com [IPv6:2a00:1450:4864:20::335]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8DDF1C061797 for ; Tue, 23 Feb 2021 12:14:36 -0800 (PST) Received: by mail-wm1-x335.google.com with SMTP id a132so3645154wmc.0 for ; Tue, 23 Feb 2021 12:14:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=5oJwzTEAwDQe46DTGB1dKLdJIdkx0jvsjBXCc8Gy0xg=; b=P1Boaqcy0REXetEpKacwISEWT5VsagabAi+ANRdDNn+bMJ+mHngXuypOedI+WVjeDe QM/ZhCJu0v9/bTgW1aPhw3XwBZDb/Eo3l3PhoXN+58DatKcItMjNCo/atpDgotsM0/7C Y+6LYyRvVEUVCu6KHRWAY47rTM5ETyFfRGOgWIrNT/uyyW92k/Hd1ophPsOq5NeYsUfo B0CSOXw5GuWY2BnqFkqzu+wIiIihJxBDDe8nIWuUDo9sQmP1xFENYBHNgVAA7PNXaknl WfwHkbkRyFlVdUSfb9kp+bXa2Lc6UcNFJDJ+fngI9Pn/7083WiMC34+YnNHG0xUct4L4 Treg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=5oJwzTEAwDQe46DTGB1dKLdJIdkx0jvsjBXCc8Gy0xg=; b=Gtbi3y00zXU03d1FXMc6K1AfzVx+UhEsaUQ7oeb1se03glt5Nv8PXlH964fLPQjJ/P zk/eYPvC8rl+sr07O9Z4bJvCBvqHme/0VPAFw6vEPc+y4hieYEqKEfBJYf0bdY/P8kLK Ofzq19RSnRm/UUl4/7NimNTZhNVis9lU6H7mzK4aY0pw9n0cXcMiWl4v/z7Aln/4lj9j KlEtZaO8D3gf1UXGfqMLqVX2EAn+3vF4h0FAOhk84V64Ms2kAQZPwn3Yc+Hvic98Z5x6 lP8K+ji3c4x+zO7uW5vlRILyyieGPz5gG4/Oi3ZX0+G8yYwrj2NorEmfQpsi0iy32OOH yqqw== X-Gm-Message-State: AOAM530rk3dmKbrKHrXOlb9d/zhRr/0lit984mtApfxGXGjFrvc2EZlI uQ9co6+VTGlog+MjdRB1woJ6hag3R8Y= X-Google-Smtp-Source: ABdhPJzbtAN3Mvuu/d2DRnPdQfsqn4KtfvFKWrWDHsMpP5OYAmal9fV/UGXEbsxefRqiHHnzPlrMwA== X-Received: by 2002:a1c:1982:: with SMTP id 124mr447565wmz.84.1614111275346; Tue, 23 Feb 2021 12:14:35 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id p9sm277086wmg.10.2021.02.23.12.14.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Feb 2021 12:14:35 -0800 (PST) Message-Id: <3d92df7a0cf9655dd34895f106cfac26ea44ad94.1614111270.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 23 Feb 2021 20:14:16 +0000 Subject: [PATCH 07/20] test-read-cache: print cache entries with --table Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee This table is helpful for discovering data in the index to ensure it is being written correctly, especially as we build and test the sparse-index. This table includes an output format similar to 'git ls-tree', but should not be compared to that directly. The biggest reasons are that 'git ls-tree' includes a tree entry for every subdirectory, even those that would not appear as a sparse directory in a sparse-index. Further, 'git ls-tree' does not use a trailing directory separator for its tree rows. This does not print the stat() information for the blobs. That could be added in a future change with another option. The tests that are added in the next few changes care only about the object types and IDs. To make the option parsing slightly more robust, wrap the string comparisons in a loop adapted from test-dir-iterator.c. Care must be taken with the final check for the 'cnt' variable. We continue the expectation that the numerical value is the final argument. Signed-off-by: Derrick Stolee --- t/helper/test-read-cache.c | 50 ++++++++++++++++++++++++++++++-------- 1 file changed, 40 insertions(+), 10 deletions(-) diff --git a/t/helper/test-read-cache.c b/t/helper/test-read-cache.c index 244977a29bdf..e4c3492f7d3e 100644 --- a/t/helper/test-read-cache.c +++ b/t/helper/test-read-cache.c @@ -2,35 +2,65 @@ #include "cache.h" #include "config.h" +static void print_cache_entry(struct cache_entry *ce) +{ + printf("%06o ", ce->ce_mode & 0777777); + + if (S_ISSPARSEDIR(ce->ce_mode)) + printf("tree "); + else if (S_ISGITLINK(ce->ce_mode)) + printf("commit "); + else + printf("blob "); + + printf("%s\t%s\n", + oid_to_hex(&ce->oid), + ce->name); +} + +static void print_cache(struct index_state *cache) +{ + int i; + for (i = 0; i < the_index.cache_nr; i++) + print_cache_entry(the_index.cache[i]); +} + int cmd__read_cache(int argc, const char **argv) { + struct repository *r = the_repository; int i, cnt = 1; const char *name = NULL; + int table = 0; - if (argc > 1 && skip_prefix(argv[1], "--print-and-refresh=", &name)) { - argc--; - argv++; + for (++argv, --argc; *argv && starts_with(*argv, "--"); ++argv, --argc) { + if (skip_prefix(*argv, "--print-and-refresh=", &name)) + continue; + if (!strcmp(*argv, "--table")) + table = 1; } - if (argc == 2) - cnt = strtol(argv[1], NULL, 0); + if (argc == 1) + cnt = strtol(argv[0], NULL, 0); setup_git_directory(); git_config(git_default_config, NULL); + for (i = 0; i < cnt; i++) { - read_cache(); + repo_read_index(r); if (name) { int pos; - refresh_index(&the_index, REFRESH_QUIET, + refresh_index(r->index, REFRESH_QUIET, NULL, NULL, NULL); - pos = index_name_pos(&the_index, name, strlen(name)); + pos = index_name_pos(r->index, name, strlen(name)); if (pos < 0) die("%s not in index", name); printf("%s is%s up to date\n", name, - ce_uptodate(the_index.cache[pos]) ? "" : " not"); + ce_uptodate(r->index->cache[pos]) ? "" : " not"); write_file(name, "%d\n", i); } - discard_cache(); + if (table) + print_cache(r->index); + discard_index(r->index); } return 0; } From patchwork Tue Feb 23 20:14:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12100859 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 819D1C433DB for ; Tue, 23 Feb 2021 20:16:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 501DB64E6B for ; Tue, 23 Feb 2021 20:16:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234224AbhBWUQn (ORCPT ); Tue, 23 Feb 2021 15:16:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47216 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233097AbhBWUP4 (ORCPT ); Tue, 23 Feb 2021 15:15:56 -0500 Received: from mail-wr1-x42e.google.com (mail-wr1-x42e.google.com [IPv6:2a00:1450:4864:20::42e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3A2C2C061574 for ; Tue, 23 Feb 2021 12:14:37 -0800 (PST) Received: by mail-wr1-x42e.google.com with SMTP id n4so23884889wrx.1 for ; Tue, 23 Feb 2021 12:14:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=NqEYBl1g3sJlE+XWYlzSJZj6ZnwcBPsZfNGC9XtLscY=; b=NEtYO08GkHe20xAIN8ZqH+xiRps7GTjS2PUlcnTqG+hNIWNJourNptQgRc7iARV8JH MJa5iseLfKLl3x0GZsNg00SAo9cxvFqMIdpXlgmdhZeXzE4u3L0DF2BVBbuCLwICa9/u F/fsD05qWeUsS2PVQB4k7rLU6GyXxgmIIwR3fC5oGRcJPpq/LM+RZMv1I7oWVTU0JD3Y rLAErxgdFM4c2ml4pTPWAjHyd4BcIDL119vlaFIZwydDFxQrhrzyWcnZZojg28zKv2Nc vw/KaUC6M4kcw7Osq2cwdetZbLb9oeZrFm00iuXtJCjn4+rTSJeA0MI9qJHhsF+4vR77 cSfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=NqEYBl1g3sJlE+XWYlzSJZj6ZnwcBPsZfNGC9XtLscY=; b=HCG2x635UXEmAFLGNgvMLAMdcz0r63+T+uZxHMjAE5dptB8f5cNbcd5t7UsU/OCpFE tXeT/mwYLzFJSoGiUjHEL2jOHISQA3Vxux45vFF/SwaMfEWMQfcBnp5gmjeSgM+Fftb0 OFc6lMvXiPmy2dACR7OqmiI+OBeVz7s9UO8k7H9idOH8GXPES7SIl48JPRLLx+DUTMRk GaCmKmc4otF6i8PbwQs7XCuQwt01skzByimOb7SyonlqKOJIry0oObzrN5gA5fQBnAmv pf9QdyB3K9N04DB4rR4M5nPXMD4fBavUgBP6sZsirGLyENcqsZ6uRLfIkBEbshFBwWPC Ytbw== X-Gm-Message-State: AOAM5313651RkkAiVe443ODa+3tbEO7v8Aq5EUJD73A1BESX2jZeem5+ ulyUHWEANXOL/fDVStaLH4Fzewl8QvQ= X-Google-Smtp-Source: ABdhPJz6zJ1mA5B+QuI2v8LamW1Fzm+8BNVZW6wivkp3NP6H231NPZ5ULlVMSCjaVmPEFUjbtYyVQA== X-Received: by 2002:adf:c101:: with SMTP id r1mr28287700wre.38.1614111276027; Tue, 23 Feb 2021 12:14:36 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id s23sm3588639wmc.29.2021.02.23.12.14.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Feb 2021 12:14:35 -0800 (PST) Message-Id: <94373e2bfbbcdef85f3e9389e6239a44d3d0a598.1614111270.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 23 Feb 2021 20:14:17 +0000 Subject: [PATCH 08/20] test-tool: don't force full index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee We will use 'test-tool read-cache --table' to check that a sparse index is written as part of init_repos. Since we will no longer always expand a sparse index into a full index, add an '--expand' parameter that adds a call to ensure_full_index() so we can compare a sparse index directly against a full index, or at least what the in-memory index looks like when expanded in this way. Signed-off-by: Derrick Stolee --- t/helper/test-read-cache.c | 13 ++++++++++++- t/t1092-sparse-checkout-compatibility.sh | 5 +++++ 2 files changed, 17 insertions(+), 1 deletion(-) diff --git a/t/helper/test-read-cache.c b/t/helper/test-read-cache.c index e4c3492f7d3e..4780429dca6b 100644 --- a/t/helper/test-read-cache.c +++ b/t/helper/test-read-cache.c @@ -1,6 +1,7 @@ #include "test-tool.h" #include "cache.h" #include "config.h" +#include "sparse-index.h" static void print_cache_entry(struct cache_entry *ce) { @@ -30,13 +31,19 @@ int cmd__read_cache(int argc, const char **argv) struct repository *r = the_repository; int i, cnt = 1; const char *name = NULL; - int table = 0; + int table = 0, expand = 0; + + initialize_the_repository(); + prepare_repo_settings(r); + r->settings.command_requires_full_index = 0; for (++argv, --argc; *argv && starts_with(*argv, "--"); ++argv, --argc) { if (skip_prefix(*argv, "--print-and-refresh=", &name)) continue; if (!strcmp(*argv, "--table")) table = 1; + else if (!strcmp(*argv, "--expand")) + expand = 1; } if (argc == 1) @@ -46,6 +53,10 @@ int cmd__read_cache(int argc, const char **argv) for (i = 0; i < cnt; i++) { repo_read_index(r); + + if (expand) + ensure_full_index(r->index); + if (name) { int pos; diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index 71d6f9e4c014..4d789fe86b9d 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -130,6 +130,11 @@ test_sparse_match () { test_cmp sparse-checkout-err sparse-index-err } +test_expect_success 'expanded in-memory index matches full index' ' + init_repos && + test_sparse_match test-tool read-cache --expand --table +' + test_expect_success 'status with options' ' init_repos && test_all_match git status --porcelain=v2 && From patchwork Tue Feb 23 20:14:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12100857 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6BE43C433DB for ; Tue, 23 Feb 2021 20:16:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3AAD664E6B for ; Tue, 23 Feb 2021 20:16:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234172AbhBWUQk (ORCPT ); Tue, 23 Feb 2021 15:16:40 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47214 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232793AbhBWUP4 (ORCPT ); Tue, 23 Feb 2021 15:15:56 -0500 Received: from mail-wr1-x430.google.com (mail-wr1-x430.google.com [IPv6:2a00:1450:4864:20::430]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C5FFAC0617A7 for ; Tue, 23 Feb 2021 12:14:37 -0800 (PST) Received: by mail-wr1-x430.google.com with SMTP id y17so2301639wrs.12 for ; Tue, 23 Feb 2021 12:14:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=i+gKe8EgHfQlDyjY32XCjj4qcNKDY7Jfl7pEV2IN20w=; b=IM3fFnKVDfv08Ss+hkWhj8mHOoFDTaC3Q+yo/VzGhaMZLDSrjxZjZ743WpRKjm4O1o ydiRbEdYY1FmGAQQeZiy+WPi7OaG27bJg6O1QEd7BWxZrxD0R7c49hqz7BJrbnHpRDCE njTpVMKPZDDoigQRwCDD3I9h1V+A1kKLmawETLTC+ci4Vu8lNQgRoctk7QNbCs47JY9G jSiYybFMPzwiDvQmR2uDuD6SrALGZOrCZ8k8WTDpmOTkWEoEQwoBtVxSVYs/clg4Ihsi NriXQzqDPIldPxVDemRDaJxQ4y9kmgV7mSuUtjaSV8OJe2dbkNnpxi6YHI+GWRdgA1Op Uemw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=i+gKe8EgHfQlDyjY32XCjj4qcNKDY7Jfl7pEV2IN20w=; b=e4YaiB1dudte2KmYTmY63hLbxwBctAIRmikPjLv5H86nT1dSkLUurRzjoRtoKW6CbA MxNWmJwm4hz6XnEjdtgtrfZ9dJTImOzNQamjgHGfsJbU1HFVm65ehL7Z1QUEHbMJ1CsD 4bkEiX8u6SDEpTx7xLu1lB/Xdczf2iBIXkjn9jXXLzTjbg6ntwFKmJB/kLFwob7RFlE/ ueY31frOEkTx7MXDWZZRqJjjEAgxTYE9MxI9AzkRDq9qwXqfCdz8dv0M+HN7Q4V1MYOA ZmyTew7rdlrGpFmcw1/1z6pDsoNHkzDI/NPacwbJHH16j09eWAqVMqhVFktpITvPfApW NfAw== X-Gm-Message-State: AOAM5308fRaAiW6PT+Yfps6AIO8Ml73F3R6NUqUCyfBUxoQUOEVr5BJh ADx5f2ypp/4PTkZXHRitRmq5/LFOhZw= X-Google-Smtp-Source: ABdhPJwtmgHFNj3nx1au9FJDNWwiOYMvzr0odbX/79UgvPW5Yg8shUOdzxoN560w68g0BDoYYUqLVA== X-Received: by 2002:a5d:6089:: with SMTP id w9mr28073175wrt.412.1614111276616; Tue, 23 Feb 2021 12:14:36 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id y1sm33144051wrr.41.2021.02.23.12.14.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Feb 2021 12:14:36 -0800 (PST) Message-Id: In-Reply-To: References: Date: Tue, 23 Feb 2021 20:14:18 +0000 Subject: [PATCH 09/20] unpack-trees: ensure full index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The next change will translate full indexes into sparse indexes at write time. The existing logic provides a way for every sparse index to be expanded to a full index at read time. However, there are cases where an index is written and then continues to be used in-memory to perform further updates. unpack_trees() is frequently called after such a write. In particular, commands like 'git reset' do this double-update of the index. Ensure that we have a full index when entering unpack_trees(), but only when command_requires_full_index is true. This is always true at the moment, but we will later relax that after unpack_trees() is updated to handle sparse directory entries. Signed-off-by: Derrick Stolee --- unpack-trees.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/unpack-trees.c b/unpack-trees.c index f5f668f532d8..4dd99219073a 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -1567,6 +1567,7 @@ static int verify_absent(const struct cache_entry *, */ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options *o) { + struct repository *repo = the_repository; int i, ret; static struct cache_entry *dfc; struct pattern_list pl; @@ -1578,6 +1579,12 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options trace_performance_enter(); trace2_region_enter("unpack_trees", "unpack_trees", the_repository); + prepare_repo_settings(repo); + if (repo->settings.command_requires_full_index) { + ensure_full_index(o->src_index); + ensure_full_index(o->dst_index); + } + if (!core_apply_sparse_checkout || !o->update) o->skip_sparse_checkout = 1; if (!o->skip_sparse_checkout && !o->pl) { From patchwork Tue Feb 23 20:14:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12100853 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21933C433DB for ; Tue, 23 Feb 2021 20:16:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D615164E6B for ; Tue, 23 Feb 2021 20:16:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234174AbhBWUQX (ORCPT ); Tue, 23 Feb 2021 15:16:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47224 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233166AbhBWUP4 (ORCPT ); Tue, 23 Feb 2021 15:15:56 -0500 Received: from mail-wr1-x431.google.com (mail-wr1-x431.google.com [IPv6:2a00:1450:4864:20::431]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 55FD3C0617A9 for ; Tue, 23 Feb 2021 12:14:38 -0800 (PST) Received: by mail-wr1-x431.google.com with SMTP id l12so23848837wry.2 for ; Tue, 23 Feb 2021 12:14:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=L6MQp2JdcxhlOlaQrA0yBx5aKzvSmmUcSuoEJrSyQ8k=; b=clkAJmySLfTcoHQEnzh18mqwNq/K129dvih0nPdk7vNLHVNfmohjQXefcNq8GdpIVn ZE9kg9KGy8aBT1W5mzoS/x4MEDzOPYqfns7OWDLO/ROLIdhGTxtHpNHff/DScRPftRGl ogQ5Qua1E9w3MuWPbElKubJF0GPBAGMShJ87fhVZkSO0kG53eHIQ/QUHr7jX48l0iXou 4XO67X9PSBSE8FDD0FvPy9wi+8Afy8UevTr+YjBi1Pih9htHY6Q4aCbPOJR2ZCXojtCO d2H10mrApTMuqB+wyr0TR+jTUBBEzs6jQ0Ylh9ZnXBkeposEhcjcuXFNiPCzglQJKFf9 QuOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=L6MQp2JdcxhlOlaQrA0yBx5aKzvSmmUcSuoEJrSyQ8k=; b=a2whkQ+YPwM1nTr4S6FL0OJOIQA9DqAOBWKQz+vpLlSm7lIe8UZ9oKEi6qmF5n26pE 8azG81wGHlvLgZVwi9znL1Oyad0vdFoofHUrGbwJT60tHeSplFMzfutxt2Pombi5kMA9 L8nvI4eE43D6rCVsn0tB+dUBBTxvqsCDN+UWydhNPxAIk027ALyssg2aHuKPYxfCnC+y 5gdc9arjZuNqEJjBWFAaAOXbpD1oYjZWIfplybRy7abxG88PqS0IBT515x/p3SkIzrI/ /NYS3+/lJ5CVseQwL8buZRgRUvIfh7J+LITvKKmrNXEoTp4TCIu1CWKWREpuB49IwupK o8pg== X-Gm-Message-State: AOAM530ZjC632zaaLsPk0AJ+nHqntawilugUfoOXENW2pbzIBXz5S8Nr 7bYHbqH6ljbTCToSfeXEyqY+prgm29c= X-Google-Smtp-Source: ABdhPJwioOdIv4WXP2K5x5wc3LCu2xS3rrlcRgiCksV9FnPj+uJT8ZY6xO0l27mAMYexi+we8YnM/A== X-Received: by 2002:adf:fd85:: with SMTP id d5mr15777641wrr.423.1614111277132; Tue, 23 Feb 2021 12:14:37 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id r7sm34737882wre.25.2021.02.23.12.14.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Feb 2021 12:14:36 -0800 (PST) Message-Id: In-Reply-To: References: Date: Tue, 23 Feb 2021 20:14:19 +0000 Subject: [PATCH 10/20] sparse-checkout: hold pattern list in index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee As we modify the sparse-checkout definition, we perform index operations on a pattern_list that only exists in-memory. This allows easy backing out in case the index update fails. However, if the index write itself cares about the sparse-checkout pattern set, we need access to that in-memory copy. Place a pointer to a 'struct pattern_list' in the index so we can access this on-demand. This will be used in the next change which uses the sparse-checkout definition to filter out directories that are outsie the sparse cone. Signed-off-by: Derrick Stolee --- builtin/sparse-checkout.c | 17 ++++++++++------- cache.h | 2 ++ 2 files changed, 12 insertions(+), 7 deletions(-) diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index 2306a9ad98e0..e00b82af727b 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -110,6 +110,8 @@ static int update_working_directory(struct pattern_list *pl) if (is_index_unborn(r->index)) return UPDATE_SPARSITY_SUCCESS; + r->index->sparse_checkout_patterns = pl; + memset(&o, 0, sizeof(o)); o.verbose_update = isatty(2); o.update = 1; @@ -138,6 +140,7 @@ static int update_working_directory(struct pattern_list *pl) else rollback_lock_file(&lock_file); + r->index->sparse_checkout_patterns = NULL; return result; } @@ -517,19 +520,18 @@ static int modify_pattern_list(int argc, const char **argv, enum modify_type m) { int result; int changed_config = 0; - struct pattern_list pl; - memset(&pl, 0, sizeof(pl)); + struct pattern_list *pl = xcalloc(1, sizeof(*pl)); switch (m) { case ADD: if (core_sparse_checkout_cone) - add_patterns_cone_mode(argc, argv, &pl); + add_patterns_cone_mode(argc, argv, pl); else - add_patterns_literal(argc, argv, &pl); + add_patterns_literal(argc, argv, pl); break; case REPLACE: - add_patterns_from_input(&pl, argc, argv); + add_patterns_from_input(pl, argc, argv); break; } @@ -539,12 +541,13 @@ static int modify_pattern_list(int argc, const char **argv, enum modify_type m) changed_config = 1; } - result = write_patterns_and_update(&pl); + result = write_patterns_and_update(pl); if (result && changed_config) set_config(MODE_NO_PATTERNS); - clear_pattern_list(&pl); + clear_pattern_list(pl); + free(pl); return result; } diff --git a/cache.h b/cache.h index 1336c8d7435e..d75b352f38d3 100644 --- a/cache.h +++ b/cache.h @@ -307,6 +307,7 @@ static inline unsigned int canon_mode(unsigned int mode) struct split_index; struct untracked_cache; struct progress; +struct pattern_list; struct index_state { struct cache_entry **cache; @@ -332,6 +333,7 @@ struct index_state { struct mem_pool *ce_mem_pool; struct progress *progress; struct repository *repo; + struct pattern_list *sparse_checkout_patterns; }; /* Name hashing */ From patchwork Tue Feb 23 20:14:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12100851 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0508AC433DB for ; Tue, 23 Feb 2021 20:16:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D1E1464E6B for ; Tue, 23 Feb 2021 20:16:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233309AbhBWUQK (ORCPT ); Tue, 23 Feb 2021 15:16:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47226 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233135AbhBWUP4 (ORCPT ); Tue, 23 Feb 2021 15:15:56 -0500 Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2B083C0617AA for ; Tue, 23 Feb 2021 12:14:39 -0800 (PST) Received: by mail-wm1-x32c.google.com with SMTP id v21so3619032wml.4 for ; Tue, 23 Feb 2021 12:14:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=bCBfnDFn/t9zxpFz+hrahuX1mNgD/g0N6Jrgzw0Evrs=; b=thgQGM5HpDmebNGVVJMoO89bi9+hAoWeO8guhoiRRfRwUc/HA6G2IsB6yjAbDOcl12 x0yNsmeJ4OeeoNCN+wfEQaf166ViBLvhC03mDBTYkCLTxLiLusU9fFcJ0yV4skIX/1ub ifSOtVBvySXeCWHTJLsGOKGuYIHHYzcW+qcV8u9nqlk0C3UiS2ZUcLdrGrwOKJXXxbKd siKrxFZ5WfrZLjZP02poD1WbKVhuetY7MCgZc2zYgsfW4sbmvm/jpRSprVj2iwAiFdmx Dhzdp7kUfJyhtN84+dJzIr93+KrIdhToHHE6LqMMnNWGgrRm3xx0BI2BjJuErE42n57b JZ+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=bCBfnDFn/t9zxpFz+hrahuX1mNgD/g0N6Jrgzw0Evrs=; b=aY4bkhARvbjW3myRMTuFbPPRZhdfudWKmRqFY20L3nDyDtAmISAVodIbo6tgLqi6dj jBatfWUMWr/BpOpdtuDL+RTyq30ziKWthNmFz7ClpRKVk8oY121oAC5ao/G/8DU0CFk+ nUhgKBNBrS4K7LR2u4CM5rU9LV1quviGBNH+Vat8FUQaPi4fbUSRrtQ0Hl5og1WzEhhk MLNJZGZIazeAKzcavj6KsCltUrgQRzeyjIFH8bKzo4KNpMLyLZehwmB9eWds0S/I64DR EkO0gcn0Qio+GN/EIKUCOTNCrhIUZ7rKzto6IgiymoOo/jrZ0oOj6Kzhtc/IGiXv32q0 jzVA== X-Gm-Message-State: AOAM533ddzQorAH/IZlW+fwVol2X7K2fskXSOqS6nFpQn7DF5rou5AuO 9RsOc0t0xUV6OTRR7txy7XhUANpAf50= X-Google-Smtp-Source: ABdhPJxQwf6UcKTNz0dF1q2CmqlcQXQiqX0DF54LUyfAbFmFnHiGA75c3i3A4rslStr2r04ybvxQdw== X-Received: by 2002:a05:600c:3551:: with SMTP id i17mr442597wmq.92.1614111277788; Tue, 23 Feb 2021 12:14:37 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id l22sm28361027wrb.4.2021.02.23.12.14.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Feb 2021 12:14:37 -0800 (PST) Message-Id: In-Reply-To: References: Date: Tue, 23 Feb 2021 20:14:20 +0000 Subject: [PATCH 11/20] sparse-index: convert from full to sparse Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee If we have a full index, then we can convert it to a sparse index by replacing directories outside of the sparse cone with sparse directory entries. The convert_to_sparse() method does this, when the situation is appropriate. For now, we avoid converting the index to a sparse index if: 1. the index is split. 2. the index is already sparse. 3. sparse-checkout is disabled. 4. sparse-checkout does not use cone mode. Finally, we currently limit the conversion to when the GIT_TEST_SPARSE_INDEX environment variable is enabled. A mode using Git config will be added in a later change. The trickiest thing about this conversion is that we might not be able to mark a directory as a sparse directory just because it is outside the sparse cone. There might be unmerged files within that directory, so we need to look for those. Also, if there is some strange reason why a file is not marked with CE_SKIP_WORKTREE, then we should give up on converting that directory. There is still hope that some of its subdirectories might be able to convert to sparse, so we keep looking deeper. The conversion process is assisted by the cache-tree extension. This is calculated from the full index if it does not already exist. We then abandon the cache-tree as it no longer applies to the newly-sparse index. Thus, this cache-tree will be recalculated in every sparse-full-sparse round-trip until we integrate the cache-tree extension with the sparse index. Some Git commands use the index after writing it. For example, 'git add' will update the index, then write it to disk, then read its entries to report information. To keep the in-memory index in a full state after writing, we re-expand it to a full one after the write. This is wasteful for commands that only write the index and do not read from it again, but that is only the case until we make those commands "sparse aware." We can compare the behavior of the sparse-index in t1092-sparse-checkout-compability.sh by using GIT_TEST_SPARSE_INDEX=1 when operating on the 'sparse-index' repo. We can also compare the two sparse repos directly, such as comparing their indexes (when expanded to full in the case of the 'sparse-index' repo). We also verify that the index is actually populated with sparse directory entries. The 'checkout and reset (mixed)' test is marked for failure when comparing a sparse repo to a full repo, but we can compare the two sparse-checkout cases directly to ensure that we are not changing the behavior when using a sparse index. Signed-off-by: Derrick Stolee --- cache-tree.c | 3 + cache.h | 2 + read-cache.c | 26 ++++- sparse-index.c | 139 +++++++++++++++++++++++ sparse-index.h | 1 + t/t1092-sparse-checkout-compatibility.sh | 61 +++++++++- 6 files changed, 227 insertions(+), 5 deletions(-) diff --git a/cache-tree.c b/cache-tree.c index 2fb483d3c083..5f07a39e501e 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -6,6 +6,7 @@ #include "object-store.h" #include "replace-object.h" #include "promisor-remote.h" +#include "sparse-index.h" #ifndef DEBUG_CACHE_TREE #define DEBUG_CACHE_TREE 0 @@ -442,6 +443,8 @@ int cache_tree_update(struct index_state *istate, int flags) if (i) return i; + ensure_full_index(istate); + if (!istate->cache_tree) istate->cache_tree = cache_tree(); diff --git a/cache.h b/cache.h index d75b352f38d3..e8b7d3b4fb33 100644 --- a/cache.h +++ b/cache.h @@ -251,6 +251,8 @@ static inline unsigned int create_ce_mode(unsigned int mode) { if (S_ISLNK(mode)) return S_IFLNK; + if (mode == S_IFDIR) + return S_IFDIR; if (S_ISDIR(mode) || S_ISGITLINK(mode)) return S_IFGITLINK; return S_IFREG | ce_permissions(mode); diff --git a/read-cache.c b/read-cache.c index 97dbf2434f30..67acbf202f4e 100644 --- a/read-cache.c +++ b/read-cache.c @@ -25,6 +25,7 @@ #include "fsmonitor.h" #include "thread-utils.h" #include "progress.h" +#include "sparse-index.h" /* Mask for the name length in ce_flags in the on-disk index */ @@ -1002,8 +1003,14 @@ int verify_path(const char *path, unsigned mode) c = *path++; if ((c == '.' && !verify_dotfile(path, mode)) || - is_dir_sep(c) || c == '\0') + is_dir_sep(c)) return 0; + /* + * allow terminating directory separators for + * sparse directory enries. + */ + if (c == '\0') + return S_ISDIR(mode); } else if (c == '\\' && protect_ntfs) { if (is_ntfs_dotgit(path)) return 0; @@ -3061,6 +3068,14 @@ static int do_write_locked_index(struct index_state *istate, struct lock_file *l unsigned flags) { int ret; + int was_full = !istate->sparse_index; + + ret = convert_to_sparse(istate); + + if (ret) { + warning(_("failed to convert to a sparse-index")); + return ret; + } /* * TODO trace2: replace "the_repository" with the actual repo instance @@ -3072,6 +3087,9 @@ static int do_write_locked_index(struct index_state *istate, struct lock_file *l trace2_region_leave_printf("index", "do_write_index", the_repository, "%s", get_lock_file_path(lock)); + if (was_full) + ensure_full_index(istate); + if (ret) return ret; if (flags & COMMIT_LOCK) @@ -3162,9 +3180,10 @@ static int write_shared_index(struct index_state *istate, struct tempfile **temp) { struct split_index *si = istate->split_index; - int ret; + int ret, was_full = !istate->sparse_index; move_cache_to_base_index(istate); + convert_to_sparse(istate); trace2_region_enter_printf("index", "shared/do_write_index", the_repository, "%s", get_tempfile_path(*temp)); @@ -3172,6 +3191,9 @@ static int write_shared_index(struct index_state *istate, trace2_region_leave_printf("index", "shared/do_write_index", the_repository, "%s", get_tempfile_path(*temp)); + if (was_full) + ensure_full_index(istate); + if (ret) return ret; ret = adjust_shared_perm(get_tempfile_path(*temp)); diff --git a/sparse-index.c b/sparse-index.c index 316cb949b74b..cb1f85635fbc 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -4,6 +4,145 @@ #include "tree.h" #include "pathspec.h" #include "trace2.h" +#include "cache-tree.h" +#include "config.h" +#include "dir.h" +#include "fsmonitor.h" + +static struct cache_entry *construct_sparse_dir_entry( + struct index_state *istate, + const char *sparse_dir, + struct cache_tree *tree) +{ + struct cache_entry *de; + + de = make_cache_entry(istate, S_IFDIR, &tree->oid, sparse_dir, 0, 0); + + de->ce_flags |= CE_SKIP_WORKTREE; + return de; +} + +/* + * Returns the number of entries "inserted" into the index. + */ +static int convert_to_sparse_rec(struct index_state *istate, + int num_converted, + int start, int end, + const char *ct_path, size_t ct_pathlen, + struct cache_tree *ct) +{ + int i, can_convert = 1; + int start_converted = num_converted; + enum pattern_match_result match; + int dtype; + struct strbuf child_path = STRBUF_INIT; + struct pattern_list *pl = istate->sparse_checkout_patterns; + + /* + * Is the current path outside of the sparse cone? + * Then check if the region can be replaced by a sparse + * directory entry (everything is sparse and merged). + */ + match = path_matches_pattern_list(ct_path, ct_pathlen, + NULL, &dtype, pl, istate); + if (match != NOT_MATCHED) + can_convert = 0; + + for (i = start; can_convert && i < end; i++) { + struct cache_entry *ce = istate->cache[i]; + + if (ce_stage(ce) || + !(ce->ce_flags & CE_SKIP_WORKTREE)) + can_convert = 0; + } + + if (can_convert) { + struct cache_entry *se; + se = construct_sparse_dir_entry(istate, ct_path, ct); + + istate->cache[num_converted++] = se; + return 1; + } + + for (i = start; i < end; ) { + int count, span, pos = -1; + const char *base, *slash; + struct cache_entry *ce = istate->cache[i]; + + /* + * Detect if this is a normal entry oustide of any subtree + * entry. + */ + base = ce->name + ct_pathlen; + slash = strchr(base, '/'); + + if (slash) + pos = cache_tree_subtree_pos(ct, base, slash - base); + + if (pos < 0) { + istate->cache[num_converted++] = ce; + i++; + continue; + } + + strbuf_setlen(&child_path, 0); + strbuf_add(&child_path, ce->name, slash - ce->name + 1); + + span = ct->down[pos]->cache_tree->entry_count; + count = convert_to_sparse_rec(istate, + num_converted, i, i + span, + child_path.buf, child_path.len, + ct->down[pos]->cache_tree); + num_converted += count; + i += span; + } + + strbuf_release(&child_path); + return num_converted - start_converted; +} + +int convert_to_sparse(struct index_state *istate) +{ + if (istate->split_index || istate->sparse_index || + !core_apply_sparse_checkout || !core_sparse_checkout_cone) + return 0; + + /* + * For now, only create a sparse index with the + * GIT_TEST_SPARSE_INDEX environment variable. We will relax + * this once we have a proper way to opt-in (and later still, + * opt-out). + */ + if (!git_env_bool("GIT_TEST_SPARSE_INDEX", 0)) + return 0; + + if (!istate->sparse_checkout_patterns) { + istate->sparse_checkout_patterns = xcalloc(1, sizeof(struct pattern_list)); + if (get_sparse_checkout_patterns(istate->sparse_checkout_patterns) < 0) + return 0; + } + + if (!istate->sparse_checkout_patterns->use_cone_patterns) { + warning(_("attempting to use sparse-index without cone mode")); + return -1; + } + + if (cache_tree_update(istate, 0)) { + warning(_("unable to update cache-tree, staying full")); + return -1; + } + + remove_fsmonitor(istate); + + trace2_region_enter("index", "convert_to_sparse", istate->repo); + istate->cache_nr = convert_to_sparse_rec(istate, + 0, 0, istate->cache_nr, + "", 0, istate->cache_tree); + istate->drop_cache_tree = 1; + istate->sparse_index = 1; + trace2_region_leave("index", "convert_to_sparse", istate->repo); + return 0; +} static void set_index_entry(struct index_state *istate, int nr, struct cache_entry *ce) { diff --git a/sparse-index.h b/sparse-index.h index 09a20d036c46..64380e121d80 100644 --- a/sparse-index.h +++ b/sparse-index.h @@ -3,5 +3,6 @@ struct index_state; void ensure_full_index(struct index_state *istate); +int convert_to_sparse(struct index_state *istate); #endif diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index 4d789fe86b9d..ca87033d30b0 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -2,6 +2,9 @@ test_description='compare full workdir to sparse workdir' +GIT_TEST_CHECK_CACHE_TREE=0 +GIT_TEST_SPLIT_INDEX=0 + . ./test-lib.sh test_expect_success 'setup' ' @@ -121,15 +124,49 @@ run_on_all () { test_all_match () { run_on_all "$@" && test_cmp full-checkout-out sparse-checkout-out && - test_cmp full-checkout-err sparse-checkout-err + test_cmp full-checkout-out sparse-index-out && + test_cmp full-checkout-err sparse-checkout-err && + test_cmp full-checkout-err sparse-index-err } test_sparse_match () { - run_on_sparse $* && + run_on_sparse "$@" && test_cmp sparse-checkout-out sparse-index-out && test_cmp sparse-checkout-err sparse-index-err } +test_expect_success 'sparse-index contents' ' + init_repos && + + test-tool -C sparse-index read-cache --table >cache && + for dir in folder1 folder2 x + do + TREE=$(git -C sparse-index rev-parse HEAD:$dir) && + grep "040000 tree $TREE $dir/" cache \ + || return 1 + done && + + GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout set folder1 && + + test-tool -C sparse-index read-cache --table >cache && + for dir in deep folder2 x + do + TREE=$(git -C sparse-index rev-parse HEAD:$dir) && + grep "040000 tree $TREE $dir/" cache \ + || return 1 + done && + + GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout set deep/deeper1 && + + test-tool -C sparse-index read-cache --table >cache && + for dir in deep/deeper2 folder1 folder2 x + do + TREE=$(git -C sparse-index rev-parse HEAD:$dir) && + grep "040000 tree $TREE $dir/" cache \ + || return 1 + done +' + test_expect_success 'expanded in-memory index matches full index' ' init_repos && test_sparse_match test-tool read-cache --expand --table @@ -137,6 +174,7 @@ test_expect_success 'expanded in-memory index matches full index' ' test_expect_success 'status with options' ' init_repos && + test_sparse_match ls && test_all_match git status --porcelain=v2 && test_all_match git status --porcelain=v2 -z -u && test_all_match git status --porcelain=v2 -uno && @@ -273,6 +311,17 @@ test_expect_failure 'checkout and reset (mixed)' ' test_all_match git reset update-folder2 ' +# Ensure that sparse-index behaves identically to +# sparse-checkout with a full index. +test_expect_success 'checkout and reset (mixed) [sparse]' ' + init_repos && + + test_sparse_match git checkout -b reset-test update-deep && + test_sparse_match git reset deepest && + test_sparse_match git reset update-folder1 && + test_sparse_match git reset update-folder2 +' + test_expect_success 'merge' ' init_repos && @@ -309,14 +358,20 @@ test_expect_success 'clean' ' test_all_match git status --porcelain=v2 && test_all_match git clean -f && test_all_match git status --porcelain=v2 && + test_sparse_match ls && + test_sparse_match ls folder1 && test_all_match git clean -xf && test_all_match git status --porcelain=v2 && + test_sparse_match ls && + test_sparse_match ls folder1 && test_all_match git clean -xdf && test_all_match git status --porcelain=v2 && + test_sparse_match ls && + test_sparse_match ls folder1 && - test_path_is_dir sparse-checkout/folder1 + test_sparse_match test_path_is_dir folder1 ' test_done From patchwork Tue Feb 23 20:14:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12100863 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ACFB4C433E0 for ; Tue, 23 Feb 2021 20:17:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 80B8064E6B for ; Tue, 23 Feb 2021 20:17:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234214AbhBWUQ5 (ORCPT ); Tue, 23 Feb 2021 15:16:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47232 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233252AbhBWUP5 (ORCPT ); Tue, 23 Feb 2021 15:15:57 -0500 Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BB0C0C0617AB for ; Tue, 23 Feb 2021 12:14:39 -0800 (PST) Received: by mail-wm1-x32d.google.com with SMTP id v21so3619055wml.4 for ; Tue, 23 Feb 2021 12:14:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=CZ13xCDSSkPH/BzGBaU8yW0L08/FfbyWyFIW94sM8ZY=; b=Vm7dg/jghfBBDAilG9YD/7Z2erfmJZjBVnahMLSR36quY8qi0D2W4XtA06XyNWYuqT EvqReuFSv353jSli7Pdw89Q8eB0JvhW0vCMc44k9NYdu2c4lWotyixj29SjGJhHFPUSt csPCnlcCoD7ziQlpqlEDGNL26iQLjC1jN4suEyQsq4vMvIJyvlyl6CFTRUFLbZFYqcNk fiz9U6MFgaMBkyRElRFItwCQ8vjxJqmx08t9VoBaTG+lRjTKanzhU3jb1m/52TqfDucR 9E9/0vNXVqWRNWFbN2dkrrzlj4lFrE+PZh/UX3OPUYt2/bxoenlAkTArPxAHjjOMKd9T DAHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=CZ13xCDSSkPH/BzGBaU8yW0L08/FfbyWyFIW94sM8ZY=; b=OjrIufa19dIRa9w0RoDF7xsLJ/MTnkKpS7ozmVSQXUaq1rzl/aIFIZ8GdnT/7MjQYD p6OU4+yxpD7tFs6/M6PYYEzjzvuiHn9tPpKZAWi3nNz8Uhz4E0/va8DaS1J7qkPIyucZ jlWQyRfLoe539/uxRIeTpio8Nqr5WU+ktrVPvRzhQG5XrjbbNXVi9LlmmqnFFBr9fVxO Boqluuq74wuTBslUaqfv071/d5Ew4QJcPaKjmTga3co18AA3hwOH9GKI2swlRTIOAg+3 OA+WhDQQ34NLulEpGsOXnG79PGwXoungLHbYtypMLy9QMXVuhCCXbmPY7iZ6p/PlbXtV 8R8w== X-Gm-Message-State: AOAM53304ThcyIsM1tsDinrIkk3upDBcaiOJelx0Ojatrb0cNVCuTkPg c2nkKnoE0NCZlSb0k2Tpl+UsR1SPgBo= X-Google-Smtp-Source: ABdhPJwfd1mWlVBXWMeScimQYdEQhRPQfIs2wfl87Pagz0r1hQ4jTRqrkKz8CgPeX6eFHOrUs2JTNg== X-Received: by 2002:a05:600c:4f46:: with SMTP id m6mr429308wmq.154.1614111278566; Tue, 23 Feb 2021 12:14:38 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id r17sm33937227wrx.82.2021.02.23.12.14.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Feb 2021 12:14:38 -0800 (PST) Message-Id: <4405a9115c3b65119d7411025a17f0a9fb0cbd1c.1614111270.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 23 Feb 2021 20:14:21 +0000 Subject: [PATCH 12/20] submodule: sparse-index should not collapse links Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee A submodule is stored as a "Git link" that actually points to a commit within a submodule. Submodules are populated or not depending on submodule configuration, not sparse-checkout. To ensure that the sparse-index feature integrates correctly with submodules, we should not collapse a directory if there is a Git link within its range. Signed-off-by: Derrick Stolee --- sparse-index.c | 1 + t/t1092-sparse-checkout-compatibility.sh | 17 +++++++++++++++++ 2 files changed, 18 insertions(+) diff --git a/sparse-index.c b/sparse-index.c index cb1f85635fbc..14029fafc750 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -52,6 +52,7 @@ static int convert_to_sparse_rec(struct index_state *istate, struct cache_entry *ce = istate->cache[i]; if (ce_stage(ce) || + S_ISGITLINK(ce->ce_mode) || !(ce->ce_flags & CE_SKIP_WORKTREE)) can_convert = 0; } diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index ca87033d30b0..b38fab6455d9 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -374,4 +374,21 @@ test_expect_success 'clean' ' test_sparse_match test_path_is_dir folder1 ' +test_expect_success 'submodule handling' ' + init_repos && + + test_all_match mkdir modules && + test_all_match touch modules/a && + test_all_match git add modules && + test_all_match git commit -m "add modules directory" && + + run_on_all git submodule add "$(pwd)/initial-repo" modules/sub && + test_all_match git commit -m "add submodule" && + + # having a submodule prevents "modules" from collapse + test-tool -C sparse-index read-cache --table >cache && + grep "100644 blob .* modules/a" cache && + grep "160000 commit $(git -C initial-repo rev-parse HEAD) modules/sub" cache +' + test_done From patchwork Tue Feb 23 20:14:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12100861 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2BFBBC433E0 for ; Tue, 23 Feb 2021 20:16:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0085764E6C for ; Tue, 23 Feb 2021 20:16:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234184AbhBWUQt (ORCPT ); Tue, 23 Feb 2021 15:16:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47234 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233243AbhBWUP5 (ORCPT ); Tue, 23 Feb 2021 15:15:57 -0500 Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 422F2C061356 for ; Tue, 23 Feb 2021 12:14:40 -0800 (PST) Received: by mail-wr1-x42f.google.com with SMTP id w11so1185414wrr.10 for ; Tue, 23 Feb 2021 12:14:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=/Mp5BVNLsy38/PeuLsDthiJ4Bt4qP52QXR7S+t+DRT4=; b=IbECb4XZ2mb2BT/pH/es5ta4gY2dsga/jtJ0nd6/ySWYxW7Ra7f0zZAq1VXYr9Q1pC aFHmg+7J/6yF7+CosBzsp/ykN4aE+X4HXhcn0Pz6xxi3e6ZjL223xxIPGVSrQXjmEGmh 9QMPN++2EgHVJyuevcAtcr9+rhY40tXYqMx+xoE/Gx9lt15CTGYQ6sD8qBySUiRwZ0Cr neo1tPYzZkocAxv0gibbWfcAtDBJJ+ZM3sV1EDN0O7j5zhUu++Io0K7rrK4QRRu7h0eI Lkp5rM0WlT8FHA+irjr17S5ez6Mh963BEeODKMCqq9F1OdZBUdVy1faMzjjZPWxeYsoO BCEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=/Mp5BVNLsy38/PeuLsDthiJ4Bt4qP52QXR7S+t+DRT4=; b=XhabKhOApNAHW9KiZpG6OoonVEZ/FuF65pZqJ6Brrs7JtLHWxnsOfvondzjLzu73DK OrDG4+6ifLgRDgBVH7vtcAURR91ZNMxAXvfETvTUDEm+keGv56HdWnM7k88MyKmu5cOm jLx9u1q3+Zll+BGAgVnBTo0sjZeWiGXjHOMZG79n/mAW2u853aCutGE2Bg8h51ZSFWPv TK/iNfTWWIsl8XCuAmNEa8JKnhIwyE/wB09SwS5cV9MmcT1X55fR3wETkagKMp8bAJwi j5W6w9wFqURZlKpRnUitzrsrOw1AH9+xK51ssk49yFu1Ew75ULS0Au3MpQECwYwMFdfr /QLQ== X-Gm-Message-State: AOAM5314XbT5ApGTTch6mnqh7SKxC8gPIdA0MAyTa6JYmJmZSfL1OXuj OBp7ewt/EDw/lrJrg5FwEatoUoyirlw= X-Google-Smtp-Source: ABdhPJxOMIkOZjXNE/dmVIpvYEPtrYCoYK39518nH/7JpimgDaBwwI8ZNGzN6aknDMxFAayu/m1h5w== X-Received: by 2002:a5d:45cd:: with SMTP id b13mr5543568wrs.296.1614111279075; Tue, 23 Feb 2021 12:14:39 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id m6sm10362249wrv.73.2021.02.23.12.14.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Feb 2021 12:14:38 -0800 (PST) Message-Id: In-Reply-To: References: Date: Tue, 23 Feb 2021 20:14:22 +0000 Subject: [PATCH 13/20] unpack-trees: allow sparse directories Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The index_pos_by_traverse_info() currently throws a BUG() when a directory entry exists exactly in the index. We need to consider that it is possible to have a directory in a sparse index as long as that entry is itself marked with the skip-worktree bit. The negation of the 'pos' variable must be conditioned to only when it starts as negative. This is identical behavior as before when the index is full. Signed-off-by: Derrick Stolee --- unpack-trees.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/unpack-trees.c b/unpack-trees.c index 4dd99219073a..b324eec2a5d1 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -746,9 +746,12 @@ static int index_pos_by_traverse_info(struct name_entry *names, strbuf_make_traverse_path(&name, info, names->path, names->pathlen); strbuf_addch(&name, '/'); pos = index_name_pos(o->src_index, name.buf, name.len); - if (pos >= 0) - BUG("This is a directory and should not exist in index"); - pos = -pos - 1; + if (pos >= 0) { + if (!o->src_index->sparse_index || + !(o->src_index->cache[pos]->ce_flags & CE_SKIP_WORKTREE)) + BUG("This is a directory and should not exist in index"); + } else + pos = -pos - 1; if (pos >= o->src_index->cache_nr || !starts_with(o->src_index->cache[pos]->name, name.buf) || (pos > 0 && starts_with(o->src_index->cache[pos-1]->name, name.buf))) From patchwork Tue Feb 23 20:14:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12100843 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90261C433E0 for ; Tue, 23 Feb 2021 20:15:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 66EA064E6B for ; Tue, 23 Feb 2021 20:15:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232342AbhBWUPu (ORCPT ); Tue, 23 Feb 2021 15:15:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47070 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234179AbhBWUPe (ORCPT ); Tue, 23 Feb 2021 15:15:34 -0500 Received: from mail-wm1-x32f.google.com (mail-wm1-x32f.google.com [IPv6:2a00:1450:4864:20::32f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CE344C06121C for ; Tue, 23 Feb 2021 12:14:40 -0800 (PST) Received: by mail-wm1-x32f.google.com with SMTP id o10so2200318wmc.1 for ; Tue, 23 Feb 2021 12:14:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=l5letMvz1wyLqoALj2Ll33jDZAuVfmkP2dhyJYF6caQ=; b=YDnP04Xrgwgas+oYYO5grnIXcnYUBXHBdDDYcubDSjGKMzAggRNrBfhNAqWpOc4gE9 Pe8+lLjkXKMijwTFwyRB+suemuNXgYVJquCuXM7me9JBuhmzCfKDQJeJ4wQv2efY+GXY Fnx77E6xWfu0CqHxXmkyTENiPeoLWojkIEUoxlvLg3nK9ZDqQPxxXlkivU9AYcF/Bv6r f3kxunJ4BlZ/H/LJdfgZQYF7faSdM5ht1LVsqLUCpn5Y+gmUadbpm3UvfiA/h9631XyF mO7QeUIuS/hRdN0RFzopZwX5WdTNjTqh0cT7Ltt0xftMHSEK6dtmOf68NKi0cHs2ghUx ZHDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=l5letMvz1wyLqoALj2Ll33jDZAuVfmkP2dhyJYF6caQ=; b=hNxptRaBEo5EShp8J4capSN42UDvyAnK40O2ETcN0vq6V2TFWEdoVtkV58/W8OLNsK M7Zp64q+1Nsq+2ooc+fc8/srrWUoRFRCmTFsYJTo3DoXGxEY7pQUBtZppoynWTCwO9ra cH7UTF22ddDJNM9Z41JGt/gWynRX8CaefeQZVXcVN/VAVKURXVNhPdCJfr2vzwvpiF6I X2BGUdhr5E0NYegO752R3HjuiFt7e2t4KzexIZIOuGDN1k1Kl4PSMyFwjRqRj7tod6bO u87mTscJbzJiY7eSSEojMs0hUvD7EPpQ+UlY+t88HrdL+zCzxk/Rv6XlSKESUjOG2d7h TaBA== X-Gm-Message-State: AOAM532x1YdgYI/CShyXpUX1r+rFEMk70YToISy39FVV6R4wKzXaM34n xxqCKfGTTwRuve2xB1P2HKErMsoNcjY= X-Google-Smtp-Source: ABdhPJzt5oSq4JZVMGOK2W1tkhWt9SROF7dNJYaiuN4MIHgAMa94H9171D2s/ijvcSWadpo/4GOPnA== X-Received: by 2002:a7b:c18b:: with SMTP id y11mr427884wmi.132.1614111279621; Tue, 23 Feb 2021 12:14:39 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 2sm33287320wre.24.2021.02.23.12.14.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Feb 2021 12:14:39 -0800 (PST) Message-Id: <7d4627574bb8dc3e3a6d0ebd62dc2855ed61a904.1614111270.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 23 Feb 2021 20:14:23 +0000 Subject: [PATCH 14/20] sparse-index: check index conversion happens Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Add a test case that uses test_region to ensure that we are truly expanding a sparse index to a full one, then converting back to sparse when writing the index. As we integrate more Git commands with the sparse index, we will convert these commands to check that we do _not_ convert the sparse index to a full index and instead stay sparse the entire time. Signed-off-by: Derrick Stolee --- t/t1092-sparse-checkout-compatibility.sh | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index b38fab6455d9..bfc9e28ef0e1 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -391,4 +391,22 @@ test_expect_success 'submodule handling' ' grep "160000 commit $(git -C initial-repo rev-parse HEAD) modules/sub" cache ' +test_expect_success 'sparse-index is expanded and converted back' ' + init_repos && + + ( + GIT_TEST_SPARSE_INDEX=1 && + export GIT_TEST_SPARSE_INDEX && + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ + git -C sparse-index -c core.fsmonitor="" reset --hard && + test_region index convert_to_sparse trace2.txt && + test_region index ensure_full_index trace2.txt && + + rm trace2.txt && + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ + git -C sparse-index -c core.fsmonitor="" status -uno && + test_region index ensure_full_index trace2.txt + ) +' + test_done From patchwork Tue Feb 23 20:14:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12100865 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 674ABC433E0 for ; Tue, 23 Feb 2021 20:17:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3FD3464E6C for ; Tue, 23 Feb 2021 20:17:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234213AbhBWURJ (ORCPT ); Tue, 23 Feb 2021 15:17:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47242 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234160AbhBWUP7 (ORCPT ); Tue, 23 Feb 2021 15:15:59 -0500 Received: from mail-wm1-x32f.google.com (mail-wm1-x32f.google.com [IPv6:2a00:1450:4864:20::32f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 700A2C06121D for ; Tue, 23 Feb 2021 12:14:41 -0800 (PST) Received: by mail-wm1-x32f.google.com with SMTP id a132so3645339wmc.0 for ; Tue, 23 Feb 2021 12:14:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=TjPV/qXIMqBExKIKB28/l9rworUXO6haVfrWUj0vMko=; b=pcPxNc/vCc4vpoY3BUg+cCBn+Qns3+yN9UYh2vs6WgGzkunMglU35I5pS6oZNVFcMC P2ZvE+BmXnpoBZGWb4JhF5bddfniJcs+f5Re/W/+psbZPu04vtgDngznupzz5kpozz/7 1sUqGR+dQI6Duexlhac8QumZZ0zNAoFqulRJa6CpMVft/Rpb1thuoAxjz37SndI7nnJE h7GS/rPsl8J46oJ4+r/vLQLveuhsqx470p187lCPfgcnLhODQg4hvJ8CodiOXaKZX8vU E5A6Yb0WBROCO17br8i54GQBZZUEfzMGHe1NCGh7laqLKNvVhF6ZN1FPFSd3A8fPWea+ Uk2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=TjPV/qXIMqBExKIKB28/l9rworUXO6haVfrWUj0vMko=; b=eSyp53fEFTla1R801v/Y+RBZU1zVCecux3Rnk/Xkb6JhQ1BelVzo686yBEaMf7GXNm YZAQ/7SGW+uWL3dIAmmPlAp3kPdrOd81zU4D3iwESVoRGdtyMTkgzXQI9by640H3NCi4 xIIfmXCOyDX53vpHBelCrKzpVx49g+qRSEMG1ZmxDrm+W4WQbpgq4091invWswEQO6M0 LbGduJI6ZUebhzKpnneMZ1W7jHbW+TDeJZyHm4/gbvID+8o+dOR+l80JSFu/FLNR1eHr +9y9MA791aohgL4hsshdEJk2uoyeDEAuaqlrbcDoMcVBsgVWskoh4r8tk7SpusOZab5c cPEQ== X-Gm-Message-State: AOAM532eVMdUcN06tuFUWJXcpoRAb0fWETT9Eg3U/WZ9U8+z/s1rbW/g b4EXMOYcBcmfv1E7wKtm80I5GiYIvnU= X-Google-Smtp-Source: ABdhPJwBsyPG6Nzz46MhFboszUOd666Wi2I7h5zzjazA1GjOT2MDR29BIMrj3Vdy57GQOQC0SWhRVQ== X-Received: by 2002:a05:600c:2904:: with SMTP id i4mr439682wmd.146.1614111280209; Tue, 23 Feb 2021 12:14:40 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id j29sm8730270wrd.21.2021.02.23.12.14.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Feb 2021 12:14:39 -0800 (PST) Message-Id: <564503f7878475f220024def9e3bf20e4d518436.1614111270.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 23 Feb 2021 20:14:24 +0000 Subject: [PATCH 15/20] sparse-index: create extension for compatibility Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Previously, we enabled the sparse index format only using GIT_TEST_SPARSE_INDEX=1. This is not a feasible direction for users to actually select this mode. Further, sparse directory entries are not understood by the index formats as advertised. We _could_ add a new index version that explicitly adds these capabilities, but there are nuances to index formats 2, 3, and 4 that are still valuable to select as options. For now, create a repo extension, "extensions.sparseIndex", that specifies that the tool reading this repository must understand sparse directory entries. This change only encodes the extension and enables it when GIT_TEST_SPARSE_INDEX=1. Later, we will add a more user-friendly CLI mechanism. Signed-off-by: Derrick Stolee --- Documentation/config/extensions.txt | 7 ++++++ cache.h | 1 + repo-settings.c | 7 ++++++ repository.h | 3 ++- setup.c | 3 +++ sparse-index.c | 38 +++++++++++++++++++++++++---- 6 files changed, 53 insertions(+), 6 deletions(-) diff --git a/Documentation/config/extensions.txt b/Documentation/config/extensions.txt index 4e23d73cdcad..5c86b3648732 100644 --- a/Documentation/config/extensions.txt +++ b/Documentation/config/extensions.txt @@ -6,3 +6,10 @@ extensions.objectFormat:: Note that this setting should only be set by linkgit:git-init[1] or linkgit:git-clone[1]. Trying to change it after initialization will not work and will produce hard-to-diagnose issues. + +extensions.sparseIndex:: + When combined with `core.sparseCheckout=true` and + `core.sparseCheckoutCone=true`, the index may contain entries + corresponding to directories outside of the sparse-checkout + definition. Versions of Git that do not understand this extension + do not expect directory entries in the index. diff --git a/cache.h b/cache.h index e8b7d3b4fb33..eea61fba7568 100644 --- a/cache.h +++ b/cache.h @@ -1053,6 +1053,7 @@ struct repository_format { int worktree_config; int is_bare; int hash_algo; + int sparse_index; char *work_tree; struct string_list unknown_extensions; struct string_list v1_only_extensions; diff --git a/repo-settings.c b/repo-settings.c index d63569e4041e..9677d50f9238 100644 --- a/repo-settings.c +++ b/repo-settings.c @@ -85,4 +85,11 @@ void prepare_repo_settings(struct repository *r) * removed. */ r->settings.command_requires_full_index = 1; + + /* + * Initialize this as off. + */ + r->settings.sparse_index = 0; + if (!repo_config_get_bool(r, "extensions.sparseindex", &value) && value) + r->settings.sparse_index = 1; } diff --git a/repository.h b/repository.h index e06a23015697..a45f7520fd9e 100644 --- a/repository.h +++ b/repository.h @@ -42,7 +42,8 @@ struct repo_settings { int core_multi_pack_index; - unsigned command_requires_full_index:1; + unsigned command_requires_full_index:1, + sparse_index:1; }; struct repository { diff --git a/setup.c b/setup.c index c04cd25a30df..cd8394564613 100644 --- a/setup.c +++ b/setup.c @@ -500,6 +500,9 @@ static enum extension_result handle_extension(const char *var, return error("invalid value for 'extensions.objectformat'"); data->hash_algo = format; return EXTENSION_OK; + } else if (!strcmp(ext, "sparseindex")) { + data->sparse_index = 1; + return EXTENSION_OK; } return EXTENSION_UNKNOWN; } diff --git a/sparse-index.c b/sparse-index.c index 14029fafc750..97b0d0c57857 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -102,19 +102,47 @@ static int convert_to_sparse_rec(struct index_state *istate, return num_converted - start_converted; } +static int enable_sparse_index(struct repository *repo) +{ + const char *config_path = repo_git_path(repo, "config.worktree"); + + if (upgrade_repository_format(1) < 0) { + warning(_("unable to upgrade repository format to enable sparse-index")); + return -1; + } + git_config_set_in_file_gently(config_path, + "extensions.sparseIndex", + "true"); + + prepare_repo_settings(repo); + repo->settings.sparse_index = 1; + return 0; +} + int convert_to_sparse(struct index_state *istate) { if (istate->split_index || istate->sparse_index || !core_apply_sparse_checkout || !core_sparse_checkout_cone) return 0; + if (!istate->repo) + istate->repo = the_repository; + + /* + * The GIT_TEST_SPARSE_INDEX environment variable triggers the + * extensions.sparseIndex config variable to be on. + */ + if (git_env_bool("GIT_TEST_SPARSE_INDEX", 0)) { + int err = enable_sparse_index(istate->repo); + if (err < 0) + return err; + } + /* - * For now, only create a sparse index with the - * GIT_TEST_SPARSE_INDEX environment variable. We will relax - * this once we have a proper way to opt-in (and later still, - * opt-out). + * Only convert to sparse if extensions.sparseIndex is set. */ - if (!git_env_bool("GIT_TEST_SPARSE_INDEX", 0)) + prepare_repo_settings(istate->repo); + if (!istate->repo->settings.sparse_index) return 0; if (!istate->sparse_checkout_patterns) { From patchwork Tue Feb 23 20:14:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12100845 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 35BBAC433DB for ; Tue, 23 Feb 2021 20:16:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0A68C64E5C for ; Tue, 23 Feb 2021 20:16:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233330AbhBWUP6 (ORCPT ); Tue, 23 Feb 2021 15:15:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47072 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234208AbhBWUPg (ORCPT ); Tue, 23 Feb 2021 15:15:36 -0500 Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 23C26C06121E for ; Tue, 23 Feb 2021 12:14:42 -0800 (PST) Received: by mail-wr1-x42b.google.com with SMTP id v1so23848785wrd.6 for ; Tue, 23 Feb 2021 12:14:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=sD4Fupn5+RGoFrzGqEpq4EQJjpx+9EBO2UT98kQj/hw=; b=eMyFuaJc5ZSsVICBxE5eAiU/4fjOdNXKhUKDeWjcU8ki5ESPsD+yLPOx7/EgOlBAX6 j1yQpKvRUfiqDHRpJ+A6jnyxeSZy9za9HM/T/qBQcA1LIo1qWx4So+YLYy60QSCTk9Bm Yw5OaMnTZJ/a+cXu2FXUzm0pa2nuFlc4ciVu6eTAI90/bvH2cUpEzZG8NgMkimQhuHgq EPz839DZdT++CVD5IdfIAjaO0S8JA4Fb+dPkrcRTBegRcVGTMW3MULt5aY3C3kzUKVHS aa7m5ZMWtkx/Gt7ne0j6k9VwSlkG0mPFP/5cBgk522Wy6vq5Z/mGJchRgvrOJZT4uUwc HhBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=sD4Fupn5+RGoFrzGqEpq4EQJjpx+9EBO2UT98kQj/hw=; b=I9AvvhSo0xPhZa6p2fdG4/KRLi+IAXFwn5hhb1Vaacx3xT1IJ2BpJpy13lQx6eii1p 6qDmInD74Gi8vgh1uuDmYtQ/hMg6ntLrW2oLZ83TtZ14HXt5QJQ4HZA7pXUUhTNbChGj qraxXLxet/pB9OEZpvddFlkK3Jm7w4q007nLgGANFMofmEbha741S9wJSGTvgK5/Frw3 Z5DwR6BfYH3pZvaciupLVa8Ux6cMhNW0Xas0m36qNabaoOKbxsAubGs3KXdLFaZQ9QCY 64pAfxoOrsKEQXUOPu8ZeHzi21dw3Wp5ItTnbx+2FF+6E4LM4iTulRE6fR5HlzAxocOj 8srQ== X-Gm-Message-State: AOAM5310DEChUQ1OC8Ni3Lejo6Oz1AoQbZvGX9rlNEfxAH2AbEGR2s3G gdzPJNqJ3IIoKu8prRDCCe3tC+A5L2M= X-Google-Smtp-Source: ABdhPJznk6xKVStrSoLiOJUVjhzWDDV6vy8EHT1SXS1wb0NaB8KKT4K8iEV8PamPIbMg3fntemhcmQ== X-Received: by 2002:a05:6000:152:: with SMTP id r18mr27442697wrx.226.1614111280920; Tue, 23 Feb 2021 12:14:40 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id o10sm26636326wrx.5.2021.02.23.12.14.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Feb 2021 12:14:40 -0800 (PST) Message-Id: <6d6b230e3318007150aebefebc16dfb8b9b6c401.1614111270.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 23 Feb 2021 20:14:25 +0000 Subject: [PATCH 16/20] sparse-checkout: toggle sparse index from builtin Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The sparse index extension is used to signal that index writes should be in sparse mode. This was only updated using GIT_TEST_SPARSE_INDEX=1. Add a '--[no-]sparse-index' option to 'git sparse-checkout init' that specifies if the sparse index should be used. It also updates the index to use the correct format, either way. Add a warning in the documentation that the use of a repository extension might reduce compatibility with third-party tools. 'git sparse-checkout init' already sets extension.worktreeConfig, which places most sparse-checkout users outside of the scope of most third-party tools. Update t1092-sparse-checkout-compatibility.sh to use this CLI instead of GIT_TEST_SPARSE_INDEX=1. Signed-off-by: Derrick Stolee --- Documentation/git-sparse-checkout.txt | 14 +++++++++ builtin/sparse-checkout.c | 17 ++++++++++- sparse-index.c | 37 +++++++++++++++-------- sparse-index.h | 3 ++ t/t1092-sparse-checkout-compatibility.sh | 38 +++++++++++------------- 5 files changed, 76 insertions(+), 33 deletions(-) diff --git a/Documentation/git-sparse-checkout.txt b/Documentation/git-sparse-checkout.txt index a0eeaeb02ee3..b51b8450cfd9 100644 --- a/Documentation/git-sparse-checkout.txt +++ b/Documentation/git-sparse-checkout.txt @@ -45,6 +45,20 @@ To avoid interfering with other worktrees, it first enables the When `--cone` is provided, the `core.sparseCheckoutCone` setting is also set, allowing for better performance with a limited set of patterns (see 'CONE PATTERN SET' below). ++ +Use the `--[no-]sparse-index` option to toggle the use of the sparse +index format. This reduces the size of the index to be more closely +aligned with your sparse-checkout definition. This can have significant +performance advantages for commands such as `git status` or `git add`. +This feature is still experimental. Some commands might be slower with +a sparse index until they are properly integrated with the feature. ++ +**WARNING:** Using a sparse index requires modifying the index in a way +that is not completely understood by other tools. Enabling sparse index +enables the `extensions.spareseIndex` config value, which might cause +other tools to stop working with your repository. If you have trouble with +this compatibility, then run `git sparse-checkout sparse-index disable` to +remove this config and rewrite your index to not be sparse. 'set':: Write a set of patterns to the sparse-checkout file, as given as diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index e00b82af727b..ca63e2c64e95 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -14,6 +14,7 @@ #include "unpack-trees.h" #include "wt-status.h" #include "quote.h" +#include "sparse-index.h" static const char *empty_base = ""; @@ -283,12 +284,13 @@ static int set_config(enum sparse_checkout_mode mode) } static char const * const builtin_sparse_checkout_init_usage[] = { - N_("git sparse-checkout init [--cone]"), + N_("git sparse-checkout init [--cone] [--[no-]sparse-index]"), NULL }; static struct sparse_checkout_init_opts { int cone_mode; + int sparse_index; } init_opts; static int sparse_checkout_init(int argc, const char **argv) @@ -303,11 +305,15 @@ static int sparse_checkout_init(int argc, const char **argv) static struct option builtin_sparse_checkout_init_options[] = { OPT_BOOL(0, "cone", &init_opts.cone_mode, N_("initialize the sparse-checkout in cone mode")), + OPT_BOOL(0, "sparse-index", &init_opts.sparse_index, + N_("toggle the use of a sparse index")), OPT_END(), }; repo_read_index(the_repository); + init_opts.sparse_index = -1; + argc = parse_options(argc, argv, NULL, builtin_sparse_checkout_init_options, builtin_sparse_checkout_init_usage, 0); @@ -326,6 +332,15 @@ static int sparse_checkout_init(int argc, const char **argv) sparse_filename = get_sparse_checkout_filename(); res = add_patterns_from_file_to_list(sparse_filename, "", 0, &pl, NULL); + if (init_opts.sparse_index >= 0) { + if (set_sparse_index_config(the_repository, init_opts.sparse_index) < 0) + die(_("failed to modify sparse-index config")); + + /* force an index rewrite */ + repo_read_index(the_repository); + the_repository->index->updated_workdir = 1; + } + /* If we already have a sparse-checkout file, use it. */ if (res >= 0) { free(sparse_filename); diff --git a/sparse-index.c b/sparse-index.c index 97b0d0c57857..a991c5331e9e 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -104,23 +104,37 @@ static int convert_to_sparse_rec(struct index_state *istate, static int enable_sparse_index(struct repository *repo) { - const char *config_path = repo_git_path(repo, "config.worktree"); + int res; if (upgrade_repository_format(1) < 0) { warning(_("unable to upgrade repository format to enable sparse-index")); return -1; } - git_config_set_in_file_gently(config_path, - "extensions.sparseIndex", - "true"); + res = git_config_set_gently("extensions.sparseindex", "true"); prepare_repo_settings(repo); repo->settings.sparse_index = 1; - return 0; + return res; +} + +int set_sparse_index_config(struct repository *repo, int enable) +{ + int res; + + if (enable) + return enable_sparse_index(repo); + + /* Don't downgrade repository format, just remove the extension. */ + res = git_config_set_gently("extensions.sparseindex", NULL); + + prepare_repo_settings(repo); + repo->settings.sparse_index = 0; + return res; } int convert_to_sparse(struct index_state *istate) { + int test_env; if (istate->split_index || istate->sparse_index || !core_apply_sparse_checkout || !core_sparse_checkout_cone) return 0; @@ -129,14 +143,13 @@ int convert_to_sparse(struct index_state *istate) istate->repo = the_repository; /* - * The GIT_TEST_SPARSE_INDEX environment variable triggers the - * extensions.sparseIndex config variable to be on. + * If GIT_TEST_SPARSE_INDEX=1, then trigger extensions.sparseIndex + * to be fully enabled. If GIT_TEST_SPARSE_INDEX=0 (set explicitly), + * then purposefully disable the setting. */ - if (git_env_bool("GIT_TEST_SPARSE_INDEX", 0)) { - int err = enable_sparse_index(istate->repo); - if (err < 0) - return err; - } + test_env = git_env_bool("GIT_TEST_SPARSE_INDEX", -1); + if (test_env >= 0) + set_sparse_index_config(istate->repo, test_env); /* * Only convert to sparse if extensions.sparseIndex is set. diff --git a/sparse-index.h b/sparse-index.h index 64380e121d80..39dcc859735e 100644 --- a/sparse-index.h +++ b/sparse-index.h @@ -5,4 +5,7 @@ struct index_state; void ensure_full_index(struct index_state *istate); int convert_to_sparse(struct index_state *istate); +struct repository; +int set_sparse_index_config(struct repository *repo, int enable); + #endif diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index bfc9e28ef0e1..9c2bc4d25f66 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -4,6 +4,7 @@ test_description='compare full workdir to sparse workdir' GIT_TEST_CHECK_CACHE_TREE=0 GIT_TEST_SPLIT_INDEX=0 +GIT_TEST_SPARSE_INDEX= . ./test-lib.sh @@ -98,25 +99,26 @@ init_repos () { # initialize sparse-checkout definitions git -C sparse-checkout sparse-checkout init --cone && git -C sparse-checkout sparse-checkout set deep && - GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout init --cone && - GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout set deep + git -C sparse-index sparse-checkout init --cone --sparse-index && + test_cmp_config -C sparse-index true extensions.sparseindex && + git -C sparse-index sparse-checkout set deep } run_on_sparse () { ( cd sparse-checkout && - GIT_TEST_SPARSE_INDEX=0 "$@" >../sparse-checkout-out 2>../sparse-checkout-err + "$@" >../sparse-checkout-out 2>../sparse-checkout-err ) && ( cd sparse-index && - GIT_TEST_SPARSE_INDEX=1 "$@" >../sparse-index-out 2>../sparse-index-err + "$@" >../sparse-index-out 2>../sparse-index-err ) } run_on_all () { ( cd full-checkout && - GIT_TEST_SPARSE_INDEX=0 "$@" >../full-checkout-out 2>../full-checkout-err + "$@" >../full-checkout-out 2>../full-checkout-err ) && run_on_sparse "$@" } @@ -146,7 +148,7 @@ test_expect_success 'sparse-index contents' ' || return 1 done && - GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout set folder1 && + git -C sparse-index sparse-checkout set folder1 && test-tool -C sparse-index read-cache --table >cache && for dir in deep folder2 x @@ -156,7 +158,7 @@ test_expect_success 'sparse-index contents' ' || return 1 done && - GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout set deep/deeper1 && + git -C sparse-index sparse-checkout set deep/deeper1 && test-tool -C sparse-index read-cache --table >cache && for dir in deep/deeper2 folder1 folder2 x @@ -394,19 +396,15 @@ test_expect_success 'submodule handling' ' test_expect_success 'sparse-index is expanded and converted back' ' init_repos && - ( - GIT_TEST_SPARSE_INDEX=1 && - export GIT_TEST_SPARSE_INDEX && - GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ - git -C sparse-index -c core.fsmonitor="" reset --hard && - test_region index convert_to_sparse trace2.txt && - test_region index ensure_full_index trace2.txt && - - rm trace2.txt && - GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ - git -C sparse-index -c core.fsmonitor="" status -uno && - test_region index ensure_full_index trace2.txt - ) + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ + git -C sparse-index -c core.fsmonitor="" reset --hard && + test_region index convert_to_sparse trace2.txt && + test_region index ensure_full_index trace2.txt && + + rm trace2.txt && + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ + git -C sparse-index -c core.fsmonitor="" status -uno && + test_region index ensure_full_index trace2.txt ' test_done From patchwork Tue Feb 23 20:14:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12100867 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6607EC433DB for ; Tue, 23 Feb 2021 20:17:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3906764E6B for ; Tue, 23 Feb 2021 20:17:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234237AbhBWURY (ORCPT ); Tue, 23 Feb 2021 15:17:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47252 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234150AbhBWUQB (ORCPT ); Tue, 23 Feb 2021 15:16:01 -0500 Received: from mail-wm1-x32b.google.com (mail-wm1-x32b.google.com [IPv6:2a00:1450:4864:20::32b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 909BFC06121F for ; Tue, 23 Feb 2021 12:14:42 -0800 (PST) Received: by mail-wm1-x32b.google.com with SMTP id p3so3629542wmc.2 for ; Tue, 23 Feb 2021 12:14:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=h/Sv2nt7HYoFWjtNBIuIh6ZaRdlBye0GXuHQNudUZb8=; b=vIR8o5iFDjewIncSaJiOB4+xe4Ycui9LpRvQqQSeE0LrrkTMhNc8GRdztwwtGFT9Va AY+Ao1y6kOZ58qdru4mwjemJu5QCqB9eA2+bgLDJxIE5Om/H+bSAF3TpqIa0vAx+q6nn fGLTduSq82IMT9BTGeuo9ahiywulhC8f96jcoZcHZiq+Ze7JBulCZircqWzinXWUx8t2 09nWVl4M0lzdbSB7H8RNAS11Fs8Sl8cV3RGfcIbe7p2pZsFWjlxJgQnqcuqNMnjdQxjM kPWqD0IDRCPCB5Oeea+XyBVfQqxllnrV8aCH95qOnFF18rojJkAtyI5HABo+uRTT4zQY RThQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=h/Sv2nt7HYoFWjtNBIuIh6ZaRdlBye0GXuHQNudUZb8=; b=sPjpElkff+O/E8gDmiezjdgGAu0Ps8uXWCf+hovQV6qdWGpSOuY5+Mh8z9s48CTnHL gtWcabA1bA3KBaDyE9M6yZCSSNrPM4QESu/wrUsJ1yHFr9Dnz5s+/OoN+eV8ru63GQ5s UO2pjnPs1QWrphmwiQD40hlPeFhp7lV08l5ucgs82igRlQQjlUQZymptjzDlemeGsC1I 4qlvUutxZ5O0XIG+BRMOsgv2FWSefuAPCjA8uQ1kregUTGnlxRMGHGw/kugCP9HB7msV XZT6p4OYUPiM7NGS6o3129gNmjnEGHSPzgjkW+XoidS4k93GgGFvL8z0Foz2zYi0mlgn bKCw== X-Gm-Message-State: AOAM533O8l+KUhpBavlIkFotbu6d1oj2U7at4Hww2UEr0V08E3MLUM1O YJKBXxM7tbt0sGrjASkRtd+TOFZrxtg= X-Google-Smtp-Source: ABdhPJyu+vsMDwSikXfqnwC5AsxALgtYNpkHcf+iUukLJk8IaW9PNXkn1omuJ5np49W4QK9r4yX0Ug== X-Received: by 2002:a05:600c:19cf:: with SMTP id u15mr391535wmq.139.1614111281426; Tue, 23 Feb 2021 12:14:41 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id r12sm23636214wrt.69.2021.02.23.12.14.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Feb 2021 12:14:41 -0800 (PST) Message-Id: In-Reply-To: References: Date: Tue, 23 Feb 2021 20:14:26 +0000 Subject: [PATCH 17/20] sparse-checkout: disable sparse-index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee We use 'git sparse-checkout init --cone --sparse-index' to toggle the sparse-index feature. It makes sense to also disable it when running 'git sparse-checkout disable'. This is particularly important because it removes the extensions.sparseIndex config option, allowing other tools to use this Git repository again. This does mean that 'git sparse-checkout init' will not re-enable the sparse-index feature, even if it was previously enabled. While testing this feature, I noticed that the sparse-index was not being written on the first run, but by a second. This was caught by the call to 'test-tool read-cache --table'. This requires adjusting some assignments to core_apply_sparse_checkout and pl.use_cone_patterns in the sparse_checkout_init() logic. Signed-off-by: Derrick Stolee --- builtin/sparse-checkout.c | 10 +++++++++- t/t1091-sparse-checkout-builtin.sh | 13 +++++++++++++ 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index ca63e2c64e95..585343fa1972 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -280,6 +280,9 @@ static int set_config(enum sparse_checkout_mode mode) "core.sparseCheckoutCone", mode == MODE_CONE_PATTERNS ? "true" : NULL); + if (mode == MODE_NO_PATTERNS) + set_sparse_index_config(the_repository, 0); + return 0; } @@ -341,10 +344,11 @@ static int sparse_checkout_init(int argc, const char **argv) the_repository->index->updated_workdir = 1; } + core_apply_sparse_checkout = 1; + /* If we already have a sparse-checkout file, use it. */ if (res >= 0) { free(sparse_filename); - core_apply_sparse_checkout = 1; return update_working_directory(NULL); } @@ -366,6 +370,7 @@ static int sparse_checkout_init(int argc, const char **argv) add_pattern(strbuf_detach(&pattern, NULL), empty_base, 0, &pl, 0); strbuf_addstr(&pattern, "!/*/"); add_pattern(strbuf_detach(&pattern, NULL), empty_base, 0, &pl, 0); + pl.use_cone_patterns = init_opts.cone_mode; return write_patterns_and_update(&pl); } @@ -632,6 +637,9 @@ static int sparse_checkout_disable(int argc, const char **argv) strbuf_addstr(&match_all, "/*"); add_pattern(strbuf_detach(&match_all, NULL), empty_base, 0, &pl, 0); + prepare_repo_settings(the_repository); + the_repository->settings.sparse_index = 0; + if (update_working_directory(&pl)) die(_("error while refreshing working directory")); diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh index fc64e9ed99f4..ff1ad570a255 100755 --- a/t/t1091-sparse-checkout-builtin.sh +++ b/t/t1091-sparse-checkout-builtin.sh @@ -205,6 +205,19 @@ test_expect_success 'sparse-checkout disable' ' check_files repo a deep folder1 folder2 ' +test_expect_success 'sparse-index enabled and disabled' ' + git -C repo sparse-checkout init --cone --sparse-index && + test_cmp_config -C repo true extensions.sparseIndex && + test-tool -C repo read-cache --table >cache && + grep " tree " cache && + + git -C repo sparse-checkout disable && + test-tool -C repo read-cache --table >cache && + ! grep " tree " cache && + git -C repo config --list >config && + ! grep extensions.sparseindex config +' + test_expect_success 'cone mode: init and set' ' git -C repo sparse-checkout init --cone && git -C repo config --list >config && From patchwork Tue Feb 23 20:14:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12100869 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5448CC433E0 for ; Tue, 23 Feb 2021 20:17:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1F3BC64E4B for ; Tue, 23 Feb 2021 20:17:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234239AbhBWUR3 (ORCPT ); Tue, 23 Feb 2021 15:17:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47070 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231881AbhBWUQE (ORCPT ); Tue, 23 Feb 2021 15:16:04 -0500 Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 22341C061221 for ; Tue, 23 Feb 2021 12:14:43 -0800 (PST) Received: by mail-wm1-x32c.google.com with SMTP id x16so3628603wmk.3 for ; Tue, 23 Feb 2021 12:14:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=BoGPNLypv1jFZsrjtQgx7YNSihf0ZNLgOAUzBRIvkho=; b=BSB/l/0HTmllrdHXhLmcGIzgdGX76CmNRwomyHgmC2joifQuaF6C5CQNZbXjwqVD65 SWB7zOyv/ev7k77rfdlsDS5832vx7hLqxlpwiekZVs9+TVVF0RnjxluKfkT9gRnlId7c DUYF/PwCTVeI/xruxlQyqmC2nrhsPG+k60+FBWHpVnawH4EbL0kb/2inumOpg1sT9L3d g6sHp5r2xShf1XlPVR0srkWe/K3Ydl2cXhV9phD9j8J3qr7iRR9eInQT8e+a4w5A5wUe r151jfE0g12tPM8RLwTBzC1Ut4QQOctTO9ueJyBiDNbJry/GVGkQM000nwJ3W3Bpt3oZ n7zg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=BoGPNLypv1jFZsrjtQgx7YNSihf0ZNLgOAUzBRIvkho=; b=Lg7KFYF6aK6QWTiE3qVGGTH9YQkdd2fQ9qsyiI/WNRtn7TuTsP5eiZ8BMshew5hoFY qcgUnEGnIyJOb1kbkwrE70OiAnU4SM1QTZwPWi5MwjKorboxRfcPK5N0H62LgmtDkXk0 4IfW0Nmzx+nEEXcNARFKCcBcg7ncLaDf/yl9nErI2xHAdrGvJjwDq634bHLQPH6a433J ccZjexpkxWozYF2+YCm3yxWHEQh9qTfK49CF7IavtBrnNIV7fCCqxoZ5jWmMgSIVYRwc d/x1zftfb8hf4sHi9x4zFJ5+jMgB+XHBSvOYUXjig+J+NYO/pfsmdJKdDKArpJ5P5eWP HGqA== X-Gm-Message-State: AOAM531Ez1PgehQ8X2AQQkUONnm/vv5rSRoy7bbiM2hrU+mLqIVYG3In YlJ5910JzZwpKPKzkAnrYCRHmtyGca0= X-Google-Smtp-Source: ABdhPJwrXUuZTL/IaKZBBrxIjKKtBb+v2AbXk8ouQ/j7cu5MtcK801h3tv+2mzwnSOG3KhZQ11NSvw== X-Received: by 2002:a1c:1f4d:: with SMTP id f74mr400015wmf.133.1614111281970; Tue, 23 Feb 2021 12:14:41 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id v6sm34355972wrx.32.2021.02.23.12.14.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Feb 2021 12:14:41 -0800 (PST) Message-Id: In-Reply-To: References: Date: Tue, 23 Feb 2021 20:14:27 +0000 Subject: [PATCH 18/20] cache-tree: integrate with sparse directory entries Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The cache-tree extension was previously disabled with sparse indexes. However, the cache-tree is an important performance feature for commands like 'git status' and 'git add'. Integrate it with sparse directory entries. When writing a sparse index, completely clear and recalculate the cache tree. By starting from scratch, the only integration necessary is to check if we hit a sparse directory entry and create a leaf of the cache-tree that has an entry_count of one and no subtrees. Signed-off-by: Derrick Stolee --- cache-tree.c | 18 ++++++++++++++++++ sparse-index.c | 10 +++++++++- 2 files changed, 27 insertions(+), 1 deletion(-) diff --git a/cache-tree.c b/cache-tree.c index 5f07a39e501e..950a9615db8f 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -256,6 +256,24 @@ static int update_one(struct cache_tree *it, *skip_count = 0; + /* + * If the first entry of this region is a sparse directory + * entry corresponding exactly to 'base', then this cache_tree + * struct is a "leaf" in the data structure, pointing to the + * tree OID specified in the entry. + */ + if (entries > 0) { + const struct cache_entry *ce = cache[0]; + + if (S_ISSPARSEDIR(ce->ce_mode) && + ce->ce_namelen == baselen && + !strncmp(ce->name, base, baselen)) { + it->entry_count = 1; + oidcpy(&it->oid, &ce->oid); + return 1; + } + } + if (0 <= it->entry_count && has_object_file(&it->oid)) return it->entry_count; diff --git a/sparse-index.c b/sparse-index.c index a991c5331e9e..e541f251b37a 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -180,7 +180,11 @@ int convert_to_sparse(struct index_state *istate) istate->cache_nr = convert_to_sparse_rec(istate, 0, 0, istate->cache_nr, "", 0, istate->cache_tree); - istate->drop_cache_tree = 1; + + /* Clear and recompute the cache-tree */ + cache_tree_free(&istate->cache_tree); + cache_tree_update(istate, 0); + istate->sparse_index = 1; trace2_region_leave("index", "convert_to_sparse", istate->repo); return 0; @@ -278,5 +282,9 @@ void ensure_full_index(struct index_state *istate) free(full); + /* Clear and recompute the cache-tree */ + cache_tree_free(&istate->cache_tree); + cache_tree_update(istate, 0); + trace2_region_leave("index", "ensure_full_index", istate->repo); } From patchwork Tue Feb 23 20:14:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12100873 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9ABC4C433E6 for ; Tue, 23 Feb 2021 20:18:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 58DE064EBB for ; Tue, 23 Feb 2021 20:18:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233512AbhBWUSP (ORCPT ); Tue, 23 Feb 2021 15:18:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47354 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232313AbhBWUQ3 (ORCPT ); Tue, 23 Feb 2021 15:16:29 -0500 Received: from mail-wr1-x430.google.com (mail-wr1-x430.google.com [IPv6:2a00:1450:4864:20::430]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AD3DEC061222 for ; Tue, 23 Feb 2021 12:14:43 -0800 (PST) Received: by mail-wr1-x430.google.com with SMTP id c7so7109654wru.8 for ; Tue, 23 Feb 2021 12:14:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=5pUTXKGQuSdTT7ILEy7EmDD3wWlGK1ztMU7gpypslrM=; b=PSiGGkHr6jOgQguqVsr2GgYm01+2XBkyZXKJeOnv0nDrI5JZu0qukRzfVsdqVXiUCE W+CwMgYZCSg9J6DF++2aR9Tn/ZbIQAiNJ2ajkKzkHjUAwewjuwpEOE5z6148xUDtwXlz vyDfx90wL9xVYnIw7r0NhIwTGu6wAHICNnn4a079lkps7JfCbItUbqH0RdrB2P9Vxq+A SDWQBaH7zKBRkupP0tAPGCNoP8Rrkh7nYIMi3K5k+hMB2FnDB2KwqKPZyFOpvFqxPgS5 +e9heEsXotl2HNtcQirpun4yC2pVe2q4RypWDotzTPeGdY9rnjYzGkSYXXavS9K3tzBH nhwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=5pUTXKGQuSdTT7ILEy7EmDD3wWlGK1ztMU7gpypslrM=; b=aQdA3VJSNI4afCRPrGYs0NmSV6gG7Ib3zWTNBMGDzuFXpsVokaPrxUg8sW5JqQrDPt v9AIbVluAxY4pw7+CNWxbaKl6KE9PvGUGrF0+Sz7wEEuyxDJMvrHJO2ze3qKIa3YxRaM 053LFkqIARqEC4DMFhIfzwgamzNts+PX32bLkPqBP9jUgETVDxyCheMmPM9HKai6fdh1 FN2SsIrP1wY3aBghWqmDfu3wo3bn5xHq8QE4bUy/3N6QXycZ7haL8EY/DDZ+Zl/IOEzb FtNuk7QLfcJzfB6GdBFeXOgM7sVLZU/YYh9fCYWqm3kKevmEy/oTpchRhKLuqsu5jLXK qsGA== X-Gm-Message-State: AOAM530AuMfnC3KCJny8BVwkwQogqyUYkL+UhTnbLHZw21M8C5GZjIG4 sPCKkS2GEJVRQlIJSLTeaBYdfmtdghw= X-Google-Smtp-Source: ABdhPJzJ5tLcu869lQTb+coCfYIBs2c2aq0hkEMWWyQvYEYlKnCVJbel7nGqXtHZ132zwQmWU138yQ== X-Received: by 2002:a5d:4848:: with SMTP id n8mr11501033wrs.241.1614111282492; Tue, 23 Feb 2021 12:14:42 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id d29sm27203649wra.51.2021.02.23.12.14.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Feb 2021 12:14:42 -0800 (PST) Message-Id: <2be4981fe6987db02b7694b099b1026f2a6defba.1614111270.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 23 Feb 2021 20:14:28 +0000 Subject: [PATCH 19/20] sparse-index: loose integration with cache_tree_verify() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The cache_tree_verify() method is run when GIT_TEST_CHECK_CACHE_TREE is enabled, which it is by default in the test suite. The logic must be adjusted for the presence of these directory entries. For now, leave the test as a simple check for whether the directory entry is sparse. Do not go any further until needed. This allows us to re-enable GIT_TEST_CHECK_CACHE_TREE in t1092-sparse-checkout-compatibility.sh. Further, p2000-sparse-operations.sh uses the test suite and hence this is enabled for all tests. We need to integrate with it before we run our performance tests with a sparse-index. Signed-off-by: Derrick Stolee --- cache-tree.c | 19 +++++++++++++++++++ t/t1092-sparse-checkout-compatibility.sh | 1 - 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/cache-tree.c b/cache-tree.c index 950a9615db8f..11bf1fcae6e1 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -808,6 +808,19 @@ int cache_tree_matches_traversal(struct cache_tree *root, return 0; } +static void verify_one_sparse(struct repository *r, + struct index_state *istate, + struct cache_tree *it, + struct strbuf *path, + int pos) +{ + struct cache_entry *ce = istate->cache[pos]; + + if (!S_ISSPARSEDIR(ce->ce_mode)) + BUG("directory '%s' is present in index, but not sparse", + path->buf); +} + static void verify_one(struct repository *r, struct index_state *istate, struct cache_tree *it, @@ -830,6 +843,12 @@ static void verify_one(struct repository *r, if (path->len) { pos = index_name_pos(istate, path->buf, path->len); + + if (pos >= 0) { + verify_one_sparse(r, istate, it, path, pos); + return; + } + pos = -pos - 1; } else { pos = 0; diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index 9c2bc4d25f66..c2624176c2e0 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -2,7 +2,6 @@ test_description='compare full workdir to sparse workdir' -GIT_TEST_CHECK_CACHE_TREE=0 GIT_TEST_SPLIT_INDEX=0 GIT_TEST_SPARSE_INDEX= From patchwork Tue Feb 23 20:14:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12100871 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9AD8BC433E0 for ; Tue, 23 Feb 2021 20:18:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5A2B064E6B for ; Tue, 23 Feb 2021 20:18:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234260AbhBWURg (ORCPT ); Tue, 23 Feb 2021 15:17:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47072 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233328AbhBWUQJ (ORCPT ); Tue, 23 Feb 2021 15:16:09 -0500 Received: from mail-wm1-x335.google.com (mail-wm1-x335.google.com [IPv6:2a00:1450:4864:20::335]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3B8FFC061223 for ; Tue, 23 Feb 2021 12:14:44 -0800 (PST) Received: by mail-wm1-x335.google.com with SMTP id o16so1657763wmh.0 for ; Tue, 23 Feb 2021 12:14:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=P4I47Z2ArVzg2Th38fIACL5m/fJU0EE7/UI8LGT5lyw=; b=QqO8yS7UVDiaD5Cd6jxUCDJsAfZII/oZbtf6u+qqPKdqz0YiGLVDPLc9eq/HXwqBhx mdiyKL5fEWe4LNvjWKrDyJT0tZn90F/IMS12eMRAmwyK3x1/USJd3t5JakqmErMBWqzM 7tB8sGK1sH4HIDXeN8NiSujmJ21n5O4nlQd1EQqtfyEjTZaDzNOgB6I9MzKNbssLIjzx zAMaCAoX+Xw35B6C0hU5CpoloeaunVytKP37kv0uSfX68pg7oY90aZVWUud4SV3tCzrR AMTf/Nn/0enPh0x03+2JDNi8MrnOfiH/6W/66Pt8yxK65paca+nbQ0HZ2IPTLEC0B7X7 jCwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=P4I47Z2ArVzg2Th38fIACL5m/fJU0EE7/UI8LGT5lyw=; b=AJVfeM4h1Z3COUA39twiLshmLENwl1G+Fe9/mhiOoZuDF3p9cM7apSt26qhox/aCrr ALLsX/4j41TZFGv/exP54bjt6h2T7YohbjwWy0Uw1Zo+yJ0+zjiR/2EY6LNXrg7HvGS1 f9DNIe2Se4/iahD9zcvBV/tVKMG8Wuf/0wv1nSQu9vxVneqJkiIC9SgR8nrzKJUuwq/S 41UhUMPejkoJZ2jx0R4zxJAPn92LZdZ9nfe8FOnMGGvQVY+QaEu2fPBEMVFak5ejljQ9 zLiJV+UAYGPtBZbzQYN2zdnt+aR3hUu3Yvi2os6rTVwMoXOjxRUZrZOgJDNhgrgYjzzJ 9Nog== X-Gm-Message-State: AOAM531KdcPz64+QK6KG2EyhiDPmpOvGYVS494pslQAsTN6bYogoAfgM BgX6/ZsgMIUHYjLpPg+q+aC4e4M6PR8= X-Google-Smtp-Source: ABdhPJydkyCOSV6c/zP363VsC/ar3lG9Y0aM404p8rc/xigHnRNOHH2t2bFv39BFJ6GSO+om8BKmQg== X-Received: by 2002:a7b:c442:: with SMTP id l2mr410742wmi.69.1614111283058; Tue, 23 Feb 2021 12:14:43 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id b15sm97088wmd.41.2021.02.23.12.14.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Feb 2021 12:14:42 -0800 (PST) Message-Id: In-Reply-To: References: Date: Tue, 23 Feb 2021 20:14:29 +0000 Subject: [PATCH 20/20] p2000: add sparse-index repos Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee p2000-sparse-operations.sh compares different Git commands in repositories with many files at HEAD but using sparse-checkout to focus on a small portion of those files. Add extra copies of the repository that use the sparse-index format so we can track how that affects the performance of different commands. At this point in time, the sparse-index is 100% overhead from the CPU front, and this is measurable in these tests: Test --------------------------------------------------------------- 2000.2: git status (full-index-v3) 0.59(0.51+0.12) 2000.3: git status (full-index-v4) 0.59(0.52+0.11) 2000.4: git status (sparse-index-v3) 1.40(1.32+0.12) 2000.5: git status (sparse-index-v4) 1.41(1.36+0.08) 2000.6: git add -A (full-index-v3) 2.32(1.97+0.19) 2000.7: git add -A (full-index-v4) 2.17(1.92+0.14) 2000.8: git add -A (sparse-index-v3) 2.31(2.21+0.15) 2000.9: git add -A (sparse-index-v4) 2.30(2.20+0.13) 2000.10: git add . (full-index-v3) 2.39(2.02+0.20) 2000.11: git add . (full-index-v4) 2.20(1.94+0.16) 2000.12: git add . (sparse-index-v3) 2.36(2.27+0.12) 2000.13: git add . (sparse-index-v4) 2.33(2.21+0.16) 2000.14: git commit -a -m A (full-index-v3) 2.47(2.12+0.20) 2000.15: git commit -a -m A (full-index-v4) 2.26(2.00+0.17) 2000.16: git commit -a -m A (sparse-index-v3) 3.01(2.92+0.16) 2000.17: git commit -a -m A (sparse-index-v4) 3.01(2.94+0.15) Note that there is very little difference between the v3 and v4 index formats when the sparse-index is enabled. This is primarily due to the fact that the relative file sizes are the same, and the command time is mostly taken up by parsing tree objects to expand the sparse index into a full one. With the current file layout, the index file sizes are given by this table: | full index | sparse index | +-------------+--------------+ v3 | 108 MiB | 1.6 MiB | v4 | 80 MiB | 1.2 MiB | Future updates will improve the performance of Git commands when the index is sparse. Signed-off-by: Derrick Stolee --- t/perf/p2000-sparse-operations.sh | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh index 52597683376e..f9c7f3c6e27e 100755 --- a/t/perf/p2000-sparse-operations.sh +++ b/t/perf/p2000-sparse-operations.sh @@ -62,12 +62,29 @@ test_expect_success 'setup repo and indexes' ' git sparse-checkout set $SPARSE_CONE && git config index.version 4 && git update-index --index-version=4 + ) && + git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-index-v3 && + ( + cd sparse-index-v3 && + git sparse-checkout init --cone --sparse-index && + git sparse-checkout set $SPARSE_CONE && + git config index.version 3 && + git update-index --index-version=3 + ) && + git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-index-v4 && + ( + cd sparse-index-v4 && + git sparse-checkout init --cone --sparse-index && + git sparse-checkout set $SPARSE_CONE && + git config index.version 4 && + git update-index --index-version=4 ) ' test_perf_on_all () { command="$@" - for repo in full-index-v3 full-index-v4 + for repo in full-index-v3 full-index-v4 \ + sparse-index-v3 sparse-index-v4 do test_perf "$command ($repo)" " (