From patchwork Tue Mar 16 16:42:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12142935 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 076DAC433E6 for ; Tue, 16 Mar 2021 16:44:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D3AC565108 for ; Tue, 16 Mar 2021 16:44:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238889AbhCPQnm (ORCPT ); Tue, 16 Mar 2021 12:43:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52570 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238859AbhCPQnI (ORCPT ); Tue, 16 Mar 2021 12:43:08 -0400 Received: from mail-wr1-x42e.google.com (mail-wr1-x42e.google.com [IPv6:2a00:1450:4864:20::42e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 17256C061756 for ; Tue, 16 Mar 2021 09:43:07 -0700 (PDT) Received: by mail-wr1-x42e.google.com with SMTP id x16so7793875wrn.4 for ; Tue, 16 Mar 2021 09:43:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=0vpHjpdIXCkZI1YqvNy1aNFYzJ9Qs7jF1L1AqoYGTHM=; b=LW2hzUJRfLnKBbAhTn2fe2mBaO5hPcRutzkZ2UJN+HabJrMAhlvTzpDMSYXD+EqlHa VC8sXRcxfuOpU1fsGgcvDdzA47y23sbut1HSW8b9/PrdpmMzs7Mclacd26i1taNxMq91 CeljhL5xDmE8KdzGXhOmUkNIhE598lVy69dLNQM2Z8P0Xnd75OzcocEi7+5Hp15RJ7YM ZkRcc38v/CsCld1CAEDp5YRUJXH8E2LtSehlqb9ymc/fepuCve0/u7ztREHgnj3CREfU XksSGbBUHZiTWCpCteM/N6R6NVPrUmg3zfXuU6S7nt0PRVb3gYJ2g9AY+MO7FfjtF8FY 5CcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=0vpHjpdIXCkZI1YqvNy1aNFYzJ9Qs7jF1L1AqoYGTHM=; b=V1D9JGWs0UkMs82fPyO/SGfKFPqYfw1l8d0UYK3+JyCuARtamfyMHysRpTQkXU7dCy Y5249yb3IxtDgk+GGxvkHWet27iRg+mq7+psh3ndn2F+Pb7m4NAKijO7fBWjFL1DNUZv 2YKK5ExO1unKwRHGGxWk3x4PdBO1FiQQ6u8klk1z187eOSPI7SFeIMzxRPphav0D8BP+ k6U/mhVjfnDpu4Phr7uDn6IbYJ4VwymHLhuD82NXtAUV1icj2AiC+x+f+ZfGhcRJaTvl fvFZouaVGdQdZMIPU072Ay9vKkICcngLuHrk+zFw7XPVWjtdTGW7h1q4deu3JSe59+6K zC0A== X-Gm-Message-State: AOAM5305gulyhq/z09Z7U5AxSjEekh5JRo2dHVw8lgCVZVbBFnlP+T/0 Kw/ixx7ffUXo8uW36KauByxyH+ULm0M= X-Google-Smtp-Source: ABdhPJws9nGlg9KcBI9g7WFF4d7Ge9YLqNw4SJzdn4B4MXNvvMQaWDQGC0KjuRSy6jkrk+XpgtfzgQ== X-Received: by 2002:a5d:6446:: with SMTP id d6mr5847038wrw.328.1615912985687; Tue, 16 Mar 2021 09:43:05 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id s16sm22560229wru.91.2021.03.16.09.43.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Mar 2021 09:43:05 -0700 (PDT) Message-Id: <62ac13945bec13270e0898126756c3f947ae264b.1615912983.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 16 Mar 2021 16:42:44 +0000 Subject: [PATCH v3 01/20] sparse-index: design doc and format update Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee This begins a long effort to update the index format to allow sparse directory entries. This should result in a significant improvement to Git commands when HEAD contains millions of files, but the user has selected many fewer files to keep in their sparse-checkout definition. Currently, the index format is only updated in the presence of extensions.sparseIndex instead of increasing a file format version number. This is temporary, and index v5 is part of the plan for future work in this area. The design document details many of the reasons for embarking on this work, and also the plan for completing it safely. Signed-off-by: Derrick Stolee --- Documentation/technical/index-format.txt | 7 + Documentation/technical/sparse-index.txt | 173 +++++++++++++++++++++++ 2 files changed, 180 insertions(+) create mode 100644 Documentation/technical/sparse-index.txt diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt index d363a71c37ec..cc548eaa0e97 100644 --- a/Documentation/technical/index-format.txt +++ b/Documentation/technical/index-format.txt @@ -44,6 +44,13 @@ Git index format localization, no special casing of directory separator '/'). Entries with the same name are sorted by their stage field. + An index entry typically represents a file. However, if sparse-checkout + is enabled in cone mode (`core.sparseCheckoutCone` is enabled) and the + `extensions.sparseIndex` extension is enabled, then the index may + contain entries for directories outside of the sparse-checkout definition. + These entries have mode `0040000`, include the `SKIP_WORKTREE` bit, and + the path ends in a directory separator. + 32-bit ctime seconds, the last time a file's metadata changed this is stat(2) data diff --git a/Documentation/technical/sparse-index.txt b/Documentation/technical/sparse-index.txt new file mode 100644 index 000000000000..aa116406a016 --- /dev/null +++ b/Documentation/technical/sparse-index.txt @@ -0,0 +1,173 @@ +Git Sparse-Index Design Document +================================ + +The sparse-checkout feature allows users to focus a working directory on +a subset of the files at HEAD. The cone mode patterns, enabled by +`core.sparseCheckoutCone`, allow for very fast pattern matching to +discover which files at HEAD belong in the sparse-checkout cone. + +Three important scale dimensions for a Git worktree are: + +* `HEAD`: How many files are present at `HEAD`? + +* Populated: How many files are within the sparse-checkout cone. + +* Modified: How many files has the user modified in the working directory? + +We will use big-O notation -- O(X) -- to denote how expensive certain +operations are in terms of these dimensions. + +These dimensions are ordered by their magnitude: users (typically) modify +fewer files than are populated, and we can only populate files at `HEAD`. +These dimensions are also ordered by how expensive they are per item: it +is expensive to detect a modified file than it is to write one that we +know must be populated; changing `HEAD` only really requires updating the +index. + +Problems occur if there is an extreme imbalance in these dimensions. For +example, if `HEAD` contains millions of paths but the populated set has +only tens of thousands, then commands like `git status` and `git add` can +be dominated by operations that require O(`HEAD`) operations instead of +O(Populated). Primarily, the cost is in parsing and rewriting the index, +which is filled primarily with files at `HEAD` that are marked with the +`SKIP_WORKTREE` bit. + +The sparse-index intends to take these commands that read and modify the +index from O(`HEAD`) to O(Populated). To do this, we need to modify the +index format in a significant way: add "sparse directory" entries. + +With cone mode patterns, it is possible to detect when an entire +directory will have its contents outside of the sparse-checkout definition. +Instead of listing all of the files it contains as individual entries, a +sparse-index contains an entry with the directory name, referencing the +object ID of the tree at `HEAD` and marked with the `SKIP_WORKTREE` bit. +If we need to discover the details for paths within that directory, we +can parse trees to find that list. + +At time of writing, sparse-directory entries violate expectations about the +index format and its in-memory data structure. There are many consumers in +the codebase that expect to iterate through all of the index entries and +see only files. In addition, they expect to see all files at `HEAD`. One +way to handle this is to parse trees to replace a sparse-directory entry +with all of the files within that tree as the index is loaded. However, +parsing trees is slower than parsing the index format, so that is a slower +operation than if we left the index alone. + +The implementation plan below follows four phases to slowly integrate with +the sparse-index. The intention is to incrementally update Git commands to +interact safely with the sparse-index without significant slowdowns. This +may not always be possible, but the hope is that the primary commands that +users need in their daily work are dramatically improved. + +Phase I: Format and initial speedups +------------------------------------ + +During this phase, Git learns to enable the sparse-index and safely parse +one. Protections are put in place so that every consumer of the in-memory +data structure can operate with its current assumption of every file at +`HEAD`. + +At first, every index parse will expand the sparse-directory entries into +the full list of paths at `HEAD`. This will be slower in all cases. The +only noticable change in behavior will be that the serialized index file +contains sparse-directory entries. + +To start, we use a new repository extension, `extensions.sparseIndex`, to +allow inserting sparse-directory entries into indexes with file format +versions 2, 3, and 4. This prevents Git versions that do not understand +the sparse-index from operating on one, but it also prevents other +operations that do not use the index at all. A new format, index v5, will +be introduced that includes sparse-directory entries by default. It might +also introduce other features that have been considered for improving the +index, as well. + +Next, consumers of the index will be guarded against operating on a +sparse-index by inserting calls to `ensure_full_index()` or +`expand_index_to_path()`. After these guards are in place, we can begin +leaving sparse-directory entries in the in-memory index structure. + +Even after inserting these guards, we will keep expanding sparse-indexes +for most Git commands using the `command_requires_full_index` repository +setting. This setting will be on by default and disabled one builtin at a +time until we have sufficient confidence that all of the index operations +are properly guarded. + +To complete this phase, the commands `git status` and `git add` will be +integrated with the sparse-index so that they operate with O(Populated) +performance. They will be carefully tested for operations within and +outside the sparse-checkout definition. + +Phase II: Careful integrations +------------------------------ + +This phase focuses on ensuring that all index extensions and APIs work +well with a sparse-index. This requires significant increases to our test +coverage, especially for operations that interact with the working +directory outside of the sparse-checkout definition. Some of these +behaviors may not be the desirable ones, such as some tests already +marked for failure in `t1092-sparse-checkout-compatibility.sh`. + +The index extensions that may require special integrations are: + +* FS Monitor +* Untracked cache + +While integrating with these features, we should look for patterns that +might lead to better APIs for interacting with the index. Coalescing +common usage patterns into an API call can reduce the number of places +where sparse-directories need to be handled carefully. + +Phase III: Important command speedups +------------------------------------- + +At this point, the patterns for testing and implementing sparse-directory +logic should be relatively stable. This phase focuses on updating some of +the most common builtins that use the index to operate as O(Populated). +Here is a potential list of commands that could be valuable to integrate +at this point: + +* `git commit` +* `git checkout` +* `git merge` +* `git rebase` + +Hopefully, commands such as `git merge` and `git rebase` can benefit +instead from merge algorithms that do not use the index as a data +structure, such as the merge-ORT strategy. As these topics mature, we +may enable the ORT strategy by default for repositories using the +sparse-index feature. + +Along with `git status` and `git add`, these commands cover the majority +of users' interactions with the working directory. In addition, we can +integrate with these commands: + +* `git grep` +* `git rm` + +These have been proposed as some whose behavior could change when in a +repo with a sparse-checkout definition. It would be good to include this +behavior automatically when using a sparse-index. Some clarity is needed +to make the behavior switch clear to the user. + +This phase is the first where parallel work might be possible without too +much conflicts between topics. + +Phase IV: The long tail +----------------------- + +This last phase is less a "phase" and more "the new normal" after all of +the previous work. + +To start, the `command_requires_full_index` option could be removed in +favor of expanding only when hitting an API guard. + +There are many Git commands that could use special attention to operate as +O(Populated), while some might be so rare that it is acceptable to leave +them with additional overhead when a sparse-index is present. + +Here are some commands that might be useful to update: + +* `git sparse-checkout set` +* `git am` +* `git clean` +* `git stash` From patchwork Tue Mar 16 16:42:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12142939 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D0C2C433DB for ; Tue, 16 Mar 2021 16:44:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 08F0E65109 for ; Tue, 16 Mar 2021 16:44:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238900AbhCPQnp (ORCPT ); Tue, 16 Mar 2021 12:43:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52578 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238853AbhCPQnI (ORCPT ); Tue, 16 Mar 2021 12:43:08 -0400 Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B48E9C06174A for ; Tue, 16 Mar 2021 09:43:07 -0700 (PDT) Received: by mail-wr1-x42f.google.com with SMTP id d15so10820977wrv.5 for ; Tue, 16 Mar 2021 09:43:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=zwMXD/n7BP4Kel8wxBvjV38W0PSA0EVBJLFSQGh4bX8=; b=SrmibYJE/sY5cq1x+1+yeUqMYvkcUVvMfVXh6HyeoyGy3GnY44UOFXEgLT8ATzBiid +YnxLd0mKBguDjy+RYeUQ/NfK2zHgq+ttgfgiU+BbLkr1LtMyz78EnVhr1XUK49+crlR nO6orZ12BBIMllpzGYevcpqrcC3iUpws506bXeRnRquE5FsSgonECqIcjh900rtqDKpB pUR6kBgMxTnkvFvi+v8V/1jjG3lCltWtmXnBNMmIgHkEnrnfip3ANRRPKEWlR66vONKW LvkLy1fZOph68rgNrRe25oIzrDz3KDuYqyCDNxyWQ4LhaaMJY79MzwFhU5DTEc41Jvtz rXXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=zwMXD/n7BP4Kel8wxBvjV38W0PSA0EVBJLFSQGh4bX8=; b=hbJNfpVxj725ruIix9VxOxO2Rlti/EHGSO9VKojZevTra1gSEyAMnGIa7L+UPgqv9L Gil8r7AycWR5oN0fl/mOPMv0J412X7Cgwy6p4x0E0QK+r/Cxwm1okBX6ZSw5+pRZhJD+ 7pqiU6RrntUkW9Cg+PHDOONNwLVS9alXW9wMZ5tt9uOxEJDEIz9LMPFBFTOA3+9o2FaJ P+Jt8pkNXDUv8yP1Ek9le+bG+AN6tgPIE/a4rdLvRgIivOK6uuYcB/oQBlkWEZHFQcWk I/x1cX6O96dly804CQh42Jy7z1aRtCbrDvpbpyboH57D7PJqI4lLOFbZlY8A0w0furFK U+Zg== X-Gm-Message-State: AOAM533kH/bQ3PrBdX/d5pLp2B8U9zzSJiCDx8lgmJzP6Cg5+vt+KhnQ tnjnvuED9lko4P8YY8dnDV1xF1evmDk= X-Google-Smtp-Source: ABdhPJxtOCyNGHjktgIivVQZT/UuIjFiL0XaqNo4i05pxn4EApUYdgdP/Z/mZqurRNfQYihtHilssw== X-Received: by 2002:a5d:4f0e:: with SMTP id c14mr5762384wru.78.1615912986472; Tue, 16 Mar 2021 09:43:06 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h20sm27824wmm.19.2021.03.16.09.43.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Mar 2021 09:43:06 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 16 Mar 2021 16:42:45 +0000 Subject: [PATCH v3 02/20] t/perf: add performance test for sparse operations Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Create a test script that takes the default performance test (the Git codebase) and multiplies it by 256 using four layers of duplicated trees of width four. This results in nearly one million blob entries in the index. Then, we can clone this repository with sparse-checkout patterns that demonstrate four copies of the initial repository. Each clone will use a different index format or mode so peformance can be tested across the different options. Note that the initial repo is stripped of submodules before doing the copies. This preserves the expected data shape of the sparse index, because directories containing submodules are not collapsed to a sparse directory entry. Run a few Git commands on these clones, especially those that use the index (status, add, commit). Here are the results on my Linux machine: Test -------------------------------------------------------------- 2000.2: git status (full-index-v3) 0.37(0.30+0.09) 2000.3: git status (full-index-v4) 0.39(0.32+0.10) 2000.4: git add -A (full-index-v3) 1.42(1.06+0.20) 2000.5: git add -A (full-index-v4) 1.26(0.98+0.16) 2000.6: git add . (full-index-v3) 1.40(1.04+0.18) 2000.7: git add . (full-index-v4) 1.26(0.98+0.17) 2000.8: git commit -a -m A (full-index-v3) 1.42(1.11+0.16) 2000.9: git commit -a -m A (full-index-v4) 1.33(1.08+0.16) It is perhaps noteworthy that there is an improvement when using index version 4. This is because the v3 index uses 108 MiB while the v4 index uses 80 MiB. Since the repeated portions of the directories are very short (f3/f1/f2, for example) this ratio is less pronounced than in similarly-sized real repositories. Signed-off-by: Derrick Stolee --- t/perf/p2000-sparse-operations.sh | 85 +++++++++++++++++++++++++++++++ 1 file changed, 85 insertions(+) create mode 100755 t/perf/p2000-sparse-operations.sh diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh new file mode 100755 index 000000000000..2fbc81b22119 --- /dev/null +++ b/t/perf/p2000-sparse-operations.sh @@ -0,0 +1,85 @@ +#!/bin/sh + +test_description="test performance of Git operations using the index" + +. ./perf-lib.sh + +test_perf_default_repo + +SPARSE_CONE=f2/f4/f1 + +test_expect_success 'setup repo and indexes' ' + git reset --hard HEAD && + # Remove submodules from the example repo, because our + # duplication of the entire repo creates an unlikly data shape. + git config --file .gitmodules --get-regexp "submodule.*.path" >modules && + git rm -f .gitmodules && + for module in $(awk "{print \$2}" modules) + do + git rm $module || return 1 + done && + git commit -m "remove submodules" && + + echo bogus >a && + cp a b && + git add a b && + git commit -m "level 0" && + BLOB=$(git rev-parse HEAD:a) && + OLD_COMMIT=$(git rev-parse HEAD) && + OLD_TREE=$(git rev-parse HEAD^{tree}) && + + for i in $(test_seq 1 4) + do + cat >in <<-EOF && + 100755 blob $BLOB a + 040000 tree $OLD_TREE f1 + 040000 tree $OLD_TREE f2 + 040000 tree $OLD_TREE f3 + 040000 tree $OLD_TREE f4 + EOF + NEW_TREE=$(git mktree >$SPARSE_CONE/a && + $command + ) + " + done +} + +test_perf_on_all git status +test_perf_on_all git add -A +test_perf_on_all git add . +test_perf_on_all git commit -a -m A + +test_done From patchwork Tue Mar 16 16:42:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12142937 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4EF62C43381 for ; Tue, 16 Mar 2021 16:44:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 219EA65111 for ; Tue, 16 Mar 2021 16:44:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238914AbhCPQnq (ORCPT ); Tue, 16 Mar 2021 12:43:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52580 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238862AbhCPQnJ (ORCPT ); Tue, 16 Mar 2021 12:43:09 -0400 Received: from mail-wm1-x32a.google.com (mail-wm1-x32a.google.com [IPv6:2a00:1450:4864:20::32a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 603A9C061756 for ; Tue, 16 Mar 2021 09:43:08 -0700 (PDT) Received: by mail-wm1-x32a.google.com with SMTP id g8so10477710wmd.4 for ; Tue, 16 Mar 2021 09:43:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Sr8T60pvjyaHGo05oWI6Lye8k/6nZgW5k/6ArV7rGJ8=; b=q9BXhli9Hsg8kOgzjIO02tvw1kGNhAD9gmFjGUjlGRPkChu9v0bDP4SlK+fdgJur9i u/GOKMz28XN9EMRKX1Brmv0O1r7S19eBPhTA4xwt+hXBflo3QdGC9CZ6X+zT6USWkCe6 lAEfPeDyJe++d96egE7sclF71FzjEMA8MvP/DFaHA6Gko2EGSssm7xvgxFgUaq77FCDd PBxVUwINTtC3OFAP91FmlZdMzGC5En3qwbJDGP8vPDClag0axiz+C8cXObcSJ5o6b72O LM47PBeK1IIhVTDaEPmAUXEMWwJwIa0eH+HJkOdwtiymvcYU/H4Zgsz9zbUaDAmSFp10 8uqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Sr8T60pvjyaHGo05oWI6Lye8k/6nZgW5k/6ArV7rGJ8=; b=TRm1H/pkv1ALDVMscp2/Qc4/KS/l94eaHAnd1paNEC3MnCTqcIfCq53kjY5EklEKQk xpNmt21QIwkeKN/YYSCReBzfi2Ov5eiZc5Tw15cczj96yYdkl76kBkCMEJBrMdMYMnr8 xv2cpMZu/ahtpolIlgimSh2UNq7OBB8++3cyq6VJ7ta+yfSuAOeEdCLnSq5oSncb9gEI liwo0GH7hnuPD3UOouH2k8IfoEgkZPWOZqFnduOxW0qvKQao3We2TVB7f92fnP9qUNh4 9OrVfSSll8NCYv0cNW0hjwdZosSgdwSbTa0VVym4wF0rTUqD6iyqlFzFVkM4t3dZkVMZ 8yNQ== X-Gm-Message-State: AOAM532zt2rPjXunQ1BUb/FRgA8THmeNlSa4VMTzU+d1S+piBlTeepvz cpWFb4Xu3hr9ShZMFsMI/4r3q71vfDI= X-Google-Smtp-Source: ABdhPJwDFg7nTkPNh7wyyuenSyTavJkd61qNN2PVAUhyaFqCtZlegjNDeiLqmw0AVP5K6YPjnEmw5w== X-Received: by 2002:a05:600c:1553:: with SMTP id f19mr470641wmg.33.1615912987192; Tue, 16 Mar 2021 09:43:07 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id s20sm12571wmj.36.2021.03.16.09.43.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Mar 2021 09:43:06 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 16 Mar 2021 16:42:46 +0000 Subject: [PATCH v3 03/20] t1092: clean up script quoting Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee This test was introduced in 19a0acc83e4 (t1092: test interesting sparse-checkout scenarios, 2021-01-23), but these issues with quoting were not noticed until starting this follow-up series. The old mechanism would drop quoting such as in test_all_match git commit -m "touch README.md" The above happened to work because README.md is a file in the repository, so 'git commit -m touch REAMDE.md' would succeed by accident. Other cases included quoting for no good reason, so clean that up now. Signed-off-by: Derrick Stolee --- t/t1092-sparse-checkout-compatibility.sh | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index 8cd3e5a8d227..3725d3997e70 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -96,20 +96,20 @@ init_repos () { run_on_sparse () { ( cd sparse-checkout && - $* >../sparse-checkout-out 2>../sparse-checkout-err + "$@" >../sparse-checkout-out 2>../sparse-checkout-err ) } run_on_all () { ( cd full-checkout && - $* >../full-checkout-out 2>../full-checkout-err + "$@" >../full-checkout-out 2>../full-checkout-err ) && - run_on_sparse $* + run_on_sparse "$@" } test_all_match () { - run_on_all $* && + run_on_all "$@" && test_cmp full-checkout-out sparse-checkout-out && test_cmp full-checkout-err sparse-checkout-err } @@ -119,7 +119,7 @@ test_expect_success 'status with options' ' test_all_match git status --porcelain=v2 && test_all_match git status --porcelain=v2 -z -u && test_all_match git status --porcelain=v2 -uno && - run_on_all "touch README.md" && + run_on_all touch README.md && test_all_match git status --porcelain=v2 && test_all_match git status --porcelain=v2 -z -u && test_all_match git status --porcelain=v2 -uno && @@ -135,7 +135,7 @@ test_expect_success 'add, commit, checkout' ' write_script edit-contents <<-\EOF && echo text >>$1 EOF - run_on_all "../edit-contents README.md" && + run_on_all ../edit-contents README.md && test_all_match git add README.md && test_all_match git status --porcelain=v2 && @@ -144,7 +144,7 @@ test_expect_success 'add, commit, checkout' ' test_all_match git checkout HEAD~1 && test_all_match git checkout - && - run_on_all "../edit-contents README.md" && + run_on_all ../edit-contents README.md && test_all_match git add -A && test_all_match git status --porcelain=v2 && @@ -153,7 +153,7 @@ test_expect_success 'add, commit, checkout' ' test_all_match git checkout HEAD~1 && test_all_match git checkout - && - run_on_all "../edit-contents deep/newfile" && + run_on_all ../edit-contents deep/newfile && test_all_match git status --porcelain=v2 -uno && test_all_match git status --porcelain=v2 && @@ -186,7 +186,7 @@ test_expect_success 'diff --staged' ' write_script edit-contents <<-\EOF && echo text >>README.md EOF - run_on_all "../edit-contents" && + run_on_all ../edit-contents && test_all_match git diff && test_all_match git diff --staged && @@ -280,7 +280,7 @@ test_expect_success 'clean' ' echo bogus >>.gitignore && run_on_all cp ../.gitignore . && test_all_match git add .gitignore && - test_all_match git commit -m ignore-bogus-files && + test_all_match git commit -m "ignore bogus files" && run_on_sparse mkdir folder1 && run_on_all touch folder1/bogus && From patchwork Tue Mar 16 16:42:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12142945 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5362EC4332B for ; Tue, 16 Mar 2021 16:44:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3807165108 for ; Tue, 16 Mar 2021 16:44:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238917AbhCPQnv (ORCPT ); Tue, 16 Mar 2021 12:43:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52584 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238863AbhCPQnJ (ORCPT ); Tue, 16 Mar 2021 12:43:09 -0400 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1F32AC06174A for ; Tue, 16 Mar 2021 09:43:09 -0700 (PDT) Received: by mail-wr1-x42d.google.com with SMTP id v15so10828153wrx.4 for ; Tue, 16 Mar 2021 09:43:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=tG7+YZdwAKze8gBAJEZBRIJJgY5B1vbHrx8V3JKD0do=; b=jcXnsCWb+1t0rm0aki+GHNjzuEjevWNtafJHQDe7XhMUeqfTyicTh7Gy4cvJEL3lZo GYTQSkCtiwQF6QkKpf3ksB9O21l6n1ES0tC+exU800f89uwLd+k8PRpY4wovTQ+SorAu Joy75h+tTy5OPU5epdjgVrRK/mgjw/vOrqjMlOpB3jUoVWIHozNUQCab2flDG8xsFMS3 2P4Gb0kuJ6s4+Zb3EB36CSwQ8LeFTdRC6o3TavCAcfLOcSyvfjcZAmeu4mDwaKjTi/LS caQ5rAl7cMhYx+dzpprzZFTFXFjIf02zK0ijxFJYp+0XwOuuUs1W00uVRMNej3S4np0q DiIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=tG7+YZdwAKze8gBAJEZBRIJJgY5B1vbHrx8V3JKD0do=; b=dkS3gQfULV2v6wljml5l5iAyun0FRql/uqKPTqjallLB1TqEsV5r8pvKeqdUTB+mUR GWSea4PVs9iJVziKvsMhNGYRvD0T7y+Qbu9pn8ARLsB8tn0ZsFWn/o87HkT0P1+kjyYY MeLSLW0BWVmBKFP3xquvwQ8pzPvw7qC+hfLGk5DyOdWmDM3baCwsvGLbhGZBktBPAlul ru4RI3vwoxKDn0NUq9ynja3XA8YUi5Pzmq1Ht6l8K3gYs+51LIqZR5bH0EQHgCf1buQQ PyHdHto/IYJpUOWczL4JAuZet9G0rRXMxorRKUeqjQViMm+z49tgqhYU8pU7Zflf1uvC JwTA== X-Gm-Message-State: AOAM532eRQy5fh4ek8f9JtxCKMlAkIMeR3QI0Bs6GZDBTb3DS+SKLtN1 fVoywUK20NdfBy0x4CMN22Xkl2lKnK0= X-Google-Smtp-Source: ABdhPJyHymXRenL4GGaoKqZjDxmE0hAgyjjBJcxsG9BNvpunWs0nzxO89M8BLtQXTUMUnpPHrUXPbQ== X-Received: by 2002:adf:e38f:: with SMTP id e15mr5651808wrm.321.1615912987891; Tue, 16 Mar 2021 09:43:07 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id x8sm22648275wru.46.2021.03.16.09.43.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Mar 2021 09:43:07 -0700 (PDT) Message-Id: <4472118cf903d8dbcfdefe68af9166711d4ad6c6.1615912983.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 16 Mar 2021 16:42:47 +0000 Subject: [PATCH v3 04/20] sparse-index: add guard to ensure full index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Upcoming changes will introduce modifications to the index format that allow sparse directories. It will be useful to have a mechanism for converting those sparse index files into full indexes by walking the tree at those sparse directories. Name this method ensure_full_index() as it will guarantee that the index is fully expanded. This method is not implemented yet, and instead we focus on the scaffolding to declare it and call it at the appropriate time. Add a 'command_requires_full_index' member to struct repo_settings. This will be an indicator that we need the index in full mode to do certain index operations. This starts as being true for every command, then we will set it to false as some commands integrate with sparse indexes. If 'command_requires_full_index' is true, then we will immediately expand a sparse index to a full one upon reading from disk. This suffices for now, but we will want to add more callers to ensure_full_index() later. Signed-off-by: Derrick Stolee --- Makefile | 1 + repo-settings.c | 8 ++++++++ repository.c | 11 ++++++++++- repository.h | 2 ++ sparse-index.c | 8 ++++++++ sparse-index.h | 7 +++++++ 6 files changed, 36 insertions(+), 1 deletion(-) create mode 100644 sparse-index.c create mode 100644 sparse-index.h diff --git a/Makefile b/Makefile index dfb0f1000fa3..89b1d5374107 100644 --- a/Makefile +++ b/Makefile @@ -985,6 +985,7 @@ LIB_OBJS += setup.o LIB_OBJS += shallow.o LIB_OBJS += sideband.o LIB_OBJS += sigchain.o +LIB_OBJS += sparse-index.o LIB_OBJS += split-index.o LIB_OBJS += stable-qsort.o LIB_OBJS += strbuf.o diff --git a/repo-settings.c b/repo-settings.c index f7fff0f5ab83..d63569e4041e 100644 --- a/repo-settings.c +++ b/repo-settings.c @@ -77,4 +77,12 @@ void prepare_repo_settings(struct repository *r) UPDATE_DEFAULT_BOOL(r->settings.core_untracked_cache, UNTRACKED_CACHE_KEEP); UPDATE_DEFAULT_BOOL(r->settings.fetch_negotiation_algorithm, FETCH_NEGOTIATION_DEFAULT); + + /* + * This setting guards all index reads to require a full index + * over a sparse index. After suitable guards are placed in the + * codebase around uses of the index, this setting will be + * removed. + */ + r->settings.command_requires_full_index = 1; } diff --git a/repository.c b/repository.c index c98298acd017..a8acae002f71 100644 --- a/repository.c +++ b/repository.c @@ -10,6 +10,7 @@ #include "object.h" #include "lockfile.h" #include "submodule-config.h" +#include "sparse-index.h" /* The main repository */ static struct repository the_repo; @@ -261,6 +262,8 @@ void repo_clear(struct repository *repo) int repo_read_index(struct repository *repo) { + int res; + if (!repo->index) repo->index = xcalloc(1, sizeof(*repo->index)); @@ -270,7 +273,13 @@ int repo_read_index(struct repository *repo) else if (repo->index->repo != repo) BUG("repo's index should point back at itself"); - return read_index_from(repo->index, repo->index_file, repo->gitdir); + res = read_index_from(repo->index, repo->index_file, repo->gitdir); + + prepare_repo_settings(repo); + if (repo->settings.command_requires_full_index) + ensure_full_index(repo->index); + + return res; } int repo_hold_locked_index(struct repository *repo, diff --git a/repository.h b/repository.h index b385ca3c94b6..e06a23015697 100644 --- a/repository.h +++ b/repository.h @@ -41,6 +41,8 @@ struct repo_settings { enum fetch_negotiation_setting fetch_negotiation_algorithm; int core_multi_pack_index; + + unsigned command_requires_full_index:1; }; struct repository { diff --git a/sparse-index.c b/sparse-index.c new file mode 100644 index 000000000000..82183ead563b --- /dev/null +++ b/sparse-index.c @@ -0,0 +1,8 @@ +#include "cache.h" +#include "repository.h" +#include "sparse-index.h" + +void ensure_full_index(struct index_state *istate) +{ + /* intentionally left blank */ +} diff --git a/sparse-index.h b/sparse-index.h new file mode 100644 index 000000000000..09a20d036c46 --- /dev/null +++ b/sparse-index.h @@ -0,0 +1,7 @@ +#ifndef SPARSE_INDEX_H__ +#define SPARSE_INDEX_H__ + +struct index_state; +void ensure_full_index(struct index_state *istate); + +#endif From patchwork Tue Mar 16 16:42:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12142951 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 950C2C4332D for ; Tue, 16 Mar 2021 16:44:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 585DF65110 for ; Tue, 16 Mar 2021 16:44:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238919AbhCPQnx (ORCPT ); Tue, 16 Mar 2021 12:43:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52592 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238864AbhCPQnL (ORCPT ); Tue, 16 Mar 2021 12:43:11 -0400 Received: from mail-wr1-x432.google.com (mail-wr1-x432.google.com [IPv6:2a00:1450:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CB2D0C06175F for ; Tue, 16 Mar 2021 09:43:09 -0700 (PDT) Received: by mail-wr1-x432.google.com with SMTP id l11so10822580wrp.7 for ; Tue, 16 Mar 2021 09:43:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Ij/ErsuBXDwyTQeN5ca6/0eXS2KeenNkj/SkHGSkbl4=; b=Yfm2UgzRKksqGzhP34vySABhh3CVsrH8gV8NMQKRwxgHyzSz4Oym+srWqr6cZAGUOP XFXNGVoQcTA3upZCDOURLukOh0nrQgpLu6XNe/f7GVMZMoGG4i+3pmKk4nOSRFE0Mb8S PWSzHp5fEwDeFjCsUHtB6heH5Usp6z9UAMHBd6ulrgXxM8oJ5ZBDuyuh0XaW3w501Tdj i/AapM6ao7QN3jWR6XsnWgCjvEYDEsdI1tyz03LK3pZQ3WBlbMYnDDViI07B2aUGFrFf fs2mHFJqB9V9e8ph2EKxV6o8RlYzD8Sbl3TiKTE1n2dYT+sn27toIXDq5atOd+KrSvhA MEPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Ij/ErsuBXDwyTQeN5ca6/0eXS2KeenNkj/SkHGSkbl4=; b=odYtAKxT1WYnr7CB6Ysw26JWuCrjNdoud4b81bQiUpqrNTnoKnDuQaiLOfNBVqAXWh f0iBmzRKG081xmviuP6RxIp6IToHJhREuAte8mIN/mn/xCUiz4UaqMXSor1lv2kUAoSY C1KG+ubV7u9rJNiRmHrotJZ2wvzJLOzQb47vkLCsr+lVLypdxB1KkOxlg6XU26AeL1sn 5YTjpMxWxlgclADFzxlYWEAEgSWD1ZR95RcWGZizL9ye2UxIG4S5hNhEK6Io9GGczlDm wov4j1s4skbvALk/VGtBv+Zd4lb8lQEjwMXTto2YU/vMH7WK9IOP1AIVm1Wd29mEwxsK EN1Q== X-Gm-Message-State: AOAM531TMXb5VgLHZrP+LRv6jWg9Ewfr0bmGxe8rj+GLXVngp05pvmjq HqRl3mHVJ1Gf/h3smCOttJO6q2oQCgY= X-Google-Smtp-Source: ABdhPJx7wbNkpfc+wh/tyXkQBe8pJdmcGYbxsRt3+TlKYDptbN9o4FfOBcmePNPs4yj/zdWz9vNuYg== X-Received: by 2002:a5d:6a81:: with SMTP id s1mr5850158wru.401.1615912988603; Tue, 16 Mar 2021 09:43:08 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h25sm22869wml.32.2021.03.16.09.43.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Mar 2021 09:43:08 -0700 (PDT) Message-Id: <99292cdbaae488101d1c247ab94dc4b3b04d0311.1615912983.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 16 Mar 2021 16:42:48 +0000 Subject: [PATCH v3 05/20] sparse-index: implement ensure_full_index() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee We will mark an in-memory index_state as having sparse directory entries with the sparse_index bit. These currently cannot exist, but we will add a mechanism for collapsing a full index to a sparse one in a later change. That will happen at write time, so we must first allow parsing the format before writing it. Commands or methods that require a full index in order to operate can call ensure_full_index() to expand that index in-memory. This requires parsing trees using that index's repository. Sparse directory entries have a specific 'ce_mode' value. The macro S_ISSPARSEDIR(ce->ce_mode) can check if a cache_entry 'ce' has this type. This ce_mode is not possible with the existing index formats, so we don't also verify all properties of a sparse-directory entry, which are: 1. ce->ce_mode == 0040000 2. ce->flags & CE_SKIP_WORKTREE is true 3. ce->name[ce->namelen - 1] == '/' (ends in dir separator) 4. ce->oid references a tree object. These are all semi-enforced in ensure_full_index() to some extent. Any deviation will cause a warning at minimum or a failure in the worst case. Signed-off-by: Derrick Stolee --- cache.h | 13 ++++++- read-cache.c | 9 +++++ sparse-index.c | 98 +++++++++++++++++++++++++++++++++++++++++++++++++- 3 files changed, 118 insertions(+), 2 deletions(-) diff --git a/cache.h b/cache.h index c2f8a8eadf67..abb00a068e5d 100644 --- a/cache.h +++ b/cache.h @@ -204,6 +204,8 @@ struct cache_entry { #error "CE_EXTENDED_FLAGS out of range" #endif +#define S_ISSPARSEDIR(m) ((m) == S_IFDIR) + /* Forward structure decls */ struct pathspec; struct child_process; @@ -319,7 +321,14 @@ struct index_state { drop_cache_tree : 1, updated_workdir : 1, updated_skipworktree : 1, - fsmonitor_has_run_once : 1; + fsmonitor_has_run_once : 1, + + /* + * sparse_index == 1 when sparse-directory + * entries exist. Requires sparse-checkout + * in cone mode. + */ + sparse_index : 1; struct hashmap name_hash; struct hashmap dir_hash; struct object_id oid; @@ -722,6 +731,8 @@ int read_index_from(struct index_state *, const char *path, const char *gitdir); int is_index_unborn(struct index_state *); +void ensure_full_index(struct index_state *istate); + /* For use with `write_locked_index()`. */ #define COMMIT_LOCK (1 << 0) #define SKIP_IF_UNCHANGED (1 << 1) diff --git a/read-cache.c b/read-cache.c index 1e9a50c6c734..dd3980c12b53 100644 --- a/read-cache.c +++ b/read-cache.c @@ -101,6 +101,9 @@ static const char *alternate_index_output; static void set_index_entry(struct index_state *istate, int nr, struct cache_entry *ce) { + if (S_ISSPARSEDIR(ce->ce_mode)) + istate->sparse_index = 1; + istate->cache[nr] = ce; add_name_hash(istate, ce); } @@ -2273,6 +2276,12 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist) trace2_data_intmax("index", the_repository, "read/cache_nr", istate->cache_nr); + if (!istate->repo) + istate->repo = the_repository; + prepare_repo_settings(istate->repo); + if (istate->repo->settings.command_requires_full_index) + ensure_full_index(istate); + return istate->cache_nr; unmap: diff --git a/sparse-index.c b/sparse-index.c index 82183ead563b..7095378a1b28 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -1,8 +1,104 @@ #include "cache.h" #include "repository.h" #include "sparse-index.h" +#include "tree.h" +#include "pathspec.h" +#include "trace2.h" + +static void set_index_entry(struct index_state *istate, int nr, struct cache_entry *ce) +{ + ALLOC_GROW(istate->cache, nr + 1, istate->cache_alloc); + + istate->cache[nr] = ce; + add_name_hash(istate, ce); +} + +static int add_path_to_index(const struct object_id *oid, + struct strbuf *base, const char *path, + unsigned int mode, void *context) +{ + struct index_state *istate = (struct index_state *)context; + struct cache_entry *ce; + size_t len = base->len; + + if (S_ISDIR(mode)) + return READ_TREE_RECURSIVE; + + strbuf_addstr(base, path); + + ce = make_cache_entry(istate, mode, oid, base->buf, 0, 0); + ce->ce_flags |= CE_SKIP_WORKTREE; + set_index_entry(istate, istate->cache_nr++, ce); + + strbuf_setlen(base, len); + return 0; +} void ensure_full_index(struct index_state *istate) { - /* intentionally left blank */ + int i; + struct index_state *full; + struct strbuf base = STRBUF_INIT; + + if (!istate || !istate->sparse_index) + return; + + if (!istate->repo) + istate->repo = the_repository; + + trace2_region_enter("index", "ensure_full_index", istate->repo); + + /* initialize basics of new index */ + full = xcalloc(1, sizeof(struct index_state)); + memcpy(full, istate, sizeof(struct index_state)); + + /* then change the necessary things */ + full->sparse_index = 0; + full->cache_alloc = (3 * istate->cache_alloc) / 2; + full->cache_nr = 0; + ALLOC_ARRAY(full->cache, full->cache_alloc); + + for (i = 0; i < istate->cache_nr; i++) { + struct cache_entry *ce = istate->cache[i]; + struct tree *tree; + struct pathspec ps; + + if (!S_ISSPARSEDIR(ce->ce_mode)) { + set_index_entry(full, full->cache_nr++, ce); + continue; + } + if (!(ce->ce_flags & CE_SKIP_WORKTREE)) + warning(_("index entry is a directory, but not sparse (%08x)"), + ce->ce_flags); + + /* recursively walk into cd->name */ + tree = lookup_tree(istate->repo, &ce->oid); + + memset(&ps, 0, sizeof(ps)); + ps.recursive = 1; + ps.has_wildcard = 1; + ps.max_depth = -1; + + strbuf_setlen(&base, 0); + strbuf_add(&base, ce->name, strlen(ce->name)); + + read_tree_at(istate->repo, tree, &base, &ps, + add_path_to_index, full); + + /* free directory entries. full entries are re-used */ + discard_cache_entry(ce); + } + + /* Copy back into original index. */ + memcpy(&istate->name_hash, &full->name_hash, sizeof(full->name_hash)); + istate->sparse_index = 0; + free(istate->cache); + istate->cache = full->cache; + istate->cache_nr = full->cache_nr; + istate->cache_alloc = full->cache_alloc; + + strbuf_release(&base); + free(full); + + trace2_region_leave("index", "ensure_full_index", istate->repo); } From patchwork Tue Mar 16 16:42:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12142943 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3032C43333 for ; Tue, 16 Mar 2021 16:44:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9F1C665117 for ; Tue, 16 Mar 2021 16:44:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236102AbhCPQn4 (ORCPT ); Tue, 16 Mar 2021 12:43:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52594 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238865AbhCPQnL (ORCPT ); Tue, 16 Mar 2021 12:43:11 -0400 Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7EEB5C061762 for ; Tue, 16 Mar 2021 09:43:10 -0700 (PDT) Received: by mail-wr1-x42f.google.com with SMTP id v11so7787118wro.7 for ; Tue, 16 Mar 2021 09:43:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=3UfR33i2EmD24Ig4R+PabyKQOjZ/JHxeIKOaa94XMX0=; b=l53nmJ7kzJeqt9iWICFrrHeCuHHykmlRvcwUITTXxVgAq86h5SzlJDMedAteR8tKbB VE7oAs3rGGrvG//X1n20McZ24ibtf0LolHF8nLbBaNxpd0w8I16Sr0zzNaaYVpasB56G lBnXbbfqHeSU8FTNKZHHmZPC/Wv14W5F14vrQhqoHm5baNcWV7+q6U04chBQl4h2klBU nJ5tX5tjv4MRdkBxmJpTLEDMX7UaHeUF3Icc8QgqcF0mk3KTnbmNtIy5R6z17zRm/aiN A545sQqQpiFzHX1faJuhfFrcx/5lu8TZ9RXM56NZmQo0tpPtaAm8UJGQKzh3YoQvl0PU EeRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=3UfR33i2EmD24Ig4R+PabyKQOjZ/JHxeIKOaa94XMX0=; b=QwpNF4keEpsR8wCHW/v3d6hXdvjbHYTTI9esVb/t+gag9IUZcl9RXlIg6DXGdUBHaM cjOO9cf44by/9ZAYzqNMIVzSmG+b1DCnORveGIcjtK5HVAwpVkWJOn+v15piIaMxY+3l ai8t19dC+VA/DuJ8sCntbKi6pUcPKQ1SVvxfR7wyQms2kGU4ta0s3+DfUzUkLuVWh3AA p06YvWqcwXlo2uToUCkLw+mlLFP4WSV9DjYNiMULvQGVENevWTcJYPVGQmFHMIHOC3Wi y6OBLmcq/KI88w47oKXBoEGe3klr7R5t+ljuK3hfa2SAB/jS2b7jRm7wpNNy+bW91cnS AjYQ== X-Gm-Message-State: AOAM532pHjaoXYkPAveRYFKl6SIwMK/WGXvF/ytIOZ/DYu70GniRqU3t YAb3/tsi91B6GYTmIMdadksCl9oaWnc= X-Google-Smtp-Source: ABdhPJzL7XIFgrYKXf1HKoOFzgiYJH93s+0+lMITjviG4+geWphZcq+hHg+tpJ/BiUa3KF6jZ7u4Lw== X-Received: by 2002:a5d:55c4:: with SMTP id i4mr5943820wrw.84.1615912989354; Tue, 16 Mar 2021 09:43:09 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id s11sm31869wme.22.2021.03.16.09.43.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Mar 2021 09:43:08 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 16 Mar 2021 16:42:49 +0000 Subject: [PATCH v3 06/20] t1092: compare sparse-checkout to sparse-index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Add a new 'sparse-index' repo alongside the 'full-checkout' and 'sparse-checkout' repos in t1092-sparse-checkout-compatibility.sh. Also add run_on_sparse and test_sparse_match helpers. These helpers will be used when the sparse index is implemented. Add the GIT_TEST_SPARSE_INDEX environment variable to enable the sparse-index by default. This can be enabled across all tests, but that will only affect cases where the sparse-checkout feature is enabled. Signed-off-by: Derrick Stolee --- t/README | 3 +++ t/t1092-sparse-checkout-compatibility.sh | 24 ++++++++++++++++++++---- 2 files changed, 23 insertions(+), 4 deletions(-) diff --git a/t/README b/t/README index 593d4a4e270c..b98bc563aab5 100644 --- a/t/README +++ b/t/README @@ -439,6 +439,9 @@ and "sha256". GIT_TEST_WRITE_REV_INDEX=, when true enables the 'pack.writeReverseIndex' setting. +GIT_TEST_SPARSE_INDEX=, when true enables index writes to use the +sparse-index format by default. + Naming Tests ------------ diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index 3725d3997e70..de5d8461c993 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -7,6 +7,7 @@ test_description='compare full workdir to sparse workdir' test_expect_success 'setup' ' git init initial-repo && ( + GIT_TEST_SPARSE_INDEX=0 && cd initial-repo && echo a >a && echo "after deep" >e && @@ -87,23 +88,32 @@ init_repos () { cp -r initial-repo sparse-checkout && git -C sparse-checkout reset --hard && - git -C sparse-checkout sparse-checkout init --cone && + + cp -r initial-repo sparse-index && + git -C sparse-index reset --hard && # initialize sparse-checkout definitions - git -C sparse-checkout sparse-checkout set deep + git -C sparse-checkout sparse-checkout init --cone && + git -C sparse-checkout sparse-checkout set deep && + GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout init --cone && + GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout set deep } run_on_sparse () { ( cd sparse-checkout && - "$@" >../sparse-checkout-out 2>../sparse-checkout-err + GIT_TEST_SPARSE_INDEX=0 "$@" >../sparse-checkout-out 2>../sparse-checkout-err + ) && + ( + cd sparse-index && + GIT_TEST_SPARSE_INDEX=1 "$@" >../sparse-index-out 2>../sparse-index-err ) } run_on_all () { ( cd full-checkout && - "$@" >../full-checkout-out 2>../full-checkout-err + GIT_TEST_SPARSE_INDEX=0 "$@" >../full-checkout-out 2>../full-checkout-err ) && run_on_sparse "$@" } @@ -114,6 +124,12 @@ test_all_match () { test_cmp full-checkout-err sparse-checkout-err } +test_sparse_match () { + run_on_sparse "$@" && + test_cmp sparse-checkout-out sparse-index-out && + test_cmp sparse-checkout-err sparse-index-err +} + test_expect_success 'status with options' ' init_repos && test_all_match git status --porcelain=v2 && From patchwork Tue Mar 16 16:42:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12142949 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B12EAC43332 for ; Tue, 16 Mar 2021 16:44:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8BF3965109 for ; Tue, 16 Mar 2021 16:44:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238923AbhCPQnz (ORCPT ); Tue, 16 Mar 2021 12:43:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52596 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238867AbhCPQnM (ORCPT ); Tue, 16 Mar 2021 12:43:12 -0400 Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3E2CAC061763 for ; Tue, 16 Mar 2021 09:43:11 -0700 (PDT) Received: by mail-wr1-x42f.google.com with SMTP id a18so10800170wrc.13 for ; Tue, 16 Mar 2021 09:43:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Wxlia8VnIU4u5D64V+OKvd3HYkcbJivGBzhX7AKEUNw=; b=i7M6/2wwKtBtZgZa4T7Zl0frKTS3tc2JqA6qb7en8h0g5CD6d0KCCFoTF+z52ypel7 5DVWRybIXwXMJ1E3kETRMhYt6k5dYQ20p4yShTnc44ppE2NU9XN+okkYGI0TZu2K6N+9 eTcTT++C0/Sf49wkbXsuxmULqtI0IvvG7YoVBCoqjhb6djhj0ADaBiiG7+tUjaErMMGS OYU7hyf6XUBCwTt9kK6aZXk2720LZMMVZVn/EATGW7qrvrAkJoDS6wY3WvVtRJBIAxWj +EXHBCQNyMKinT4AgF4E90Py0AyApcuqJT+5RQekKw/i+k6PvEosvSJR4/khG4+AMZ5L J6wg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Wxlia8VnIU4u5D64V+OKvd3HYkcbJivGBzhX7AKEUNw=; b=gHcp5+4lRN6FbdqjZ+cL1tMyRJaADU0uodbop8OMEUEK6j1EWSJLegyzhWpba4S1IK /rMQJ+gl4nmMsK2COft7N0UfsbxdQNV9EIZGQ5SxAMeX/OD5JWAjzROmFT9bcwBYVaQw dK0P9pYZSEa4kON5xkyk9gamNLo+cTUTNF0rVy2oqhEjLrOtoVfROYrEvk+dJErndysU KGneaV9AzK1U09YOIcVi6YbTZSCF7dxHwKcpwYX4yZwY7izhFo7uFfIKngHkBFJJ+pIc 44LdRI1fOEtnAFF6lZ3bUWSkXmzIWFLbI4gyK25KTt3v4xQcTrju67YV/FRNtw5icOmK /t5g== X-Gm-Message-State: AOAM532D0iKDNH/ylEO07UGxWG2IBpjoW9FOHywzC8TO9arHiltNmgCe NVJ2i/VgC+f/FPnmvgqkcgqvU4QWlyw= X-Google-Smtp-Source: ABdhPJwu3HppJediaHMoY7X/tLOrT0Z1QZC7h99bylMOqiQ8OFrYXYfFTl3YbdtCn0c5ooSRipprzw== X-Received: by 2002:a5d:4688:: with SMTP id u8mr5842773wrq.39.1615912990067; Tue, 16 Mar 2021 09:43:10 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id b17sm22765209wrt.17.2021.03.16.09.43.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Mar 2021 09:43:09 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 16 Mar 2021 16:42:50 +0000 Subject: [PATCH v3 07/20] test-read-cache: print cache entries with --table Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee This table is helpful for discovering data in the index to ensure it is being written correctly, especially as we build and test the sparse-index. This table includes an output format similar to 'git ls-tree', but should not be compared to that directly. The biggest reasons are that 'git ls-tree' includes a tree entry for every subdirectory, even those that would not appear as a sparse directory in a sparse-index. Further, 'git ls-tree' does not use a trailing directory separator for its tree rows. This does not print the stat() information for the blobs. That could be added in a future change with another option. The tests that are added in the next few changes care only about the object types and IDs. To make the option parsing slightly more robust, wrap the string comparisons in a loop adapted from test-dir-iterator.c. Care must be taken with the final check for the 'cnt' variable. We continue the expectation that the numerical value is the final argument. Signed-off-by: Derrick Stolee --- t/helper/test-read-cache.c | 55 +++++++++++++++++++++++++++++++------- 1 file changed, 45 insertions(+), 10 deletions(-) diff --git a/t/helper/test-read-cache.c b/t/helper/test-read-cache.c index 244977a29bdf..6cfd8f2de71c 100644 --- a/t/helper/test-read-cache.c +++ b/t/helper/test-read-cache.c @@ -1,36 +1,71 @@ #include "test-tool.h" #include "cache.h" #include "config.h" +#include "blob.h" +#include "commit.h" +#include "tree.h" + +static void print_cache_entry(struct cache_entry *ce) +{ + const char *type; + printf("%06o ", ce->ce_mode & 0177777); + + if (S_ISSPARSEDIR(ce->ce_mode)) + type = tree_type; + else if (S_ISGITLINK(ce->ce_mode)) + type = commit_type; + else + type = blob_type; + + printf("%s %s\t%s\n", + type, + oid_to_hex(&ce->oid), + ce->name); +} + +static void print_cache(struct index_state *istate) +{ + int i; + for (i = 0; i < istate->cache_nr; i++) + print_cache_entry(istate->cache[i]); +} int cmd__read_cache(int argc, const char **argv) { + struct repository *r = the_repository; int i, cnt = 1; const char *name = NULL; + int table = 0; - if (argc > 1 && skip_prefix(argv[1], "--print-and-refresh=", &name)) { - argc--; - argv++; + for (++argv, --argc; *argv && starts_with(*argv, "--"); ++argv, --argc) { + if (skip_prefix(*argv, "--print-and-refresh=", &name)) + continue; + if (!strcmp(*argv, "--table")) + table = 1; } - if (argc == 2) - cnt = strtol(argv[1], NULL, 0); + if (argc == 1) + cnt = strtol(argv[0], NULL, 0); setup_git_directory(); git_config(git_default_config, NULL); + for (i = 0; i < cnt; i++) { - read_cache(); + repo_read_index(r); if (name) { int pos; - refresh_index(&the_index, REFRESH_QUIET, + refresh_index(r->index, REFRESH_QUIET, NULL, NULL, NULL); - pos = index_name_pos(&the_index, name, strlen(name)); + pos = index_name_pos(r->index, name, strlen(name)); if (pos < 0) die("%s not in index", name); printf("%s is%s up to date\n", name, - ce_uptodate(the_index.cache[pos]) ? "" : " not"); + ce_uptodate(r->index->cache[pos]) ? "" : " not"); write_file(name, "%d\n", i); } - discard_cache(); + if (table) + print_cache(r->index); + discard_index(r->index); } return 0; } From patchwork Tue Mar 16 16:42:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12142941 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9EE9CC4332E for ; Tue, 16 Mar 2021 16:44:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 72E936510E for ; Tue, 16 Mar 2021 16:44:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238921AbhCPQny (ORCPT ); Tue, 16 Mar 2021 12:43:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52604 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238870AbhCPQnM (ORCPT ); Tue, 16 Mar 2021 12:43:12 -0400 Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 115B0C061756 for ; Tue, 16 Mar 2021 09:43:12 -0700 (PDT) Received: by mail-wm1-x32c.google.com with SMTP id 12so5446712wmf.5 for ; Tue, 16 Mar 2021 09:43:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=3o+t5/dOGp5xpMAAQd/XUzGtlCJPY6SPrX9Z+uQ/DZ4=; b=fiJGxPqaOAGHbyhFsicEQnzJTTx04cxqYN6OrfyAt9CnDg7CbMU7BSYxHRqnM8PAgo Mf/qbZ+qaMb3yuyssWmqQ9MhP3B7F1QL4uhU2EgvNPdwjoiJLF4fPW/yZ1F4g7SyipS3 ZIfjzw2HWk1At9MvBR8fLxuVl314NPmgI9e71iuk0wVaLA/d7XW3IcJAj93Jd6zlJ6HA fU6nR0zKUkFVpeOO3yqL6BZnGLlbYOiYWw1+EKRlT3IrhDv50HudmPcjW4YcIb9ZJlM7 NTo/0u8XqENagGGihv60iVzH+fqXhulLDnnb+w2l3RDq6PsBp/hfZm0FT+QSyuyynw26 7C0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=3o+t5/dOGp5xpMAAQd/XUzGtlCJPY6SPrX9Z+uQ/DZ4=; b=QveiO6PBpbaSxxtPV7upXWBd7jG8L81srGnhVLfo3SkY/P6vn3+NDfTbV025sqNEj1 kyyWduNgbxOf4EIA18UP5i6WLWvkLz6mXnusBW+7Ka3v8yWogPDJF4T6ahE+3Ibj5xA5 Iy3jrh2XhZMGZ6y+FBXqynK3lyFTh9Um7C6MSXB0u0WmhdNonleyDtseaiCORUpfh20+ 064SO2gyucSg/6jRdoU49c/f6KEe2rygqQLGFwLfq2tfn8BAaOKYR0sOs3UBxg39ziEi P9jnVXmoZ6nWDeO77DS/a0y2KybyTAHQeimMYA9pDJQiUVngxfHeU7a6SIoV+BZqN0e7 kiJw== X-Gm-Message-State: AOAM532+KOp8K8Eyaan7GUe50++2kP1846iXKZVIAZJSzStXWFQYW6QM 9I1/1BrE2Gm+5RQHOXdx34cZoR7xiFI= X-Google-Smtp-Source: ABdhPJyrundPj64HcUtHQNsC7Ma+Fa/hOfT63+ky5wWcfDruOBlZVkr8DHkUiUV4JxN0MKbiS5R0AQ== X-Received: by 2002:a7b:c3c1:: with SMTP id t1mr464503wmj.47.1615912990906; Tue, 16 Mar 2021 09:43:10 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id c2sm28778wmr.22.2021.03.16.09.43.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Mar 2021 09:43:10 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 16 Mar 2021 16:42:51 +0000 Subject: [PATCH v3 08/20] test-tool: don't force full index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee We will use 'test-tool read-cache --table' to check that a sparse index is written as part of init_repos. Since we will no longer always expand a sparse index into a full index, add an '--expand' parameter that adds a call to ensure_full_index() so we can compare a sparse index directly against a full index, or at least what the in-memory index looks like when expanded in this way. Signed-off-by: Derrick Stolee --- t/helper/test-read-cache.c | 13 ++++++++++++- t/t1092-sparse-checkout-compatibility.sh | 5 +++++ 2 files changed, 17 insertions(+), 1 deletion(-) diff --git a/t/helper/test-read-cache.c b/t/helper/test-read-cache.c index 6cfd8f2de71c..b52c174acc7a 100644 --- a/t/helper/test-read-cache.c +++ b/t/helper/test-read-cache.c @@ -4,6 +4,7 @@ #include "blob.h" #include "commit.h" #include "tree.h" +#include "sparse-index.h" static void print_cache_entry(struct cache_entry *ce) { @@ -35,13 +36,19 @@ int cmd__read_cache(int argc, const char **argv) struct repository *r = the_repository; int i, cnt = 1; const char *name = NULL; - int table = 0; + int table = 0, expand = 0; + + initialize_the_repository(); + prepare_repo_settings(r); + r->settings.command_requires_full_index = 0; for (++argv, --argc; *argv && starts_with(*argv, "--"); ++argv, --argc) { if (skip_prefix(*argv, "--print-and-refresh=", &name)) continue; if (!strcmp(*argv, "--table")) table = 1; + else if (!strcmp(*argv, "--expand")) + expand = 1; } if (argc == 1) @@ -51,6 +58,10 @@ int cmd__read_cache(int argc, const char **argv) for (i = 0; i < cnt; i++) { repo_read_index(r); + + if (expand) + ensure_full_index(r->index); + if (name) { int pos; diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index de5d8461c993..a1aea141c62c 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -130,6 +130,11 @@ test_sparse_match () { test_cmp sparse-checkout-err sparse-index-err } +test_expect_success 'expanded in-memory index matches full index' ' + init_repos && + test_sparse_match test-tool read-cache --expand --table +' + test_expect_success 'status with options' ' init_repos && test_all_match git status --porcelain=v2 && From patchwork Tue Mar 16 16:42:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12142947 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E6791C43331 for ; Tue, 16 Mar 2021 16:44:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B053A6511A for ; Tue, 16 Mar 2021 16:44:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238928AbhCPQn6 (ORCPT ); Tue, 16 Mar 2021 12:43:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52608 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238873AbhCPQnN (ORCPT ); Tue, 16 Mar 2021 12:43:13 -0400 Received: from mail-wm1-x336.google.com (mail-wm1-x336.google.com [IPv6:2a00:1450:4864:20::336]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BC64DC06174A for ; Tue, 16 Mar 2021 09:43:12 -0700 (PDT) Received: by mail-wm1-x336.google.com with SMTP id n11-20020a05600c4f8bb029010e5cf86347so4136598wmq.1 for ; Tue, 16 Mar 2021 09:43:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=PmQCSE6X5jEApHHEsKgETjtkB8Y7U5YQIJz9rmXLIWU=; b=Z0FDi4hNeAat94XYBIiqRbyyXkaFVnQR87Ka0bbh/kGI9o+AXUbIdHn1xTITUs+Pzm 9bElf4ScvmITOkb9Ft6oekvHBAdBTYgPOpUVLU+SBTRZHoW/o+oxAT9ScFXpFCvJy/JJ eoLJWhgjpToJQpiOEKjXxJGFRGrGYsmipM8U+aBPWdE30Sl614LWJXosTU2vQyE3yKjC MX7Owa3ChGlJMbSCQ8Nj5aOH48i2l8iJICAf7DINgu3fFP9Av+9yny1bedn2OwDV729j P8/d+GMtF1ZT8bx8EQXPJYXyYwmnJoTEiDmiER/bH1t3W9wHB4f1KvtFH3qh1fnbPFnr hicg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=PmQCSE6X5jEApHHEsKgETjtkB8Y7U5YQIJz9rmXLIWU=; b=a+VA25Q+7FkNLz9qYt6x5mrUIjwshoMi/4JsLdSSvDYaR/LPUayQY14JxgE+F44L4l wkb1Vq5oanTMaPdNMVYIIXktWLbSEIhCUDwsOqp44ifrpXxQSrWG4JfBs6CxFV36ajkM BMN3JqXLpTaaXTXC2m9/RFSpWh/9WTIEQXgqJtJu7oHlJVEX6SuT0CEBzN/DzwA9XWxJ 9b/mToWoQDK4sbNITYOj7xMJRdev7c3JDVx9tyrhocCa7ncBKvszvrkCFpW8gwze26jU lMdbq4Y/alHzvoC3DsYK5/j/ksV1OIGtBtqhIQu41LidsXeiES9n8B96R/BNUceNXO5l XiAQ== X-Gm-Message-State: AOAM532nykPqCjBU2FjxomxTpv0Z2TcG6tLJwi68N+hdEUDMkEiAkzig xMA9ceCnmGhRZ3Q7GqEjOevuq1a4pbE= X-Google-Smtp-Source: ABdhPJyrQb3Ovg+1c741oBFCJMaSfExF6eGyYAZH0IlnjnzjphAwIrTCrXOc4yHw0bp3F2U1xd75AA== X-Received: by 2002:a7b:cb99:: with SMTP id m25mr516946wmi.64.1615912991594; Tue, 16 Mar 2021 09:43:11 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id c9sm8022wml.42.2021.03.16.09.43.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Mar 2021 09:43:11 -0700 (PDT) Message-Id: <4780076a50df8c4db73c04baa95d8654fd04f38b.1615912983.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 16 Mar 2021 16:42:52 +0000 Subject: [PATCH v3 09/20] unpack-trees: ensure full index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The next change will translate full indexes into sparse indexes at write time. The existing logic provides a way for every sparse index to be expanded to a full index at read time. However, there are cases where an index is written and then continues to be used in-memory to perform further updates. unpack_trees() is frequently called after such a write. In particular, commands like 'git reset' do this double-update of the index. Ensure that we have a full index when entering unpack_trees(), but only when command_requires_full_index is true. This is always true at the moment, but we will later relax that after unpack_trees() is updated to handle sparse directory entries. Signed-off-by: Derrick Stolee --- unpack-trees.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/unpack-trees.c b/unpack-trees.c index eb8fcda31ba7..2da3e5ec77a1 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -1570,6 +1570,7 @@ static int verify_absent(const struct cache_entry *, */ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options *o) { + struct repository *repo = the_repository; int i, ret; static struct cache_entry *dfc; struct pattern_list pl; @@ -1581,6 +1582,12 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options trace_performance_enter(); trace2_region_enter("unpack_trees", "unpack_trees", the_repository); + prepare_repo_settings(repo); + if (repo->settings.command_requires_full_index) { + ensure_full_index(o->src_index); + ensure_full_index(o->dst_index); + } + if (!core_apply_sparse_checkout || !o->update) o->skip_sparse_checkout = 1; if (!o->skip_sparse_checkout && !o->pl) { From patchwork Tue Mar 16 16:42:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12142957 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3CEB7C4321A for ; Tue, 16 Mar 2021 16:44:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 28AC865111 for ; Tue, 16 Mar 2021 16:44:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238938AbhCPQoD (ORCPT ); Tue, 16 Mar 2021 12:44:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52616 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238876AbhCPQnO (ORCPT ); Tue, 16 Mar 2021 12:43:14 -0400 Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8DD54C06175F for ; Tue, 16 Mar 2021 09:43:13 -0700 (PDT) Received: by mail-wr1-x42f.google.com with SMTP id j2so10799157wrx.9 for ; Tue, 16 Mar 2021 09:43:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=v9Eykv/Bcp1PY3+/HvFDzf+sslGlfcSnKU+nMpjHnlk=; b=u96eDHj9wH5DRbz5ZpOVz61UErKfVK7Tdivb+OARyqlNtO4YgF7dgJ9uZx19ntt5s/ SXvI5q9PFJZbXnvq86laOl8VdHzFlwOu3Qkiae00nHXlMTWdF8WHrARTBFjmvbKc5sB+ zB5PYFXB0OoKlzVqoWd0mLXFjCO42cB4w1rb0gvPvQpNxuRyLJLnPVx5LgDRuvcZVyU4 qhwvx8zSmdRY+t9kAJD6mD/4pwkTEy5xizFLiVWih2Ug2HmeCsZKn+iaJBJO+1sxgXp7 fgHjUNM71j0p56LJa+y2xFk+2nOCkXkOZOWY931tRfLLuxzqcgk1Amy1o39BYXXrDjsp 7l8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=v9Eykv/Bcp1PY3+/HvFDzf+sslGlfcSnKU+nMpjHnlk=; b=jW+YSwaGT+Fp9qEWAAP30XLXqcdpkaVmTdVnuFiUpmB6eMRWHHoA9IXXuweZpbgcmM ONVCASkeecSIIX0eHdl58loEjxuO3uRXQGDLMoQe5JulLUHaUxU3hlB8sGMwdhBgUe/Y sPXjWGAuiUOyhRuo/oTyt3BVJWo3DjlYoq5hKj4cKlL0mB9Q020hLpoXBOay59cY2CIo b7mpWd0GF9+jfvMUC4b6lM9ZzAyjArcguWL5EZ6opdM4chWj977h2UqAootv+9VSu6e7 PB3osTC82+p0uHUAWeNE25SY0LaJwvqfG8EE4sJnHJdljbe8qcGKSeLMomckAiAEGFg+ iBmA== X-Gm-Message-State: AOAM5334MbRCNk1jewmVAX6b02BDxtF1RXvuomO1Xdi8otBaZUbKQCyT PkUeYFss9CJCa3PagxOn4DiNG9sCRnY= X-Google-Smtp-Source: ABdhPJwMvqVjpZ5vvrRSVOI4Mg8F7d9+XKa9IO2jfg//tpC6GHC6B6ASfhsoJziFypO2FnYzyIIajQ== X-Received: by 2002:a5d:68cd:: with SMTP id p13mr6071608wrw.247.1615912992364; Tue, 16 Mar 2021 09:43:12 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id z2sm26824109wrm.0.2021.03.16.09.43.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Mar 2021 09:43:11 -0700 (PDT) Message-Id: <33fdba2b8cfdf3b7d003989a0ba1264ae8f9bb99.1615912983.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 16 Mar 2021 16:42:53 +0000 Subject: [PATCH v3 10/20] sparse-checkout: hold pattern list in index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee As we modify the sparse-checkout definition, we perform index operations on a pattern_list that only exists in-memory. This allows easy backing out in case the index update fails. However, if the index write itself cares about the sparse-checkout pattern set, we need access to that in-memory copy. Place a pointer to a 'struct pattern_list' in the index so we can access this on-demand. This will be used in the next change which uses the sparse-checkout definition to filter out directories that are outside the sparse cone. Signed-off-by: Derrick Stolee --- builtin/sparse-checkout.c | 17 ++++++++++------- cache.h | 2 ++ 2 files changed, 12 insertions(+), 7 deletions(-) diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index 2306a9ad98e0..e00b82af727b 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -110,6 +110,8 @@ static int update_working_directory(struct pattern_list *pl) if (is_index_unborn(r->index)) return UPDATE_SPARSITY_SUCCESS; + r->index->sparse_checkout_patterns = pl; + memset(&o, 0, sizeof(o)); o.verbose_update = isatty(2); o.update = 1; @@ -138,6 +140,7 @@ static int update_working_directory(struct pattern_list *pl) else rollback_lock_file(&lock_file); + r->index->sparse_checkout_patterns = NULL; return result; } @@ -517,19 +520,18 @@ static int modify_pattern_list(int argc, const char **argv, enum modify_type m) { int result; int changed_config = 0; - struct pattern_list pl; - memset(&pl, 0, sizeof(pl)); + struct pattern_list *pl = xcalloc(1, sizeof(*pl)); switch (m) { case ADD: if (core_sparse_checkout_cone) - add_patterns_cone_mode(argc, argv, &pl); + add_patterns_cone_mode(argc, argv, pl); else - add_patterns_literal(argc, argv, &pl); + add_patterns_literal(argc, argv, pl); break; case REPLACE: - add_patterns_from_input(&pl, argc, argv); + add_patterns_from_input(pl, argc, argv); break; } @@ -539,12 +541,13 @@ static int modify_pattern_list(int argc, const char **argv, enum modify_type m) changed_config = 1; } - result = write_patterns_and_update(&pl); + result = write_patterns_and_update(pl); if (result && changed_config) set_config(MODE_NO_PATTERNS); - clear_pattern_list(&pl); + clear_pattern_list(pl); + free(pl); return result; } diff --git a/cache.h b/cache.h index abb00a068e5d..759ca92e2ecc 100644 --- a/cache.h +++ b/cache.h @@ -307,6 +307,7 @@ static inline unsigned int canon_mode(unsigned int mode) struct split_index; struct untracked_cache; struct progress; +struct pattern_list; struct index_state { struct cache_entry **cache; @@ -338,6 +339,7 @@ struct index_state { struct mem_pool *ce_mem_pool; struct progress *progress; struct repository *repo; + struct pattern_list *sparse_checkout_patterns; }; /* Name hashing */ From patchwork Tue Mar 16 16:42:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12142959 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2833EC4360C for ; Tue, 16 Mar 2021 16:44:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1331E6510E for ; Tue, 16 Mar 2021 16:44:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238934AbhCPQoC (ORCPT ); Tue, 16 Mar 2021 12:44:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52594 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238887AbhCPQnP (ORCPT ); Tue, 16 Mar 2021 12:43:15 -0400 Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6471CC061764 for ; Tue, 16 Mar 2021 09:43:14 -0700 (PDT) Received: by mail-wr1-x42c.google.com with SMTP id k8so7773786wrc.3 for ; Tue, 16 Mar 2021 09:43:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=i3iZTPPMokU9WlivKjKXvx7UdnmO55ak3J4wMWYsu38=; b=D6aOf8WB9XVfg/ie5Tz53Tin+0krcZLBJRes7rHtTnTN7fP7u1TKX5vjg4oZOMlza4 t2Fgz2PagjQq+IdghpfHDX3cmEphiRMCxhKOkxwtS+YXy2cIFxzJDPvR4iRQEigM9STu 7QGM3ikM2CKemt/sbz5Ka1uhDTzK+nBD7FVaQ4FytsVdZMJkBsJYSr3k7zrRAVcJHYhu x1eJ70gk8tJkmchlqbJgombZtTFwwTwzYJrph1Nnvhdhy/jAQjYzO1uNDSOQbGnY1xNQ IYAgawTdrNtX2v8lf23/lIuxK0ytcnjcpDZsrE9YDYPUlKh5QZPmwdKksePmhoP+aBl8 pV9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=i3iZTPPMokU9WlivKjKXvx7UdnmO55ak3J4wMWYsu38=; b=V6bj8bUIM60QovfCvSFDRkUNHI90hhxtGzns+QXE6KvSXwf1VaiMWvonNJTKMueHre tKUAzpbuAcZN3ln+iCApGCQx8uCoNq/KlNpqDZIyE4EcqtQ/7O10haqAisbZPeLJ/L7W boY5S2rxMmBmXDjiyhj+40GykUbFm52kEGRoDrCsGR26v9Odx5NoVjg5tQhzA0yJPHRH CiF/Vn966L02Hai0WJO2nxSLbcMPAcWQ13+67Hgi7Bc66W2HpY4sOqRSAUvzXKoPAu70 1L1cTxwzB/WrkoYIo6I9dzqaxMkXjiloWpNjnsjMPl55kUz4tw+vR2R+geNd8SimG0cC x9Tw== X-Gm-Message-State: AOAM530RcmYHCg2iKJXbOy5MX9DsB5xcGKqiAeXmdOEcmPUT9CEeUtPX ekJsZ7MXllFFZJBF/XLI3mf1EwfdyxE= X-Google-Smtp-Source: ABdhPJxolYd0yP+VFTNDWPLOrgX3f25u+Jfiq+dxN+AEsJWhzvLqsuTFH3YipC6EjIo7gNmYBUOmHQ== X-Received: by 2002:adf:e8c9:: with SMTP id k9mr5853970wrn.315.1615912993079; Tue, 16 Mar 2021 09:43:13 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id u9sm11752wmc.38.2021.03.16.09.43.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Mar 2021 09:43:12 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 16 Mar 2021 16:42:54 +0000 Subject: [PATCH v3 11/20] sparse-index: convert from full to sparse Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee If we have a full index, then we can convert it to a sparse index by replacing directories outside of the sparse cone with sparse directory entries. The convert_to_sparse() method does this, when the situation is appropriate. For now, we avoid converting the index to a sparse index if: 1. the index is split. 2. the index is already sparse. 3. sparse-checkout is disabled. 4. sparse-checkout does not use cone mode. Finally, we currently limit the conversion to when the GIT_TEST_SPARSE_INDEX environment variable is enabled. A mode using Git config will be added in a later change. The trickiest thing about this conversion is that we might not be able to mark a directory as a sparse directory just because it is outside the sparse cone. There might be unmerged files within that directory, so we need to look for those. Also, if there is some strange reason why a file is not marked with CE_SKIP_WORKTREE, then we should give up on converting that directory. There is still hope that some of its subdirectories might be able to convert to sparse, so we keep looking deeper. The conversion process is assisted by the cache-tree extension. This is calculated from the full index if it does not already exist. We then abandon the cache-tree as it no longer applies to the newly-sparse index. Thus, this cache-tree will be recalculated in every sparse-full-sparse round-trip until we integrate the cache-tree extension with the sparse index. Some Git commands use the index after writing it. For example, 'git add' will update the index, then write it to disk, then read its entries to report information. To keep the in-memory index in a full state after writing, we re-expand it to a full one after the write. This is wasteful for commands that only write the index and do not read from it again, but that is only the case until we make those commands "sparse aware." We can compare the behavior of the sparse-index in t1092-sparse-checkout-compability.sh by using GIT_TEST_SPARSE_INDEX=1 when operating on the 'sparse-index' repo. We can also compare the two sparse repos directly, such as comparing their indexes (when expanded to full in the case of the 'sparse-index' repo). We also verify that the index is actually populated with sparse directory entries. The 'checkout and reset (mixed)' test is marked for failure when comparing a sparse repo to a full repo, but we can compare the two sparse-checkout cases directly to ensure that we are not changing the behavior when using a sparse index. Signed-off-by: Derrick Stolee --- cache-tree.c | 3 + cache.h | 2 + read-cache.c | 26 ++++- sparse-index.c | 139 +++++++++++++++++++++++ sparse-index.h | 1 + t/t1092-sparse-checkout-compatibility.sh | 61 +++++++++- 6 files changed, 228 insertions(+), 4 deletions(-) diff --git a/cache-tree.c b/cache-tree.c index 2fb483d3c083..5f07a39e501e 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -6,6 +6,7 @@ #include "object-store.h" #include "replace-object.h" #include "promisor-remote.h" +#include "sparse-index.h" #ifndef DEBUG_CACHE_TREE #define DEBUG_CACHE_TREE 0 @@ -442,6 +443,8 @@ int cache_tree_update(struct index_state *istate, int flags) if (i) return i; + ensure_full_index(istate); + if (!istate->cache_tree) istate->cache_tree = cache_tree(); diff --git a/cache.h b/cache.h index 759ca92e2ecc..69a32146cd77 100644 --- a/cache.h +++ b/cache.h @@ -251,6 +251,8 @@ static inline unsigned int create_ce_mode(unsigned int mode) { if (S_ISLNK(mode)) return S_IFLNK; + if (mode == S_IFDIR) + return S_IFDIR; if (S_ISDIR(mode) || S_ISGITLINK(mode)) return S_IFGITLINK; return S_IFREG | ce_permissions(mode); diff --git a/read-cache.c b/read-cache.c index dd3980c12b53..b9c08773466c 100644 --- a/read-cache.c +++ b/read-cache.c @@ -25,6 +25,7 @@ #include "fsmonitor.h" #include "thread-utils.h" #include "progress.h" +#include "sparse-index.h" /* Mask for the name length in ce_flags in the on-disk index */ @@ -1002,8 +1003,14 @@ int verify_path(const char *path, unsigned mode) c = *path++; if ((c == '.' && !verify_dotfile(path, mode)) || - is_dir_sep(c) || c == '\0') + is_dir_sep(c)) return 0; + /* + * allow terminating directory separators for + * sparse directory entries. + */ + if (c == '\0') + return S_ISDIR(mode); } else if (c == '\\' && protect_ntfs) { if (is_ntfs_dotgit(path)) return 0; @@ -3079,6 +3086,14 @@ static int do_write_locked_index(struct index_state *istate, struct lock_file *l unsigned flags) { int ret; + int was_full = !istate->sparse_index; + + ret = convert_to_sparse(istate); + + if (ret) { + warning(_("failed to convert to a sparse-index")); + return ret; + } /* * TODO trace2: replace "the_repository" with the actual repo instance @@ -3090,6 +3105,9 @@ static int do_write_locked_index(struct index_state *istate, struct lock_file *l trace2_region_leave_printf("index", "do_write_index", the_repository, "%s", get_lock_file_path(lock)); + if (was_full) + ensure_full_index(istate); + if (ret) return ret; if (flags & COMMIT_LOCK) @@ -3180,9 +3198,10 @@ static int write_shared_index(struct index_state *istate, struct tempfile **temp) { struct split_index *si = istate->split_index; - int ret; + int ret, was_full = !istate->sparse_index; move_cache_to_base_index(istate); + convert_to_sparse(istate); trace2_region_enter_printf("index", "shared/do_write_index", the_repository, "%s", get_tempfile_path(*temp)); @@ -3190,6 +3209,9 @@ static int write_shared_index(struct index_state *istate, trace2_region_leave_printf("index", "shared/do_write_index", the_repository, "%s", get_tempfile_path(*temp)); + if (was_full) + ensure_full_index(istate); + if (ret) return ret; ret = adjust_shared_perm(get_tempfile_path(*temp)); diff --git a/sparse-index.c b/sparse-index.c index 7095378a1b28..619ff7c2e217 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -4,6 +4,145 @@ #include "tree.h" #include "pathspec.h" #include "trace2.h" +#include "cache-tree.h" +#include "config.h" +#include "dir.h" +#include "fsmonitor.h" + +static struct cache_entry *construct_sparse_dir_entry( + struct index_state *istate, + const char *sparse_dir, + struct cache_tree *tree) +{ + struct cache_entry *de; + + de = make_cache_entry(istate, S_IFDIR, &tree->oid, sparse_dir, 0, 0); + + de->ce_flags |= CE_SKIP_WORKTREE; + return de; +} + +/* + * Returns the number of entries "inserted" into the index. + */ +static int convert_to_sparse_rec(struct index_state *istate, + int num_converted, + int start, int end, + const char *ct_path, size_t ct_pathlen, + struct cache_tree *ct) +{ + int i, can_convert = 1; + int start_converted = num_converted; + enum pattern_match_result match; + int dtype; + struct strbuf child_path = STRBUF_INIT; + struct pattern_list *pl = istate->sparse_checkout_patterns; + + /* + * Is the current path outside of the sparse cone? + * Then check if the region can be replaced by a sparse + * directory entry (everything is sparse and merged). + */ + match = path_matches_pattern_list(ct_path, ct_pathlen, + NULL, &dtype, pl, istate); + if (match != NOT_MATCHED) + can_convert = 0; + + for (i = start; can_convert && i < end; i++) { + struct cache_entry *ce = istate->cache[i]; + + if (ce_stage(ce) || + !(ce->ce_flags & CE_SKIP_WORKTREE)) + can_convert = 0; + } + + if (can_convert) { + struct cache_entry *se; + se = construct_sparse_dir_entry(istate, ct_path, ct); + + istate->cache[num_converted++] = se; + return 1; + } + + for (i = start; i < end; ) { + int count, span, pos = -1; + const char *base, *slash; + struct cache_entry *ce = istate->cache[i]; + + /* + * Detect if this is a normal entry outside of any subtree + * entry. + */ + base = ce->name + ct_pathlen; + slash = strchr(base, '/'); + + if (slash) + pos = cache_tree_subtree_pos(ct, base, slash - base); + + if (pos < 0) { + istate->cache[num_converted++] = ce; + i++; + continue; + } + + strbuf_setlen(&child_path, 0); + strbuf_add(&child_path, ce->name, slash - ce->name + 1); + + span = ct->down[pos]->cache_tree->entry_count; + count = convert_to_sparse_rec(istate, + num_converted, i, i + span, + child_path.buf, child_path.len, + ct->down[pos]->cache_tree); + num_converted += count; + i += span; + } + + strbuf_release(&child_path); + return num_converted - start_converted; +} + +int convert_to_sparse(struct index_state *istate) +{ + if (istate->split_index || istate->sparse_index || + !core_apply_sparse_checkout || !core_sparse_checkout_cone) + return 0; + + /* + * For now, only create a sparse index with the + * GIT_TEST_SPARSE_INDEX environment variable. We will relax + * this once we have a proper way to opt-in (and later still, + * opt-out). + */ + if (!git_env_bool("GIT_TEST_SPARSE_INDEX", 0)) + return 0; + + if (!istate->sparse_checkout_patterns) { + istate->sparse_checkout_patterns = xcalloc(1, sizeof(struct pattern_list)); + if (get_sparse_checkout_patterns(istate->sparse_checkout_patterns) < 0) + return 0; + } + + if (!istate->sparse_checkout_patterns->use_cone_patterns) { + warning(_("attempting to use sparse-index without cone mode")); + return -1; + } + + if (cache_tree_update(istate, 0)) { + warning(_("unable to update cache-tree, staying full")); + return -1; + } + + remove_fsmonitor(istate); + + trace2_region_enter("index", "convert_to_sparse", istate->repo); + istate->cache_nr = convert_to_sparse_rec(istate, + 0, 0, istate->cache_nr, + "", 0, istate->cache_tree); + istate->drop_cache_tree = 1; + istate->sparse_index = 1; + trace2_region_leave("index", "convert_to_sparse", istate->repo); + return 0; +} static void set_index_entry(struct index_state *istate, int nr, struct cache_entry *ce) { diff --git a/sparse-index.h b/sparse-index.h index 09a20d036c46..64380e121d80 100644 --- a/sparse-index.h +++ b/sparse-index.h @@ -3,5 +3,6 @@ struct index_state; void ensure_full_index(struct index_state *istate); +int convert_to_sparse(struct index_state *istate); #endif diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index a1aea141c62c..1e888d195122 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -2,6 +2,11 @@ test_description='compare full workdir to sparse workdir' +# The verify_cache_tree() check is not sparse-aware (yet). +# So, disable the check until that integration is complete. +GIT_TEST_CHECK_CACHE_TREE=0 +GIT_TEST_SPLIT_INDEX=0 + . ./test-lib.sh test_expect_success 'setup' ' @@ -121,7 +126,9 @@ run_on_all () { test_all_match () { run_on_all "$@" && test_cmp full-checkout-out sparse-checkout-out && - test_cmp full-checkout-err sparse-checkout-err + test_cmp full-checkout-out sparse-index-out && + test_cmp full-checkout-err sparse-checkout-err && + test_cmp full-checkout-err sparse-index-err } test_sparse_match () { @@ -130,6 +137,38 @@ test_sparse_match () { test_cmp sparse-checkout-err sparse-index-err } +test_expect_success 'sparse-index contents' ' + init_repos && + + test-tool -C sparse-index read-cache --table >cache && + for dir in folder1 folder2 x + do + TREE=$(git -C sparse-index rev-parse HEAD:$dir) && + grep "040000 tree $TREE $dir/" cache \ + || return 1 + done && + + GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout set folder1 && + + test-tool -C sparse-index read-cache --table >cache && + for dir in deep folder2 x + do + TREE=$(git -C sparse-index rev-parse HEAD:$dir) && + grep "040000 tree $TREE $dir/" cache \ + || return 1 + done && + + GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout set deep/deeper1 && + + test-tool -C sparse-index read-cache --table >cache && + for dir in deep/deeper2 folder1 folder2 x + do + TREE=$(git -C sparse-index rev-parse HEAD:$dir) && + grep "040000 tree $TREE $dir/" cache \ + || return 1 + done +' + test_expect_success 'expanded in-memory index matches full index' ' init_repos && test_sparse_match test-tool read-cache --expand --table @@ -137,6 +176,7 @@ test_expect_success 'expanded in-memory index matches full index' ' test_expect_success 'status with options' ' init_repos && + test_sparse_match ls && test_all_match git status --porcelain=v2 && test_all_match git status --porcelain=v2 -z -u && test_all_match git status --porcelain=v2 -uno && @@ -273,6 +313,17 @@ test_expect_failure 'checkout and reset (mixed)' ' test_all_match git reset update-folder2 ' +# Ensure that sparse-index behaves identically to +# sparse-checkout with a full index. +test_expect_success 'checkout and reset (mixed) [sparse]' ' + init_repos && + + test_sparse_match git checkout -b reset-test update-deep && + test_sparse_match git reset deepest && + test_sparse_match git reset update-folder1 && + test_sparse_match git reset update-folder2 +' + test_expect_success 'merge' ' init_repos && @@ -309,14 +360,20 @@ test_expect_success 'clean' ' test_all_match git status --porcelain=v2 && test_all_match git clean -f && test_all_match git status --porcelain=v2 && + test_sparse_match ls && + test_sparse_match ls folder1 && test_all_match git clean -xf && test_all_match git status --porcelain=v2 && + test_sparse_match ls && + test_sparse_match ls folder1 && test_all_match git clean -xdf && test_all_match git status --porcelain=v2 && + test_sparse_match ls && + test_sparse_match ls folder1 && - test_path_is_dir sparse-checkout/folder1 + test_sparse_match test_path_is_dir folder1 ' test_done From patchwork Tue Mar 16 16:42:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12142955 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15B75C432C3 for ; Tue, 16 Mar 2021 16:44:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E639B65111 for ; Tue, 16 Mar 2021 16:44:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238929AbhCPQn7 (ORCPT ); Tue, 16 Mar 2021 12:43:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52596 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238892AbhCPQnP (ORCPT ); Tue, 16 Mar 2021 12:43:15 -0400 Received: from mail-wm1-x330.google.com (mail-wm1-x330.google.com [IPv6:2a00:1450:4864:20::330]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F0C9DC061762 for ; Tue, 16 Mar 2021 09:43:14 -0700 (PDT) Received: by mail-wm1-x330.google.com with SMTP id r10-20020a05600c35cab029010c946c95easo1841066wmq.4 for ; Tue, 16 Mar 2021 09:43:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=d57sN1iwbvl34esKzbNWg9awcmZQjE9yfjMflF8q+sQ=; b=V2TSPi8IlodrET6lbbThOOSI7/AKHMnKK8J78IkO68eOA3JzCqrokp3ZdmHaWnWhDL zUq4W7kBDQ2DFGB0uKdC/nN/lmx768TkxrCVzGhZXEVEYl/MjjallCI3OOpdkEoKyNwX tKhqISXVFbeeM+i5cQBM9QAvd6okbGUBZGtUigXXOq4hpWaeOaLWsoFl7ORIOWkav00v ILEVB5kffSxFEBg87LggddmwI4GZBOW6DdPm368Xf6pPWg2sZOiTuDZVNPoUP2ygVQ+u 061gd5d7XXVpFup6zXLPbELFS1pBe4IXJaTTZER6wQjrj27I0i45yqFg2RbhtkKVWDj8 2pPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=d57sN1iwbvl34esKzbNWg9awcmZQjE9yfjMflF8q+sQ=; b=c3p0vZY6uX1A32ApziI0CDh3GxCvKsMiHlWBFqKzRv3WDV+ZcBUhQQpu+EHunOutm5 lgOzHlLyO7Wh5Aigm04E5Zh+qqgk1jcmnr0vfgcwnMt0JPq2dZrq9jc/3WNk67KI+gxI 1yAnw5T/eKrqTqbZhiulc/T6pgA2Ej9E8crKh4OZC8sx/Uf7DXdzSgxWXOOeB2n9vsy4 TFxDgUd7tMydNtE0yvJieZzkoo9lcBQv0XCDwhIpobkfAkgETo+qfMacAiC13b58Qtb/ 4LnhsQ0T/iJ4Xs1gCIT3QBPfRhScLwA6qG9nqLr2Rp6rSrZPxayeLMz1CjCAry7Ihg5s GmPQ== X-Gm-Message-State: AOAM531Zc1r799c32EavRsmFuPhPypKD8jpSnDFhr4i6X7yJ4I46TPbo QQBtJfJcg46toA7gjADe1FahPhpik8g= X-Google-Smtp-Source: ABdhPJwkUoLz7ygwah6K5SXyc5VwkY0iKBcyI1+IoU//ZMNdWtcHFBn80bhSE4vHbZ38+JhVIU1f0A== X-Received: by 2002:a7b:cdf7:: with SMTP id p23mr1230wmj.26.1615912993781; Tue, 16 Mar 2021 09:43:13 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id t23sm24747658wra.50.2021.03.16.09.43.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Mar 2021 09:43:13 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 16 Mar 2021 16:42:55 +0000 Subject: [PATCH v3 12/20] submodule: sparse-index should not collapse links Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee A submodule is stored as a "Git link" that actually points to a commit within a submodule. Submodules are populated or not depending on submodule configuration, not sparse-checkout. To ensure that the sparse-index feature integrates correctly with submodules, we should not collapse a directory if there is a Git link within its range. Signed-off-by: Derrick Stolee --- sparse-index.c | 1 + t/t1092-sparse-checkout-compatibility.sh | 17 +++++++++++++++++ 2 files changed, 18 insertions(+) diff --git a/sparse-index.c b/sparse-index.c index 619ff7c2e217..7631f7bd00b7 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -52,6 +52,7 @@ static int convert_to_sparse_rec(struct index_state *istate, struct cache_entry *ce = istate->cache[i]; if (ce_stage(ce) || + S_ISGITLINK(ce->ce_mode) || !(ce->ce_flags & CE_SKIP_WORKTREE)) can_convert = 0; } diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index 1e888d195122..cba5f89b1e96 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -376,4 +376,21 @@ test_expect_success 'clean' ' test_sparse_match test_path_is_dir folder1 ' +test_expect_success 'submodule handling' ' + init_repos && + + test_all_match mkdir modules && + test_all_match touch modules/a && + test_all_match git add modules && + test_all_match git commit -m "add modules directory" && + + run_on_all git submodule add "$(pwd)/initial-repo" modules/sub && + test_all_match git commit -m "add submodule" && + + # having a submodule prevents "modules" from collapse + test-tool -C sparse-index read-cache --table >cache && + grep "100644 blob .* modules/a" cache && + grep "160000 commit $(git -C initial-repo rev-parse HEAD) modules/sub" cache +' + test_done From patchwork Tue Mar 16 16:42:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12142953 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1D371C43603 for ; Tue, 16 Mar 2021 16:44:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0278A6511A for ; Tue, 16 Mar 2021 16:44:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236183AbhCPQoB (ORCPT ); Tue, 16 Mar 2021 12:44:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52604 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238896AbhCPQnQ (ORCPT ); Tue, 16 Mar 2021 12:43:16 -0400 Received: from mail-wr1-x433.google.com (mail-wr1-x433.google.com [IPv6:2a00:1450:4864:20::433]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 846ABC061756 for ; Tue, 16 Mar 2021 09:43:15 -0700 (PDT) Received: by mail-wr1-x433.google.com with SMTP id j2so10799205wrx.9 for ; Tue, 16 Mar 2021 09:43:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=AD1RbgR+VdmkFM+nbObFdWpWSx5BgYQP2cy/vQr5rpM=; b=UK0PeDjfK6gryoss94YdloWQHWOaW25chpEcfmmG7jUeUMHpat+xRkdKnm6Ng2fDCE RqVqNqp3D2VWT3xXURn34Ijd2D81ZPc/ujR7d+KUCBDwwzYE2O8S5QlJiTXEMnh7GzLX hBLLYAXzniJNR6ooq5XGF11bquOWTNsbnLKB4U5BuYH/T3BV6wg60fvxfj4OiftUHpWS XfCO5rxTcLY16/fCdJXRfDSxLga6ZJEwhU72+ZV9LtjO5HznJvbcJoEtJAIuxMqCsR8d zaAQMaLaAtttB5eM2wmBGl46YU6xyhsH7RzZpE1hGI9zcdUFFvoyB84NHtnsLlIcrnD2 zwTQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=AD1RbgR+VdmkFM+nbObFdWpWSx5BgYQP2cy/vQr5rpM=; b=DPxoYKbikUwV7L/zT36k1pa6/3tJ2RDm/fhvp0Gfh6EFF/Ge2GX3gN1BkhobCqbNPR Io82JFv2h68Sf23aM/+NPHDTkn1DdjROIil3+duBvuvRVmF4ICHKZNPUYLDis1lee/mc BIZaEpxy190eaJK4JH/PgW4B0PVhxxZhRhxFo1UhrtQPNFi2W9cIVVQGSdwMfS8ycryI 5OEYKPgcPCEgdkXPz0jmnGiR9KZTUl+ubxDqdSU68vv7TL5FNOXxKvBBd0v9b3Z/mJ3j RtLuMbolqCkkw6szi99GtTSS9NrVJEDPpW7i9cQLX+XLMlkB2ICJQx5qeTkwhsnRb4SZ ORxQ== X-Gm-Message-State: AOAM530GJ00REL1OQpDAj7+aTX3eKjpyYfOiQfWeZU5ITrODOrm91CQY VEKrl3Vl+pslJcPjZTKXpEkvUcWMWgc= X-Google-Smtp-Source: ABdhPJz6wU5za3bntoHldQYSeF1DUKzXFxCw9/qmHmD01cjNn6gJGq1RmNsKGhRWEBDOHOofqQrY/Q== X-Received: by 2002:adf:f584:: with SMTP id f4mr5956232wro.311.1615912994380; Tue, 16 Mar 2021 09:43:14 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id z13sm22315868wrh.65.2021.03.16.09.43.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Mar 2021 09:43:14 -0700 (PDT) Message-Id: <4000c5cdd4cf6008358a02d1b0244b24e61b3e3e.1615912983.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 16 Mar 2021 16:42:56 +0000 Subject: [PATCH v3 13/20] unpack-trees: allow sparse directories Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The index_pos_by_traverse_info() currently throws a BUG() when a directory entry exists exactly in the index. We need to consider that it is possible to have a directory in a sparse index as long as that entry is itself marked with the skip-worktree bit. The 'pos' variable is assigned a negative value if an exact match is not found. Since a directory name can be an exact match, it is no longer an error to have a nonnegative 'pos' value. Signed-off-by: Derrick Stolee --- unpack-trees.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/unpack-trees.c b/unpack-trees.c index 2da3e5ec77a1..e81d82d72d89 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -749,9 +749,12 @@ static int index_pos_by_traverse_info(struct name_entry *names, strbuf_make_traverse_path(&name, info, names->path, names->pathlen); strbuf_addch(&name, '/'); pos = index_name_pos(o->src_index, name.buf, name.len); - if (pos >= 0) - BUG("This is a directory and should not exist in index"); - pos = -pos - 1; + if (pos >= 0) { + if (!o->src_index->sparse_index || + !(o->src_index->cache[pos]->ce_flags & CE_SKIP_WORKTREE)) + BUG("This is a directory and should not exist in index"); + } else + pos = -pos - 1; if (pos >= o->src_index->cache_nr || !starts_with(o->src_index->cache[pos]->name, name.buf) || (pos > 0 && starts_with(o->src_index->cache[pos-1]->name, name.buf))) From patchwork Tue Mar 16 16:42:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12142961 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4CD86C433DB for ; Tue, 16 Mar 2021 16:44:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 17A1465108 for ; Tue, 16 Mar 2021 16:44:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238940AbhCPQoG (ORCPT ); Tue, 16 Mar 2021 12:44:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52624 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238898AbhCPQnQ (ORCPT ); Tue, 16 Mar 2021 12:43:16 -0400 Received: from mail-wm1-x335.google.com (mail-wm1-x335.google.com [IPv6:2a00:1450:4864:20::335]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3264BC061763 for ; Tue, 16 Mar 2021 09:43:16 -0700 (PDT) Received: by mail-wm1-x335.google.com with SMTP id u5-20020a7bcb050000b029010e9316b9d5so1846197wmj.2 for ; Tue, 16 Mar 2021 09:43:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=HXXyCb3008cEuxUb0eFiCeUe/gRR165/1lETbl5mS4A=; b=mRcmQpWdxPqI9B0Koct/0mzklvAVCv75mZk4b7QYUXP5mZyCjXM0dJOFFmZ4CiUvMr nhAx6CHVDAVLkNvT3EtjeJOqjoai6RRkbLdxvHurLEXputgpusscLJU3kprpiEuOc5zK ulSeT1+8OI2xDhlmQiQSR/4erm9k4taDzMxwJNXywKsPWzDnicLnK6/RXOiy3SkOJxov 3eDYbVMMXsTmN2hd5vtxWwen+kGERW5fplLlFVBMF03X9qbz3c5gUWCfmQ0nnrHtt/JG mwV3BSc/Azrq3QEn0J7QLCV+qOurHrACoN95XR55tx8TBUfRXKQ4+tUqW7RIYGrHejc7 t5Gw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=HXXyCb3008cEuxUb0eFiCeUe/gRR165/1lETbl5mS4A=; b=CebowHF+wfyklavUTsST+XMAHpjQmnk0Dbcs7imNjqPvS+f9+svC1xZv/4d2OuF6v0 7MdOBtK31LvvG5zljhJ+cxkTkBi91rJntQwT2OdH/n9N81QjJHcefPlTPqyR7y5SQvs7 EChkqIx1dYgWtdo3JI/PRkQ2b6hvTJ1X0DzloknXDsx8rH8+o05JGU5iH2PVg0cxT8u1 xQ6ISGneYpqf97ijMpkBWIzULoRjUHLFZIApP1r8mwP9wmTggtGtrvUb91msJldV3H28 iJL7FzixOAiWIiEH4526hhqlSzSl4ddCTs2CnuFAPLl7Ah1pA8qbWFH+m4mK5TbZAkEy rhnA== X-Gm-Message-State: AOAM530pPS8woYeAV3bpi2r2CFMBXpW0nmnq1gps9DNDfvkw/XDIVqb1 HqYYl8091ybmBLoGZh4pZehSrSUE/i4= X-Google-Smtp-Source: ABdhPJzUVHnFsWJMPH0lmt+ICfGp9OjLsjW3CPWrrDTI9U4nVFi0rAwHmO3dwcEGbW67d4mRJv+ywQ== X-Received: by 2002:a1c:7fd8:: with SMTP id a207mr492230wmd.40.1615912995025; Tue, 16 Mar 2021 09:43:15 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id p6sm22377151wru.2.2021.03.16.09.43.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Mar 2021 09:43:14 -0700 (PDT) Message-Id: <1a2be38b2ca71974d19f78d4391f16145d5a9ba2.1615912983.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 16 Mar 2021 16:42:57 +0000 Subject: [PATCH v3 14/20] sparse-index: check index conversion happens Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Add a test case that uses test_region to ensure that we are truly expanding a sparse index to a full one, then converting back to sparse when writing the index. As we integrate more Git commands with the sparse index, we will convert these commands to check that we do _not_ convert the sparse index to a full index and instead stay sparse the entire time. Signed-off-by: Derrick Stolee --- t/t1092-sparse-checkout-compatibility.sh | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index cba5f89b1e96..47f983217852 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -393,4 +393,22 @@ test_expect_success 'submodule handling' ' grep "160000 commit $(git -C initial-repo rev-parse HEAD) modules/sub" cache ' +test_expect_success 'sparse-index is expanded and converted back' ' + init_repos && + + ( + GIT_TEST_SPARSE_INDEX=1 && + export GIT_TEST_SPARSE_INDEX && + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ + git -C sparse-index -c core.fsmonitor="" reset --hard && + test_region index convert_to_sparse trace2.txt && + test_region index ensure_full_index trace2.txt && + + rm trace2.txt && + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ + git -C sparse-index -c core.fsmonitor="" status -uno && + test_region index ensure_full_index trace2.txt + ) +' + test_done From patchwork Tue Mar 16 16:42:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12142963 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 81A0FC433E6 for ; Tue, 16 Mar 2021 16:44:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5C5DE6510B for ; Tue, 16 Mar 2021 16:44:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238776AbhCPQoH (ORCPT ); Tue, 16 Mar 2021 12:44:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52608 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238901AbhCPQnR (ORCPT ); Tue, 16 Mar 2021 12:43:17 -0400 Received: from mail-wr1-x430.google.com (mail-wr1-x430.google.com [IPv6:2a00:1450:4864:20::430]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D909FC061765 for ; Tue, 16 Mar 2021 09:43:16 -0700 (PDT) Received: by mail-wr1-x430.google.com with SMTP id e18so7783641wrt.6 for ; Tue, 16 Mar 2021 09:43:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Xeo+ALDrr9sojSvzOq0Dt1thWADRAna/XtjhCXYIYmg=; b=QDgo8CF/nB4x6FIRoKTPBnZ5kTF3Q5bg3ArCL1rFnjbg4vzIpML7iErQLDr3XzWMcS 9XIdmDBChSxreOA31e+MeFfiYgy6UBKLsQDMJTqH28r+UH9b+/3EKu89aSCr4z0otiFx C2S/cP3U3WAXw/89zJyPmQ8ZhzJTZkNg4RLSomzy8Iw3zQUeUfmLkpPc7pUYt2ulqVv6 PRuof9IhtU3iabbfcK2ghpq/vYcB0R3tf44GBvYblFGTPVmpm+xJaNUvbLFGRZVr9AmA PLG4Ri40doCrmxpJXReWTtvk1lNjPmVn/JeqW/V1di2iEGrAnuVyk5j0KGWPMhkQetjd ZrNQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Xeo+ALDrr9sojSvzOq0Dt1thWADRAna/XtjhCXYIYmg=; b=HxBon1cct2Mc2eW1hgvBwTqCp+oG+ahp8SmOVVxvI88uXHtsAJBNPOtfW2+bqGRINT UYz0I7EiE422wbyN8lj9h5j0U0nqxHWr1FBSBGmEFDj4bo66I0lweEbA2qx6piK6/WqJ p3m5HQj6F5C0jXub2FGPEqOYk6KLeQmJmcd/V++eA03Z3FoCpfVaYV69iKIyFknPIhmA EGa/Z8p+W/JBczKgeWvoTjuHFTVBvW5ombxhJiWi/0R49I7VMxz16PdNZF/usjfkZCz6 da32rJrZouANRrUlHd//oL+4rHQxGfOoywKYpNQdXmMfLkSf+EUr61kJpn7HuBMZ1Fxu glfQ== X-Gm-Message-State: AOAM531REWHlUb3saopO0XpACU7LweHxeUweyi0TCehlDUrAPHBtkdXG OEZehzhvBUPc29zcJ+LxrkZsQQTcXoE= X-Google-Smtp-Source: ABdhPJx25JryrXtf5+Z1Kdq8ViP6vNgBhv9XvCwJbhmhKkNpnWd4ZmM019R+y1OoGYTXO18Er+sB/A== X-Received: by 2002:adf:84e6:: with SMTP id 93mr5598002wrg.376.1615912995689; Tue, 16 Mar 2021 09:43:15 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id b15sm10408wmd.41.2021.03.16.09.43.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Mar 2021 09:43:15 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 16 Mar 2021 16:42:58 +0000 Subject: [PATCH v3 15/20] sparse-index: create extension for compatibility Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Previously, we enabled the sparse index format only using GIT_TEST_SPARSE_INDEX=1. This is not a feasible direction for users to actually select this mode. Further, sparse directory entries are not understood by the index formats as advertised. We _could_ add a new index version that explicitly adds these capabilities, but there are nuances to index formats 2, 3, and 4 that are still valuable to select as options. Until we add index format version 5, create a repo extension, "extensions.sparseIndex", that specifies that the tool reading this repository must understand sparse directory entries. This change only encodes the extension and enables it when GIT_TEST_SPARSE_INDEX=1. Later, we will add a more user-friendly CLI mechanism. Signed-off-by: Derrick Stolee --- Documentation/config/extensions.txt | 8 ++++++ cache.h | 1 + repo-settings.c | 7 ++++++ repository.h | 3 ++- setup.c | 3 +++ sparse-index.c | 38 +++++++++++++++++++++++++---- 6 files changed, 54 insertions(+), 6 deletions(-) diff --git a/Documentation/config/extensions.txt b/Documentation/config/extensions.txt index 4e23d73cdcad..c02e09af0046 100644 --- a/Documentation/config/extensions.txt +++ b/Documentation/config/extensions.txt @@ -6,3 +6,11 @@ extensions.objectFormat:: Note that this setting should only be set by linkgit:git-init[1] or linkgit:git-clone[1]. Trying to change it after initialization will not work and will produce hard-to-diagnose issues. + +extensions.sparseIndex:: + When combined with `core.sparseCheckout=true` and + `core.sparseCheckoutCone=true`, the index may contain entries + corresponding to directories outside of the sparse-checkout + definition in lieu of containing each path under such directories. + Versions of Git that do not understand this extension do not + expect directory entries in the index. diff --git a/cache.h b/cache.h index 69a32146cd77..4ca6cd7f782c 100644 --- a/cache.h +++ b/cache.h @@ -1059,6 +1059,7 @@ struct repository_format { int worktree_config; int is_bare; int hash_algo; + int sparse_index; char *work_tree; struct string_list unknown_extensions; struct string_list v1_only_extensions; diff --git a/repo-settings.c b/repo-settings.c index d63569e4041e..9677d50f9238 100644 --- a/repo-settings.c +++ b/repo-settings.c @@ -85,4 +85,11 @@ void prepare_repo_settings(struct repository *r) * removed. */ r->settings.command_requires_full_index = 1; + + /* + * Initialize this as off. + */ + r->settings.sparse_index = 0; + if (!repo_config_get_bool(r, "extensions.sparseindex", &value) && value) + r->settings.sparse_index = 1; } diff --git a/repository.h b/repository.h index e06a23015697..a45f7520fd9e 100644 --- a/repository.h +++ b/repository.h @@ -42,7 +42,8 @@ struct repo_settings { int core_multi_pack_index; - unsigned command_requires_full_index:1; + unsigned command_requires_full_index:1, + sparse_index:1; }; struct repository { diff --git a/setup.c b/setup.c index c04cd25a30df..cd8394564613 100644 --- a/setup.c +++ b/setup.c @@ -500,6 +500,9 @@ static enum extension_result handle_extension(const char *var, return error("invalid value for 'extensions.objectformat'"); data->hash_algo = format; return EXTENSION_OK; + } else if (!strcmp(ext, "sparseindex")) { + data->sparse_index = 1; + return EXTENSION_OK; } return EXTENSION_UNKNOWN; } diff --git a/sparse-index.c b/sparse-index.c index 7631f7bd00b7..3a6df66faeab 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -102,19 +102,47 @@ static int convert_to_sparse_rec(struct index_state *istate, return num_converted - start_converted; } +static int enable_sparse_index(struct repository *repo) +{ + const char *config_path = repo_git_path(repo, "config.worktree"); + + if (upgrade_repository_format(1) < 0) { + warning(_("unable to upgrade repository format to enable sparse-index")); + return -1; + } + git_config_set_in_file_gently(config_path, + "extensions.sparseIndex", + "true"); + + prepare_repo_settings(repo); + repo->settings.sparse_index = 1; + return 0; +} + int convert_to_sparse(struct index_state *istate) { if (istate->split_index || istate->sparse_index || !core_apply_sparse_checkout || !core_sparse_checkout_cone) return 0; + if (!istate->repo) + istate->repo = the_repository; + + /* + * The GIT_TEST_SPARSE_INDEX environment variable triggers the + * extensions.sparseIndex config variable to be on. + */ + if (git_env_bool("GIT_TEST_SPARSE_INDEX", 0)) { + int err = enable_sparse_index(istate->repo); + if (err < 0) + return err; + } + /* - * For now, only create a sparse index with the - * GIT_TEST_SPARSE_INDEX environment variable. We will relax - * this once we have a proper way to opt-in (and later still, - * opt-out). + * Only convert to sparse if extensions.sparseIndex is set. */ - if (!git_env_bool("GIT_TEST_SPARSE_INDEX", 0)) + prepare_repo_settings(istate->repo); + if (!istate->repo->settings.sparse_index) return 0; if (!istate->sparse_checkout_patterns) { From patchwork Tue Mar 16 16:42:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12142967 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7394C43381 for ; Tue, 16 Mar 2021 16:44:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8CF266510B for ; Tue, 16 Mar 2021 16:44:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238915AbhCPQoJ (ORCPT ); Tue, 16 Mar 2021 12:44:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52596 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238875AbhCPQnU (ORCPT ); Tue, 16 Mar 2021 12:43:20 -0400 Received: from mail-wr1-x431.google.com (mail-wr1-x431.google.com [IPv6:2a00:1450:4864:20::431]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C2945C06174A for ; Tue, 16 Mar 2021 09:43:17 -0700 (PDT) Received: by mail-wr1-x431.google.com with SMTP id w11so10796447wrr.10 for ; Tue, 16 Mar 2021 09:43:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=kncCz5Lyrf/H+lO6YLs/MlrpL++udfl5vXkop5ZyHzg=; b=PPIKAf5Buu7jVbNxxoG2mSyC7Wr4Z23ZI+S8BDPQLdYChUPWiOeAHVkK1WGGj1YAM4 VgR5FOek2W42iAza0NjbLsZTV4BFd2XV75nnx6BLyH2h95qmpKPA3JGN3adsXSN2hz23 yxx6dPmbM8dMSWDG9CSxZr3OjqHo4rq/80zxf4idKDW4UdIKxoGcssyV57PAWklURHT8 SOyYqID79tgA2y5TzXXJKYFmTzHxKgUgvJREA4ao5V/b3/ycZ03mUjpFiVx/Q3Hx0dsB orUm1BFAA0CklftPKfbrCr2eaB6p2G0Vf+HeApWZu0qexrwcNCnG1ehvD0o/Ca5s4lqm K0fw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=kncCz5Lyrf/H+lO6YLs/MlrpL++udfl5vXkop5ZyHzg=; b=shPHQXK0IOdZsn/qG54Hh7RfLVpo7XTSKzLx0TFSbR1D3Yv8ynsSez9QhgFCjgyfxw s3TZ3gCS9DjybsbstOQJUXVRrmpn3uOgnN1SqyYjw6Rwvo9iWNcv/7dcRS4SL9GmwGMx P54vjomJZeEytpMaWAFOmrUdpRtc2gvEqm/TwehUN/ATZfYr0oH2f6JgPZ0yAry94Mlm /9GBMUQovRY1Y3OuGcV46a88ak58EpdgAmbgxFFCPhs3NSMKwB1o3yD8fGwtgSOeBIcC DMa0PWCwN3pwxG/wYiKHoorLI1B0M6fKUGcLwsVnF3JXDuTvAv3tppIWU/UZGym3XkiG iBLA== X-Gm-Message-State: AOAM531UaevUkO7gLma0tQ7UwqImasx3N3qqUrVT/xyygkHFH5D7Kwxv yt6dGl7o2o2fSbo8hxisUQk8nTg6Goc= X-Google-Smtp-Source: ABdhPJzjC4nk6ZLoHTgbUuBF1J4GxqnbXiHf798xmdrzAkOb6ozRv51OgymPyNdkqhPkyrcNUmNc4w== X-Received: by 2002:adf:9544:: with SMTP id 62mr5738221wrs.128.1615912996549; Tue, 16 Mar 2021 09:43:16 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id m11sm22962525wrz.40.2021.03.16.09.43.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Mar 2021 09:43:16 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 16 Mar 2021 16:42:59 +0000 Subject: [PATCH v3 16/20] sparse-checkout: toggle sparse index from builtin Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The sparse index extension is used to signal that index writes should be in sparse mode. This was only updated using GIT_TEST_SPARSE_INDEX=1. Add a '--[no-]sparse-index' option to 'git sparse-checkout init' that specifies if the sparse index should be used. It also updates the index to use the correct format, either way. Add a warning in the documentation that the use of a repository extension might reduce compatibility with third-party tools. 'git sparse-checkout init' already sets extension.worktreeConfig, which places most sparse-checkout users outside of the scope of most third-party tools. Update t1092-sparse-checkout-compatibility.sh to use this CLI instead of GIT_TEST_SPARSE_INDEX=1. Signed-off-by: Derrick Stolee --- Documentation/git-sparse-checkout.txt | 14 +++++++ builtin/sparse-checkout.c | 17 ++++++++- sparse-index.c | 37 +++++++++++++------ sparse-index.h | 3 ++ t/t1092-sparse-checkout-compatibility.sh | 47 +++++++++++++----------- 5 files changed, 84 insertions(+), 34 deletions(-) diff --git a/Documentation/git-sparse-checkout.txt b/Documentation/git-sparse-checkout.txt index a0eeaeb02ee3..2ff66c5a4e41 100644 --- a/Documentation/git-sparse-checkout.txt +++ b/Documentation/git-sparse-checkout.txt @@ -45,6 +45,20 @@ To avoid interfering with other worktrees, it first enables the When `--cone` is provided, the `core.sparseCheckoutCone` setting is also set, allowing for better performance with a limited set of patterns (see 'CONE PATTERN SET' below). ++ +Use the `--[no-]sparse-index` option to toggle the use of the sparse +index format. This reduces the size of the index to be more closely +aligned with your sparse-checkout definition. This can have significant +performance advantages for commands such as `git status` or `git add`. +This feature is still experimental. Some commands might be slower with +a sparse index until they are properly integrated with the feature. ++ +**WARNING:** Using a sparse index requires modifying the index in a way +that is not completely understood by external tools. If you have trouble +with this compatibility, then run `git sparse-checkout init --no-sparse-index` +to rewrite your index to not be sparse. Older versions of Git will not +understand the `sparseIndex` repository extension and may fail to interact +with your repository until it is disabled. 'set':: Write a set of patterns to the sparse-checkout file, as given as diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index e00b82af727b..ca63e2c64e95 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -14,6 +14,7 @@ #include "unpack-trees.h" #include "wt-status.h" #include "quote.h" +#include "sparse-index.h" static const char *empty_base = ""; @@ -283,12 +284,13 @@ static int set_config(enum sparse_checkout_mode mode) } static char const * const builtin_sparse_checkout_init_usage[] = { - N_("git sparse-checkout init [--cone]"), + N_("git sparse-checkout init [--cone] [--[no-]sparse-index]"), NULL }; static struct sparse_checkout_init_opts { int cone_mode; + int sparse_index; } init_opts; static int sparse_checkout_init(int argc, const char **argv) @@ -303,11 +305,15 @@ static int sparse_checkout_init(int argc, const char **argv) static struct option builtin_sparse_checkout_init_options[] = { OPT_BOOL(0, "cone", &init_opts.cone_mode, N_("initialize the sparse-checkout in cone mode")), + OPT_BOOL(0, "sparse-index", &init_opts.sparse_index, + N_("toggle the use of a sparse index")), OPT_END(), }; repo_read_index(the_repository); + init_opts.sparse_index = -1; + argc = parse_options(argc, argv, NULL, builtin_sparse_checkout_init_options, builtin_sparse_checkout_init_usage, 0); @@ -326,6 +332,15 @@ static int sparse_checkout_init(int argc, const char **argv) sparse_filename = get_sparse_checkout_filename(); res = add_patterns_from_file_to_list(sparse_filename, "", 0, &pl, NULL); + if (init_opts.sparse_index >= 0) { + if (set_sparse_index_config(the_repository, init_opts.sparse_index) < 0) + die(_("failed to modify sparse-index config")); + + /* force an index rewrite */ + repo_read_index(the_repository); + the_repository->index->updated_workdir = 1; + } + /* If we already have a sparse-checkout file, use it. */ if (res >= 0) { free(sparse_filename); diff --git a/sparse-index.c b/sparse-index.c index 3a6df66faeab..30c1a11fd62d 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -104,23 +104,37 @@ static int convert_to_sparse_rec(struct index_state *istate, static int enable_sparse_index(struct repository *repo) { - const char *config_path = repo_git_path(repo, "config.worktree"); + int res; if (upgrade_repository_format(1) < 0) { warning(_("unable to upgrade repository format to enable sparse-index")); return -1; } - git_config_set_in_file_gently(config_path, - "extensions.sparseIndex", - "true"); + res = git_config_set_gently("extensions.sparseindex", "true"); prepare_repo_settings(repo); repo->settings.sparse_index = 1; - return 0; + return res; +} + +int set_sparse_index_config(struct repository *repo, int enable) +{ + int res; + + if (enable) + return enable_sparse_index(repo); + + /* Don't downgrade repository format, just remove the extension. */ + res = git_config_set_gently("extensions.sparseindex", NULL); + + prepare_repo_settings(repo); + repo->settings.sparse_index = 0; + return res; } int convert_to_sparse(struct index_state *istate) { + int test_env; if (istate->split_index || istate->sparse_index || !core_apply_sparse_checkout || !core_sparse_checkout_cone) return 0; @@ -129,14 +143,13 @@ int convert_to_sparse(struct index_state *istate) istate->repo = the_repository; /* - * The GIT_TEST_SPARSE_INDEX environment variable triggers the - * extensions.sparseIndex config variable to be on. + * If GIT_TEST_SPARSE_INDEX=1, then trigger extensions.sparseIndex + * to be fully enabled. If GIT_TEST_SPARSE_INDEX=0 (set explicitly), + * then purposefully disable the setting. */ - if (git_env_bool("GIT_TEST_SPARSE_INDEX", 0)) { - int err = enable_sparse_index(istate->repo); - if (err < 0) - return err; - } + test_env = git_env_bool("GIT_TEST_SPARSE_INDEX", -1); + if (test_env >= 0) + set_sparse_index_config(istate->repo, test_env); /* * Only convert to sparse if extensions.sparseIndex is set. diff --git a/sparse-index.h b/sparse-index.h index 64380e121d80..39dcc859735e 100644 --- a/sparse-index.h +++ b/sparse-index.h @@ -5,4 +5,7 @@ struct index_state; void ensure_full_index(struct index_state *istate); int convert_to_sparse(struct index_state *istate); +struct repository; +int set_sparse_index_config(struct repository *repo, int enable); + #endif diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index 47f983217852..f14dc48924d2 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -6,6 +6,7 @@ test_description='compare full workdir to sparse workdir' # So, disable the check until that integration is complete. GIT_TEST_CHECK_CACHE_TREE=0 GIT_TEST_SPLIT_INDEX=0 +GIT_TEST_SPARSE_INDEX= . ./test-lib.sh @@ -100,25 +101,26 @@ init_repos () { # initialize sparse-checkout definitions git -C sparse-checkout sparse-checkout init --cone && git -C sparse-checkout sparse-checkout set deep && - GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout init --cone && - GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout set deep + git -C sparse-index sparse-checkout init --cone --sparse-index && + test_cmp_config -C sparse-index true extensions.sparseindex && + git -C sparse-index sparse-checkout set deep } run_on_sparse () { ( cd sparse-checkout && - GIT_TEST_SPARSE_INDEX=0 "$@" >../sparse-checkout-out 2>../sparse-checkout-err + "$@" >../sparse-checkout-out 2>../sparse-checkout-err ) && ( cd sparse-index && - GIT_TEST_SPARSE_INDEX=1 "$@" >../sparse-index-out 2>../sparse-index-err + "$@" >../sparse-index-out 2>../sparse-index-err ) } run_on_all () { ( cd full-checkout && - GIT_TEST_SPARSE_INDEX=0 "$@" >../full-checkout-out 2>../full-checkout-err + "$@" >../full-checkout-out 2>../full-checkout-err ) && run_on_sparse "$@" } @@ -148,7 +150,7 @@ test_expect_success 'sparse-index contents' ' || return 1 done && - GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout set folder1 && + git -C sparse-index sparse-checkout set folder1 && test-tool -C sparse-index read-cache --table >cache && for dir in deep folder2 x @@ -158,7 +160,7 @@ test_expect_success 'sparse-index contents' ' || return 1 done && - GIT_TEST_SPARSE_INDEX=1 git -C sparse-index sparse-checkout set deep/deeper1 && + git -C sparse-index sparse-checkout set deep/deeper1 && test-tool -C sparse-index read-cache --table >cache && for dir in deep/deeper2 folder1 folder2 x @@ -166,7 +168,14 @@ test_expect_success 'sparse-index contents' ' TREE=$(git -C sparse-index rev-parse HEAD:$dir) && grep "040000 tree $TREE $dir/" cache \ || return 1 - done + done && + + # Disabling the sparse-index removes tree entries with full ones + git -C sparse-index sparse-checkout init --no-sparse-index && + + test-tool -C sparse-index read-cache --table >cache && + ! grep "040000 tree" cache && + test_sparse_match test-tool read-cache --table ' test_expect_success 'expanded in-memory index matches full index' ' @@ -396,19 +405,15 @@ test_expect_success 'submodule handling' ' test_expect_success 'sparse-index is expanded and converted back' ' init_repos && - ( - GIT_TEST_SPARSE_INDEX=1 && - export GIT_TEST_SPARSE_INDEX && - GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ - git -C sparse-index -c core.fsmonitor="" reset --hard && - test_region index convert_to_sparse trace2.txt && - test_region index ensure_full_index trace2.txt && - - rm trace2.txt && - GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ - git -C sparse-index -c core.fsmonitor="" status -uno && - test_region index ensure_full_index trace2.txt - ) + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ + git -C sparse-index -c core.fsmonitor="" reset --hard && + test_region index convert_to_sparse trace2.txt && + test_region index ensure_full_index trace2.txt && + + rm trace2.txt && + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ + git -C sparse-index -c core.fsmonitor="" status -uno && + test_region index ensure_full_index trace2.txt ' test_done From patchwork Tue Mar 16 16:43:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12142965 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CBCEBC4332B for ; Tue, 16 Mar 2021 16:44:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B811D65109 for ; Tue, 16 Mar 2021 16:44:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238962AbhCPQoP (ORCPT ); Tue, 16 Mar 2021 12:44:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52604 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238908AbhCPQnU (ORCPT ); Tue, 16 Mar 2021 12:43:20 -0400 Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [IPv6:2a00:1450:4864:20::333]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 89981C06175F for ; Tue, 16 Mar 2021 09:43:18 -0700 (PDT) Received: by mail-wm1-x333.google.com with SMTP id m20-20020a7bcb940000b029010cab7e5a9fso1829792wmi.3 for ; Tue, 16 Mar 2021 09:43:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=h/Sv2nt7HYoFWjtNBIuIh6ZaRdlBye0GXuHQNudUZb8=; b=X/ut/9z8pH3pSneMoimwEahGbj5uRUjtSCXclDi+34KjX7PvxXQrLuWgAEFOrb7WPQ eNNWKtClj3kEv17It5x1iGYq4YTdKncGXjFLmi8xatTud+RF0l9yy/2ndo9vqARtooab FKkxEHXEZslT7oy3vPEEDEDiKYy8jjnBIkCxP2ABKeSK4lrn3oSheACZbL5w8co/ZCxd 3QLNtqZQwgTt2gJxrjIm/ET9sgx8YclkLPGoMEVBprw8j2it9pAiUaRHiGfrlYHosO5v fm02qkElj3FFn4mae7dZEp1wqLy4G1f75oPw7A/fBiN+NFUM9VjSvneYVWh4WOPOSkIk Ce0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=h/Sv2nt7HYoFWjtNBIuIh6ZaRdlBye0GXuHQNudUZb8=; b=lkvYttVGcTnUrPqIngyyXqVEs0KsQZJv2x+HC+WFwWkeh1fg82ZoygrdT4Vm9YrjjO +I+t0RgPjFFlkjNh8cC6zk4l9ducpCKGzYxVfPgXeyC8wcDSt7TBN8V26gy9q6bboW/9 +X8KI3ZirOCU0pqkeb/UJm7FICmOgUHGWSOkG/9LayXa6FehCLH5VBF36Nv0WDGzSQl3 K9y5zxlMb/rND3oqQWrTsAteNHMpdE2oub5LkLD+6dfSlQUqJvv+2JM4HE1XEiqoPMIQ l63/H92OlLeSr29yp9T15/bedWAdCibF33G1ARgQGktTg6xlSIYDbP8FbH5mqSM7Mx6C KLfQ== X-Gm-Message-State: AOAM531ZLUKDyMj3KYI8YWPMTrFUjS06VpNtpuQUDWafO8yy6mysqTUv hOB4XHOBk1G3DMqAxhaVC8Wc1LCT/ZM= X-Google-Smtp-Source: ABdhPJyPYIJ393S5b2BwPayYyOO1H9fyJTACR3ae2CHnG0Vqe5zVsEDr0YJID09TTzxVJZKIt+MZmQ== X-Received: by 2002:a1c:7e16:: with SMTP id z22mr503740wmc.74.1615912997370; Tue, 16 Mar 2021 09:43:17 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id k4sm29839514wrd.9.2021.03.16.09.43.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Mar 2021 09:43:16 -0700 (PDT) Message-Id: <598557f90a2a6d2a8656ff60179adcd260074771.1615912983.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 16 Mar 2021 16:43:00 +0000 Subject: [PATCH v3 17/20] sparse-checkout: disable sparse-index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee We use 'git sparse-checkout init --cone --sparse-index' to toggle the sparse-index feature. It makes sense to also disable it when running 'git sparse-checkout disable'. This is particularly important because it removes the extensions.sparseIndex config option, allowing other tools to use this Git repository again. This does mean that 'git sparse-checkout init' will not re-enable the sparse-index feature, even if it was previously enabled. While testing this feature, I noticed that the sparse-index was not being written on the first run, but by a second. This was caught by the call to 'test-tool read-cache --table'. This requires adjusting some assignments to core_apply_sparse_checkout and pl.use_cone_patterns in the sparse_checkout_init() logic. Signed-off-by: Derrick Stolee --- builtin/sparse-checkout.c | 10 +++++++++- t/t1091-sparse-checkout-builtin.sh | 13 +++++++++++++ 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index ca63e2c64e95..585343fa1972 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -280,6 +280,9 @@ static int set_config(enum sparse_checkout_mode mode) "core.sparseCheckoutCone", mode == MODE_CONE_PATTERNS ? "true" : NULL); + if (mode == MODE_NO_PATTERNS) + set_sparse_index_config(the_repository, 0); + return 0; } @@ -341,10 +344,11 @@ static int sparse_checkout_init(int argc, const char **argv) the_repository->index->updated_workdir = 1; } + core_apply_sparse_checkout = 1; + /* If we already have a sparse-checkout file, use it. */ if (res >= 0) { free(sparse_filename); - core_apply_sparse_checkout = 1; return update_working_directory(NULL); } @@ -366,6 +370,7 @@ static int sparse_checkout_init(int argc, const char **argv) add_pattern(strbuf_detach(&pattern, NULL), empty_base, 0, &pl, 0); strbuf_addstr(&pattern, "!/*/"); add_pattern(strbuf_detach(&pattern, NULL), empty_base, 0, &pl, 0); + pl.use_cone_patterns = init_opts.cone_mode; return write_patterns_and_update(&pl); } @@ -632,6 +637,9 @@ static int sparse_checkout_disable(int argc, const char **argv) strbuf_addstr(&match_all, "/*"); add_pattern(strbuf_detach(&match_all, NULL), empty_base, 0, &pl, 0); + prepare_repo_settings(the_repository); + the_repository->settings.sparse_index = 0; + if (update_working_directory(&pl)) die(_("error while refreshing working directory")); diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh index fc64e9ed99f4..ff1ad570a255 100755 --- a/t/t1091-sparse-checkout-builtin.sh +++ b/t/t1091-sparse-checkout-builtin.sh @@ -205,6 +205,19 @@ test_expect_success 'sparse-checkout disable' ' check_files repo a deep folder1 folder2 ' +test_expect_success 'sparse-index enabled and disabled' ' + git -C repo sparse-checkout init --cone --sparse-index && + test_cmp_config -C repo true extensions.sparseIndex && + test-tool -C repo read-cache --table >cache && + grep " tree " cache && + + git -C repo sparse-checkout disable && + test-tool -C repo read-cache --table >cache && + ! grep " tree " cache && + git -C repo config --list >config && + ! grep extensions.sparseindex config +' + test_expect_success 'cone mode: init and set' ' git -C repo sparse-checkout init --cone && git -C repo config --list >config && From patchwork Tue Mar 16 16:43:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12142969 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DDCD0C43331 for ; Tue, 16 Mar 2021 16:44:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C84516510B for ; Tue, 16 Mar 2021 16:44:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237762AbhCPQo0 (ORCPT ); Tue, 16 Mar 2021 12:44:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52640 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238910AbhCPQnU (ORCPT ); Tue, 16 Mar 2021 12:43:20 -0400 Received: from mail-wm1-x32a.google.com (mail-wm1-x32a.google.com [IPv6:2a00:1450:4864:20::32a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 307E1C061756 for ; Tue, 16 Mar 2021 09:43:19 -0700 (PDT) Received: by mail-wm1-x32a.google.com with SMTP id 124-20020a1c00820000b029010b871409cfso1828356wma.4 for ; Tue, 16 Mar 2021 09:43:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=fCX9Pdv6Q50MZfKlGM7RED1/TN58Z2uHwpjRZk86caE=; b=Md8W94th6DCGTUVHc9QAsAcR5lDBRoVlwruJe7V/2EvueLWEs1NLLXV8qDxcf0UntM UFJy4JqOXaAHMeaLXAETqd6wCYRhGdBEqnNG8/jFpuQUK4BsO9D8mlc7650EVH09662V nxdY/v3gCceRyOCwDRankB2mhMgBfFdD2W5a9cw1XfPlLIdU1Ic/jwjdxkmyqGq6BSB9 Vd8ktu1ZXIbQD4IwZZtSAL7cJbvS0Mf7uZX4puxXDBdfn9qLy2U4ADB79PYOsuCMLYjr 7ZiwrUxQn7LNniLTel853ET5ORU1ZJfFQd8dsLr6LeaX+GAJSszDis5yCjiQqmD1t24F hgNQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=fCX9Pdv6Q50MZfKlGM7RED1/TN58Z2uHwpjRZk86caE=; b=mFrhVrGISTsScXZRcIw2ePQXbiLbtBdvmtVIC8n88EsEyTVnVUUQQ7dRWsWlfRRLV8 G63QqKpQ8dXgqCd91InC2Dg9um+r3TmGrGs1rfE5TXJntZmt2r710913iUyDgf2kqnrq mvWPxt8Mknb5ZVCpcv9aag+RaBheJvyHiuyyOPW+xk9AtQdd3VRA8RYb6ZytbBKAAXO2 kc8RB6zUMWt0m70lE0s+HFEdBhLfX+EfINgyJXRePOuoyMyVDwUZE8cBCSWZqh06j4W1 wdi0lvv9ZIGx26qEWAYkADa4pkIqbUOfdlKyMQuXg5qRUtYtYacx0ZjhsOM9c6BItzPB p+Ew== X-Gm-Message-State: AOAM533XMX7oj2e0Vw9Zr1vaIohLhKfqrCsn7IkmZ0kicPXfCcg46S/H H/s7A3YXYSBwzHifzeVUqxszVkyXTV8= X-Google-Smtp-Source: ABdhPJzMi497sLOjVDZHUoxIn9p3czi1eFdadrlqV77tIr8yujv2LbPpVrUm+pybqG8ftgibQbhiHQ== X-Received: by 2002:a7b:c209:: with SMTP id x9mr497845wmi.92.1615912997976; Tue, 16 Mar 2021 09:43:17 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id p17sm8852wmd.42.2021.03.16.09.43.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Mar 2021 09:43:17 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 16 Mar 2021 16:43:01 +0000 Subject: [PATCH v3 18/20] cache-tree: integrate with sparse directory entries Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The cache-tree extension was previously disabled with sparse indexes. However, the cache-tree is an important performance feature for commands like 'git status' and 'git add'. Integrate it with sparse directory entries. When writing a sparse index, completely clear and recalculate the cache tree. By starting from scratch, the only integration necessary is to check if we hit a sparse directory entry and create a leaf of the cache-tree that has an entry_count of one and no subtrees. Signed-off-by: Derrick Stolee --- cache-tree.c | 18 ++++++++++++++++++ sparse-index.c | 10 +++++++++- 2 files changed, 27 insertions(+), 1 deletion(-) diff --git a/cache-tree.c b/cache-tree.c index 5f07a39e501e..950a9615db8f 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -256,6 +256,24 @@ static int update_one(struct cache_tree *it, *skip_count = 0; + /* + * If the first entry of this region is a sparse directory + * entry corresponding exactly to 'base', then this cache_tree + * struct is a "leaf" in the data structure, pointing to the + * tree OID specified in the entry. + */ + if (entries > 0) { + const struct cache_entry *ce = cache[0]; + + if (S_ISSPARSEDIR(ce->ce_mode) && + ce->ce_namelen == baselen && + !strncmp(ce->name, base, baselen)) { + it->entry_count = 1; + oidcpy(&it->oid, &ce->oid); + return 1; + } + } + if (0 <= it->entry_count && has_object_file(&it->oid)) return it->entry_count; diff --git a/sparse-index.c b/sparse-index.c index 30c1a11fd62d..56313e805d9d 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -180,7 +180,11 @@ int convert_to_sparse(struct index_state *istate) istate->cache_nr = convert_to_sparse_rec(istate, 0, 0, istate->cache_nr, "", 0, istate->cache_tree); - istate->drop_cache_tree = 1; + + /* Clear and recompute the cache-tree */ + cache_tree_free(&istate->cache_tree); + cache_tree_update(istate, 0); + istate->sparse_index = 1; trace2_region_leave("index", "convert_to_sparse", istate->repo); return 0; @@ -281,5 +285,9 @@ void ensure_full_index(struct index_state *istate) strbuf_release(&base); free(full); + /* Clear and recompute the cache-tree */ + cache_tree_free(&istate->cache_tree); + cache_tree_update(istate, 0); + trace2_region_leave("index", "ensure_full_index", istate->repo); } From patchwork Tue Mar 16 16:43:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12142971 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F21B6C4332D for ; Tue, 16 Mar 2021 16:44:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DC6B365120 for ; Tue, 16 Mar 2021 16:44:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238926AbhCPQoe (ORCPT ); Tue, 16 Mar 2021 12:44:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52642 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238911AbhCPQnV (ORCPT ); Tue, 16 Mar 2021 12:43:21 -0400 Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 361A4C061762 for ; Tue, 16 Mar 2021 09:43:20 -0700 (PDT) Received: by mail-wr1-x42f.google.com with SMTP id j7so7785218wrd.1 for ; Tue, 16 Mar 2021 09:43:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=UI+eRyRGiD8kv9ARhAds+wlgsXLrX8hcYHQH8XYkEp8=; b=vT9xSflsdfhmlYebwruVUCX504dtE+qcEOpcayVwqMMD82PieErY1z6HSEDRGMU32X AmkZhdx0omkzYs2WSXvlbD16ohCCvfxNX2pMpynfIT2T9mO9J6YA7z5zwVbrzfPpBRtn Wpy9F2d32rR/5WLyatgmYwXBdAS5aMscF8jDEKniHQ1/87Xr7mEnpsi8tkOPoeDTPzAy Id9bTl2t1dBkj2EFeTKUDQ3IuWPqAk1ZqCBwlfRgXjKOHlYGgZysXxpkd4C8rYnGc4KY 0LQmmvwellbt6tapI9UDTRu/ByYjXVjtcB3rUtstF2wJvstJBBjUvodtfE1igfdq4DlY t9YA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=UI+eRyRGiD8kv9ARhAds+wlgsXLrX8hcYHQH8XYkEp8=; b=PtIoYtNpm+lX6Ss1nron2xNYVmKaLmxeMLry8meuErLML/GjYib87zjO1AgUwYe33y OaUmRBuCWWRTszfy0E5zBDjoe/lzoP4c6oSNjGvXThkYxQG3oXaMCJabkJbBdf9jg1di nIDxB9VP0DfdpIhBBqT5YebDeml2IEFkBqehdLRadfx1dyXe0z/ufpt27t2EKa1wsWHL zCOLowh3iuZINUUajH14FuTVmfBDC3siwZeJxdEIuU/pt1y3dH8I1iK0t8xrqIcHw91T SMYICtf/3s+BlRc5kurQTLgd2UClFqIgndMWESkSbX85tbG7cqd8WZxV3AceUjVn9dLO M0nA== X-Gm-Message-State: AOAM530y3MItTnJ55ZbHzi7YCvFS3ypudqHeMZnKdtkL4MKm0f9zgAgi EKfvQ3YHy/4McBNJMy/NFU4Idi3Ag0w= X-Google-Smtp-Source: ABdhPJzkwg3mOZa4OnAslK9HHtiAz5/oByK5/vaRrQFbwE0b/EFz7TtNuF5EHus6bQs2bDnkhUWHpA== X-Received: by 2002:a5d:55c4:: with SMTP id i4mr5944425wrw.84.1615912998760; Tue, 16 Mar 2021 09:43:18 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id s11sm32355wme.22.2021.03.16.09.43.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Mar 2021 09:43:18 -0700 (PDT) Message-Id: <6fdd9323c14ea42f805eaa1ace525d268fc3438c.1615912983.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 16 Mar 2021 16:43:02 +0000 Subject: [PATCH v3 19/20] sparse-index: loose integration with cache_tree_verify() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The cache_tree_verify() method is run when GIT_TEST_CHECK_CACHE_TREE is enabled, which it is by default in the test suite. The logic must be adjusted for the presence of these directory entries. For now, leave the test as a simple check for whether the directory entry is sparse. Do not go any further until needed. This allows us to re-enable GIT_TEST_CHECK_CACHE_TREE in t1092-sparse-checkout-compatibility.sh. Further, p2000-sparse-operations.sh uses the test suite and hence this is enabled for all tests. We need to integrate with it before we run our performance tests with a sparse-index. Signed-off-by: Derrick Stolee --- cache-tree.c | 19 +++++++++++++++++++ t/t1092-sparse-checkout-compatibility.sh | 3 --- 2 files changed, 19 insertions(+), 3 deletions(-) diff --git a/cache-tree.c b/cache-tree.c index 950a9615db8f..11bf1fcae6e1 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -808,6 +808,19 @@ int cache_tree_matches_traversal(struct cache_tree *root, return 0; } +static void verify_one_sparse(struct repository *r, + struct index_state *istate, + struct cache_tree *it, + struct strbuf *path, + int pos) +{ + struct cache_entry *ce = istate->cache[pos]; + + if (!S_ISSPARSEDIR(ce->ce_mode)) + BUG("directory '%s' is present in index, but not sparse", + path->buf); +} + static void verify_one(struct repository *r, struct index_state *istate, struct cache_tree *it, @@ -830,6 +843,12 @@ static void verify_one(struct repository *r, if (path->len) { pos = index_name_pos(istate, path->buf, path->len); + + if (pos >= 0) { + verify_one_sparse(r, istate, it, path, pos); + return; + } + pos = -pos - 1; } else { pos = 0; diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index f14dc48924d2..d97bf9b64527 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -2,9 +2,6 @@ test_description='compare full workdir to sparse workdir' -# The verify_cache_tree() check is not sparse-aware (yet). -# So, disable the check until that integration is complete. -GIT_TEST_CHECK_CACHE_TREE=0 GIT_TEST_SPLIT_INDEX=0 GIT_TEST_SPARSE_INDEX= From patchwork Tue Mar 16 16:43:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12142973 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15280C43332 for ; Tue, 16 Mar 2021 16:44:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id ED5556510B for ; Tue, 16 Mar 2021 16:44:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238999AbhCPQof (ORCPT ); Tue, 16 Mar 2021 12:44:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52624 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238912AbhCPQnV (ORCPT ); Tue, 16 Mar 2021 12:43:21 -0400 Received: from mail-wm1-x334.google.com (mail-wm1-x334.google.com [IPv6:2a00:1450:4864:20::334]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E08A9C061764 for ; Tue, 16 Mar 2021 09:43:20 -0700 (PDT) Received: by mail-wm1-x334.google.com with SMTP id f22-20020a7bc8d60000b029010c024a1407so1847695wml.2 for ; Tue, 16 Mar 2021 09:43:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=LHUKWFc377nClgW8+NNh7YinD6CTjnUd3Jh5YVvBApg=; b=m2r/lsesKg95SArBAc4TFT0pxdBhwECG5y5PhxqIK/4zwTgoIQnUc+RkJzK2ZX+nxy nHyyqHqnc+xJhBpfjd9kcYyj/d9H1gbRXn2XTIIlLTazM8R0FA3c0V86E8+TXBzmUHJK +bQrTlQrMobvXJzgW3Oc0V2RewUH43Zu5ORCTRlzt8PEAmkQ9PZgoQbf4jn8it7fFzpj cYNaqr4+aKBAYskEHhk0yOF7sFWzhqxCS77pCQ8AIV5oNDYVJB4rCZ1NniplCjQve1z/ n6vZEerxKU4NlutXwNVoOHlyo12nj+ZWXsfIBr1ubP+nk7QuYGkjuUllCEawUVIzjLj0 XRDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=LHUKWFc377nClgW8+NNh7YinD6CTjnUd3Jh5YVvBApg=; b=bEJpSqVK0UIen+veStjv/c1z/N/CUKOm0aOQ6YkxxbjU4akoOcgNmnBTW+VfhluH7s k2MBvpRS74E6ZhI70WtOzzK+xW8/GvA6vIxGuC2o8QGXqrQ587CkaA2xcexVdAY5vcGh W8ClTJQc9oD7Qr1sREXVZ+xd56w9cTqpyJBZ5h4MbpPu31kBbEAQ+WXe46s57Sdlw0Nc yQqqndBXbqAQAp5Qp09/MbzMSEhcJeFqTcbe4Cmkd0HrcNpQACSoATlcXxj9SfzL+8Qk eM3tvEcqaB5vytNAgAFmzoFUeabnoMfHVkEfvam6l6I6EprmahzAlvlD1b/RMCKYaF// nMQw== X-Gm-Message-State: AOAM5339ciu5wgMnJVQZBMRRYmlOFHzfnN5zp5SpvCfKKQLXjHK9k9QL vIZZ4ESGHk0hHxZgTwq472r1P5T/NeA= X-Google-Smtp-Source: ABdhPJz+7TK20VJSSs8qUt0DddIJTWAX8lGRSacVQePXIH8fAN92SQ1kLR3zvW6neK4WDRA4/gtUBA== X-Received: by 2002:a1c:bc06:: with SMTP id m6mr2302wmf.18.1615912999673; Tue, 16 Mar 2021 09:43:19 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h25sm23347wml.32.2021.03.16.09.43.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Mar 2021 09:43:19 -0700 (PDT) Message-Id: <3db06ac46dd5c61e83d7fc4747615d616fdbbdda.1615912983.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 16 Mar 2021 16:43:03 +0000 Subject: [PATCH v3 20/20] p2000: add sparse-index repos Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: newren@gmail.com, gitster@pobox.com, pclouds@gmail.com, jrnieder@gmail.com, Martin =?utf-8?b?w4VncmVu?= , Derrick Stolee , SZEDER =?utf-8?b?R8OhYm9y?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsA==?= Bjarmason , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee p2000-sparse-operations.sh compares different Git commands in repositories with many files at HEAD but using sparse-checkout to focus on a small portion of those files. Add extra copies of the repository that use the sparse-index format so we can track how that affects the performance of different commands. At this point in time, the sparse-index is 100% overhead from the CPU front, and this is measurable in these tests: Test --------------------------------------------------------------- 2000.2: git status (full-index-v3) 0.59(0.51+0.12) 2000.3: git status (full-index-v4) 0.59(0.52+0.11) 2000.4: git status (sparse-index-v3) 1.40(1.32+0.12) 2000.5: git status (sparse-index-v4) 1.41(1.36+0.08) 2000.6: git add -A (full-index-v3) 2.32(1.97+0.19) 2000.7: git add -A (full-index-v4) 2.17(1.92+0.14) 2000.8: git add -A (sparse-index-v3) 2.31(2.21+0.15) 2000.9: git add -A (sparse-index-v4) 2.30(2.20+0.13) 2000.10: git add . (full-index-v3) 2.39(2.02+0.20) 2000.11: git add . (full-index-v4) 2.20(1.94+0.16) 2000.12: git add . (sparse-index-v3) 2.36(2.27+0.12) 2000.13: git add . (sparse-index-v4) 2.33(2.21+0.16) 2000.14: git commit -a -m A (full-index-v3) 2.47(2.12+0.20) 2000.15: git commit -a -m A (full-index-v4) 2.26(2.00+0.17) 2000.16: git commit -a -m A (sparse-index-v3) 3.01(2.92+0.16) 2000.17: git commit -a -m A (sparse-index-v4) 3.01(2.94+0.15) Note that there is very little difference between the v3 and v4 index formats when the sparse-index is enabled. This is primarily due to the fact that the relative file sizes are the same, and the command time is mostly taken up by parsing tree objects to expand the sparse index into a full one. With the current file layout, the index file sizes are given by this table: | full index | sparse index | +-------------+--------------+ v3 | 108 MiB | 1.6 MiB | v4 | 80 MiB | 1.2 MiB | Future updates will improve the performance of Git commands when the index is sparse. Signed-off-by: Derrick Stolee --- t/perf/p2000-sparse-operations.sh | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh index 2fbc81b22119..e527316e66d6 100755 --- a/t/perf/p2000-sparse-operations.sh +++ b/t/perf/p2000-sparse-operations.sh @@ -60,12 +60,29 @@ test_expect_success 'setup repo and indexes' ' git sparse-checkout set $SPARSE_CONE && git config index.version 4 && git update-index --index-version=4 + ) && + git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-index-v3 && + ( + cd sparse-index-v3 && + git sparse-checkout init --cone --sparse-index && + git sparse-checkout set $SPARSE_CONE && + git config index.version 3 && + git update-index --index-version=3 + ) && + git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-index-v4 && + ( + cd sparse-index-v4 && + git sparse-checkout init --cone --sparse-index && + git sparse-checkout set $SPARSE_CONE && + git config index.version 4 && + git update-index --index-version=4 ) ' test_perf_on_all () { command="$@" - for repo in full-index-v3 full-index-v4 + for repo in full-index-v3 full-index-v4 \ + sparse-index-v3 sparse-index-v4 do test_perf "$command ($repo)" " (