From patchwork Thu May 19 17:52:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12855865 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AADD9C433EF for ; Thu, 19 May 2022 17:53:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243347AbiESRxx (ORCPT ); Thu, 19 May 2022 13:53:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52414 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243577AbiESRx2 (ORCPT ); Thu, 19 May 2022 13:53:28 -0400 Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 44F8A333 for ; Thu, 19 May 2022 10:52:43 -0700 (PDT) Received: by mail-wr1-x42b.google.com with SMTP id e28so7656208wra.10 for ; Thu, 19 May 2022 10:52:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=JOe9/xcyWlPygpRg9fHBknYEcVKDUlk3pocQHkD5gJA=; b=pmQ/ScgbX/fshVsVdFeDaaL3K83mSreor28S9N+cvQjcuSWHAraPMPigVnKudYWJ7m 2GxGBAzt/ys1eWkHztdQnT+WYUgIHkC1p92y4F/R5hgCA9UJdObEKpIw1uvLWgGLPWAa E/Wa7zoqUjF+iXW7KTuYqfHmzwwO7w5hGZ+cBO6Ns9NtZ7iqnyCMD/XA556a809YxwNV F7390kf0zFoRBTMHX1nVTuqa3Oh0bhcofknY2uyBkP4A8uOTjWCVxjUvV9qIUZo6VpIY IMe5+zgfvYCXkXh4i2Nuv7/gWdyiSZYT9LG5IdiLqp/c+wVFFpIo514R0mJ3PM4U0yQZ I+vA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=JOe9/xcyWlPygpRg9fHBknYEcVKDUlk3pocQHkD5gJA=; b=lGsxJhJiMFLmQcqprucRiFrjWtLgisnnTq1V6kw5hQihETt41xUQ4q1hJq7n5QyV4m 3CQ3NuzPR7/OPAN1iHUAWeyk2viAAFKmDMbki2ORtvEH2STBTeEM2odxY48MM41wD+tN JXQKyfb1C5eyyOyJ7nQtuNdnz4Vn5ZSinCoD+l4XjPYeF7Pc/IcfTt96Gl8Hc0cJbO5u TG1+Tkciq0MoWf4J28aZEyS38Up+CHMS5kkRVHG6y1mjmWfOwEm2z0MFrAaOEnGib3UD 7DCTtDviqQLSRcIxZTKHJnO7zt3igwTfTBK1Y6Nh/ERE7rdWW2a+WZEjjuLvfbrYJ8fL Af8A== X-Gm-Message-State: AOAM533YBPJjXHZ5WQrX7Q+HnPNuv7EGNGqKLw9mfzoN40f67iYUEyTM bXE7lPZW/IPCg39tpe+kYMwu6x8LrHs= X-Google-Smtp-Source: ABdhPJwBnKx+7VGDuRKpzyLHEZ1A/KJC8GkSjBAmKmeu1VbInjzyZ73SM2ncK2ai6KyIIpqc8NXF4A== X-Received: by 2002:adf:d23b:0:b0:20e:69df:5f02 with SMTP id k27-20020adfd23b000000b0020e69df5f02mr3686850wrh.138.1652982761444; Thu, 19 May 2022 10:52:41 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id j15-20020adfb30f000000b0020d12936563sm188920wrd.108.2022.05.19.10.52.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 May 2022 10:52:41 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Thu, 19 May 2022 17:52:29 +0000 Subject: [PATCH v2 01/10] t1092: refactor 'sparse-index contents' test Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, vdye@github.com, shaoxuan.yuan02@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Before expanding this test with more involved cases, first extract the repeated logic into a new test_sparse_checkout_set helper. This helper checks that 'git sparse-checkout set ...' succeeds and then verifies that certain directories have sparse directory entries in the sparse index. It also verifies that the in-cone directories are _not_ sparse directory entries in the sparse index. Signed-off-by: Derrick Stolee --- t/t1092-sparse-checkout-compatibility.sh | 53 ++++++++++++++++-------- 1 file changed, 35 insertions(+), 18 deletions(-) diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index 93bcfd20bbc..e7c0ae9b953 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -205,36 +205,53 @@ test_sparse_unstaged () { done } -test_expect_success 'sparse-index contents' ' - init_repos && - +# Usage: test_sprase_checkout_set " ... " " ... " +# Verifies that "git sparse-checkout set ... " succeeds and +# leaves the sparse index in a state where ... are sparse +# directories (and ... are not). +test_sparse_checkout_set () { + CONE_DIRS=$1 && + SPARSE_DIRS=$2 && + git -C sparse-index sparse-checkout set $CONE_DIRS && git -C sparse-index ls-files --sparse --stage >cache && - for dir in folder1 folder2 x + + # Check that the directories outside of the sparse-checkout cone + # have sparse directory entries. + for dir in $SPARSE_DIRS do TREE=$(git -C sparse-index rev-parse HEAD:$dir) && grep "040000 $TREE 0 $dir/" cache \ || return 1 done && - git -C sparse-index sparse-checkout set folder1 && - - git -C sparse-index ls-files --sparse --stage >cache && - for dir in deep folder2 x + # Check that the directories in the sparse-checkout cone + # are not sparse directory entries. + for dir in $CONE_DIRS do TREE=$(git -C sparse-index rev-parse HEAD:$dir) && - grep "040000 $TREE 0 $dir/" cache \ + ! grep "040000 $TREE 0 $dir/" cache \ || return 1 - done && + done +} - git -C sparse-index sparse-checkout set deep/deeper1 && +test_expect_success 'sparse-index contents' ' + init_repos && - git -C sparse-index ls-files --sparse --stage >cache && - for dir in deep/deeper2 folder1 folder2 x - do - TREE=$(git -C sparse-index rev-parse HEAD:$dir) && - grep "040000 $TREE 0 $dir/" cache \ - || return 1 - done && + # Remove deep, add three other directories. + test_sparse_checkout_set \ + "folder1 folder2 x" \ + "before deep" && + + # Remove folder1, add deep + test_sparse_checkout_set \ + "deep folder2 x" \ + "before folder1" && + + # Replace deep with deep/deeper2 (dropping deep/deeper1) + # Add folder1 + test_sparse_checkout_set \ + "deep/deeper2 folder1 folder2 x" \ + "before deep/deeper1" && # Disabling the sparse-index replaces tree entries with full ones git -C sparse-index sparse-checkout init --no-sparse-index && From patchwork Thu May 19 17:52:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12855861 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34743C433EF for ; Thu, 19 May 2022 17:53:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243436AbiESRxl (ORCPT ); Thu, 19 May 2022 13:53:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51596 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243390AbiESRx2 (ORCPT ); Thu, 19 May 2022 13:53:28 -0400 Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4BDF8DE9 for ; Thu, 19 May 2022 10:52:44 -0700 (PDT) Received: by mail-wr1-x42c.google.com with SMTP id u3so8270152wrg.3 for ; Thu, 19 May 2022 10:52:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=U5UspuwgHVBqo7FpE1+kOE+FRDR7ixKr1UpHpeCL9Ig=; b=UsVPTu4qeF/T3ZAkVrk/XhZcXkCtH/OYL4tve9qpEQTkYVI+/tuAe0w6Oiv2f3O24T 5vz+ILYbirgUx6moSDBTYowKQphr6NMamwjxlIzGR9OEmX+NQ97IBLyBov+Z+IpyoONo QNua4Sy+O7cBCLzWtUw+FjTM20BoH4QS1Ml+Lzx9l3wYc9eh80r24ozWkcz13OST1qD7 NKqr9rknl/PP/ILY0yKRxsINutz/kPqLReZpE8FeX8Gr/JYWPHWx4wflqOPmQxGD7U/z j4EzyeApU8A4PvOMr/Yj1i0YV3xEGgE1vTgYAvnWx+Dy/Fp0fRCrI/BFtRiMKhJt8FYo pRYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=U5UspuwgHVBqo7FpE1+kOE+FRDR7ixKr1UpHpeCL9Ig=; b=BuEa0BWDew4cJaq+X9rQzLG7gcNbXkN6ZkOHg653oY4VOX6klrHFBRQXfReiSVbi/K gcKWiBcDDLm0pKYLXfr3Ig8a6r3f9xbJIhpKhL3AG2SXeA7HdWIE+qochBanUvslql+W WjG2tTakss1dDnxQpA5SDX0GF5mH4ydvhIM8mz9/cnQBdliCdbchu2obf8+M+OSXtLVV 0uwBexWU8dgUJZQ0FhfW5Ni29TKzpULRA6mFHjQ7uLRJ+nUR19NJ/7QoFiJ60JY59fQw +6KGOkMicCAPK4QOcM+h8176ofGu6Vje/YSUkrkxLYhA8AQCle7LaD1GHmjQI4mnigaH jO8w== X-Gm-Message-State: AOAM5334pZ9QfzdouT6ZosrmWUAmxrJ1PyTsZgiCogSfxbk74GeFN+ED 7Akh2kkE26ubczoRgjJVCPmPPqd8PH0= X-Google-Smtp-Source: ABdhPJwhnfnJhePMepN1UtQU7FXmYeYgHitpCOdszmJ9LulCCm2EdMX+jVkgOiUTkS4w2UWTsOLUJg== X-Received: by 2002:a05:6000:1842:b0:20d:2834:765e with SMTP id c2-20020a056000184200b0020d2834765emr4972513wri.599.1652982762568; Thu, 19 May 2022 10:52:42 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id n128-20020a1c2786000000b003942a244f50sm106083wmn.41.2022.05.19.10.52.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 May 2022 10:52:42 -0700 (PDT) Message-Id: <5030eeecf4f8e2dd65bc055d6a720c7d67015b1e.1652982758.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Thu, 19 May 2022 17:52:30 +0000 Subject: [PATCH v2 02/10] t1092: stress test 'git sparse-checkout set' Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, vdye@github.com, shaoxuan.yuan02@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The 'sparse-index contents' test checks that the sparse index has the correct set of sparse directories in the index after modifying the cone mode patterns using 'git sparse-checkout set'. Add to the coverage here by adding more complicated scenarios that were not previously tested. In order to check paths that do not exist at HEAD, we need to modify the test_sparse_checkout_set helper slightly: 1. Add the --skip-checks argument to the 'set' command to avoid failures when passing paths that do not exist at HEAD. 2. When looking for the non-existence of sparse directories for the paths in $CONE_DIRS, allow the rev-list command to fail because the path does not exist at HEAD. This allows us to add some interesting test cases. Helped-by: Victoria Dye Signed-off-by: Derrick Stolee --- t/t1092-sparse-checkout-compatibility.sh | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index e7c0ae9b953..785820f9fd5 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -212,7 +212,7 @@ test_sparse_unstaged () { test_sparse_checkout_set () { CONE_DIRS=$1 && SPARSE_DIRS=$2 && - git -C sparse-index sparse-checkout set $CONE_DIRS && + git -C sparse-index sparse-checkout set --skip-checks $CONE_DIRS && git -C sparse-index ls-files --sparse --stage >cache && # Check that the directories outside of the sparse-checkout cone @@ -228,7 +228,9 @@ test_sparse_checkout_set () { # are not sparse directory entries. for dir in $CONE_DIRS do - TREE=$(git -C sparse-index rev-parse HEAD:$dir) && + # Allow TREE to not exist because + # $dir does not exist at HEAD. + TREE=$(git -C sparse-index rev-parse HEAD:$dir) || ! grep "040000 $TREE 0 $dir/" cache \ || return 1 done @@ -253,6 +255,19 @@ test_expect_success 'sparse-index contents' ' "deep/deeper2 folder1 folder2 x" \ "before deep/deeper1" && + # Replace deep/deeper2 with deep/deeper1 + # Replace folder1 with folder1/0/0 + # Replace folder2 with non-existent folder2/2/3 + # Add non-existent "bogus" + test_sparse_checkout_set \ + "bogus deep/deeper1 folder1/0/0 folder2/2/3 x" \ + "before deep/deeper2 folder2/0" && + + # Drop down to only files at root + test_sparse_checkout_set \ + "" \ + "before deep folder1 folder2 x" && + # Disabling the sparse-index replaces tree entries with full ones git -C sparse-index sparse-checkout init --no-sparse-index && test_sparse_match git ls-files --stage --sparse From patchwork Thu May 19 17:52:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12855862 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F8BCC433FE for ; Thu, 19 May 2022 17:53:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243489AbiESRxn (ORCPT ); Thu, 19 May 2022 13:53:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51592 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243370AbiESRx2 (ORCPT ); Thu, 19 May 2022 13:53:28 -0400 Received: from mail-wr1-x42a.google.com (mail-wr1-x42a.google.com [IPv6:2a00:1450:4864:20::42a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 615AD1DB for ; Thu, 19 May 2022 10:52:45 -0700 (PDT) Received: by mail-wr1-x42a.google.com with SMTP id j24so8275014wrb.1 for ; Thu, 19 May 2022 10:52:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=RbG1UlqErvQan6gdcTmNiHSlUALQ7je4x8pZnwNuqD4=; b=Ak+qZdBp337gCNgI3rcoCLVGy7N5ihSSsekez2DoEgSg1VAsZp5QX9H7tF/leGzn8J wDUstfUQ3E4kF3kPQW4m5cn9OacSzIG3u/aPOYyh/3bG890UwXJM9+qgwJFUBEKkyacf cl/oUQ1Arq4Kb6+DM3iobaxjibKl8kASP1oXyCoXHATTThUEDZShGEwWduv1xHoMhhT2 8s/yQzZ36A9wo4AT05QAHm4aj/r6a03B+iffyrnN5eYIyQzkHoskLZaR4fS8ZS+F4Fv0 zdsPvMyJwX/aQgdxSGvlYLx5gd47Df7pJv4BMTFH6blM3rqOBeoui3FZGNU5EWzzanQP E0kQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=RbG1UlqErvQan6gdcTmNiHSlUALQ7je4x8pZnwNuqD4=; b=akF2ZVYxfbVSqQVp+RnzwvwBadPIG0Wen3Oi/KIWCktbw9wxHFMCO3OBVGeZN/s0Gn B72qaQaTuXB8LieATY+UB37MKSTMf0I15AJH1YEunDBXjJlc3XJL8JV0i3f3/QAguOCd OniqcWVPAYdaM+LlX7KStvJghAb4pgDi5CH8HGCbuoQK4URec04SklI8o65FgWXxwNTm odoPDx2RPXTT845zB+XbcgbX48Uvuox0ACFmuTY7qR6c/ffenEYq/Dnq4SfjVCXGZU0s fYg2vQ3zK86mueUQDVkeZUNF/vXwYp2YJOIuZw5U+6WUPCGkUBrE+ZS1syx4PZzXZJIk cKEw== X-Gm-Message-State: AOAM532vqc/xVYInvRuqaxNU8Z/d2Jv7VU/Xu1mEqyN7i4Z7O6QYto1m bjhLpT0SIkQ1C2O4ugqwsVqaOI1H69A= X-Google-Smtp-Source: ABdhPJxjQr+oLQKCnb6Ic1CMNPmjESh+updEz1K0fvIpcd2b8GQEUu2o+yDlt9X3ufxBfdws2asJ2A== X-Received: by 2002:a5d:5187:0:b0:20d:dfc:c333 with SMTP id k7-20020a5d5187000000b0020d0dfcc333mr5099329wrv.623.1652982763577; Thu, 19 May 2022 10:52:43 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id p10-20020adfba8a000000b0020cf41017b4sm240092wrg.19.2022.05.19.10.52.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 May 2022 10:52:43 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Thu, 19 May 2022 17:52:31 +0000 Subject: [PATCH v2 03/10] sparse-index: create expand_to_pattern_list() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, vdye@github.com, shaoxuan.yuan02@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee This is the first change in a series to allow modifying the sparse-checkout pattern set without expanding a sparse index to a full one in the process. Here, we focus on the problem of expanding the pattern set through a command like 'git sparse-checkout add ' which needs to create new index entries for the paths now being written to the worktree. To achieve this, we need to be able to replace sparse directory entries with their contained files and subdirectories. Once this is complete, other code paths can discover those cache entries and write the corresponding files to disk before committing the index. We already have logic in ensure_full_index() that expands the index entries, so we will use that as our base. Create a new method, expand_to_pattern_list(), which takes a pattern list, but for now mostly ignores it. The current implementation is only correct when the pattern list is NULL as that does the same as ensure_full_index(). In fact, ensure_full_index() is converted to a shim over expand_to_pattern_list(). A future update will actually implement expand_to_pattern_list() to its full capabilities. For now, it is created and documented. Signed-off-by: Derrick Stolee --- sparse-index.c | 35 ++++++++++++++++++++++++++++++++--- sparse-index.h | 14 ++++++++++++++ 2 files changed, 46 insertions(+), 3 deletions(-) diff --git a/sparse-index.c b/sparse-index.c index 8636af72de5..2a06ef58051 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -248,19 +248,41 @@ static int add_path_to_index(const struct object_id *oid, return 0; } -void ensure_full_index(struct index_state *istate) +void expand_to_pattern_list(struct index_state *istate, + struct pattern_list *pl) { int i; struct index_state *full; struct strbuf base = STRBUF_INIT; + /* + * If the index is already full, then keep it full. We will convert + * it to a sparse index on write, if possible. + */ if (!istate || !istate->sparse_index) return; + /* + * If our index is sparse, but our new pattern set does not use + * cone mode patterns, then we need to expand the index before we + * continue. A NULL pattern set indicates a full expansion to a + * full index. + */ + if (pl && !pl->use_cone_patterns) + pl = NULL; + if (!istate->repo) istate->repo = the_repository; - trace2_region_enter("index", "ensure_full_index", istate->repo); + /* + * A NULL pattern set indicates we are expanding a full index, so + * we use a special region name that indicates the full expansion. + * This is used by test cases, but also helps to differentiate the + * two cases. + */ + trace2_region_enter("index", + pl ? "expand_to_pattern_list" : "ensure_full_index", + istate->repo); /* initialize basics of new index */ full = xcalloc(1, sizeof(struct index_state)); @@ -322,7 +344,14 @@ void ensure_full_index(struct index_state *istate) cache_tree_free(&istate->cache_tree); cache_tree_update(istate, 0); - trace2_region_leave("index", "ensure_full_index", istate->repo); + trace2_region_leave("index", + pl ? "expand_to_pattern_list" : "ensure_full_index", + istate->repo); +} + +void ensure_full_index(struct index_state *istate) +{ + expand_to_pattern_list(istate, NULL); } void ensure_correct_sparsity(struct index_state *istate) diff --git a/sparse-index.h b/sparse-index.h index 633d4fb7e31..037b541f49d 100644 --- a/sparse-index.h +++ b/sparse-index.h @@ -23,4 +23,18 @@ void expand_to_path(struct index_state *istate, struct repository; int set_sparse_index_config(struct repository *repo, int enable); +struct pattern_list; + +/** + * Scan the given index and compare its entries to the given pattern list. + * If the index is sparse and the pattern list uses cone mode patterns, + * then modify the index to contain the all of the file entries within that + * new pattern list. This expands sparse directories only as far as needed. + * + * If the pattern list is NULL or does not use cone mode patterns, then the + * index is expanded to a full index. + */ +void expand_to_pattern_list(struct index_state *istate, + struct pattern_list *pl); + #endif From patchwork Thu May 19 17:52:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12855864 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA2D3C433F5 for ; Thu, 19 May 2022 17:53:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243306AbiESRxs (ORCPT ); Thu, 19 May 2022 13:53:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52486 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243433AbiESRx3 (ORCPT ); Thu, 19 May 2022 13:53:29 -0400 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C1D36E91 for ; Thu, 19 May 2022 10:52:46 -0700 (PDT) Received: by mail-wr1-x42d.google.com with SMTP id f2so8292613wrc.0 for ; Thu, 19 May 2022 10:52:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=U/b7p4PaQXRYi92J/Vf1N8AIaihhQJSBvxJIqADio1c=; b=FLLZp/50kxDbAL/ENRKjAMhOY8JfTpRz+75EsRu4iYsixAQaONTD8SQl6AYeXsGN7n SlVCEysKTtakStm63IydNspJFK2sKYX8utbrWJH0T1GXFXEdJ5xiYtVNzFzHfmewIoT9 CBifbnGx0O4vrY8x06Zhebj4v8PLVrffvsTvdy0IRzw0QTlOoiuBRwl9lIbLDKd3PE3H YwVCL3jFai0FXJvC6NvJ7xfBxFRk2HXQwu7oFonFzqQLKUabivtCBMlBWZLd3m02me7L ClmvLeDF3vGtXBeuI2oDo4gFydi8XtCp2ZFmaORHmly9shAM1OGyl8RWT9VnT4tJPvCX S2wg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=U/b7p4PaQXRYi92J/Vf1N8AIaihhQJSBvxJIqADio1c=; b=26mNS99mUcppSsLafwy91YUxDFfrisXKpQx4ErhizfTESwvgb9eErbnUsaJQ7C8w1C zPSzzbh3dy1fex/u4VBKjhZ3EZztY0kkZc0aPQsxSYzpUzopgJ3IHjY0nk0VETMER4mV LLC0L41xIiX03lBy4fn3lwSji3bkE7yR33UBThlPmhH8fiDIjSKz2QFfjVXB4cQ+eqTP OCvYnpGT1BG34eYNntnu/OsGIisBXXkD0COV0+oQLxd7TZQWsw6r8JyaAsmLJMuc2XZR ma1PkOOZIEtMi9Ig1ljsaT3PCTPQcznlg9kCgCU4f3eYLIt5JL74fQ64ekOd1uude14v gdfw== X-Gm-Message-State: AOAM5339HCfjNCo4+TGt6c+uIAkSTHfH1Dw584tNwrRVz1hEdM0/btct M9QRyEIsMgsagR+LU8mMnc+sZWmAgo8= X-Google-Smtp-Source: ABdhPJw8qs2kuK6V30JNDPU3jd1/Z4iz5kDTk0tzJKqO5IJrgaJHlwaCGNC3jqIbavwEK9r3UKqepg== X-Received: by 2002:adf:fb90:0:b0:20c:62bd:652d with SMTP id a16-20020adffb90000000b0020c62bd652dmr5234094wrr.402.1652982764940; Thu, 19 May 2022 10:52:44 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id t11-20020adfba4b000000b0020c6fa5a797sm195276wrg.91.2022.05.19.10.52.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 May 2022 10:52:44 -0700 (PDT) Message-Id: <269c206c331bb43006678beaa20832a75754c3df.1652982758.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Thu, 19 May 2022 17:52:32 +0000 Subject: [PATCH v2 04/10] sparse-index: introduce partially-sparse indexes Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, vdye@github.com, shaoxuan.yuan02@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee A future change will present a temporary, in-memory mode where the index can both contain sparse directory entries but also not be completely collapsed to the smallest possible sparse directories. This will be necessary for modifying the sparse-checkout definition while using a sparse index. For now, convert the single-bit member 'sparse_index' in 'struct index_state' to be a an 'enum sparse_index_mode' with three modes: * COMPLETELY_FULL (0): No sparse directories exist. * COMPLETELY_SPARSE (1): Sparse directories may exist. Files outside the sparse-checkout cone are reduced to sparse directory entries whenever possible. * PARTIALLY_SPARSE (2): Sparse directories may exist. Some file entries outside the sparse-checkout cone may exist. Running convert_to_sparse() may further reduce those files to sparse directory entries. The main reason to store this extra information is to allow convert_to_sparse() to short-circuit when the index is already in COMPLETELY_SPARSE mode but to actually do the necessary work when in PARTIALLY_SPARSE mode. The PARTIALLY_SPARSE mode will be used in an upcoming change. Signed-off-by: Derrick Stolee --- builtin/sparse-checkout.c | 2 +- cache.h | 32 ++++++++++++++++++++++++-------- read-cache.c | 6 +++--- sparse-index.c | 6 +++--- 4 files changed, 31 insertions(+), 15 deletions(-) diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index 0217d44c5b1..88eea069ad4 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -128,7 +128,7 @@ static void clean_tracked_sparse_directories(struct repository *r) * sparse index will not delete directories that contain * conflicted entries or submodules. */ - if (!r->index->sparse_index) { + if (r->index->sparse_index == COMPLETELY_FULL) { /* * If something, such as a merge conflict or other concern, * prevents us from converting to a sparse index, then do diff --git a/cache.h b/cache.h index 6226f6a8a53..2d067aca2fd 100644 --- a/cache.h +++ b/cache.h @@ -310,6 +310,28 @@ struct untracked_cache; struct progress; struct pattern_list; +enum sparse_index_mode { + /* + * COMPLETELY_FULL: there are no sparse directories + * in the index at all. + */ + COMPLETELY_FULL = 0, + + /* + * COLLAPSED: the index has already been collapsed to sparse + * directories whereever possible. + */ + COLLAPSED = 1, + + /* + * PARTIALLY_SPARSE: the sparse directories that exist are + * outside the sparse-checkout boundary, but it is possible + * that some file entries could collapse to sparse directory + * entries. + */ + PARTIALLY_SPARSE = 2, +}; + struct index_state { struct cache_entry **cache; unsigned int version; @@ -323,14 +345,8 @@ struct index_state { drop_cache_tree : 1, updated_workdir : 1, updated_skipworktree : 1, - fsmonitor_has_run_once : 1, - - /* - * sparse_index == 1 when sparse-directory - * entries exist. Requires sparse-checkout - * in cone mode. - */ - sparse_index : 1; + fsmonitor_has_run_once : 1; + enum sparse_index_mode sparse_index; struct hashmap name_hash; struct hashmap dir_hash; struct object_id oid; diff --git a/read-cache.c b/read-cache.c index 4df97e185e9..cb9b33169fd 100644 --- a/read-cache.c +++ b/read-cache.c @@ -112,7 +112,7 @@ static const char *alternate_index_output; static void set_index_entry(struct index_state *istate, int nr, struct cache_entry *ce) { if (S_ISSPARSEDIR(ce->ce_mode)) - istate->sparse_index = 1; + istate->sparse_index = COLLAPSED; istate->cache[nr] = ce; add_name_hash(istate, ce); @@ -1856,7 +1856,7 @@ static int read_index_extension(struct index_state *istate, break; case CACHE_EXT_SPARSE_DIRECTORIES: /* no content, only an indicator */ - istate->sparse_index = 1; + istate->sparse_index = COLLAPSED; break; default: if (*ext < 'A' || 'Z' < *ext) @@ -3149,7 +3149,7 @@ static int do_write_locked_index(struct index_state *istate, struct lock_file *l unsigned flags) { int ret; - int was_full = !istate->sparse_index; + int was_full = istate->sparse_index == COMPLETELY_FULL; ret = convert_to_sparse(istate, 0); diff --git a/sparse-index.c b/sparse-index.c index 2a06ef58051..c2cd3bdb614 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -173,7 +173,7 @@ int convert_to_sparse(struct index_state *istate, int flags) * If the index is already sparse, empty, or otherwise * cannot be converted to sparse, do not convert. */ - if (istate->sparse_index || !istate->cache_nr || + if (istate->sparse_index == COLLAPSED || !istate->cache_nr || !is_sparse_index_allowed(istate, flags)) return 0; @@ -214,7 +214,7 @@ int convert_to_sparse(struct index_state *istate, int flags) FREE_AND_NULL(istate->fsmonitor_dirty); FREE_AND_NULL(istate->fsmonitor_last_update); - istate->sparse_index = 1; + istate->sparse_index = COLLAPSED; trace2_region_leave("index", "convert_to_sparse", istate->repo); return 0; } @@ -259,7 +259,7 @@ void expand_to_pattern_list(struct index_state *istate, * If the index is already full, then keep it full. We will convert * it to a sparse index on write, if possible. */ - if (!istate || !istate->sparse_index) + if (!istate || istate->sparse_index == COMPLETELY_FULL) return; /* From patchwork Thu May 19 17:52:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12855866 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4FFECC433EF for ; Thu, 19 May 2022 17:53:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243388AbiESRxy (ORCPT ); Thu, 19 May 2022 13:53:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51874 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243434AbiESRxa (ORCPT ); Thu, 19 May 2022 13:53:30 -0400 Received: from mail-wm1-x336.google.com (mail-wm1-x336.google.com [IPv6:2a00:1450:4864:20::336]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E169FE94 for ; Thu, 19 May 2022 10:52:47 -0700 (PDT) Received: by mail-wm1-x336.google.com with SMTP id p189so3299426wmp.3 for ; Thu, 19 May 2022 10:52:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=mgKy/wmfp+WZHTlBuoJS32yn4kSi0zwmpGFlL0Vsrn0=; b=UvRqafy4rIqr8yDE+jtANOP38UdRKXUmk6yjQPzy94WNihqmGNHWqkqfM6ag1mW6tn /6rr4IxxGw+zIpkCd0blPtIBNQaBm/BIS80I0XIEEd5jcpfvU8O8WGDivj24h/zEmgWX /MV/fINQXGF/s0DSbiS/VIHGNSj5iekzxtCVrnKgwljsbTu7uOv/snkZ7XnwKWVAm0zk 9YlBsCcHnBJEAYs5ty2NK+UyOJLhHvadGzT/auMpfPIMaLehRRLzvq1UYwMfSxksIvcO zyudm4h1Is9zyxCgywcuwOT4g6D/U3HmAiaT2dGKmNsk+0K5v0igK7Tu+NdYaAVhpuW8 jXlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=mgKy/wmfp+WZHTlBuoJS32yn4kSi0zwmpGFlL0Vsrn0=; b=FPhpArYGp3JRElorQh/lU54XKVdvk+hHAmbyAf3yLyMEvRCYMr73Zztw7Gk5HBSyjT H4JtDCS8xFGNaFbs6o9khBpsey8kZiSwkPOFjQbYdR2rjZm7j3TMwi6CSEC3uzQtDTBI PQa4WvN7SiRiPpG5w3OQrvxYL2iyYpgR2r71v898/wObL3a+oXgGlKX1iV0j6Wxv/+CD yxxJaMQUIM8GgWk4y1SnBo7lyATQfO+GKTLW6ZZYFfODs9rrQTA6rhWfgB6khqfm53pH /mQIvnEyYmgcGFD4fwzMEcvSFgvAA09xE+LlOOCSNYanrK1JgMO+4jJuZ4v3aHrix8XI 1kkQ== X-Gm-Message-State: AOAM531zAxPQl2LsIk5RMmx0OdmSlT2zxbKXYNr2jkh9KyknpXLmUjN6 MrFaP24ift3IyWX/HymvRbhg7vBQjmM= X-Google-Smtp-Source: ABdhPJzTrHtNaBUJDQdIIqlMIUGSc4lHkoHDQacTJye1FlYtu6JZcV+Y/m9azfDx/5iB7tGrekMtlw== X-Received: by 2002:a1c:f314:0:b0:397:10a5:a355 with SMTP id q20-20020a1cf314000000b0039710a5a355mr4875661wmq.176.1652982765983; Thu, 19 May 2022 10:52:45 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id s9-20020a7bc389000000b0039482d95ab7sm167878wmj.24.2022.05.19.10.52.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 May 2022 10:52:45 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Thu, 19 May 2022 17:52:33 +0000 Subject: [PATCH v2 05/10] cache-tree: implement cache_tree_find_path() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, vdye@github.com, shaoxuan.yuan02@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Given a 'struct cache_tree', it may be beneficial to navigate directly to a node within that corresponds to a given path name. Create cache_tree_find_path() for this function. It returns NULL when no such path exists. The implementation is adapted from do_invalidate_path() which does a similar search but also modifies the nodes it finds along the way. This new method is not currently used, but will be in an upcoming change. Signed-off-by: Derrick Stolee --- cache-tree.c | 24 ++++++++++++++++++++++++ cache-tree.h | 2 ++ 2 files changed, 26 insertions(+) diff --git a/cache-tree.c b/cache-tree.c index 6752f69d515..23893a7b113 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -100,6 +100,30 @@ struct cache_tree_sub *cache_tree_sub(struct cache_tree *it, const char *path) return find_subtree(it, path, pathlen, 1); } +struct cache_tree *cache_tree_find_path(struct cache_tree *it, const char *path) +{ + const char *slash; + int namelen; + struct cache_tree_sub *down; + + if (!it) + return NULL; + slash = strchrnul(path, '/'); + namelen = slash - path; + it->entry_count = -1; + if (!*slash) { + int pos; + pos = cache_tree_subtree_pos(it, path, namelen); + if (0 <= pos) + return it->down[pos]->cache_tree; + return NULL; + } + down = find_subtree(it, path, namelen, 0); + if (down) + return cache_tree_find_path(down->cache_tree, slash + 1); + return NULL; +} + static int do_invalidate_path(struct cache_tree *it, const char *path) { /* a/b/c diff --git a/cache-tree.h b/cache-tree.h index 8efeccebfc9..f75f8e74dcd 100644 --- a/cache-tree.h +++ b/cache-tree.h @@ -29,6 +29,8 @@ struct cache_tree_sub *cache_tree_sub(struct cache_tree *, const char *); int cache_tree_subtree_pos(struct cache_tree *it, const char *path, int pathlen); +struct cache_tree *cache_tree_find_path(struct cache_tree *it, const char *path); + void cache_tree_write(struct strbuf *, struct cache_tree *root); struct cache_tree *cache_tree_read(const char *buffer, unsigned long size); From patchwork Thu May 19 17:52:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12855868 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A96AFC433EF for ; Thu, 19 May 2022 17:54:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243504AbiESRyA (ORCPT ); Thu, 19 May 2022 13:54:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52682 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243582AbiESRxa (ORCPT ); Thu, 19 May 2022 13:53:30 -0400 Received: from mail-wm1-x32f.google.com (mail-wm1-x32f.google.com [IPv6:2a00:1450:4864:20::32f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C9D71EA9 for ; Thu, 19 May 2022 10:52:48 -0700 (PDT) Received: by mail-wm1-x32f.google.com with SMTP id r6-20020a1c2b06000000b00396fee5ebc9so3113122wmr.1 for ; Thu, 19 May 2022 10:52:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=PnR1tnT4QBFQfMJMAKPgZ82lu5UpfUErt1pLv/Nsqlk=; b=oLmxrLjRO/EqKhBEUMRnomEOqqDvY75SmhZOIvZVtd4u4w35lCud5ZDRrUUipECJav 8RmmobbwhxH+cSXwFkZbRjtQvh1VgTEKZph2LsA/4DrpTnmoLzrO3p5gZ892/zs2HCag JcD+uwtmMzDE5AyVt7O42ExHmnEz52QPbKZXE9fCp5rrgJBAmjZwC0nfNzzXY6jxQU9Q K1ceO+UI2lNeVNKqz7UVevoiVI1pHMCOrA39pbxGRWjLVlyGrEVeIEY4DHj2GdtZ5+df PehvlLTTC4rMRn2PRzClUrDSqt1szrTrPk5LILX2WS9+noMTbWuzQG6PnDh6jAjGbCKn 4GYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=PnR1tnT4QBFQfMJMAKPgZ82lu5UpfUErt1pLv/Nsqlk=; b=anZbqdzU6nLUPBDTlftTMlxUCN7iGH33GBo51LYYJvtMa2z3RA7GqGvBB4EvnT9exw 2HCOO/UXJbc4VOHXCzs7OxHiv3L7oTcgAfE41R534/c9pdAXOstPrgZLryE0hqmFlVrt y5Y/bVoqO3Uu4kGkjsSCKAt/psOYgiTZK5C6WlKg61NfDszErQfLDvMPBdNgWMsAlPTE jN44KPrFu9JH7oW59tsJ/SHrxNSPmKW1l4Y+ITBaxlb5K/8bf5AweMc0hfmzd283mfQb DHS2l2lDzWpdKZ02KLEZBvsZ3S+SeEPh1bauKDA7PodoArr5k9rVqX8j5pnPItn49HJr E17g== X-Gm-Message-State: AOAM532OANP4s4rMhL9MyAC9so9fZBFBInkbjiwlOKPT2gY0csM00uH1 INyPeMasny6ueH5uJWkqY/mf9QAF4KU= X-Google-Smtp-Source: ABdhPJxIVpnnWbj/K8UQx/oiPSy7rf1+wvnOJExt9meweQGxvgrZSaKwa5yzfhssU1XJ8S+kVbjoNQ== X-Received: by 2002:a05:600c:21c8:b0:394:7796:49c0 with SMTP id x8-20020a05600c21c800b00394779649c0mr5054396wmj.191.1652982767205; Thu, 19 May 2022 10:52:47 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id n10-20020a05600c500a00b00397335c8ad4sm205458wmr.1.2022.05.19.10.52.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 May 2022 10:52:46 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Thu, 19 May 2022 17:52:34 +0000 Subject: [PATCH v2 06/10] sparse-checkout: --no-sparse-index needs a full index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, vdye@github.com, shaoxuan.yuan02@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee When the --no-sparse-index option is supplied, the sparse-checkout builtin should explicitly ask to expand a sparse index to a full one. This is currently done implicitly due to the command_requires_full_index protection, but that will be removed in an upcoming change. Signed-off-by: Derrick Stolee --- builtin/sparse-checkout.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index 88eea069ad4..cbff6ad00b0 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -413,6 +413,9 @@ static int update_modes(int *cone_mode, int *sparse_index) /* force an index rewrite */ repo_read_index(the_repository); the_repository->index->updated_workdir = 1; + + if (!*sparse_index) + ensure_full_index(the_repository->index); } return 0; From patchwork Thu May 19 17:52:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12855867 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4209DC433EF for ; Thu, 19 May 2022 17:53:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243427AbiESRx5 (ORCPT ); Thu, 19 May 2022 13:53:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51500 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243586AbiESRxa (ORCPT ); Thu, 19 May 2022 13:53:30 -0400 Received: from mail-wr1-x433.google.com (mail-wr1-x433.google.com [IPv6:2a00:1450:4864:20::433]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 14869DFFA for ; Thu, 19 May 2022 10:52:50 -0700 (PDT) Received: by mail-wr1-x433.google.com with SMTP id r23so8279092wrr.2 for ; Thu, 19 May 2022 10:52:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=MJHguuq3EGZPe4+eLDHq9VlhYC9dDU+auiDeaDrogvE=; b=fwGE2h4uy4fuwI0i6TKuNFdmHkxJNgl2jQj0P3MBU9aUqAy0+F4E3RwN3uCphum9QN o7MkWotZJA2zcKScJw+MTkFfk6jg0/ChJ3O/Af/6QS0CqwREsVIGV/VNzjVq5vcDF0Tt 2B+aXPLgH+b/+LW2+C1fJDB9VluQeGL9Zyt0rLE/UqAWMmTJ72m1hBtm/F5gs/xK9Rs/ r+b8286+ljf/SGlqd4lW0pAaoc2XNdZK/JfCTqlnpisgMaJBZ6Vdf6Hh7/71A02XFXB6 6YMWuJaE60QYEazFfKX+t9+a6UuufqEYDWDPOv6J1n0NSzkqNcuh/C/YQ6kd9+y7flsH c4NQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=MJHguuq3EGZPe4+eLDHq9VlhYC9dDU+auiDeaDrogvE=; b=NSsph+GBfimB58xyMXf9ol9KrqxlJJapIJaosoKMT+Hn9zrmTpqzjiRUGpdO6CMKGm uJ9yuV31WsDQmj8LQLYa/GZsFbC4/CnTjnjdntkUHQSdFlSMya+rIoMque3NJ5yiMaU5 vBwc+XhiHMI0wEb/Xd/nHqrpScGspxbSPh3Pei9pM6bYMGk/yrd5Tiew+NrOyfOy1q+h CzbTxOqvtiQ9Qqq2j0QL5/zOXu689OFCJqRPRdMI8Dry41NU8nx/grUFRbzfKbRsrk4B 1Jq99ISvG/MsgG2thAR/o8Tb+ygNP7j8FvKeM7a5AARCiH6IRwtBFtXm7G0E0RyNQ5nI CjqA== X-Gm-Message-State: AOAM531d41fy0wTGeriOrq1agd3CXguUtjXkYG9RxyAil6GQkqt0GedM KmPDQRnntH+uJ5StvEt2R38PKh+u63A= X-Google-Smtp-Source: ABdhPJwFCNQq/9So/OSJWh34+CszVf5L0SfDgCdspBV0MUzSCohBABPCiT4PrHGOnpFRuXQaLFyDbg== X-Received: by 2002:a5d:58d0:0:b0:20d:1176:bf1a with SMTP id o16-20020a5d58d0000000b0020d1176bf1amr5034699wrf.167.1652982768238; Thu, 19 May 2022 10:52:48 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id j18-20020a05600c1c1200b0039429bfebeasm277649wms.2.2022.05.19.10.52.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 May 2022 10:52:47 -0700 (PDT) Message-Id: <346c56bf2560c5a89850ef4f8a58fbe17cde10fc.1652982759.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Thu, 19 May 2022 17:52:35 +0000 Subject: [PATCH v2 07/10] sparse-index: partially expand directories Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, vdye@github.com, shaoxuan.yuan02@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The expand_to_pattern_list() method expands sparse directory entries to their list of contained files when either the pattern list is NULL or the directory is contained in the new pattern list's cone mode patterns. It is possible that the pattern list has a recursive match with a directory 'A/B/C/' and so an existing sparse directory 'A/B/' would need to be expanded. If there exists a directory 'A/B/D/', then that directory should not be expanded and instead we can create a sparse directory. To implement this, we plug into the add_path_to_index() callback for the call to read_tree_at(). Since we now need access to both the index we are writing and the pattern list we are comparing, create a 'struct modify_index_context' to use as a data transfer object. It is important that we use the given pattern list since we will use this pattern list to change the sparse-checkout patterns and cannot use istate->sparse_checkout_patterns. Signed-off-by: Derrick Stolee --- sparse-index.c | 46 +++++++++++++++++++++++++++++++++++++++------- 1 file changed, 39 insertions(+), 7 deletions(-) diff --git a/sparse-index.c b/sparse-index.c index c2cd3bdb614..73b82e5017b 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -9,6 +9,11 @@ #include "dir.h" #include "fsmonitor.h" +struct modify_index_context { + struct index_state *write; + struct pattern_list *pl; +}; + static struct cache_entry *construct_sparse_dir_entry( struct index_state *istate, const char *sparse_dir, @@ -231,18 +236,41 @@ static int add_path_to_index(const struct object_id *oid, struct strbuf *base, const char *path, unsigned int mode, void *context) { - struct index_state *istate = (struct index_state *)context; + struct modify_index_context *ctx = (struct modify_index_context *)context; struct cache_entry *ce; size_t len = base->len; - if (S_ISDIR(mode)) - return READ_TREE_RECURSIVE; + if (S_ISDIR(mode)) { + int dtype; + size_t baselen = base->len; + if (!ctx->pl) + return READ_TREE_RECURSIVE; - strbuf_addstr(base, path); + /* + * Have we expanded to a point outside of the sparse-checkout? + */ + strbuf_addstr(base, path); + strbuf_add(base, "/-", 2); + + if (path_matches_pattern_list(base->buf, base->len, + NULL, &dtype, + ctx->pl, ctx->write)) { + strbuf_setlen(base, baselen); + return READ_TREE_RECURSIVE; + } - ce = make_cache_entry(istate, mode, oid, base->buf, 0, 0); + /* + * The path "{base}{path}/" is a sparse directory. Create the correct + * name for inserting the entry into the index. + */ + strbuf_setlen(base, base->len - 1); + } else { + strbuf_addstr(base, path); + } + + ce = make_cache_entry(ctx->write, mode, oid, base->buf, 0, 0); ce->ce_flags |= CE_SKIP_WORKTREE | CE_EXTENDED; - set_index_entry(istate, istate->cache_nr++, ce); + set_index_entry(ctx->write, ctx->write->cache_nr++, ce); strbuf_setlen(base, len); return 0; @@ -254,6 +282,7 @@ void expand_to_pattern_list(struct index_state *istate, int i; struct index_state *full; struct strbuf base = STRBUF_INIT; + struct modify_index_context ctx; /* * If the index is already full, then keep it full. We will convert @@ -294,6 +323,9 @@ void expand_to_pattern_list(struct index_state *istate, full->cache_nr = 0; ALLOC_ARRAY(full->cache, full->cache_alloc); + ctx.write = full; + ctx.pl = pl; + for (i = 0; i < istate->cache_nr; i++) { struct cache_entry *ce = istate->cache[i]; struct tree *tree; @@ -319,7 +351,7 @@ void expand_to_pattern_list(struct index_state *istate, strbuf_add(&base, ce->name, strlen(ce->name)); read_tree_at(istate->repo, tree, &base, &ps, - add_path_to_index, full); + add_path_to_index, &ctx); /* free directory entries. full entries are re-used */ discard_cache_entry(ce); From patchwork Thu May 19 17:52:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12855871 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C3E7EC433EF for ; Thu, 19 May 2022 17:54:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237257AbiESRyN (ORCPT ); Thu, 19 May 2022 13:54:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52106 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243441AbiESRxa (ORCPT ); Thu, 19 May 2022 13:53:30 -0400 Received: from mail-wr1-x432.google.com (mail-wr1-x432.google.com [IPv6:2a00:1450:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 72AFF11C16 for ; Thu, 19 May 2022 10:52:51 -0700 (PDT) Received: by mail-wr1-x432.google.com with SMTP id w4so8218390wrg.12 for ; Thu, 19 May 2022 10:52:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=IEpNz3kJwXSXH54uGrN5t4SX/2hfEE+wWeKVcgcKSGA=; b=Hgf5kiDTDmoJs5Nuz4rljkdkNDLlZdXFYVvgfdzLsHf1QX37HLedXC5j3hG7rpNbPs 3h3o4L5hfrBmKlpXZzZjOFhYXv7wXHsEowhd5XAoH3dozDNGnJ25ggjDZvwuC30Q8JEA 4HjOMV01BulaLhfaajajEGBlbY/8V6DqgVjnzvCI2iCi8pMRk3VRpSY9XCEA4sRNoX1O oKNOlnopt9OR4veyzr6ashRijRSlhyqhcHFYiUsSSX2jCPI7Vk1gdJpnybFZ1nP49b1I 4AeUPDpy8nqzwvs3m19Greyan74fSDxVT9ajxJY5XJaPnPN6yaG1k4yVq8y5UURFKvr1 ix0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=IEpNz3kJwXSXH54uGrN5t4SX/2hfEE+wWeKVcgcKSGA=; b=aN4407FnztFK/k8qENdR0QvoRYalfBaFAgcnGLL6eZIFCMw/T0CcxE1bXm08czkTZS 0mORckXV3Jy2xaufCul1fCyHGRmf/VEox490IJj9K3zN0dPmiB9EKbB33CqHRPa0iRYm CvNE2AK5PhGd2sTdT5CWn/ZSgmHKYHfWvQNixRiEZSLULSM+6PoCNh0mLLbXjlCHVPH/ ihl9nmhgfOo1OW072uU7qcNZ7cccESBt1NC9kTgqziGw02eEDzgaG2Cqx7DQFaw4loE/ kfTBF8kXLxfaACShbZvKnU1CH5AhsY2UvjwdPIQWOQxAR+SbqzNT8coz2jYLCF49K323 BdFw== X-Gm-Message-State: AOAM532EBP0L2FyRVDQSi9os9/1e77rmESJ4irCRWBgYtfFhs9nyBNTt 2RLIvYruu4nJUl8ZnO2lIxOso4SrgKI= X-Google-Smtp-Source: ABdhPJwiI4uHJI6OvEpqfzcZVku3W49euedB2oy8aEEsIlIZ1pMMtyV31D5e0OamWY7PRxqlIsNXWw== X-Received: by 2002:a05:6000:2c4:b0:20c:6c76:14f4 with SMTP id o4-20020a05600002c400b0020c6c7614f4mr5167364wry.56.1652982769361; Thu, 19 May 2022 10:52:49 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id v15-20020adf8b4f000000b0020c5253d8d3sm246422wra.31.2022.05.19.10.52.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 May 2022 10:52:48 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Thu, 19 May 2022 17:52:36 +0000 Subject: [PATCH v2 08/10] sparse-index: complete partial expansion Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, vdye@github.com, shaoxuan.yuan02@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee To complete the implementation of expand_to_pattern_list(), we need to detect when a sparse directory entry should remain sparse. This avoids a full expansion, so we now need to use the PARTIALLY_SPARSE mode to indicate this state. There still are no callers to this method, but we will add one in the next change. Signed-off-by: Derrick Stolee --- sparse-index.c | 41 +++++++++++++++++++++++++++++++++++++---- 1 file changed, 37 insertions(+), 4 deletions(-) diff --git a/sparse-index.c b/sparse-index.c index 73b82e5017b..a65169030a2 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -297,8 +297,24 @@ void expand_to_pattern_list(struct index_state *istate, * continue. A NULL pattern set indicates a full expansion to a * full index. */ - if (pl && !pl->use_cone_patterns) + if (pl && !pl->use_cone_patterns) { pl = NULL; + } else { + /* + * We might contract file entries into sparse-directory + * entries, and for that we will need the cache tree to + * be recomputed. + */ + cache_tree_free(&istate->cache_tree); + + /* + * If there is a problem creating the cache tree, then we + * need to expand to a full index since we cannot satisfy + * the current request as a sparse index. + */ + if (cache_tree_update(istate, WRITE_TREE_MISSING_OK)) + pl = NULL; + } if (!istate->repo) istate->repo = the_repository; @@ -317,8 +333,14 @@ void expand_to_pattern_list(struct index_state *istate, full = xcalloc(1, sizeof(struct index_state)); memcpy(full, istate, sizeof(struct index_state)); + /* + * This slightly-misnamed 'full' index might still be sparse if we + * are only modifying the list of sparse directories. This hinges + * on whether we have a non-NULL pattern list. + */ + full->sparse_index = pl ? PARTIALLY_SPARSE : COMPLETELY_FULL; + /* then change the necessary things */ - full->sparse_index = 0; full->cache_alloc = (3 * istate->cache_alloc) / 2; full->cache_nr = 0; ALLOC_ARRAY(full->cache, full->cache_alloc); @@ -330,11 +352,22 @@ void expand_to_pattern_list(struct index_state *istate, struct cache_entry *ce = istate->cache[i]; struct tree *tree; struct pathspec ps; + int dtype; if (!S_ISSPARSEDIR(ce->ce_mode)) { set_index_entry(full, full->cache_nr++, ce); continue; } + + /* We now have a sparse directory entry. Should we expand? */ + if (pl && + path_matches_pattern_list(ce->name, ce->ce_namelen, + NULL, &dtype, + pl, istate) == NOT_MATCHED) { + set_index_entry(full, full->cache_nr++, ce); + continue; + } + if (!(ce->ce_flags & CE_SKIP_WORKTREE)) warning(_("index entry is a directory, but not sparse (%08x)"), ce->ce_flags); @@ -360,7 +393,7 @@ void expand_to_pattern_list(struct index_state *istate, /* Copy back into original index. */ memcpy(&istate->name_hash, &full->name_hash, sizeof(full->name_hash)); memcpy(&istate->dir_hash, &full->dir_hash, sizeof(full->dir_hash)); - istate->sparse_index = 0; + istate->sparse_index = pl ? PARTIALLY_SPARSE : COMPLETELY_FULL; free(istate->cache); istate->cache = full->cache; istate->cache_nr = full->cache_nr; @@ -374,7 +407,7 @@ void expand_to_pattern_list(struct index_state *istate, /* Clear and recompute the cache-tree */ cache_tree_free(&istate->cache_tree); - cache_tree_update(istate, 0); + cache_tree_update(istate, WRITE_TREE_MISSING_OK); trace2_region_leave("index", pl ? "expand_to_pattern_list" : "ensure_full_index", From patchwork Thu May 19 17:52:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12855869 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07161C433F5 for ; Thu, 19 May 2022 17:54:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243505AbiESRyD (ORCPT ); Thu, 19 May 2022 13:54:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51812 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243588AbiESRxa (ORCPT ); Thu, 19 May 2022 13:53:30 -0400 Received: from mail-wm1-x329.google.com (mail-wm1-x329.google.com [IPv6:2a00:1450:4864:20::329]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4FC5313D29 for ; Thu, 19 May 2022 10:52:52 -0700 (PDT) Received: by mail-wm1-x329.google.com with SMTP id o12-20020a1c4d0c000000b00393fbe2973dso5494428wmh.2 for ; Thu, 19 May 2022 10:52:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=2IlzxJ7VZLFYW9FoTSMVdEvLGG9ryAdgqq2Dl4y6yJU=; b=fB06fFnhM4qX3LUifp9wCR50JL/17ctaz9gpXiScYUQnuU3hxl25DffVwSpHtJsnnp NzCDfw3xSrVzgfvwnnbfBFxJjpBkek+UmykQJ8CbVQGRRiByFVOTWPOul9bsmuxViJb9 rcDEQeJPZHbA3ua3qc4zV7hEw1/f8ULARs8CHswFUdWnDElo6yOr1We4CACOdrXTmvJB btAyJ9VbDqR+u7szUCVLFDRIyh6aomZFkJvOhmNEzY6ol+W+AqXeM0CIC+Pyb92iE/d/ yi9IExkLpQXbNBKAC9w8HvtRWi+J6xg7IPoCYZ9fOnZ4LFinKZ1H5Bq96/Slz3kLBvL2 05MA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=2IlzxJ7VZLFYW9FoTSMVdEvLGG9ryAdgqq2Dl4y6yJU=; b=UZG2yJQZhWaOtX1HnuZlBeu32wtEWXAN2LJjtSnfilVomzaq+PolpcXmfIYoSzyfwz /wwI0kZzdWOGXDcgEqF6S8/5FSCbZu2ACEoi6uY99HTdw0Ov+KxCfyw/2UDOXqWSNq1X bx8bx7mTl3k5a37YQVnC0+DDclkiKUa6uiAAMSpnVsIIzlvQX8XN0WP4+6CywbGU2jW1 cfqseihFIV9Pi/Cg8epqctn8OCtGQPSuS1ItyJiCU8aT722eYB6P1/jY2O1dWK0kglBu gQ4uPAM9Y7FFvBHXwnUmMl/aZp8fJsqs4iY7+OZLhbpDi7kaEKSs3vk5wpNl25tSPtpd NGhQ== X-Gm-Message-State: AOAM532XDUK6aSMyyuaUcb8J6rqVZ4ebC1edddBLQO4xaYzj1lHEya9F cJhj08iGzoJeYi8gW4OkbyatQ4eN8GQ= X-Google-Smtp-Source: ABdhPJwQYIGoBD5y4CGKbtqdcFsRuivtFk6GtF6rv8yfmQ1FEn7SudwfOzV7vHTVZQy4ftI3J1H85A== X-Received: by 2002:a05:600c:3b11:b0:394:57eb:c58b with SMTP id m17-20020a05600c3b1100b0039457ebc58bmr5291236wms.3.1652982770560; Thu, 19 May 2022 10:52:50 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id o17-20020adfba11000000b0020d11ee1bcesm182544wrg.82.2022.05.19.10.52.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 May 2022 10:52:50 -0700 (PDT) Message-Id: <089ab086f584054a7b2bb8a2868a01c545e38d27.1652982759.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Thu, 19 May 2022 17:52:37 +0000 Subject: [PATCH v2 09/10] p2000: add test for 'git sparse-checkout [add|set]' Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, vdye@github.com, shaoxuan.yuan02@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The sparse-checkout builtin is almost completely integrated with the sparse index, allowing the sparse-checkout boundary to be modified without expanding a sparse index to a full one. Add a test to p2000-sparse-operations.sh that adds a directory to the sparse-checkout definition, then removes it. Using both operations is important to ensure that the operation is doing the same work in each repetition as well as leaving the test repo in a good state for later tests. Signed-off-by: Derrick Stolee --- t/perf/p2000-sparse-operations.sh | 1 + 1 file changed, 1 insertion(+) diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh index 382716cfca9..ce5cfac5714 100755 --- a/t/perf/p2000-sparse-operations.sh +++ b/t/perf/p2000-sparse-operations.sh @@ -110,6 +110,7 @@ test_perf_on_all git add -A test_perf_on_all git add . test_perf_on_all git commit -a -m A test_perf_on_all git checkout -f - +test_perf_on_all "git sparse-checkout add f2/f3/f1 && git sparse-checkout set $SPARSE_CONE" test_perf_on_all git reset test_perf_on_all git reset --hard test_perf_on_all git reset -- does-not-exist From patchwork Thu May 19 17:52:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12855870 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3415FC433FE for ; Thu, 19 May 2022 17:54:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243448AbiESRyF (ORCPT ); Thu, 19 May 2022 13:54:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52420 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243591AbiESRxa (ORCPT ); Thu, 19 May 2022 13:53:30 -0400 Received: from mail-wm1-x32f.google.com (mail-wm1-x32f.google.com [IPv6:2a00:1450:4864:20::32f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 68FDF18B29 for ; Thu, 19 May 2022 10:52:53 -0700 (PDT) Received: by mail-wm1-x32f.google.com with SMTP id l38-20020a05600c1d2600b00395b809dfbaso3106534wms.2 for ; Thu, 19 May 2022 10:52:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=ajeaHxnLCh9ucVB3v4mh6XIqKDB2XVBegW9+7kiuB5A=; b=GCpoJNEVWoJ0D9DLEh5KumOc/p7IRKaqgi4+NVAtlHS3WMnBeQ787TbXD88MMcbv/H 8S8ar+OvVGjH9HOhPjzqPdzblB9RQu4Qulm5u4o90Ods9BwcBBkwNGS3xxi3JaA/cbSJ 0p+afjPpLE0naAeCC+gKA2rDuvIHFZ34ALY9s5whqPKWITdnoGQspw94GzRixHMjHPrD byFo9rJ6Yq67V1K30VuhE6xy9TTOW77oEXgH2uXBqQ1JutH49v4iEmT19PAc5smGcfXE qS4PqKv71Jez8jigzCbeO3XEp6RxxrqsoB9e62+MAUXPLdnnHgYuE53c5Tvsoi7K2iw7 7i/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=ajeaHxnLCh9ucVB3v4mh6XIqKDB2XVBegW9+7kiuB5A=; b=b1J/tsu0RKTPUAWw/bbEFEpcZvWfjErhTTtHEBL7j7WcH/+Qz9L9g3PIl49pn3J77X /u3uDJH/Vvr44h8Kgf5p6QFgPYkvytRA9PdUtYEsY8NiidAT/AA3HU57I5SXdcRbt+ao rfNTQqYtNw/pQFd8vJ46hxRhBrp3hC+vqtctetlEWTisJDr84AGZjr9bM0nYXuKSH/3g j1dOb0SiU46ATiQnNdELwZK339YYrhJzgOsnMypmnPRKp92XNfKAJvLifcZwtunX+zwc Oroy6/k+14xMJbDQ3N0XPWuXsuryqSQcoEiouoVzb61x3Mtcs/bzSbZ+BHI0i4Oc3cSF mUuw== X-Gm-Message-State: AOAM533wwmbrJXyLFWhuvH4oDnNQee+5jqAgrMn6x7KPrUJx5eZumZnH LW4BlRggTNd6h5vmYHaE4eZpd+NlOjM= X-Google-Smtp-Source: ABdhPJx3rSRESryAxgLdtatgkKmOJwaH99x7JZsACuxXk7AC0uO+ozcSZx+1xbj2ZOFOVCXKFVepnw== X-Received: by 2002:a05:600c:1f19:b0:396:e98e:35e1 with SMTP id bd25-20020a05600c1f1900b00396e98e35e1mr5495491wmb.84.1652982771572; Thu, 19 May 2022 10:52:51 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id w9-20020adf8bc9000000b0020d07958bb3sm276784wra.3.2022.05.19.10.52.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 May 2022 10:52:51 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Thu, 19 May 2022 17:52:38 +0000 Subject: [PATCH v2 10/10] sparse-checkout: integrate with sparse index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, vdye@github.com, shaoxuan.yuan02@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee When modifying the sparse-checkout definition, the sparse-checkout builtin calls update_sparsity() to modify the SKIP_WORKTREE bits of all cache entries in the index. Before, we needed the index to be fully expanded in order to ensure we had the full list of files necessary that match the new patterns. Insert a call to reset_sparse_directories() that expands sparse directories that are within the new pattern list, but only far enough that every necessary file path now exists as a cache entry. The remaining logic within update_sparsity() will modify the SKIP_WORKTREE bits appropriately. This allows us to disable command_requires_full_index within the sparse-checkout builtin. Add tests that demonstrate that we are not expanding to a full index unnecessarily. We can see the improved performance in the p2000 test script: Test HEAD~1 HEAD ------------------------------------------------------------------------ 2000.24: git ... (sparse-v3) 2.14(1.55+0.58) 1.57(1.03+0.53) -26.6% 2000.25: git ... (sparse-v4) 2.20(1.62+0.57) 1.58(0.98+0.59) -28.2% These reductions of 26-28% are small compared to most examples, but the time is dominated by writing a new copy of the base repository to the worktree and then deleting it again. The fact that the previous index expansion was such a large portion of the time is telling how important it is to complete this sparse index integration. Signed-off-by: Derrick Stolee --- builtin/sparse-checkout.c | 3 +++ t/t1092-sparse-checkout-compatibility.sh | 25 ++++++++++++++++++++++++ unpack-trees.c | 4 ++++ 3 files changed, 32 insertions(+) diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index cbff6ad00b0..0157b292b36 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -937,6 +937,9 @@ int cmd_sparse_checkout(int argc, const char **argv, const char *prefix) git_config(git_default_config, NULL); + prepare_repo_settings(the_repository); + the_repository->settings.command_requires_full_index = 0; + if (argc > 0) { if (!strcmp(argv[0], "list")) return sparse_checkout_list(argc, argv); diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index 785820f9fd5..73f4cf47314 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -1584,6 +1584,31 @@ test_expect_success 'ls-files' ' ensure_not_expanded ls-files --sparse ' +test_expect_success 'sparse index is not expanded: sparse-checkout' ' + init_repos && + + ensure_not_expanded sparse-checkout set deep/deeper2 && + ensure_not_expanded sparse-checkout set deep/deeper1 && + ensure_not_expanded sparse-checkout set deep && + ensure_not_expanded sparse-checkout add folder1 && + ensure_not_expanded sparse-checkout set deep/deeper1 && + ensure_not_expanded sparse-checkout set folder2 && + + # Demonstrate that the checks that "folder1/a" is a file + # do not cause a sparse-index expansion (since it is in the + # sparse-checkout cone). + echo >>sparse-index/folder2/a && + git -C sparse-index add folder2/a && + + ensure_not_expanded sparse-checkout add folder1 && + + # Skip checks here, since deep/deeper1 is inside a sparse directory + # that must be expanded to check whether `deep/deeper1` is a file + # or not. + ensure_not_expanded sparse-checkout set --skip-checks deep/deeper1 && + ensure_not_expanded sparse-checkout set +' + # NEEDSWORK: a sparse-checkout behaves differently from a full checkout # in this scenario, but it shouldn't. test_expect_success 'reset mixed and checkout orphan' ' diff --git a/unpack-trees.c b/unpack-trees.c index 7f528d35cc2..9745e0dfc34 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -18,6 +18,7 @@ #include "promisor-remote.h" #include "entry.h" #include "parallel-checkout.h" +#include "sparse-index.h" /* * Error messages expected by scripts out of plumbing commands such as @@ -2018,6 +2019,9 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o) goto skip_sparse_checkout; } + /* Expand sparse directories as needed */ + expand_to_pattern_list(o->src_index, o->pl); + /* Set NEW_SKIP_WORKTREE on existing entries. */ mark_all_ce_unused(o->src_index); mark_new_skip_worktree(o->pl, o->src_index, 0,