From patchwork Mon May 23 13:48:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12859099 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA835C433EF for ; Mon, 23 May 2022 13:49:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236670AbiEWNtE (ORCPT ); Mon, 23 May 2022 09:49:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41602 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236659AbiEWNs5 (ORCPT ); Mon, 23 May 2022 09:48:57 -0400 Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7DE0B562E4 for ; Mon, 23 May 2022 06:48:51 -0700 (PDT) Received: by mail-wr1-x42b.google.com with SMTP id t13so2291673wrg.9 for ; Mon, 23 May 2022 06:48:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=JOe9/xcyWlPygpRg9fHBknYEcVKDUlk3pocQHkD5gJA=; b=bjzoY5WcaQ7pZ8haSh07AlSkLv/C8K8jfGGtW788VML6QKESZnyJR8lgJQp6yymHJB g5kCY5LYDb72vJen4/P+k2SCc+K/aZwtWhBJoFwOVKEf778cg9FY3LJYj8zN7kEP8qwV TY7CgLRf79KQs6qrCe3eXN5K7MDzweH69gSgVqKhk8/vN4wKe5ExXvoghvc3vvzPeWXN hEQySYr/d/WcG02Tj1q0cKYYDYYMfUzGjUGTzIbw1TocZnXoMrPVafXqk7N8YMIqpyFD AkILaZdYG2vhmLIyYR5gig7hXHICuKvZEpgj0Q/pvjtODToKISZB6znwayY7aKC7yg2/ pDCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=JOe9/xcyWlPygpRg9fHBknYEcVKDUlk3pocQHkD5gJA=; b=SkIafq3DXMe6SxAEXyD78XFWlUp+cuvjCik5l5HM4/yTdpmOLBBkDeJ3EAySPer8EJ p7WMpk1pIyXexmK3+y4aFBX4IYKWd8CkO7mGXI6oBm8+5ftsSPUgI9/uHyhJESljvRcF Va9RbKYVrYCaZNcTYCJfV381QazD2dPjz8ucv6DaCcEL650+5FGoyWCsjFxtxxnLBtNP 6np4y+YaiVhiLuh6i+CScE42GwzWCc0dPjGMNUtabnuhhWezHxUbK1SmjOYnnbG6hZB2 JKh3ADQtH+bRgKFbQlRJi1PAxKrglSI4KiG6nB7L+GbzHt5jCRLRrgMA/62keqrvdxtj 7rRg== X-Gm-Message-State: AOAM530QPzmW36X9yZ+8pclGWHqSu6ze0VDd1MPwsrUrYU11tceJM1EP XMhBPrXTlLSUhf5YKbNnEuGVAyU4CeY= X-Google-Smtp-Source: ABdhPJzRa93Qz8ORvaS21uPSeEAc0+JOjb7b0kLDb1z0N/v7rTekBi4/OSnvJy0xWcukSr0OcOQpBg== X-Received: by 2002:a5d:6d88:0:b0:20e:6f48:a18b with SMTP id l8-20020a5d6d88000000b0020e6f48a18bmr16780380wrs.521.1653313729635; Mon, 23 May 2022 06:48:49 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id w5-20020adf8bc5000000b0020e615bab7bsm10256310wra.7.2022.05.23.06.48.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 May 2022 06:48:48 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Mon, 23 May 2022 13:48:37 +0000 Subject: [PATCH v3 01/10] t1092: refactor 'sparse-index contents' test Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, vdye@github.com, shaoxuan.yuan02@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Before expanding this test with more involved cases, first extract the repeated logic into a new test_sparse_checkout_set helper. This helper checks that 'git sparse-checkout set ...' succeeds and then verifies that certain directories have sparse directory entries in the sparse index. It also verifies that the in-cone directories are _not_ sparse directory entries in the sparse index. Signed-off-by: Derrick Stolee --- t/t1092-sparse-checkout-compatibility.sh | 53 ++++++++++++++++-------- 1 file changed, 35 insertions(+), 18 deletions(-) diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index 93bcfd20bbc..e7c0ae9b953 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -205,36 +205,53 @@ test_sparse_unstaged () { done } -test_expect_success 'sparse-index contents' ' - init_repos && - +# Usage: test_sprase_checkout_set " ... " " ... " +# Verifies that "git sparse-checkout set ... " succeeds and +# leaves the sparse index in a state where ... are sparse +# directories (and ... are not). +test_sparse_checkout_set () { + CONE_DIRS=$1 && + SPARSE_DIRS=$2 && + git -C sparse-index sparse-checkout set $CONE_DIRS && git -C sparse-index ls-files --sparse --stage >cache && - for dir in folder1 folder2 x + + # Check that the directories outside of the sparse-checkout cone + # have sparse directory entries. + for dir in $SPARSE_DIRS do TREE=$(git -C sparse-index rev-parse HEAD:$dir) && grep "040000 $TREE 0 $dir/" cache \ || return 1 done && - git -C sparse-index sparse-checkout set folder1 && - - git -C sparse-index ls-files --sparse --stage >cache && - for dir in deep folder2 x + # Check that the directories in the sparse-checkout cone + # are not sparse directory entries. + for dir in $CONE_DIRS do TREE=$(git -C sparse-index rev-parse HEAD:$dir) && - grep "040000 $TREE 0 $dir/" cache \ + ! grep "040000 $TREE 0 $dir/" cache \ || return 1 - done && + done +} - git -C sparse-index sparse-checkout set deep/deeper1 && +test_expect_success 'sparse-index contents' ' + init_repos && - git -C sparse-index ls-files --sparse --stage >cache && - for dir in deep/deeper2 folder1 folder2 x - do - TREE=$(git -C sparse-index rev-parse HEAD:$dir) && - grep "040000 $TREE 0 $dir/" cache \ - || return 1 - done && + # Remove deep, add three other directories. + test_sparse_checkout_set \ + "folder1 folder2 x" \ + "before deep" && + + # Remove folder1, add deep + test_sparse_checkout_set \ + "deep folder2 x" \ + "before folder1" && + + # Replace deep with deep/deeper2 (dropping deep/deeper1) + # Add folder1 + test_sparse_checkout_set \ + "deep/deeper2 folder1 folder2 x" \ + "before deep/deeper1" && # Disabling the sparse-index replaces tree entries with full ones git -C sparse-index sparse-checkout init --no-sparse-index && From patchwork Mon May 23 13:48:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12859102 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2BE1FC433EF for ; Mon, 23 May 2022 13:49:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236726AbiEWNtK (ORCPT ); Mon, 23 May 2022 09:49:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41632 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236667AbiEWNs5 (ORCPT ); Mon, 23 May 2022 09:48:57 -0400 Received: from mail-wm1-x32f.google.com (mail-wm1-x32f.google.com [IPv6:2a00:1450:4864:20::32f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DC32854F98 for ; Mon, 23 May 2022 06:48:52 -0700 (PDT) Received: by mail-wm1-x32f.google.com with SMTP id p19so363431wmg.2 for ; Mon, 23 May 2022 06:48:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=U5UspuwgHVBqo7FpE1+kOE+FRDR7ixKr1UpHpeCL9Ig=; b=c/D8yrmpq0wPs/aCBPf7r5Qyc7E2WWbzJ/E/66gG0mL1N7Xu8ajcZOLqQiblRi7XMt 02Ld7y4W+0uQKDVruAhQeiDZwuvNLV+7yCsGetGJkL+gGQ/XSNtVPhZibL7D8y0AR8jw BLQ+kG0Kw+U4Z/AlAukA2jmyOcX8tV50wjcid7AgjQXS64FgzThRqjtqNRZ69dJ0QuF1 /XOmJcTJKUBi8oqA62UCLWTizV7CSXm3GmQt7AWsBwk11aMJRo5MRQ7XuFq745Kfi0sM 2zrCdn0yruk1+48wisX1hyW//qAVTAfJR4CzTVudCCmVQrXi0EL8ALDKX2tsEgdOgZ62 h9UQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=U5UspuwgHVBqo7FpE1+kOE+FRDR7ixKr1UpHpeCL9Ig=; b=b8bAEZjN+JaLVbsidbGQb7oCsQtxbieBGCLA1hJBsirPr1/cR3bDj6SBpT3u0H0g0q 8lh5MCyTRBqRd0mktkWsRX1wbQ3pmd1JnOEVLEMAnKmQl8enaZcBZE/f/DpFahVO+/+d e8othFzoBU3JKhpvuIBUia39gOALlW7LVe8wH768HXBnxArWjWmkKbzIOnGOe4cIWh9o XV3GAYE7HjIvSKvfW7yyId4WwOVLpPRCImFNvPQ2at3cWWwlVymiDalLb1HbEyLyIWsu YGN8pta1K65E33DWz8qklq9n/Us4xdoQQQK8mF/3teWZGBKaKvTO1RfNyBGc/fxiVfNX hpsQ== X-Gm-Message-State: AOAM530ueELTrZWMPOoTNtJ+7yKZSn+0CyiDhiP12hbIGqWJUiTzxuH2 07sYJg0ASAv21WtolhzoydDb89WDSo4= X-Google-Smtp-Source: ABdhPJzOM3pGfzJgTNnSaG9U5e+OzM+jrVDQbr71yor4vks8ifIzrhCc9aXDTlczhQQQ4zk7lBDdqg== X-Received: by 2002:a05:600c:4144:b0:394:1972:1a73 with SMTP id h4-20020a05600c414400b0039419721a73mr19896439wmm.71.1653313731022; Mon, 23 May 2022 06:48:51 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id e6-20020adfc846000000b0020d02cbbb87sm10257265wrh.16.2022.05.23.06.48.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 May 2022 06:48:50 -0700 (PDT) Message-Id: <5030eeecf4f8e2dd65bc055d6a720c7d67015b1e.1653313726.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 23 May 2022 13:48:38 +0000 Subject: [PATCH v3 02/10] t1092: stress test 'git sparse-checkout set' Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, vdye@github.com, shaoxuan.yuan02@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The 'sparse-index contents' test checks that the sparse index has the correct set of sparse directories in the index after modifying the cone mode patterns using 'git sparse-checkout set'. Add to the coverage here by adding more complicated scenarios that were not previously tested. In order to check paths that do not exist at HEAD, we need to modify the test_sparse_checkout_set helper slightly: 1. Add the --skip-checks argument to the 'set' command to avoid failures when passing paths that do not exist at HEAD. 2. When looking for the non-existence of sparse directories for the paths in $CONE_DIRS, allow the rev-list command to fail because the path does not exist at HEAD. This allows us to add some interesting test cases. Helped-by: Victoria Dye Signed-off-by: Derrick Stolee --- t/t1092-sparse-checkout-compatibility.sh | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index e7c0ae9b953..785820f9fd5 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -212,7 +212,7 @@ test_sparse_unstaged () { test_sparse_checkout_set () { CONE_DIRS=$1 && SPARSE_DIRS=$2 && - git -C sparse-index sparse-checkout set $CONE_DIRS && + git -C sparse-index sparse-checkout set --skip-checks $CONE_DIRS && git -C sparse-index ls-files --sparse --stage >cache && # Check that the directories outside of the sparse-checkout cone @@ -228,7 +228,9 @@ test_sparse_checkout_set () { # are not sparse directory entries. for dir in $CONE_DIRS do - TREE=$(git -C sparse-index rev-parse HEAD:$dir) && + # Allow TREE to not exist because + # $dir does not exist at HEAD. + TREE=$(git -C sparse-index rev-parse HEAD:$dir) || ! grep "040000 $TREE 0 $dir/" cache \ || return 1 done @@ -253,6 +255,19 @@ test_expect_success 'sparse-index contents' ' "deep/deeper2 folder1 folder2 x" \ "before deep/deeper1" && + # Replace deep/deeper2 with deep/deeper1 + # Replace folder1 with folder1/0/0 + # Replace folder2 with non-existent folder2/2/3 + # Add non-existent "bogus" + test_sparse_checkout_set \ + "bogus deep/deeper1 folder1/0/0 folder2/2/3 x" \ + "before deep/deeper2 folder2/0" && + + # Drop down to only files at root + test_sparse_checkout_set \ + "" \ + "before deep folder1 folder2 x" && + # Disabling the sparse-index replaces tree entries with full ones git -C sparse-index sparse-checkout init --no-sparse-index && test_sparse_match git ls-files --stage --sparse From patchwork Mon May 23 13:48:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12859101 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4CCBBC433F5 for ; Mon, 23 May 2022 13:49:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236720AbiEWNtH (ORCPT ); Mon, 23 May 2022 09:49:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41604 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236644AbiEWNs5 (ORCPT ); Mon, 23 May 2022 09:48:57 -0400 Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com [IPv6:2a00:1450:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3EA8C5640B for ; Mon, 23 May 2022 06:48:54 -0700 (PDT) Received: by mail-wr1-x436.google.com with SMTP id u3so21488009wrg.3 for ; Mon, 23 May 2022 06:48:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=2e/lmMHKxljW0UmkwNfLmy3N7RbVBrxm9U5bxVh7FeI=; b=Pl1r+br6a/EdsVpbNL2QjEAo38gScqj3WQPA9dpfjMJ/K045sFk9f//K3iUnkE8Fkz IqLxsdcW+6AQthJTaU0e5lY42AFnFkz1t/y6mnyE8iA1mP2z/4G6xP44Ddsld+FS/kl6 pTYl17zEIUH2D/4jmVbebC79NayS4LIDF8jL4Nxwawqkn8dQXHIE+q5Q4QIrm8wXn4yf HGkyWwzVVguVuzXC0b9Wn+jV5rDa/Z9+0+Q09g+GLxVWt7UvMhG1Tkq96xr1mNv/HHZ+ Kwf3bXpIHaC9Ioj+adnjrwZdU/xH8TGXm0iIf2IgINtj/vzNowPr52ugibdTSdrY8cag 60LA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=2e/lmMHKxljW0UmkwNfLmy3N7RbVBrxm9U5bxVh7FeI=; b=4Th3zoDiKd2HlfSP91fFsGATXm07DleW2FxAGZQRw5HfMon8jZDXhFC4VxytW/iBg4 Pvc11ovwWp5g2E2LoIuxTiGcoGFORK2fRuv4rZ9ZsDcmZEM24h8wHcWQBwRrHCCsLLjr 1301rv+aRB+3+9SUSBXTKEQX/zGasBpdtWh1f9J4vx51WusIv9XwzhJhxq/iGEZZ359b L28FIfYKpusGH3FDM9qYBqgxOeyWrV0as+cDoBtFrwQzf2lcYwFqZz91dCHpngy6NXy3 Pi6c7Es3+P+C7NcBOj69d0nm+lcSlJF+kNvn2qFyloQavpPiKPvGrBZ8bYyeeS1nx3Os 6KWg== X-Gm-Message-State: AOAM532hEelizRj6f9fDWhvQ2CUnQ87cm67Xwwoqg2fFWPQMY8jBI2Jq hyTD5A17OZBiAfRbS5YDDXnvibujx6E= X-Google-Smtp-Source: ABdhPJzV7tit8qqXLUO9QW4SmUHJPxxqepd8GRkBP3/VvOO7kY9xS2FcLaycP9iy76/hL0HTN1sFgQ== X-Received: by 2002:adf:9cc2:0:b0:20f:e59a:ec41 with SMTP id h2-20020adf9cc2000000b0020fe59aec41mr2650118wre.124.1653313732223; Mon, 23 May 2022 06:48:52 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id q6-20020adfbb86000000b0020c5253d902sm10430596wrg.78.2022.05.23.06.48.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 May 2022 06:48:51 -0700 (PDT) Message-Id: <44b0549a2882df07a5b7c96a637c8f0d2e0d9798.1653313726.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 23 May 2022 13:48:39 +0000 Subject: [PATCH v3 03/10] sparse-index: create expand_index() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, vdye@github.com, shaoxuan.yuan02@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee This is the first change in a series to allow modifying the sparse-checkout pattern set without expanding a sparse index to a full one in the process. Here, we focus on the problem of expanding the pattern set through a command like 'git sparse-checkout add ' which needs to create new index entries for the paths now being written to the worktree. To achieve this, we need to be able to replace sparse directory entries with their contained files and subdirectories. Once this is complete, other code paths can discover those cache entries and write the corresponding files to disk before committing the index. We already have logic in ensure_full_index() that expands the index entries, so we will use that as our base. Create a new method, expand_index(), which takes a pattern list, but for now mostly ignores it. The current implementation is only correct when the pattern list is NULL as that does the same as ensure_full_index(). In fact, ensure_full_index() is converted to a shim over expand_index(). A future update will actually implement expand_index() to its full capabilities. For now, it is created and documented. Signed-off-by: Derrick Stolee --- sparse-index.c | 32 +++++++++++++++++++++++++++++--- sparse-index.h | 13 +++++++++++++ 2 files changed, 42 insertions(+), 3 deletions(-) diff --git a/sparse-index.c b/sparse-index.c index 8636af72de5..a11b5cf1314 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -248,19 +248,40 @@ static int add_path_to_index(const struct object_id *oid, return 0; } -void ensure_full_index(struct index_state *istate) +void expand_index(struct index_state *istate, struct pattern_list *pl) { int i; struct index_state *full; struct strbuf base = STRBUF_INIT; + const char *tr_region; + /* + * If the index is already full, then keep it full. We will convert + * it to a sparse index on write, if possible. + */ if (!istate || !istate->sparse_index) return; + /* + * If our index is sparse, but our new pattern set does not use + * cone mode patterns, then we need to expand the index before we + * continue. A NULL pattern set indicates a full expansion to a + * full index. + */ + if (pl && !pl->use_cone_patterns) + pl = NULL; + if (!istate->repo) istate->repo = the_repository; - trace2_region_enter("index", "ensure_full_index", istate->repo); + /* + * A NULL pattern set indicates we are expanding a full index, so + * we use a special region name that indicates the full expansion. + * This is used by test cases, but also helps to differentiate the + * two cases. + */ + tr_region = pl ? "expand_index" : "ensure_full_index"; + trace2_region_enter("index", tr_region, istate->repo); /* initialize basics of new index */ full = xcalloc(1, sizeof(struct index_state)); @@ -322,7 +343,12 @@ void ensure_full_index(struct index_state *istate) cache_tree_free(&istate->cache_tree); cache_tree_update(istate, 0); - trace2_region_leave("index", "ensure_full_index", istate->repo); + trace2_region_leave("index", tr_region, istate->repo); +} + +void ensure_full_index(struct index_state *istate) +{ + expand_index(istate, NULL); } void ensure_correct_sparsity(struct index_state *istate) diff --git a/sparse-index.h b/sparse-index.h index 633d4fb7e31..b1f2cdbb164 100644 --- a/sparse-index.h +++ b/sparse-index.h @@ -23,4 +23,17 @@ void expand_to_path(struct index_state *istate, struct repository; int set_sparse_index_config(struct repository *repo, int enable); +struct pattern_list; + +/** + * Scan the given index and compare its entries to the given pattern list. + * If the index is sparse and the pattern list uses cone mode patterns, + * then modify the index to contain the all of the file entries within that + * new pattern list. This expands sparse directories only as far as needed. + * + * If the pattern list is NULL or does not use cone mode patterns, then the + * index is expanded to a full index. + */ +void expand_index(struct index_state *istate, struct pattern_list *pl); + #endif From patchwork Mon May 23 13:48:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12859100 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC905C433EF for ; Mon, 23 May 2022 13:49:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236717AbiEWNtF (ORCPT ); Mon, 23 May 2022 09:49:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41672 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236668AbiEWNs5 (ORCPT ); Mon, 23 May 2022 09:48:57 -0400 Received: from mail-wr1-x42a.google.com (mail-wr1-x42a.google.com [IPv6:2a00:1450:4864:20::42a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C2D23562DE for ; Mon, 23 May 2022 06:48:55 -0700 (PDT) Received: by mail-wr1-x42a.google.com with SMTP id h5so20334579wrb.11 for ; Mon, 23 May 2022 06:48:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=PzcezCpmXzCVNgqBLunvHr3HL+2W8ZouqZBn4pMXpHQ=; b=H63jVCLUqGB9tP+3VQZy8gPJ3tPLb6hI9TFk4zKoU60yuKdTgcPO8LBc2e3+8FnucR THThJkk6IfeH8GecYBEJWv90yOYHt82GRjZCYualO73ui44OilTB6vjjR4rn4i5963KL GW6XuBIrdRgh3kX46NvwHfTTCZseUE2c6NpFZWSKZuxNP+W4MatigUTBlNBHxDTiBGBE KguMPPvgf1YTpuX7IQ+FmfKUGQwQ4o0nOC1gla4hW0NezsnyP8XxKrkteF2lJ2Qy+8KD GM+MABiDSMuFo1D3eo+yL98L1QPeqJIkx2lsSWRJ/wPNhRjYklV04tGO2BPn7dTjWK26 m38w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=PzcezCpmXzCVNgqBLunvHr3HL+2W8ZouqZBn4pMXpHQ=; b=kr9kYF8QBOUT8co67KRYjKgzyPzHFyMb051Ul8uuFnxGKjfUbx76GI6cY7nvsPdOVH w0VI5hFGiGqoTEGlhYny/ijeC3Kh52PyIBNwZLm5183LmUTTDpS7+HjYuFRPCgOeDoSa WwRj1G8NJC9Ea3Jil77sArJBT0Nc1w6QH4F4Nq2vrvexAcQFX/XyYl5pp25TUqW46XKO oNprgkKqrPsvsFGNMDul8q3mPN1rjWTDzHaFuIDfQa325wesZk1GdYkYGWGSyw1Sra99 xfMTvr355EKJbXA6ATesgIB6bAuOovNHe03q4PTAsr8tp5ZsiYOBfiB/0t6FtoFAfhmU 1mMQ== X-Gm-Message-State: AOAM5314vhIVNr8yTA9GogI/BPMiLBVNeVv7I8ddf/BHnuxb/4xoakSj gCbvWxhaSfdzMsUIV2bI8BkL7FCKT7w= X-Google-Smtp-Source: ABdhPJwvzyNSTBtQiMIN9KkK89oWYymLsuvdQGC7Lb0cX/MQOaSDZnyd+TtHiGTUJkJNJEb+GuABhA== X-Received: by 2002:a5d:618f:0:b0:20c:ffa0:95a8 with SMTP id j15-20020a5d618f000000b0020cffa095a8mr19018996wru.306.1653313733922; Mon, 23 May 2022 06:48:53 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id y14-20020a1c4b0e000000b003974df805c7sm1967757wma.8.2022.05.23.06.48.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 May 2022 06:48:53 -0700 (PDT) Message-Id: <8a459d6c67d2cd2cb6f5e6a7d2ea1bff29adb883.1653313726.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 23 May 2022 13:48:40 +0000 Subject: [PATCH v3 04/10] sparse-index: introduce partially-sparse indexes Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, vdye@github.com, shaoxuan.yuan02@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee A future change will present a temporary, in-memory mode where the index can both contain sparse directory entries but also not be completely collapsed to the smallest possible sparse directories. This will be necessary for modifying the sparse-checkout definition while using a sparse index. For now, convert the single-bit member 'sparse_index' in 'struct index_state' to be a an 'enum sparse_index_mode' with three modes: * INDEX_EXPANDED (0): No sparse directories exist. This is always the case for repositories that do not use cone-mode sparse-checkout. * INDEX_COLLAPSED: Sparse directories may exist. Files outside the sparse-checkout cone are reduced to sparse directory entries whenever possible. * INDEX_PARTIALLY_SPARSE: Sparse directories may exist. Some file entries outside the sparse-checkout cone may exist. Running convert_to_sparse() may further reduce those files to sparse directory entries. The main reason to store this extra information is to allow convert_to_sparse() to short-circuit when the index is already in INDEX_EXPANDED mode but to actually do the necessary work when in INDEX_PARTIALLY_SPARSE mode. The INDEX_PARTIALLY_SPARSE mode will be used in an upcoming change. Signed-off-by: Derrick Stolee --- builtin/sparse-checkout.c | 2 +- cache.h | 33 +++++++++++++++++++++++++-------- read-cache.c | 6 +++--- sparse-index.c | 6 +++--- 4 files changed, 32 insertions(+), 15 deletions(-) diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index 0217d44c5b1..5b054400bf3 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -128,7 +128,7 @@ static void clean_tracked_sparse_directories(struct repository *r) * sparse index will not delete directories that contain * conflicted entries or submodules. */ - if (!r->index->sparse_index) { + if (r->index->sparse_index == INDEX_EXPANDED) { /* * If something, such as a merge conflict or other concern, * prevents us from converting to a sparse index, then do diff --git a/cache.h b/cache.h index 6226f6a8a53..e171ce882a2 100644 --- a/cache.h +++ b/cache.h @@ -310,6 +310,29 @@ struct untracked_cache; struct progress; struct pattern_list; +enum sparse_index_mode { + /* + * There are no sparse directories in the index at all. + * + * Repositories that don't use cone-mode sparse-checkout will + * always have their indexes in this mode. + */ + INDEX_EXPANDED = 0, + + /* + * The index has already been collapsed to sparse directories + * whereever possible. + */ + INDEX_COLLAPSED, + + /* + * The sparse directories that exist are outside the + * sparse-checkout boundary, but it is possible that some file + * entries could collapse to sparse directory entries. + */ + INDEX_PARTIALLY_SPARSE, +}; + struct index_state { struct cache_entry **cache; unsigned int version; @@ -323,14 +346,8 @@ struct index_state { drop_cache_tree : 1, updated_workdir : 1, updated_skipworktree : 1, - fsmonitor_has_run_once : 1, - - /* - * sparse_index == 1 when sparse-directory - * entries exist. Requires sparse-checkout - * in cone mode. - */ - sparse_index : 1; + fsmonitor_has_run_once : 1; + enum sparse_index_mode sparse_index; struct hashmap name_hash; struct hashmap dir_hash; struct object_id oid; diff --git a/read-cache.c b/read-cache.c index 4df97e185e9..b236042eee1 100644 --- a/read-cache.c +++ b/read-cache.c @@ -112,7 +112,7 @@ static const char *alternate_index_output; static void set_index_entry(struct index_state *istate, int nr, struct cache_entry *ce) { if (S_ISSPARSEDIR(ce->ce_mode)) - istate->sparse_index = 1; + istate->sparse_index = INDEX_COLLAPSED; istate->cache[nr] = ce; add_name_hash(istate, ce); @@ -1856,7 +1856,7 @@ static int read_index_extension(struct index_state *istate, break; case CACHE_EXT_SPARSE_DIRECTORIES: /* no content, only an indicator */ - istate->sparse_index = 1; + istate->sparse_index = INDEX_COLLAPSED; break; default: if (*ext < 'A' || 'Z' < *ext) @@ -3149,7 +3149,7 @@ static int do_write_locked_index(struct index_state *istate, struct lock_file *l unsigned flags) { int ret; - int was_full = !istate->sparse_index; + int was_full = istate->sparse_index == INDEX_EXPANDED; ret = convert_to_sparse(istate, 0); diff --git a/sparse-index.c b/sparse-index.c index a11b5cf1314..7848910c154 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -173,7 +173,7 @@ int convert_to_sparse(struct index_state *istate, int flags) * If the index is already sparse, empty, or otherwise * cannot be converted to sparse, do not convert. */ - if (istate->sparse_index || !istate->cache_nr || + if (istate->sparse_index == INDEX_COLLAPSED || !istate->cache_nr || !is_sparse_index_allowed(istate, flags)) return 0; @@ -214,7 +214,7 @@ int convert_to_sparse(struct index_state *istate, int flags) FREE_AND_NULL(istate->fsmonitor_dirty); FREE_AND_NULL(istate->fsmonitor_last_update); - istate->sparse_index = 1; + istate->sparse_index = INDEX_COLLAPSED; trace2_region_leave("index", "convert_to_sparse", istate->repo); return 0; } @@ -259,7 +259,7 @@ void expand_index(struct index_state *istate, struct pattern_list *pl) * If the index is already full, then keep it full. We will convert * it to a sparse index on write, if possible. */ - if (!istate || !istate->sparse_index) + if (!istate || istate->sparse_index == INDEX_EXPANDED) return; /* From patchwork Mon May 23 13:48:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12859103 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6DD38C433EF for ; Mon, 23 May 2022 13:49:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236729AbiEWNtN (ORCPT ); Mon, 23 May 2022 09:49:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41852 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236699AbiEWNtC (ORCPT ); Mon, 23 May 2022 09:49:02 -0400 Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0CE8F393D4 for ; Mon, 23 May 2022 06:48:56 -0700 (PDT) Received: by mail-wr1-x42f.google.com with SMTP id h5so20334652wrb.11 for ; Mon, 23 May 2022 06:48:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=AG+pinYYeC7JqPgv69XRiMK3ws7OFwejY/XATs8rfe4=; b=qFuTJ6pQTsXp5zZP06wn0lcUGOTFaAagybNfJ3TYJTE9EWgae1OMoCRzFTEKpJDCvF D46+lSGMdtsL440DFP/4eLkRNg3qkqxU7zrQJleKnlEX7EcqkX1u/QwHLzc+19mvcEPj evlZPzWV/ymXQeB1MD39Ox0C7PB5G8QdzQPDw3hOK/MWwS34pvb+Ai5UMlKbrhtHc6c8 bJVLLrh06czWSkvBuMNyV4pRWojCsxUhJDo6OrdRQQ2mv32U6L4+AJcjWIviAWjmLa1I GM2J5rw4oejU7QaJv9S+AmZNfJ75manjZnndWc/bWtXam0xZEBDrwnxJsUeud0vYSB+L dybQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=AG+pinYYeC7JqPgv69XRiMK3ws7OFwejY/XATs8rfe4=; b=V8CcULHuikdiac2WGlM6AtcEqcqHF/coxZrysc0V1pzcEIbK2rWGqyuyFRNiWQnuHJ O1FPhqDKmwCONGNVvmAgzET2mlybWNmXTTsDzKTkbI77Bc+SpBHrqF8tnwBw8LmpltYh hojszX8pE+ZUErM83rAE54a9uRBSiodA5E+DKnPDuhsMG13ujRI70+nh3thReSk3ny/0 3wboUu97PvLiRfLTPObCGZrfBIdsW15VaMKPNxZ/r8R8crFf8gXsnyj1RQZMqL9wAcCJ O0NbVAiUjSoXLWck10y46wVF+yXcwWw7TBi5tKyel/F+YaJdVERWLo7vyqyoHHZegEdT Xnew== X-Gm-Message-State: AOAM533kYkmqepkMk2rV03FPOeZC5HHF019w2PxYzFbHK8MDIuKVu9TG /2RR2LqVjpd4sXn2Ua3LvMyP+b0XBmQ= X-Google-Smtp-Source: ABdhPJykQQCV5rAwBZ9WHH+pCXGAlxzxiyb1qF3zVe/1vKH9nv9V37WZjfn0aUc64/MFOm2kqDsxLg== X-Received: by 2002:a05:6000:1f89:b0:20e:5db3:febf with SMTP id bw9-20020a0560001f8900b0020e5db3febfmr18943138wrb.685.1653313735024; Mon, 23 May 2022 06:48:55 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id az10-20020a05600c600a00b003974860e15esm4104538wmb.40.2022.05.23.06.48.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 May 2022 06:48:54 -0700 (PDT) Message-Id: <9103584ed75cdc3bb3c2afd87df53161e0eef9b1.1653313727.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 23 May 2022 13:48:41 +0000 Subject: [PATCH v3 05/10] cache-tree: implement cache_tree_find_path() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, vdye@github.com, shaoxuan.yuan02@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Given a 'struct cache_tree', it may be beneficial to navigate directly to a node within that corresponds to a given path name. Create cache_tree_find_path() for this function. It returns NULL when no such path exists. The implementation is adapted from do_invalidate_path() which does a similar search but also modifies the nodes it finds along the way. The method could be implemented simply using tail-recursion, but this while loop does the same thing. This new method is not currently used, but will be in an upcoming change. Helped-by: Junio C Hamano Signed-off-by: Derrick Stolee --- cache-tree.c | 27 +++++++++++++++++++++++++++ cache-tree.h | 2 ++ 2 files changed, 29 insertions(+) diff --git a/cache-tree.c b/cache-tree.c index 6752f69d515..f42db920d10 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -100,6 +100,33 @@ struct cache_tree_sub *cache_tree_sub(struct cache_tree *it, const char *path) return find_subtree(it, path, pathlen, 1); } +struct cache_tree *cache_tree_find_path(struct cache_tree *it, const char *path) +{ + const char *slash; + int namelen; + struct cache_tree_sub it_sub = { + .cache_tree = it, + }; + struct cache_tree_sub *down = &it_sub; + + while (down) { + slash = strchrnul(path, '/'); + namelen = slash - path; + down->cache_tree->entry_count = -1; + if (!*slash) { + int pos; + pos = cache_tree_subtree_pos(down->cache_tree, path, namelen); + if (0 <= pos) + return down->cache_tree->down[pos]->cache_tree; + return NULL; + } + down = find_subtree(it, path, namelen, 0); + path = slash + 1; + } + + return NULL; +} + static int do_invalidate_path(struct cache_tree *it, const char *path) { /* a/b/c diff --git a/cache-tree.h b/cache-tree.h index 8efeccebfc9..f75f8e74dcd 100644 --- a/cache-tree.h +++ b/cache-tree.h @@ -29,6 +29,8 @@ struct cache_tree_sub *cache_tree_sub(struct cache_tree *, const char *); int cache_tree_subtree_pos(struct cache_tree *it, const char *path, int pathlen); +struct cache_tree *cache_tree_find_path(struct cache_tree *it, const char *path); + void cache_tree_write(struct strbuf *, struct cache_tree *root); struct cache_tree *cache_tree_read(const char *buffer, unsigned long size); From patchwork Mon May 23 13:48:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12859105 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96EA9C433EF for ; Mon, 23 May 2022 13:49:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236699AbiEWNtR (ORCPT ); Mon, 23 May 2022 09:49:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41854 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236707AbiEWNtC (ORCPT ); Mon, 23 May 2022 09:49:02 -0400 Received: from mail-wm1-x32a.google.com (mail-wm1-x32a.google.com [IPv6:2a00:1450:4864:20::32a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1AAE4562DD for ; Mon, 23 May 2022 06:48:58 -0700 (PDT) Received: by mail-wm1-x32a.google.com with SMTP id z17so1953403wmf.1 for ; Mon, 23 May 2022 06:48:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=QfNYBTxrTPqEokKT3VQEYgaGYVZKk2yWs7owInO/GCg=; b=SAcV2oPCjhM7cOEU5PWVbBLAE2TQs9ktziOKYDxiEpFYevsqwYbzL0VRbTY4EKmwPd Czsik3vZZ+of1okcJhTaBC4LsxRz+VVYWKZXUEdgWLH1Hoy9m4s4AZJvykclqxPbr7EI +qL6s4LQ45Nrevy/N+MlduHIP2x1ZAbZi5BIpMg6hL2r/z/42VESuzzKETFTW7SxpJSG 1Ur/pEUxxpVi8kP9rrUzuDd5mabM6uSKbHPRe+JjRkSkk8rFgvovjDs76BoKFeCMoE93 5SFToMdtmrCMv/RhZljGpWdAGgYs9KOp3fPJnq0RlGXCCF2vEqWNd1JdQPNvobXdUgvV 0xkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=QfNYBTxrTPqEokKT3VQEYgaGYVZKk2yWs7owInO/GCg=; b=Zpgd47MHWau34N2Db6Ca2wJ5QXYqz4ZEfrfmLlqNaukjrlUDTIPc3KdUzV1/wPJifq vTd5dX+nHgra3bLX1NA+jczObsGfnKbikyAbdAO/vrJwK8BSkbAqAqnQRcNtKVeJotMx qA9awRUkczb46k9KgAr4bUzGV56NPqjh+wiFrg911wVftObzoYV4y59mFUVH+YCPs9tr dV3/pb9Fsv9Fbtyia8JnMhnNOgfAXbQ9Kne/aijqp0atXaGigE2RUPgquZ9RlnMRwZo2 I5HZNUCfCSdEj0nPohZr6kfWp8xfNYyiDCbyW5eyV3/KqPYNVzg6v091bA+DKXYGCY6L /Cnw== X-Gm-Message-State: AOAM531tg8gdXXtsNAbJEP7w7s7lEpjyQ4svuVtGADbQ23p3hOXaTJaY SpxCOgVJxMwG3qmbXYKLxhruJrYAVKE= X-Google-Smtp-Source: ABdhPJyTDPNYyqHJ22taDwzeeO1pA/PRLKWx7+6fiVVKKaiWvpkyf0dZe+fgKvygj6Ztla/YCPHxwA== X-Received: by 2002:a05:600c:1552:b0:394:52a9:e48a with SMTP id f18-20020a05600c155200b0039452a9e48amr20692757wmg.45.1653313736039; Mon, 23 May 2022 06:48:56 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id u22-20020adfa196000000b0020c5253d8dcsm10364402wru.40.2022.05.23.06.48.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 May 2022 06:48:55 -0700 (PDT) Message-Id: <75ecb579a9f69c6dbe5d88ecec0d8e7e15c02efb.1653313727.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 23 May 2022 13:48:42 +0000 Subject: [PATCH v3 06/10] sparse-checkout: --no-sparse-index needs a full index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, vdye@github.com, shaoxuan.yuan02@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee When the --no-sparse-index option is supplied, the sparse-checkout builtin should explicitly ask to expand a sparse index to a full one. This is currently done implicitly due to the command_requires_full_index protection, but that will be removed in an upcoming change. Signed-off-by: Derrick Stolee --- builtin/sparse-checkout.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index 5b054400bf3..1db51c3fd72 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -413,6 +413,9 @@ static int update_modes(int *cone_mode, int *sparse_index) /* force an index rewrite */ repo_read_index(the_repository); the_repository->index->updated_workdir = 1; + + if (!*sparse_index) + ensure_full_index(the_repository->index); } return 0; From patchwork Mon May 23 13:48:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12859104 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D881CC433F5 for ; Mon, 23 May 2022 13:49:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236679AbiEWNtO (ORCPT ); Mon, 23 May 2022 09:49:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41858 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236706AbiEWNtC (ORCPT ); Mon, 23 May 2022 09:49:02 -0400 Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com [IPv6:2a00:1450:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA8F9562FB for ; Mon, 23 May 2022 06:48:58 -0700 (PDT) Received: by mail-wr1-x436.google.com with SMTP id e2so9841470wrc.1 for ; Mon, 23 May 2022 06:48:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=ZwGBv26UQuQjTkeZyhJ6Y4Vk30CsxCnxhBwP/Ig2btk=; b=d7pJDgTDfjNLo+Hxyyo4wZZH6hSYdxmLcZmg1wBg1fFI3eTs1FBqukLvtjRNdqFE21 8Y6+dPcUJNFsgLlOKbgYK+NERj8arTt/Mh746bs3/19HdBZMCZy6CBi5XEHv443+Dns3 H7K1yW7yh5vc8UnPe2O6tSUWEyxUcWpbL7T5+mYehUbqwW/uf0QeMWbTsCbR1nax0H2l F8ErnOcHSlP/4ewzaWuYo/XTRqzd6HRj9Y/lUvYtRh3sa7oPLSi8aOr8xeBy2ninbZDE oGKj+LVTo0MzBcMnj8+roSREgnJbgB+wwQOMUF6OgrswKBOEfkyu/Bc5IXlaCIkSzjKc w1jg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=ZwGBv26UQuQjTkeZyhJ6Y4Vk30CsxCnxhBwP/Ig2btk=; b=wn8MMYmOS75Ip+8BzDwkSrbr4A8Ym3hhmul8Oj8sLNsLcO7FGmEIep6in6w/xBY4RR VDZ2g7L6KvIUvLKhPjc8Ntate5IDvlJUlfLediuz8WK4k+qo9A6ooiMVZYFBQEFwVlQt b1WV54nx5XcKHWHbz3kjETgdCdgsYLEsEf1OobZUYEOwZMiJOvMijCX5zfnn8zPyZ5Pe knWrHAtALbmGYFUq1w6PuPkOZC5hehZ1Xh4/QKGRPQy97W4sxFiS5dXTzkt6reLsZmb7 uMlVelsIJx3p/cGdZOQStr8uvKLm6lvkfCnXhOIVoN8a/FWJOzR3Kdlmm/BrYNEitDHD DXgg== X-Gm-Message-State: AOAM531CRLhtewPyr/rfATv2749jdBxisXCwhiQZlwk7oGdR3A7HFsYz bN2F4QCEpiCLERnYDUIVs0rMH7ffhd0= X-Google-Smtp-Source: ABdhPJwlF+je2wq6EnFLAsPzpKjZd1JLlHfUCcjEDl/MbKzz507GMqesUrn4clG3L75ExYGi1pu1+A== X-Received: by 2002:a05:6000:1d93:b0:20c:58f8:f530 with SMTP id bk19-20020a0560001d9300b0020c58f8f530mr19236099wrb.254.1653313737131; Mon, 23 May 2022 06:48:57 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id z1-20020adfc001000000b0020e5b2afb3bsm10190613wre.23.2022.05.23.06.48.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 May 2022 06:48:56 -0700 (PDT) Message-Id: <310e59d9f0e9cf6b88ced10c80e982606fe7632a.1653313727.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 23 May 2022 13:48:43 +0000 Subject: [PATCH v3 07/10] sparse-index: partially expand directories Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, vdye@github.com, shaoxuan.yuan02@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The expand_to_pattern_list() method expands sparse directory entries to their list of contained files when either the pattern list is NULL or the directory is contained in the new pattern list's cone mode patterns. It is possible that the pattern list has a recursive match with a directory 'A/B/C/' and so an existing sparse directory 'A/B/' would need to be expanded. If there exists a directory 'A/B/D/', then that directory should not be expanded and instead we can create a sparse directory. To implement this, we plug into the add_path_to_index() callback for the call to read_tree_at(). Since we now need access to both the index we are writing and the pattern list we are comparing, create a 'struct modify_index_context' to use as a data transfer object. It is important that we use the given pattern list since we will use this pattern list to change the sparse-checkout patterns and cannot use istate->sparse_checkout_patterns. Signed-off-by: Derrick Stolee --- sparse-index.c | 57 +++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 50 insertions(+), 7 deletions(-) diff --git a/sparse-index.c b/sparse-index.c index 7848910c154..a881f851810 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -9,6 +9,11 @@ #include "dir.h" #include "fsmonitor.h" +struct modify_index_context { + struct index_state *write; + struct pattern_list *pl; +}; + static struct cache_entry *construct_sparse_dir_entry( struct index_state *istate, const char *sparse_dir, @@ -231,18 +236,52 @@ static int add_path_to_index(const struct object_id *oid, struct strbuf *base, const char *path, unsigned int mode, void *context) { - struct index_state *istate = (struct index_state *)context; + struct modify_index_context *ctx = (struct modify_index_context *)context; struct cache_entry *ce; size_t len = base->len; - if (S_ISDIR(mode)) - return READ_TREE_RECURSIVE; + if (S_ISDIR(mode)) { + int dtype; + size_t baselen = base->len; + if (!ctx->pl) + return READ_TREE_RECURSIVE; - strbuf_addstr(base, path); + /* + * Have we expanded to a point outside of the sparse-checkout? + * + * Artificially pad the path name with a slash "/" to + * indicate it as a directory, and add an arbitrary file + * name ("-") so we can consider base->buf as a file name + * to match against the cone-mode patterns. + * + * If we compared just "path", then we would expand more + * than we should. Since every file at root is always + * included, we would expand every directory at root at + * least one level deep instead of using sparse directory + * entries. + */ + strbuf_addstr(base, path); + strbuf_add(base, "/-", 2); + + if (path_matches_pattern_list(base->buf, base->len, + NULL, &dtype, + ctx->pl, ctx->write)) { + strbuf_setlen(base, baselen); + return READ_TREE_RECURSIVE; + } - ce = make_cache_entry(istate, mode, oid, base->buf, 0, 0); + /* + * The path "{base}{path}/" is a sparse directory. Create the correct + * name for inserting the entry into the index. + */ + strbuf_setlen(base, base->len - 1); + } else { + strbuf_addstr(base, path); + } + + ce = make_cache_entry(ctx->write, mode, oid, base->buf, 0, 0); ce->ce_flags |= CE_SKIP_WORKTREE | CE_EXTENDED; - set_index_entry(istate, istate->cache_nr++, ce); + set_index_entry(ctx->write, ctx->write->cache_nr++, ce); strbuf_setlen(base, len); return 0; @@ -254,6 +293,7 @@ void expand_index(struct index_state *istate, struct pattern_list *pl) struct index_state *full; struct strbuf base = STRBUF_INIT; const char *tr_region; + struct modify_index_context ctx; /* * If the index is already full, then keep it full. We will convert @@ -293,6 +333,9 @@ void expand_index(struct index_state *istate, struct pattern_list *pl) full->cache_nr = 0; ALLOC_ARRAY(full->cache, full->cache_alloc); + ctx.write = full; + ctx.pl = pl; + for (i = 0; i < istate->cache_nr; i++) { struct cache_entry *ce = istate->cache[i]; struct tree *tree; @@ -318,7 +361,7 @@ void expand_index(struct index_state *istate, struct pattern_list *pl) strbuf_add(&base, ce->name, strlen(ce->name)); read_tree_at(istate->repo, tree, &base, &ps, - add_path_to_index, full); + add_path_to_index, &ctx); /* free directory entries. full entries are re-used */ discard_cache_entry(ce); From patchwork Mon May 23 13:48:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12859107 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D782FC433F5 for ; Mon, 23 May 2022 13:49:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236742AbiEWNtX (ORCPT ); Mon, 23 May 2022 09:49:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41860 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236710AbiEWNtC (ORCPT ); Mon, 23 May 2022 09:49:02 -0400 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0DB5556426 for ; Mon, 23 May 2022 06:49:00 -0700 (PDT) Received: by mail-wr1-x42d.google.com with SMTP id p10so2838446wrg.12 for ; Mon, 23 May 2022 06:48:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Jdp0Qd54vQ0o7G0FgNEa12/QQwj/dI/J4aERo2G5/G8=; b=W4Qd8+GWlf7CxOSrz2OFbOEytCLxiuhdu5mIiAQsGbv7+3JV8NZiRo49mJQ678+5VU 6n3o4J9HfY2oL0blsg7R/tCIs6XXtu2+dkpmi+J5FkgLXLlORHWriFVi2lrILSbP8Lrq wT+vES8jRTdJxxScc0e1QvzZwS/E9NgRxf+A3fCx+CuyOW4m05cBPrF3lWiwG5pMkVii OYRloi13grC5D/nWosqiIGxGquFxhDDvl4vg2vGUYE/WGTUORZtgEG4FOxLrt7k0Hf7N iZ5QyZDRMlZmIWcKwMnryGQDLozARf/LYgLbc7KUUAS6bbGsTXv5d+EK/BOjDMt+Vxlm 1j5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Jdp0Qd54vQ0o7G0FgNEa12/QQwj/dI/J4aERo2G5/G8=; b=S6RoP54e9B4sDRcmE8DMv69F8LY5zyfUDGX7iWM8vCA1A9Fo1yRu7V5iY7YVYv7Lqm /2r/pHY+n77HwV8iDNoDkatYm2ThNECfb0cG4ESOwol/s49B94r5eyXQ2DHKFOc29sJQ dQT1XSCuUM1UCoVDc+2DPrIg5uaXx5TvRiHBdovL5+QfYUe8ggWKIXU/5twGSdFvt6YF N75YvkhdWFFzwdYvO0YYhwan1QeeZHtUsGeW5a4VbDJfSD1AiJQAKcgpCMN+YAX+WjTl wQuaB6IKm2LJgSJ7EdXd56olNDqQenqawwOVeFyC1Y7F+x0z971m9XeaSo6jV+GNzS/D Yndg== X-Gm-Message-State: AOAM532GwDmKHYcXi2VpGb22BPzFvo7MH+pR6VpCRbrJLbNGFno2mDtd OYJVqdVyuHGW15ulVdifHLGKvtBrHKw= X-Google-Smtp-Source: ABdhPJxrfKM6JDhWepOYN+bYIDfGiXhdmYr6WD/1oQ0Mc1tU8LJLRH5ZGegS4PhotMZjml9aDC+oXA== X-Received: by 2002:adf:ed8a:0:b0:20c:fe1d:d9e3 with SMTP id c10-20020adfed8a000000b0020cfe1dd9e3mr19237146wro.546.1653313738230; Mon, 23 May 2022 06:48:58 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h124-20020a1c2182000000b0039744bd664esm4188199wmh.13.2022.05.23.06.48.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 May 2022 06:48:57 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Mon, 23 May 2022 13:48:44 +0000 Subject: [PATCH v3 08/10] sparse-index: complete partial expansion Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, vdye@github.com, shaoxuan.yuan02@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee To complete the implementation of expand_to_pattern_list(), we need to detect when a sparse directory entry should remain sparse. This avoids a full expansion, so we now need to use the PARTIALLY_SPARSE mode to indicate this state. There still are no callers to this method, but we will add one in the next change. Signed-off-by: Derrick Stolee --- sparse-index.c | 39 ++++++++++++++++++++++++++++++++++++--- 1 file changed, 36 insertions(+), 3 deletions(-) diff --git a/sparse-index.c b/sparse-index.c index a881f851810..2c0a18380f1 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -308,8 +308,24 @@ void expand_index(struct index_state *istate, struct pattern_list *pl) * continue. A NULL pattern set indicates a full expansion to a * full index. */ - if (pl && !pl->use_cone_patterns) + if (pl && !pl->use_cone_patterns) { pl = NULL; + } else { + /* + * We might contract file entries into sparse-directory + * entries, and for that we will need the cache tree to + * be recomputed. + */ + cache_tree_free(&istate->cache_tree); + + /* + * If there is a problem creating the cache tree, then we + * need to expand to a full index since we cannot satisfy + * the current request as a sparse index. + */ + if (cache_tree_update(istate, 0)) + pl = NULL; + } if (!istate->repo) istate->repo = the_repository; @@ -327,8 +343,14 @@ void expand_index(struct index_state *istate, struct pattern_list *pl) full = xcalloc(1, sizeof(struct index_state)); memcpy(full, istate, sizeof(struct index_state)); + /* + * This slightly-misnamed 'full' index might still be sparse if we + * are only modifying the list of sparse directories. This hinges + * on whether we have a non-NULL pattern list. + */ + full->sparse_index = pl ? INDEX_PARTIALLY_SPARSE : INDEX_EXPANDED; + /* then change the necessary things */ - full->sparse_index = 0; full->cache_alloc = (3 * istate->cache_alloc) / 2; full->cache_nr = 0; ALLOC_ARRAY(full->cache, full->cache_alloc); @@ -340,11 +362,22 @@ void expand_index(struct index_state *istate, struct pattern_list *pl) struct cache_entry *ce = istate->cache[i]; struct tree *tree; struct pathspec ps; + int dtype; if (!S_ISSPARSEDIR(ce->ce_mode)) { set_index_entry(full, full->cache_nr++, ce); continue; } + + /* We now have a sparse directory entry. Should we expand? */ + if (pl && + path_matches_pattern_list(ce->name, ce->ce_namelen, + NULL, &dtype, + pl, istate) == NOT_MATCHED) { + set_index_entry(full, full->cache_nr++, ce); + continue; + } + if (!(ce->ce_flags & CE_SKIP_WORKTREE)) warning(_("index entry is a directory, but not sparse (%08x)"), ce->ce_flags); @@ -370,7 +403,7 @@ void expand_index(struct index_state *istate, struct pattern_list *pl) /* Copy back into original index. */ memcpy(&istate->name_hash, &full->name_hash, sizeof(full->name_hash)); memcpy(&istate->dir_hash, &full->dir_hash, sizeof(full->dir_hash)); - istate->sparse_index = 0; + istate->sparse_index = pl ? INDEX_PARTIALLY_SPARSE : INDEX_EXPANDED; free(istate->cache); istate->cache = full->cache; istate->cache_nr = full->cache_nr; From patchwork Mon May 23 13:48:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12859106 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9C75C433EF for ; Mon, 23 May 2022 13:49:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236734AbiEWNtV (ORCPT ); Mon, 23 May 2022 09:49:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41834 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236558AbiEWNtD (ORCPT ); Mon, 23 May 2022 09:49:03 -0400 Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [IPv6:2a00:1450:4864:20::32e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8AED45642F for ; Mon, 23 May 2022 06:49:01 -0700 (PDT) Received: by mail-wm1-x32e.google.com with SMTP id c190-20020a1c35c7000000b0038e37907b5bso11393471wma.0 for ; Mon, 23 May 2022 06:49:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=2IlzxJ7VZLFYW9FoTSMVdEvLGG9ryAdgqq2Dl4y6yJU=; b=iH3Lck59MraRwB3kLEIeolCggQh3mTAUFDXl3Tpi/xo3RY1Yoxn3kzfgUBznEa5/dT ngLpzqMmiklPH62E/DZtvk64B2iGO5BbPsrTz4Upnjrc3wKfjhLOAo1rENKqUQYjYrZu gfOgTe0joSC9XozgdBh+RiHhMqwgrgIipXAh/VrqmmWSoKwS+90hyVify0tw8FSIfxu9 +WE2SU79qIqHlnXYW3VszhbtMJv6tEgooHb0Re69hcqE9tKcTCVwLbDKgPdhuV795qme ssiBO4noPbfr95PTCqElQaL/KZJL81ousGlxg5F+OeEmv7SJsJm9FCAq6Gbh4FtdiTRW FlUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=2IlzxJ7VZLFYW9FoTSMVdEvLGG9ryAdgqq2Dl4y6yJU=; b=SWLUOs+XzH5MooSCFPAhGLW2LJz+IApYqbpEwqjzaVuWIpn14oph9wUtmApToHT4xv KiKivbkhdmcXmeA7Y718THNtPhWNcQcu2m3JBWDyagrN5U9B6ifNO1LPzdtSOgYKI2BU fCgCqVuIp9Rbe6ebA4AFY5cg0UhNq4mFwCoGhQNYKcoKHeprZV0FQKt+pq9sPlXpCmQB bvULofGPfgcxy7c8DMmIhuGXRtipS2MbbVk8rj2DY+Fb58MjRbBRaBNFKnVVmoOwqGz6 WzMyEiGWxDqH27DUiwXX6K/KvFSZ6+ko7wadD8o8V9ldftPq4OoQSXXvZ+VV40H5hAMh DPiQ== X-Gm-Message-State: AOAM530mHSA6czUoyq1Caav27I2qFSuTB/zI0Zgj3qFMHAT/YBbVwiEX tFQoyrPjVnL5GXz6HuGSQ1XZzc9yDZ0= X-Google-Smtp-Source: ABdhPJw+LvtfgBLiJ4uOjF1XoVpV0ALJlO5Uhk1rsGFMZYndFb0FherrCOK+wWEbQEczMEn9A3EXLQ== X-Received: by 2002:a05:600c:34d4:b0:394:91a8:104b with SMTP id d20-20020a05600c34d400b0039491a8104bmr20201449wmq.134.1653313739716; Mon, 23 May 2022 06:48:59 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a5-20020a05600c348500b003974d9d088fsm2062226wmq.30.2022.05.23.06.48.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 May 2022 06:48:59 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Mon, 23 May 2022 13:48:45 +0000 Subject: [PATCH v3 09/10] p2000: add test for 'git sparse-checkout [add|set]' Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, vdye@github.com, shaoxuan.yuan02@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The sparse-checkout builtin is almost completely integrated with the sparse index, allowing the sparse-checkout boundary to be modified without expanding a sparse index to a full one. Add a test to p2000-sparse-operations.sh that adds a directory to the sparse-checkout definition, then removes it. Using both operations is important to ensure that the operation is doing the same work in each repetition as well as leaving the test repo in a good state for later tests. Signed-off-by: Derrick Stolee --- t/perf/p2000-sparse-operations.sh | 1 + 1 file changed, 1 insertion(+) diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh index 382716cfca9..ce5cfac5714 100755 --- a/t/perf/p2000-sparse-operations.sh +++ b/t/perf/p2000-sparse-operations.sh @@ -110,6 +110,7 @@ test_perf_on_all git add -A test_perf_on_all git add . test_perf_on_all git commit -a -m A test_perf_on_all git checkout -f - +test_perf_on_all "git sparse-checkout add f2/f3/f1 && git sparse-checkout set $SPARSE_CONE" test_perf_on_all git reset test_perf_on_all git reset --hard test_perf_on_all git reset -- does-not-exist From patchwork Mon May 23 13:48:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12859108 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C8F9C433F5 for ; Mon, 23 May 2022 13:49:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236745AbiEWNtY (ORCPT ); Mon, 23 May 2022 09:49:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41916 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236702AbiEWNtE (ORCPT ); Mon, 23 May 2022 09:49:04 -0400 Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com [IPv6:2a00:1450:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 99FD156409 for ; Mon, 23 May 2022 06:49:02 -0700 (PDT) Received: by mail-wr1-x436.google.com with SMTP id e2so9841470wrc.1 for ; Mon, 23 May 2022 06:49:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=CQWvGccqzvRWATprGD+CZtA2wSznDAeD0JXUqWIYgLI=; b=VoqjaH2Mc4UFqPxE7QHATQpBuVy8gw6g/QdXfhwakvSvfvv4QYXz61S9941RV4OYKW WMjM248Uv9j5g6MMfwHiUnw9ZbsVmJ4LAzFktI4+olpSxTAZ9K6QOjgYntKAidUeOfXj 90x13siDdCen4vUFqd5nXhOrVZRteRZO4YtSRiRrUWO+PEiOBcJW5Ujdv7pujXoka2cK /GPJ9JEeLf6jNjTNIPiaO5qt+3zYH5pFOrj6mMPqlqeElsMV7I/iG1oxRk+rui7eBSrz jYSQQsZ0f/ZkEmEO8jzROE0cuJrgQaJ7nEWUS+7DZ1875NA1HP+3uQpc9SQv655hlMjv 9E3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=CQWvGccqzvRWATprGD+CZtA2wSznDAeD0JXUqWIYgLI=; b=GlgMYTiZSjG7TvVEdy7z+v/+CB8pasYpVkGGyUDCZdiwNpfK2Y9xQZ8XD2fKSZwIjX Jhx7vRU4akXv6n/9q90n5LSLjgvDRmHsja+7upqJvVwFXkZGV7fz5+WDhKhtOv2B0TUt 70rOEp/IRjMObV7rqt87MXFnLGBiiy14VXKWXCCHeYl/UocfP3w1YuqfRR5U28Evhc/Y VuZ2WOWvV6T0WVFdYdjEttnqzCcbYWhW71PG9YMP0JfXE/u5EKi7CXXNRJxp6hRO+KqQ c8wYV6eExBGEZcr6oWfk1TCfWXpaaD+U0qZAXfH9iRYfV3MQGGoVQJS/IWvuhdMkZdWX JYIA== X-Gm-Message-State: AOAM5330IKMZLphwH5cb46O5L29HK5HsEPXR2cWrh73+v6nnQGvt7PkE 0IxB5XaOqJND+4RI8GuUWqgcpueLQBQ= X-Google-Smtp-Source: ABdhPJzxxcq8Tc0kC8yJyuOvGN+v5sS4rswhnbXsVvZz2DF1jdIcqBac1tVV74S45lO3/LPZ93gU3Q== X-Received: by 2002:a5d:64a2:0:b0:20e:6404:b32d with SMTP id m2-20020a5d64a2000000b0020e6404b32dmr19289281wrp.202.1653313741811; Mon, 23 May 2022 06:49:01 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id e22-20020adf9bd6000000b0020c5253d8c2sm10507746wrc.14.2022.05.23.06.49.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 May 2022 06:49:00 -0700 (PDT) Message-Id: <11728619120a2d616daab57c57c66e47052f5c89.1653313727.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 23 May 2022 13:48:46 +0000 Subject: [PATCH v3 10/10] sparse-checkout: integrate with sparse index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, vdye@github.com, shaoxuan.yuan02@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee When modifying the sparse-checkout definition, the sparse-checkout builtin calls update_sparsity() to modify the SKIP_WORKTREE bits of all cache entries in the index. Before, we needed the index to be fully expanded in order to ensure we had the full list of files necessary that match the new patterns. Insert a call to reset_sparse_directories() that expands sparse directories that are within the new pattern list, but only far enough that every necessary file path now exists as a cache entry. The remaining logic within update_sparsity() will modify the SKIP_WORKTREE bits appropriately. This allows us to disable command_requires_full_index within the sparse-checkout builtin. Add tests that demonstrate that we are not expanding to a full index unnecessarily. We can see the improved performance in the p2000 test script: Test HEAD~1 HEAD ------------------------------------------------------------------------ 2000.24: git ... (sparse-v3) 2.14(1.55+0.58) 1.57(1.03+0.53) -26.6% 2000.25: git ... (sparse-v4) 2.20(1.62+0.57) 1.58(0.98+0.59) -28.2% These reductions of 26-28% are small compared to most examples, but the time is dominated by writing a new copy of the base repository to the worktree and then deleting it again. The fact that the previous index expansion was such a large portion of the time is telling how important it is to complete this sparse index integration. Signed-off-by: Derrick Stolee --- builtin/sparse-checkout.c | 3 +++ t/t1092-sparse-checkout-compatibility.sh | 25 ++++++++++++++++++++++++ unpack-trees.c | 4 ++++ 3 files changed, 32 insertions(+) diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index 1db51c3fd72..67d1d146de5 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -937,6 +937,9 @@ int cmd_sparse_checkout(int argc, const char **argv, const char *prefix) git_config(git_default_config, NULL); + prepare_repo_settings(the_repository); + the_repository->settings.command_requires_full_index = 0; + if (argc > 0) { if (!strcmp(argv[0], "list")) return sparse_checkout_list(argc, argv); diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index 785820f9fd5..73f4cf47314 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -1584,6 +1584,31 @@ test_expect_success 'ls-files' ' ensure_not_expanded ls-files --sparse ' +test_expect_success 'sparse index is not expanded: sparse-checkout' ' + init_repos && + + ensure_not_expanded sparse-checkout set deep/deeper2 && + ensure_not_expanded sparse-checkout set deep/deeper1 && + ensure_not_expanded sparse-checkout set deep && + ensure_not_expanded sparse-checkout add folder1 && + ensure_not_expanded sparse-checkout set deep/deeper1 && + ensure_not_expanded sparse-checkout set folder2 && + + # Demonstrate that the checks that "folder1/a" is a file + # do not cause a sparse-index expansion (since it is in the + # sparse-checkout cone). + echo >>sparse-index/folder2/a && + git -C sparse-index add folder2/a && + + ensure_not_expanded sparse-checkout add folder1 && + + # Skip checks here, since deep/deeper1 is inside a sparse directory + # that must be expanded to check whether `deep/deeper1` is a file + # or not. + ensure_not_expanded sparse-checkout set --skip-checks deep/deeper1 && + ensure_not_expanded sparse-checkout set +' + # NEEDSWORK: a sparse-checkout behaves differently from a full checkout # in this scenario, but it shouldn't. test_expect_success 'reset mixed and checkout orphan' ' diff --git a/unpack-trees.c b/unpack-trees.c index 7f528d35cc2..8908b27c03e 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -18,6 +18,7 @@ #include "promisor-remote.h" #include "entry.h" #include "parallel-checkout.h" +#include "sparse-index.h" /* * Error messages expected by scripts out of plumbing commands such as @@ -2018,6 +2019,9 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o) goto skip_sparse_checkout; } + /* Expand sparse directories as needed */ + expand_index(o->src_index, o->pl); + /* Set NEW_SKIP_WORKTREE on existing entries. */ mark_all_ce_unused(o->src_index); mark_new_skip_worktree(o->pl, o->src_index, 0,