From patchwork Tue Jul 20 20:14:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12389219 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B00BBC636C9 for ; Tue, 20 Jul 2021 20:41:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9A4816100C for ; Tue, 20 Jul 2021 20:41:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231404AbhGTT7r (ORCPT ); Tue, 20 Jul 2021 15:59:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36318 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232370AbhGTTeg (ORCPT ); Tue, 20 Jul 2021 15:34:36 -0400 Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2019EC0613DE for ; Tue, 20 Jul 2021 13:14:44 -0700 (PDT) Received: by mail-wr1-x42f.google.com with SMTP id u1so27364134wrs.1 for ; Tue, 20 Jul 2021 13:14:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=f/KiyvI/Q1CXlsOXJtPYGGM2ZxsZJqz3uYZNz5JOQt8=; b=KTUfTqvTbkoIT9iYR7z15ScxKMPvMx7sMf1j9jHil//K4Wh/lnKYHC0tKhi5Kkf4ST U9xNuzDIuODAzD0uK4QiGt5+nd16flNPcTWVvnYqp6yAy9dizUe4/TfhRLVNN+QF4nIe KhigVcW4Yx2nwAuFcEboCCKIwWidqHDENpzFxSoA8ORNG45FFajtP2GS5W3qknS1DPn2 daKiX4nxFbJE3qGY8hX7VyJB5XOrZNG0DU4Vnh6f5nPX9SsyjJgMEI53Tgvl874BLo/R /Lv7W2LMYoBhzEm+uxlj2lF9RcLC2GO2vBmUxX+CPYLUvt3rZLkYFnwsNhHAP5aVb8wG Q51A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=f/KiyvI/Q1CXlsOXJtPYGGM2ZxsZJqz3uYZNz5JOQt8=; b=Ac6bLKut8ptCbaNrXNQmfnfLekLP3HQukUQTNSvcGwT9O9289rbPV13TxwVFSauJYI oWtywIxXMhrc++ocDjbUrg8LHYe/QguslsfWWuw+OFA5EjZDjOkvyXpahqUrPr01QH9+ QsFaYtEWKOzSbYar/8uNEqzV1oxsLEEVNCjuIou8PE+k3WDZuMvrWa62WabrK/Cs9OIl ZueTIhdVJ1DC1+GkyIHtb6SW+V6uPoW7m1V6cKPInOPtX+m7URNU5hqsUkP6E7XERNpw BY9FJfrq6RwS98iU79MGUioUtsHiwd6J5FE9TtWUc/6Fvt5Q1alfisxrH4j7BU3kYEOM +3bQ== X-Gm-Message-State: AOAM530/crWzJ3lZztBUQbrl3OL4Lcn7Z4FgIZ3xuA/3TBoOGeOkas4i t1o9pm+gar/XZHcMp7deNKZhq1T9Ji0= X-Google-Smtp-Source: ABdhPJx4ospaESpkYCvQ8r7euFo1hI3cN7VJ+S4Xypg9lYAg28kY0V7sBWr5cil6kNANK/V1QXQezw== X-Received: by 2002:adf:facf:: with SMTP id a15mr37969348wrs.39.1626812082711; Tue, 20 Jul 2021 13:14:42 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id q19sm3146620wmq.38.2021.07.20.13.14.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 13:14:42 -0700 (PDT) Message-Id: <6e74958f5900855cc0c3b6484b2b2a7732293e6c.1626812081.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 20:14:35 +0000 Subject: [PATCH v2 1/7] p2000: add 'git checkout -' test and decrease depth Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, newren@gmail.com, matheus.bernardino@usp.br, stolee@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee As we increase our list of commands to test in p2000-sparse-operations.sh, we will want to have a slightly smaller test repository. Reduce the size by a factor of four by reducing the depth of the step that creates a big index around a moderately-sized repository. Also add a step to run 'git checkout -' on repeat. This requires having a previous location in the reflog, so add that to the initialization steps. Signed-off-by: Derrick Stolee --- t/perf/p2000-sparse-operations.sh | 27 ++++++++++++++++++++------- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh index 94513c97748..f7f8c012103 100755 --- a/t/perf/p2000-sparse-operations.sh +++ b/t/perf/p2000-sparse-operations.sh @@ -6,7 +6,7 @@ test_description="test performance of Git operations using the index" test_perf_default_repo -SPARSE_CONE=f2/f4/f1 +SPARSE_CONE=f2/f4 test_expect_success 'setup repo and indexes' ' git reset --hard HEAD && @@ -27,7 +27,7 @@ test_expect_success 'setup repo and indexes' ' OLD_COMMIT=$(git rev-parse HEAD) && OLD_TREE=$(git rev-parse HEAD^{tree}) && - for i in $(test_seq 1 4) + for i in $(test_seq 1 3) do cat >in <<-EOF && 100755 blob $BLOB a @@ -43,14 +43,23 @@ test_expect_success 'setup repo and indexes' ' done && git sparse-checkout init --cone && - git branch -f wide $OLD_COMMIT && + git sparse-checkout set $SPARSE_CONE && + git checkout -b wide $OLD_COMMIT && + + for l2 in f1 f2 f3 f4 + do + echo more bogus >>$SPARSE_CONE/$l2/a && + git commit -a -m "edit $SPARSE_CONE/$l2/a" || return 1 + done && + git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . full-index-v3 && ( cd full-index-v3 && git sparse-checkout init --cone && git sparse-checkout set $SPARSE_CONE && git config index.version 3 && - git update-index --index-version=3 + git update-index --index-version=3 && + git checkout HEAD~4 ) && git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . full-index-v4 && ( @@ -58,7 +67,8 @@ test_expect_success 'setup repo and indexes' ' git sparse-checkout init --cone && git sparse-checkout set $SPARSE_CONE && git config index.version 4 && - git update-index --index-version=4 + git update-index --index-version=4 && + git checkout HEAD~4 ) && git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-index-v3 && ( @@ -66,7 +76,8 @@ test_expect_success 'setup repo and indexes' ' git sparse-checkout init --cone --sparse-index && git sparse-checkout set $SPARSE_CONE && git config index.version 3 && - git update-index --index-version=3 + git update-index --index-version=3 && + git checkout HEAD~4 ) && git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-index-v4 && ( @@ -74,7 +85,8 @@ test_expect_success 'setup repo and indexes' ' git sparse-checkout init --cone --sparse-index && git sparse-checkout set $SPARSE_CONE && git config index.version 4 && - git update-index --index-version=4 + git update-index --index-version=4 && + git checkout HEAD~4 ) ' @@ -97,5 +109,6 @@ test_perf_on_all git status test_perf_on_all git add -A test_perf_on_all git add . test_perf_on_all git commit -a -m A +test_perf_on_all git checkout -f - test_done From patchwork Tue Jul 20 20:14:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12389215 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7636FC07E9B for ; Tue, 20 Jul 2021 20:40:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 535A860FF3 for ; Tue, 20 Jul 2021 20:40:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229797AbhGTT7L (ORCPT ); Tue, 20 Jul 2021 15:59:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36320 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231655AbhGTTeg (ORCPT ); Tue, 20 Jul 2021 15:34:36 -0400 Received: from mail-wr1-x429.google.com (mail-wr1-x429.google.com [IPv6:2a00:1450:4864:20::429]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D5576C0613DF for ; Tue, 20 Jul 2021 13:14:44 -0700 (PDT) Received: by mail-wr1-x429.google.com with SMTP id k4so27297890wrc.8 for ; Tue, 20 Jul 2021 13:14:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=CjCwvYNfaftKNON/5Qnsv7Bb6epxLgdnNa3qG/hBIJI=; b=g4v/yXOGprvv3FFNnBbZT6eSuz8b9tq2Hx9XhoqRRaChyXtx+T02SZ3g08L4+Wktuk 9yCt1eC3dtDqnq1bfPFhEndu9NsBHQQXjsP/m+WyRWiLpQsC3ZQmViW/u8q6TJm78vBW f5q1VojsQTVOkHtUUQ3VJ6C59XMBmk4bqXmdAIV2WrJEiynDOH3KprknszsThBAD4FZz WQOFM+C+YfYw/HkGzsScdpjLjKx0XzFlejM2azcnJbDl1xh2ORDfvfLAaf/LHByXKDIb fkjcVGaeuFppB5mY+w4l5WbF8mbtHz0NMegwYX8w6TSH/r3dYT4P1K+xxaD5rao1q+uN 3Crg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=CjCwvYNfaftKNON/5Qnsv7Bb6epxLgdnNa3qG/hBIJI=; b=cDM66SejQOa47lFLsF53aKJ/REkbhQYYoVOzaYELTQHxY8JkUJeYT3YF1xo/zgsuVB qZgGx5VN3QsgTdD7qrwDJiGw+LD5jQvWzhK7NfLHZvvx+g0GBy0No5rdB+gdCyk+tT8v PUsl892bp37HkK7J1C2DpYVkHPEDT7loV95UajZjX/T/ljaqwdUSIHxAcHvNQrmtbH3F B3JmFvRQoYhqRGow3llVAenidsIZBMp+VcqVnMB69UARxJSZbPQdmN04pUEP/+vSLXlu R0TlX5Fl1v2T5erir5HNj1jS2IRbuHPF13i8mmD6+3a4rYPjBeP3ZhDSpcoZWYCieABV dDKw== X-Gm-Message-State: AOAM5330CNXF8AsbUixMHgZ3v2xd9rla/C/xrnEuY/8nb++dstWtTQqd T1Alz6/U07gHdZEJfFdKYgaJ2bKxJ58= X-Google-Smtp-Source: ABdhPJyCDqidatuc3Yf9cMYzsNVbZgd1aFh3LQu3pUvrMsOuM+7yEbyjT+lRQ85cjCn8T1y46Kl2HQ== X-Received: by 2002:adf:c849:: with SMTP id e9mr38581865wrh.348.1626812083364; Tue, 20 Jul 2021 13:14:43 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id l14sm23528123wrs.22.2021.07.20.13.14.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 13:14:43 -0700 (PDT) Message-Id: <3e1d03c41be9701c2cd518afc6ebc2797bca4398.1626812081.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 20:14:36 +0000 Subject: [PATCH v2 2/7] p2000: compress repo names Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, newren@gmail.com, matheus.bernardino@usp.br, stolee@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee By using shorter names for the test repos, we will get a slightly more compressed performance summary without comprimising clarity. Signed-off-by: Derrick Stolee --- t/perf/p2000-sparse-operations.sh | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh index f7f8c012103..597626276fb 100755 --- a/t/perf/p2000-sparse-operations.sh +++ b/t/perf/p2000-sparse-operations.sh @@ -52,36 +52,36 @@ test_expect_success 'setup repo and indexes' ' git commit -a -m "edit $SPARSE_CONE/$l2/a" || return 1 done && - git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . full-index-v3 && + git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . full-v3 && ( - cd full-index-v3 && + cd full-v3 && git sparse-checkout init --cone && git sparse-checkout set $SPARSE_CONE && git config index.version 3 && git update-index --index-version=3 && git checkout HEAD~4 ) && - git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . full-index-v4 && + git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . full-v4 && ( - cd full-index-v4 && + cd full-v4 && git sparse-checkout init --cone && git sparse-checkout set $SPARSE_CONE && git config index.version 4 && git update-index --index-version=4 && git checkout HEAD~4 ) && - git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-index-v3 && + git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-v3 && ( - cd sparse-index-v3 && + cd sparse-v3 && git sparse-checkout init --cone --sparse-index && git sparse-checkout set $SPARSE_CONE && git config index.version 3 && git update-index --index-version=3 && git checkout HEAD~4 ) && - git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-index-v4 && + git -c core.sparseCheckoutCone=true clone --branch=wide --sparse . sparse-v4 && ( - cd sparse-index-v4 && + cd sparse-v4 && git sparse-checkout init --cone --sparse-index && git sparse-checkout set $SPARSE_CONE && git config index.version 4 && @@ -92,8 +92,8 @@ test_expect_success 'setup repo and indexes' ' test_perf_on_all () { command="$@" - for repo in full-index-v3 full-index-v4 \ - sparse-index-v3 sparse-index-v4 + for repo in full-v3 full-v4 \ + sparse-v3 sparse-v4 do test_perf "$command ($repo)" " ( From patchwork Tue Jul 20 20:14:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12389227 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78C89C07E9B for ; Tue, 20 Jul 2021 20:42:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 64BAB60FEA for ; Tue, 20 Jul 2021 20:42:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234508AbhGTUBa (ORCPT ); Tue, 20 Jul 2021 16:01:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36326 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232473AbhGTTeg (ORCPT ); Tue, 20 Jul 2021 15:34:36 -0400 Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [IPv6:2a00:1450:4864:20::333]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 69E77C0613E0 for ; Tue, 20 Jul 2021 13:14:45 -0700 (PDT) Received: by mail-wm1-x333.google.com with SMTP id o30-20020a05600c511eb029022e0571d1a0so241528wms.5 for ; Tue, 20 Jul 2021 13:14:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=RvvnRwo1YQanCRaIT8C+DPSrYl16RvceDnNq1bSckRM=; b=F53EutLrNah01quYiYNnT61j9H6Cx5FJ+pmx2GDwZmoRsBfoRIUMgasVA0HAYu1gf2 AnxCok12NpWb6esClgEKLYnbRaf0zbZI+CbziaVsB1uaDJg/QB0PxvdxWVk808DHOhOh PjbF+GGkS4gQm/OStkjDCHSmkFc39p/XJMg4JjPfg4c3iQnrsoE0Mcr2XyL0uAisJ/E2 M52V852VzSb4VURVsfFuL3G6tk34rDRbv1Z8UwrSiih+oLZkpjkeiBgJXaV4I0EqZcCo yQacKpnruwH8zHKmIbZmlf+e8w54/5kU/prQd7KDzuHMkjhfkUInf2a8OPPNBC/BbYyI EDHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=RvvnRwo1YQanCRaIT8C+DPSrYl16RvceDnNq1bSckRM=; b=Y/yl5g/7yTz9US81LWoJ1VJ4cXtlBpNT3pDNGsNZ42bs+CsGO0rcMx0D/8IxzusJzX SkAOONuHOQ6zGcX28umpRXfMkQ8zkyPrJinPV2XH78hw+YaIksmd81Rrmu2urDicP44u rc0Rcjfc11z4l+Xkl3wZyWh5e+hW76grpEDnZJPQWFBr0hp3d7PD5nsL4ii+tXhBTG/n aWlt3L/J+umckTjTD+BjBGERF7JxRzm4UQgYVKevGjUuAvyPaF/PQZzhPhCQxbdZjo4b 7KAm2WD7YSPktpJA5mUVS6Iw3KfCWAREJz0pIulMPnG/FWD1Hur4WV6qHfRqdUdllZpN rCDQ== X-Gm-Message-State: AOAM530ApisiTDWg3V1zvt0df7sf2aJ8OA3UR4jyyLT22mAotIgBeRJI DXO3yFDHx/6rgnwwCHNtmdssni4hoGc= X-Google-Smtp-Source: ABdhPJyGqPtA5oJLcVCw7Bsvc99aVJ/rYg44ThYMGz2QbQaavJWoVZpqSX0zd1XRp9u8wS+N1E0Uow== X-Received: by 2002:a05:600c:3510:: with SMTP id h16mr784921wmq.65.1626812083983; Tue, 20 Jul 2021 13:14:43 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h15sm23831061wrq.88.2021.07.20.13.14.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 13:14:43 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 20 Jul 2021 20:14:37 +0000 Subject: [PATCH v2 3/7] commit: integrate with sparse-index Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, newren@gmail.com, matheus.bernardino@usp.br, stolee@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Update 'git commit' to allow using the sparse-index in memory without expanding to a full one. The only place that had an ensure_full_index() call was in cache_tree_update(). The recursive algorithm for update_one() was already updated in 2de37c536 (cache-tree: integrate with sparse directory entries, 2021-03-03) to handle sparse directory entries in the index. Most of this change involves testing different command-line options that allow specifying which on-disk changes should be included in the commit. This includes no options (only take currently-staged changes), -a (take all tracked changes), and --include (take a list of specific changes). To simplify testing that these options do not expand the index, update the test that previously verified that 'git status' does not expand the index with a helper method, ensure_not_expanded(). This allows 'git commit' to operate much faster when the sparse-checkout cone is much smaller than the full list of files at HEAD. Here are the relevant lines from p2000-sparse-operations.sh: Test HEAD~1 HEAD ---------------------------------------------------------------------------------- 2000.14: git commit -a -m A (full-v3) 0.35(0.26+0.06) 0.36(0.28+0.07) +2.9% 2000.15: git commit -a -m A (full-v4) 0.32(0.26+0.05) 0.34(0.28+0.06) +6.3% 2000.16: git commit -a -m A (sparse-v3) 0.63(0.59+0.06) 0.04(0.05+0.05) -93.7% 2000.17: git commit -a -m A (sparse-v4) 0.64(0.59+0.08) 0.04(0.04+0.04) -93.8% It is important to compare the full-index case to the sparse-index case, so the improvement for index version v4 is actually an 88% improvement in this synthetic example. In a real repository with over two million files at HEAD and 60,000 files in the sparse-checkout definition, the time for 'git commit -a' went from 2.61 seconds to 134ms. I compared this to the result if the index only contained the paths in the sparse-checkout definition and found the theoretical optimum to be 120ms, so the out-of-cone paths only add a 12% overhead. Signed-off-by: Derrick Stolee --- builtin/commit.c | 3 ++ cache-tree.c | 2 - t/t1092-sparse-checkout-compatibility.sh | 47 ++++++++++++++++++++++-- 3 files changed, 46 insertions(+), 6 deletions(-) diff --git a/builtin/commit.c b/builtin/commit.c index 12f51db158a..0bc64892505 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -1682,6 +1682,9 @@ int cmd_commit(int argc, const char **argv, const char *prefix) if (argc == 2 && !strcmp(argv[1], "-h")) usage_with_options(builtin_commit_usage, builtin_commit_options); + prepare_repo_settings(the_repository); + the_repository->settings.command_requires_full_index = 0; + status_init_config(&s, git_commit_config); s.commit_template = 1; status_format = STATUS_FORMAT_NONE; /* Ignore status.short */ diff --git a/cache-tree.c b/cache-tree.c index 45e58666afc..577b18d8811 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -461,8 +461,6 @@ int cache_tree_update(struct index_state *istate, int flags) if (i) return i; - ensure_full_index(istate); - if (!istate->cache_tree) istate->cache_tree = cache_tree(); diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index cabbd42e339..d3e34d0acac 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -262,6 +262,34 @@ test_expect_success 'add, commit, checkout' ' test_all_match git checkout - ' +test_expect_success 'commit including unstaged changes' ' + init_repos && + + write_script edit-file <<-\EOF && + echo $1 >$2 + EOF + + run_on_all ../edit-file 1 a && + run_on_all ../edit-file 1 deep/a && + + test_all_match git commit -m "-a" -a && + test_all_match git status --porcelain=v2 && + + run_on_all ../edit-file 2 a && + run_on_all ../edit-file 2 deep/a && + + test_all_match git commit -m "--include" --include deep/a && + test_all_match git status --porcelain=v2 && + test_all_match git commit -m "--include" --include a && + test_all_match git status --porcelain=v2 && + + run_on_all ../edit-file 3 a && + run_on_all ../edit-file 3 deep/a && + + test_all_match git commit -m "--amend" -a --amend && + test_all_match git status --porcelain=v2 +' + test_expect_success 'status/add: outside sparse cone' ' init_repos && @@ -514,14 +542,25 @@ test_expect_success 'sparse-index is expanded and converted back' ' test_region index ensure_full_index trace2.txt ' -test_expect_success 'sparse-index is not expanded' ' - init_repos && - +ensure_not_expanded () { rm -f trace2.txt && echo >>sparse-index/untracked.txt && GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \ - git -C sparse-index status && + git -C sparse-index "$@" && test_region ! index ensure_full_index trace2.txt +} + +test_expect_success 'sparse-index is not expanded' ' + init_repos && + + ensure_not_expanded status && + ensure_not_expanded commit --allow-empty -m empty && + echo >>sparse-index/a && + ensure_not_expanded commit -a -m a && + echo >>sparse-index/a && + ensure_not_expanded commit --include a -m a && + echo >>sparse-index/deep/deeper1/a && + ensure_not_expanded commit --include deep/deeper1/a -m deeper ' # NEEDSWORK: a sparse-checkout behaves differently from a full checkout From patchwork Tue Jul 20 20:14:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12389229 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6A815C636C8 for ; Tue, 20 Jul 2021 20:42:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4B77F6113B for ; Tue, 20 Jul 2021 20:42:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234778AbhGTUBk (ORCPT ); Tue, 20 Jul 2021 16:01:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36128 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232902AbhGTTeg (ORCPT ); Tue, 20 Jul 2021 15:34:36 -0400 Received: from mail-wr1-x435.google.com (mail-wr1-x435.google.com [IPv6:2a00:1450:4864:20::435]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 089E6C0613E1 for ; Tue, 20 Jul 2021 13:14:46 -0700 (PDT) Received: by mail-wr1-x435.google.com with SMTP id l7so27295272wrv.7 for ; Tue, 20 Jul 2021 13:14:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=lX+phGtUUNbbXIoUjtKs9N0xtQbw6A/3yCrL+FgxfnM=; b=uhbF/3/j61wrXVptx8S4N/CffDfyayRf/ENbepe46XzeOzbtej65dFCX8KDJNNfNHf DBU2USJaS17aaXQQPfsQfPvMr6PZGzaRARrWa3OLPL1um0MjDg9VDYcpbOcRnMbZVEn7 WNYDR6i4XKyPMMEGhIS9wi8b9ZGARYCtFy8+PVF23MDWcgmJ1LN9xS5UPJxeDsTNjz5C G8R0SiBhaPXdD7HfLu/tisZPxKXDNIu7U1JIzSJDqSObXGfvg5f/5P5AgJZvyRvJtfFD pdVWZsgy9ZCN5f46nWRCTVSGaW1rnhf6FgBvMdKFotkN/qLGIzkoqPisdMPvo1EBiWrG Co1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=lX+phGtUUNbbXIoUjtKs9N0xtQbw6A/3yCrL+FgxfnM=; b=ZuJcKE5wRpO4hYAqZSxJg8sHBeHsYl83rNp6t/FKDI0KpV+HrwqUS4Qj8w6dAumEZt QM6uBZl1P6QUiuZ01MuU9+0r2bEfuJ9pYqzOxXoT+pHeM4GgxF/+y1CTKjihU7ty769r d74l9kJRSOKuwISujI7hdFA0GlK+dOEe7u9w1f/eVzVoGoGQKmx48JqEQcowqG5Dqa7o KtjgeqFAlEWOuf0iRgRprYo2fmiWELwqr5lf9z/vt+CJgN68caIcinw/5nmS8TIsUOfX 9b31bydCdolEZOalrkBdr1kJKkVNccyrWysk19baa3lUr2etOlTDoS/LorGMNf5Tolk/ 6FnA== X-Gm-Message-State: AOAM533PkloZZN6Eodm4XDnLvdlKuKFoH1UgJlqY0bCXF8pCBuPg1JSr N5JOzwFjh6er1SJ80wBINUPXsRPvF8g= X-Google-Smtp-Source: ABdhPJxrujm9eiRhfiIaq0kHMLpbALCwluVU2aCOmEturjYl0Fd09hyTIoTbmnL71cAyJN8iX55mBQ== X-Received: by 2002:a5d:4fd2:: with SMTP id h18mr36822579wrw.289.1626812084635; Tue, 20 Jul 2021 13:14:44 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 129sm20950673wmz.26.2021.07.20.13.14.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 13:14:44 -0700 (PDT) Message-Id: <65e79b8037ca0268363db9ff967c1342a76485c5.1626812081.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 20:14:38 +0000 Subject: [PATCH v2 4/7] sparse-index: recompute cache-tree Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, newren@gmail.com, matheus.bernardino@usp.br, stolee@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee When some commands run with command_requires_full_index=1, then the index can get in a state where the in-memory cache tree is actually equal to the sparse index's cache tree instead of the full one. This results in incorrect entry_count values. By clearing the cache tree before converting to sparse, we avoid this issue. Signed-off-by: Derrick Stolee --- sparse-index.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/sparse-index.c b/sparse-index.c index 53c8f711ccc..c6b4feec413 100644 --- a/sparse-index.c +++ b/sparse-index.c @@ -170,6 +170,8 @@ int convert_to_sparse(struct index_state *istate) if (index_has_unmerged_entries(istate)) return 0; + /* Clear and recompute the cache-tree */ + cache_tree_free(&istate->cache_tree); if (cache_tree_update(istate, 0)) { warning(_("unable to update cache-tree, staying full")); return -1; From patchwork Tue Jul 20 20:14:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12389221 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C35ABC636CA for ; Tue, 20 Jul 2021 20:41:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AAF7E61009 for ; Tue, 20 Jul 2021 20:41:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233937AbhGTUAf (ORCPT ); Tue, 20 Jul 2021 16:00:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36348 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232764AbhGTTeg (ORCPT ); Tue, 20 Jul 2021 15:34:36 -0400 Received: from mail-wm1-x336.google.com (mail-wm1-x336.google.com [IPv6:2a00:1450:4864:20::336]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9D4D6C0613E2 for ; Tue, 20 Jul 2021 13:14:46 -0700 (PDT) Received: by mail-wm1-x336.google.com with SMTP id n4so84353wms.1 for ; Tue, 20 Jul 2021 13:14:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=emMgPNDmsinyWNQuDS25z5uXhezmqYrHYu56d0gVkXA=; b=dXuinKtA5HSTeOHsMeZEzix45r5ncAbgWj83GrbQib/fcuHGtOcvEnaXIiDF7sUWfg Bf2p5S1eOWK4E/JU5I/xig3nPYHUix40mAvqCJHwAofqY85sfN5PlSFXBOZBOLRItsbi WEmv6SiVizzIre4mfJyKqFibX6oz3CGM97OGXGH/NXfqp9+L6Fw/9kwDPKXSsuHmukEW K4Ik2VWHg4LrViFKTIQUCBK9cO2p4a/IciopezFR1dDZSsIjtmrPpZZkPOzbiUH3i3Qs 6L9Kfz1S2wy4rhxXlHDL5AWHFTA3ZPyZ0yGY0ekSqIquIVeVHamewuwAn1Qlj1B4scjs G8FA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=emMgPNDmsinyWNQuDS25z5uXhezmqYrHYu56d0gVkXA=; b=lUZdxYUUTmdbbgU8whlqgnuQuMuZdJ90PGm2hbvjBsC1ODYvdiwk013n1g9Q2wiW4u dQPzO+8CnCcF9HrNdBBMzo1LK4G0/MTDJYApABenbbwPfxpgrMKmcfAR9yH32vS9q7ya pNbuBcbY9MzPbORfTOnAB/tlBGR+RkLdyS9Fg8Xrd3W4G24yACoZxAr68AGYbdndys74 CuIspcjHsofyL1TcfFjTbkmnN5GQxtVTA97LMRgzwKfLcLcMMN5obDUWRWjvkekdALdq os8Xg4DLVUGguZNm/ksJixp2TUiNQIJAJcdj+cGKYBQoJ9+f8UTxcFsNgL7cIuoxIxi5 +ZtA== X-Gm-Message-State: AOAM533GUIxIGnrYkBQh1ZZMq5q335phihVtLxjbalm8W+2KKX6Y30BP FKAg7lOitZb449pAV5XGG5Em1v7Ir/k= X-Google-Smtp-Source: ABdhPJxN3/tsecDoUsHNWjAYWMNG8eS9wlRkogvDSx8NyYTguaDLxidHhtp19vPaZCs4JCO/rODK7g== X-Received: by 2002:a05:600c:35c1:: with SMTP id r1mr33787931wmq.0.1626812085230; Tue, 20 Jul 2021 13:14:45 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id o6sm24270953wry.91.2021.07.20.13.14.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 13:14:44 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 20 Jul 2021 20:14:39 +0000 Subject: [PATCH v2 5/7] checkout: stop expanding sparse indexes Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, newren@gmail.com, matheus.bernardino@usp.br, stolee@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Previous changes did the necessary improvements to unpack-trees.c and diff-lib.c in order to modify a sparse index based on its comparision with a tree. The only remaining work is to remove some ensure_full_index() calls and add tests that verify that the index is not expanded in our interesting cases. Include 'switch' and 'restore' in these tests, as they share a base implementation with 'checkout'. Here are the relevant performance results from p2000-sparse-operations.sh: Test HEAD~1 HEAD -------------------------------------------------------------------------------- 2000.18: git checkout -f - (full-v3) 0.49(0.43+0.03) 0.47(0.39+0.05) -4.1% 2000.19: git checkout -f - (full-v4) 0.45(0.37+0.06) 0.42(0.37+0.05) -6.7% 2000.20: git checkout -f - (sparse-v3) 0.76(0.71+0.07) 0.04(0.03+0.04) -94.7% 2000.21: git checkout -f - (sparse-v4) 0.75(0.72+0.04) 0.05(0.06+0.04) -93.3% It is important to compare the full index case to the sparse index case, as the previous results for the sparse index were inflated by the index expansion. For index v4, this is an 88% improvement. On an internal repository with over two million paths at HEAD and a sparse-checkout definition containing ~60,000 of those paths, 'git checkout' went from 3.5s to 297ms with this change. The theoretical optimum where only those ~60,000 paths exist was 275ms, so the extra sparse directory entries contribute a 22ms overhead. Signed-off-by: Derrick Stolee --- builtin/checkout.c | 8 +++----- t/t1092-sparse-checkout-compatibility.sh | 10 +++++++++- 2 files changed, 12 insertions(+), 6 deletions(-) diff --git a/builtin/checkout.c b/builtin/checkout.c index f4cd7747d35..b5d477919a7 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -378,9 +378,6 @@ static int checkout_worktree(const struct checkout_opts *opts, if (pc_workers > 1) init_parallel_checkout(); - /* TODO: audit for interaction with sparse-index. */ - ensure_full_index(&the_index); - for (pos = 0; pos < active_nr; pos++) { struct cache_entry *ce = active_cache[pos]; if (ce->ce_flags & CE_MATCHED) { @@ -530,8 +527,6 @@ static int checkout_paths(const struct checkout_opts *opts, * Make sure all pathspecs participated in locating the paths * to be checked out. */ - /* TODO: audit for interaction with sparse-index. */ - ensure_full_index(&the_index); for (pos = 0; pos < active_nr; pos++) if (opts->overlay_mode) mark_ce_for_checkout_overlay(active_cache[pos], @@ -1593,6 +1588,9 @@ static int checkout_main(int argc, const char **argv, const char *prefix, git_config(git_checkout_config, opts); + prepare_repo_settings(the_repository); + the_repository->settings.command_requires_full_index = 0; + opts->track = BRANCH_TRACK_UNSPECIFIED; if (!opts->accept_pathspec && !opts->accept_ref) diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index d3e34d0acac..fde3b41aba8 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -560,7 +560,15 @@ test_expect_success 'sparse-index is not expanded' ' echo >>sparse-index/a && ensure_not_expanded commit --include a -m a && echo >>sparse-index/deep/deeper1/a && - ensure_not_expanded commit --include deep/deeper1/a -m deeper + ensure_not_expanded commit --include deep/deeper1/a -m deeper && + ensure_not_expanded checkout rename-out-to-out && + ensure_not_expanded checkout - && + ensure_not_expanded switch rename-out-to-out && + ensure_not_expanded switch - && + git -C sparse-index reset --hard && + ensure_not_expanded checkout rename-out-to-out -- deep/deeper1 && + git -C sparse-index reset --hard && + ensure_not_expanded restore -s rename-out-to-out -- deep/deeper1 ' # NEEDSWORK: a sparse-checkout behaves differently from a full checkout From patchwork Tue Jul 20 20:14:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12389223 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 211D9C636C9 for ; Tue, 20 Jul 2021 20:42:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0363F61009 for ; Tue, 20 Jul 2021 20:42:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234157AbhGTUBH (ORCPT ); Tue, 20 Jul 2021 16:01:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36130 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232986AbhGTTeg (ORCPT ); Tue, 20 Jul 2021 15:34:36 -0400 Received: from mail-wm1-x334.google.com (mail-wm1-x334.google.com [IPv6:2a00:1450:4864:20::334]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4D321C0613E3 for ; Tue, 20 Jul 2021 13:14:47 -0700 (PDT) Received: by mail-wm1-x334.google.com with SMTP id l6so102639wmq.0 for ; Tue, 20 Jul 2021 13:14:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=5TJJQI9b+HCPdsHvWdKITCwanVgfq0C2ILsyop3lM3o=; b=n9h/q0go8DPtKOWVpua+v1FPzs1beUhCnBIMgu42hCuId/BEHJX2akPOuNi7mLDTmN 3rymWEMCjQ5Ek3UDvVZb/Lcj5f9kBhK2A1aIK4M5XNIAsaAnS/+XEfwdwRTMRmBcOa2Z C7xK4v7KtN+tFKw0LDRbCzDxDjvd8eN2cxuuXkCBTfl6BUKSvpxzDCoVhD7ta99Y3aE2 Bob+vyVr6o4NK1ddoBuSiFiQLRd56PqbjmGBVYYa1uFF4wzPBbnw6/q9CC4O6zOkN7Wk Pm4yD17kh1KTchgJUijO1aHX8LKLvjNvfPh1qvTXae5kO6sLjZGjh6FtptZbGXxq7ySd Q6KQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=5TJJQI9b+HCPdsHvWdKITCwanVgfq0C2ILsyop3lM3o=; b=Cr+ICzShMLmh/d6cqhqJzRJJN/fWWvceblvCN1GrvU+GTavpXdrrS1KflYTicHGBhg XdpfZywnBwv8vHaj6VIufE5gRVB/75O+80efTsMgkQoHJBQdlk8s640uHM5CsWwebVaR KFQDzoT7a0xSzQnyoZjGyEfGVhSJylqDKtT38sSv/z+F9G/UVaE0qWViv04IkOYihwV5 Bv6MmObp7fXNBeMH10zJB6zn6tRv9YZN8LU/HONcxP99TDNfHyU7ptX3rG1DQK2UeeCL H/6pvFD61YDtcjDPm2iJyeJFHqbGJhWOF65omD/FKCEFzNrNyMPlT/HUmTelwhWgDSYm Q4Xw== X-Gm-Message-State: AOAM533boDwLIGI9fTY608gj84x8jI/yK0RY2rt+1Pc7lAKQqLITxNYr zscGg0ZfuriZi6tVNrWhvoucAEJhtbg= X-Google-Smtp-Source: ABdhPJxBKk7gM7+fsfl5jw99ZDkQ9pQk+mnugTGy3oBVKYNGFDwqX0X0ne/73TRk2hvwsaC5zk60uQ== X-Received: by 2002:a7b:ca45:: with SMTP id m5mr33600781wml.46.1626812085811; Tue, 20 Jul 2021 13:14:45 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id q19sm3182915wmc.44.2021.07.20.13.14.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 13:14:45 -0700 (PDT) Message-Id: <4b801c854fb7891a6530121281cff3f67451b283.1626812081.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 20:14:40 +0000 Subject: [PATCH v2 6/7] t1092: document bad 'git checkout' behavior Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, newren@gmail.com, matheus.bernardino@usp.br, stolee@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Add new branches to the test repo that demonstrate directory/file conflicts in different ways. Since the directory 'folder1/' has adjacent files 'folder1-', 'folder1.txt', and 'folder10' it causes searches for 'folder1/' to land in a different place in the index than a search for 'folder1'. This causes a change in behavior when working with the df-conflict-1 and df-conflict-2 branches, whose only difference is that the first uses 'folder1' as the conflict and the other uses 'folder2' which does not have these adjacent files. We can extend two tests that compare the behavior across different 'git checkout' commands, and we see already that the behavior will be different in some cases and not in others. The difference between the two test loops is that one uses 'git reset --hard' between iterations. Further, we isolate the behavior of creating a staged change within a directory and then checking out a branch where that directory is replaced with a file. A full checkout behaves differently across these two cases, while a sparse-checkout cone behaves consistently. In both cases, the behavior is wrong. In one case, the staged change is dropped entirely. The other case the staged change is kept, replacing the file at that location, but none of the other files in the directory are kept. Likely, the correct behavior in this case is to reject the checkout and report the conflict, leaving HEAD in its previous location. None of the cases behave this way currently. Use comments to demonstrate that the tested behavior is only a documentation of the current, incorrect behavior to ensure we do not _accidentally_ change it. Instead, we would prefer to change it on purpose with a future change. At this point, the sparse-index does not handle these 'git checkout' commands correctly. Or rather, it _does_ reject the 'git checkout' when we have the staged change, but for the wrong reason. It also rejects the 'git checkout' commands when there is no staged change and we want to replace a directory with a file. A fix for that unstaged case will follow in the next change, but that will make the sparse-index agree with the full checkout case in these documented incorrect behaviors. Helped-by: Elijah Newren Signed-off-by: Derrick Stolee --- t/t1092-sparse-checkout-compatibility.sh | 142 ++++++++++++++++++++++- 1 file changed, 140 insertions(+), 2 deletions(-) diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index fde3b41aba8..79b4a8ce199 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -95,6 +95,25 @@ test_expect_success 'setup' ' git add . && git commit -m "rename deep/deeper1/... to folder1/..." && + git checkout -b df-conflict-1 base && + rm -rf folder1 && + echo content >folder1 && + git add . && + git commit -m "dir to file" && + + git checkout -b df-conflict-2 base && + rm -rf folder2 && + echo content >folder2 && + git add . && + git commit -m "dir to file" && + + git checkout -b fd-conflict base && + rm a && + mkdir a && + echo content >a/a && + git add . && + git commit -m "file to dir" && + git checkout -b deepest base && echo "updated deepest" >deep/deeper1/deepest/a && git commit -a -m "update deepest" && @@ -358,10 +377,16 @@ test_expect_success 'diff --staged' ' test_all_match git diff --staged ' +# NEEDSWORK: sparse-checkout behaves differently from full-checkout when +# running this test with 'df-conflict-2' after 'df-conflict-1'. test_expect_success 'diff with renames and conflicts' ' init_repos && - for branch in rename-out-to-out rename-out-to-in rename-in-to-out + for branch in rename-out-to-out \ + rename-out-to-in \ + rename-in-to-out \ + df-conflict-1 \ + fd-conflict do test_all_match git checkout rename-base && test_all_match git checkout $branch -- . && @@ -371,10 +396,15 @@ test_expect_success 'diff with renames and conflicts' ' done ' +# NEEDSWORK: the sparse-index fails to move HEAD across a directory/file +# conflict such as when checking out df-conflict-1 and df-conflict2. test_expect_success 'diff with directory/file conflicts' ' init_repos && - for branch in rename-out-to-out rename-out-to-in rename-in-to-out + for branch in rename-out-to-out \ + rename-out-to-in \ + rename-in-to-out \ + fd-conflict do git -C full-checkout reset --hard && test_sparse_match git reset --hard && @@ -606,4 +636,112 @@ test_expect_success 'add everything with deep new file' ' test_all_match git status --porcelain=v2 ' +# NEEDSWORK: 'git checkout' behaves incorrectly in the case of +# directory/file conflicts, even without sparse-checkout. Use this +# test only as a documentation of the incorrect behavior, not a +# measure of how it _should_ behave. +test_expect_success 'checkout behaves oddly with df-conflict-1' ' + init_repos && + + test_sparse_match git sparse-checkout disable && + + write_script edit-content <<-\EOF && + echo content >>folder1/larger-content + git add folder1 + EOF + + run_on_all ../edit-content && + test_all_match git status --porcelain=v2 && + + git -C sparse-checkout sparse-checkout init --cone && + git -C sparse-index sparse-checkout init --cone --sparse-index && + + test_all_match git status --porcelain=v2 && + + # This checkout command should fail, because we have a staged + # change to folder1/larger-content, but the destination changes + # folder1 to a file. + git -C full-checkout checkout df-conflict-1 \ + 1>full-checkout-out \ + 2>full-checkout-err && + git -C sparse-checkout checkout df-conflict-1 \ + 1>sparse-checkout-out \ + 2>sparse-checkout-err && + + # NEEDSWORK: the sparse-index case refuses to change HEAD here, + # but for the wrong reason. + test_must_fail git -C sparse-index checkout df-conflict-1 \ + 1>sparse-index-out \ + 2>sparse-index-err && + + # Instead, the checkout deletes the folder1 file and adds the + # folder1/larger-content file, leaving all other paths that were + # in folder1/ as deleted (without any warning). + cat >expect <<-EOF && + D folder1 + A folder1/larger-content + EOF + test_cmp expect full-checkout-out && + test_cmp expect sparse-checkout-out && + + # stderr: Switched to branch df-conflict-1 + test_cmp full-checkout-err sparse-checkout-err +' + +# NEEDSWORK: 'git checkout' behaves incorrectly in the case of +# directory/file conflicts, even without sparse-checkout. Use this +# test only as a documentation of the incorrect behavior, not a +# measure of how it _should_ behave. +test_expect_success 'checkout behaves oddly with df-conflict-2' ' + init_repos && + + test_sparse_match git sparse-checkout disable && + + write_script edit-content <<-\EOF && + echo content >>folder2/larger-content + git add folder2 + EOF + + run_on_all ../edit-content && + test_all_match git status --porcelain=v2 && + + git -C sparse-checkout sparse-checkout init --cone && + git -C sparse-index sparse-checkout init --cone --sparse-index && + + test_all_match git status --porcelain=v2 && + + # This checkout command should fail, because we have a staged + # change to folder1/larger-content, but the destination changes + # folder1 to a file. + git -C full-checkout checkout df-conflict-2 \ + 1>full-checkout-out \ + 2>full-checkout-err && + git -C sparse-checkout checkout df-conflict-2 \ + 1>sparse-checkout-out \ + 2>sparse-checkout-err && + + # NEEDSWORK: the sparse-index case refuses to change HEAD + # here, but for the wrong reason. + test_must_fail git -C sparse-index checkout df-conflict-2 \ + 1>sparse-index-out \ + 2>sparse-index-err && + + # The full checkout deviates from the df-conflict-1 case here! + # It drops the change to folder1/larger-content and leaves the + # folder1 path as-is on disk. + test_must_be_empty full-checkout-out && + + # In the sparse-checkout case, the checkout deletes the folder1 + # file and adds the folder1/larger-content file, leaving all other + # paths that were in folder1/ as deleted (without any warning). + cat >expect <<-EOF && + D folder2 + A folder2/larger-content + EOF + test_cmp expect sparse-checkout-out && + + # Switched to branch df-conflict-1 + test_cmp full-checkout-err sparse-checkout-err +' + test_done From patchwork Tue Jul 20 20:14:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 12389225 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5DB2BC07E95 for ; Tue, 20 Jul 2021 20:42:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 46EE460FF3 for ; Tue, 20 Jul 2021 20:42:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234365AbhGTUBP (ORCPT ); Tue, 20 Jul 2021 16:01:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36362 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233013AbhGTTeg (ORCPT ); Tue, 20 Jul 2021 15:34:36 -0400 Received: from mail-wr1-x430.google.com (mail-wr1-x430.google.com [IPv6:2a00:1450:4864:20::430]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C2DE0C0613E4 for ; Tue, 20 Jul 2021 13:14:47 -0700 (PDT) Received: by mail-wr1-x430.google.com with SMTP id d2so27369940wrn.0 for ; Tue, 20 Jul 2021 13:14:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=V5tQXeDD5Ip25tVwsuX5B5DgwU8YJNO0LVj8v6MyCLw=; b=QK8poySLxDwdZRdy/2iVWDWsF96i7hW1CrSrKdF/tAMk+hbOhZWmeSqTq9t9Roqkr/ F/BHhdFZg/GJ7aL0Krtckjrq1QCYH+2pLd+NceAJs0++DId3kZ1M+xrui1tXEq+I8eQO 3tEXj7tVYC90irhdiiI+IBCW8vhYaKXOxvqzCcEtLFkCT9kEcuI3KYIa7ffffQOmqv9p VaNDSGYBU8YfkAbOD4o4wW6bPG35E7HfjUv8WheKzUBDxBKZkIXiZEQpcUySRBi+6NHK rSpJMEwI7Hsnac23boqVClwPvviMANwOqMz41Lysw5Z9Opv6u56hYPR+lGyL8+JdHnHo nhSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=V5tQXeDD5Ip25tVwsuX5B5DgwU8YJNO0LVj8v6MyCLw=; b=SHxF4lUo3XYutF/ypSk+yVEflUa20KbRAANpkYHN8seNa6GCvQ1zhlK7LFofCkgxbk zdcMeixvDHe3LEYMQiQR/lAtExBwL05DKv5iZesVVk0+skG5b0MXni9LmasFI+Z/COat 3Z65I9goeUxCmzTTg2pgv/xvI9EQdJX8BGBLz5GCgx6JWooRxTRXG3EESkc4BPS2QfYa HbbAh+M1RXXCw2uKS1gLYtJIzH38ELCUvfS2XZujG8Zcce9Ro/pKibgSeW7M0tQneLj0 5g7A5Ar97FFHmwKJdS6mgIWGasSOnkcq/PpOQOjFLmruIMhFjk5tyy/HMz0rTiPDpl++ cRaw== X-Gm-Message-State: AOAM532W1IfOG2wzATpNQ6ArjG80lIMnRfaVw3dbo5lBcBxprqoIIK7C OWfdObT0z892t2uudZ1q43hWoSbmbt4= X-Google-Smtp-Source: ABdhPJxstizRdWRiA696OoVsfMpYlpmHj8zXwKZJ+QZ6hnDPVQFdOwOEyCiWEXzr+6bM37NJJvOdjg== X-Received: by 2002:adf:e805:: with SMTP id o5mr5665784wrm.321.1626812086382; Tue, 20 Jul 2021 13:14:46 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id e6sm28823090wrg.18.2021.07.20.13.14.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 13:14:46 -0700 (PDT) Message-Id: <71e301501c88399711a1bf8515d1747e92cfbb9b.1626812081.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 20:14:41 +0000 Subject: [PATCH v2 7/7] unpack-trees: resolve sparse-directory/file conflicts Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, newren@gmail.com, matheus.bernardino@usp.br, stolee@gmail.com, Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee When running unpack_trees() with a sparse index, we attempt to operate on the index without expanding the sparse directory entries. Thus, we operate by manipulating entire directories and passing them to the unpack function. In the case of the 'git checkout' command, this is the twoway_merge() function. There are several cases in twoway_merge() that handle different situations. One new one to add is the case of a directory/file conflict where the directory is sparse. Before the sparse index, such a conflict would appear as a list of file additions and deletions. Now, twoway_merge() initializes 'current', 'oldtree', and 'newtree' from src[0], src[1], and src[2], then sets 'oldtree' to NULL because it is equal to the df_conflict_entry. The way to determine that we have a directory/file conflict is to test that 'current' and 'newtree' disagree on being sparse directory entries. When we are in this case, we want to resolve the situation by calling merged_entry(). This allows replacing the 'current' entry with the 'newtree' entry. This is important for cases where we want to run 'git checkout' across the conflict and have the new HEAD represent the new file type at that path. The first NEEDSWORK comment dropped in t1092 demonstrates this necessary behavior. However, we still are in a confusing state when 'current' corresponds to a staged change within a sparse directory that is not present at HEAD. This should be atypical, because it requires adding a change outside of the sparse-checkout cone, but it is possible. Since we are unable to determine that this is a staged change within twoway_merge(), we cannot add a case to reject the merge at this point. I believe this is due to the use of df_conflict_entry in the place of 'oldtree' instead of using the valud at HEAD, which would provide some perspective to this decision. Any change that would allow this differentiation for staged entries would need to involve information further up in unpack_trees(). That work should be done, sometime, because we are further confusing the behavior of a directory/file conflict when staging a change in the directory. The two cases 'checkout behaves oddly with df-conflict-?' in t1092 demonstrate that even without a sparse-checkout, Git is not consistent in its behavior. Neither of the two options seems correct, either. This change makes the sparse-index behave differently than the typcial sparse-checkout case, but it does match the full checkout behavior in the df-conflict-2 case. Signed-off-by: Derrick Stolee --- t/t1092-sparse-checkout-compatibility.sh | 24 ++++++++++++------------ unpack-trees.c | 11 +++++++++++ 2 files changed, 23 insertions(+), 12 deletions(-) diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh index 79b4a8ce199..91e30d6ec22 100755 --- a/t/t1092-sparse-checkout-compatibility.sh +++ b/t/t1092-sparse-checkout-compatibility.sh @@ -396,14 +396,14 @@ test_expect_success 'diff with renames and conflicts' ' done ' -# NEEDSWORK: the sparse-index fails to move HEAD across a directory/file -# conflict such as when checking out df-conflict-1 and df-conflict2. test_expect_success 'diff with directory/file conflicts' ' init_repos && for branch in rename-out-to-out \ rename-out-to-in \ rename-in-to-out \ + df-conflict-1 \ + df-conflict-2 \ fd-conflict do git -C full-checkout reset --hard && @@ -667,10 +667,7 @@ test_expect_success 'checkout behaves oddly with df-conflict-1' ' git -C sparse-checkout checkout df-conflict-1 \ 1>sparse-checkout-out \ 2>sparse-checkout-err && - - # NEEDSWORK: the sparse-index case refuses to change HEAD here, - # but for the wrong reason. - test_must_fail git -C sparse-index checkout df-conflict-1 \ + git -C sparse-index checkout df-conflict-1 \ 1>sparse-index-out \ 2>sparse-index-err && @@ -684,7 +681,11 @@ test_expect_success 'checkout behaves oddly with df-conflict-1' ' test_cmp expect full-checkout-out && test_cmp expect sparse-checkout-out && + # The sparse-index reports no output + test_must_be_empty sparse-index-out && + # stderr: Switched to branch df-conflict-1 + test_cmp full-checkout-err sparse-checkout-err && test_cmp full-checkout-err sparse-checkout-err ' @@ -719,17 +720,15 @@ test_expect_success 'checkout behaves oddly with df-conflict-2' ' git -C sparse-checkout checkout df-conflict-2 \ 1>sparse-checkout-out \ 2>sparse-checkout-err && - - # NEEDSWORK: the sparse-index case refuses to change HEAD - # here, but for the wrong reason. - test_must_fail git -C sparse-index checkout df-conflict-2 \ + git -C sparse-index checkout df-conflict-2 \ 1>sparse-index-out \ 2>sparse-index-err && # The full checkout deviates from the df-conflict-1 case here! # It drops the change to folder1/larger-content and leaves the - # folder1 path as-is on disk. + # folder1 path as-is on disk. The sparse-index behaves the same. test_must_be_empty full-checkout-out && + test_must_be_empty sparse-index-out && # In the sparse-checkout case, the checkout deletes the folder1 # file and adds the folder1/larger-content file, leaving all other @@ -741,7 +740,8 @@ test_expect_success 'checkout behaves oddly with df-conflict-2' ' test_cmp expect sparse-checkout-out && # Switched to branch df-conflict-1 - test_cmp full-checkout-err sparse-checkout-err + test_cmp full-checkout-err sparse-checkout-err && + test_cmp full-checkout-err sparse-index-err ' test_done diff --git a/unpack-trees.c b/unpack-trees.c index 0a5135ab397..309c1352f5d 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -2619,6 +2619,17 @@ int twoway_merge(const struct cache_entry * const *src, same(current, oldtree) && !same(current, newtree)) { /* 20 or 21 */ return merged_entry(newtree, current, o); + } else if (current && !oldtree && newtree && + S_ISSPARSEDIR(current->ce_mode) != S_ISSPARSEDIR(newtree->ce_mode) && + ce_stage(current) == 0) { + /* + * This case is a directory/file conflict across the sparse-index + * boundary. When we are changing from one path to another via + * 'git checkout', then we want to replace one entry with another + * via merged_entry(). If there are staged changes, then we should + * reject the merge instead. + */ + return merged_entry(newtree, current, o); } else return reject_merge(current, o); }