From patchwork Thu Jul 15 00:45:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 12378211 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68A96C1B08C for ; Thu, 15 Jul 2021 00:45:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4CA64613CC for ; Thu, 15 Jul 2021 00:45:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231154AbhGOAsW (ORCPT ); Wed, 14 Jul 2021 20:48:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34964 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229666AbhGOAsU (ORCPT ); Wed, 14 Jul 2021 20:48:20 -0400 Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [IPv6:2a00:1450:4864:20::32e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 827EEC061760 for ; Wed, 14 Jul 2021 17:45:27 -0700 (PDT) Received: by mail-wm1-x32e.google.com with SMTP id f190so1076238wmf.4 for ; Wed, 14 Jul 2021 17:45:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Om/ov8t+sbeiVrF95ZC6c1SOeL+sOEfF1lt5KPa62hQ=; b=eApmMnRZQBS993jrCSF712JyKx9hWRF5RuViL6uvrt5IN1HDdYiPtFZ0uluVUMLohc Zw7G8ELLopgApsuVbJpq/61RiHX9SVd5p1FV8T91W8vDQv3W/42xg3kJNlnf3Er86Z8y hOFNRO5+Hd44Nl/LvHc9cW+Z9VrOVLCqm1PzH4KTF1WJL0i7r8Ics5mgtSs1/LbHnzv9 CDciuqag41ZBIDxKL+JgIsb1lvmz0MPvk4fYnsn9Vt33UWc6el9FwmNQ9jhC1AEwVXbC lQzZ2SQ91JlArCWCYDgagarq+cCLdhkr+ACAlDgyvK9awdfKEH2RI0/CMaQ440XIeeD3 6S5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Om/ov8t+sbeiVrF95ZC6c1SOeL+sOEfF1lt5KPa62hQ=; b=gJWpvPJ+ZJexgL6krDR+lArsp+EnzC13hOk8gqnlxsV5kEIPqKAm3ayKE8AvMSp/B0 086GsEm1tGqtjZHZ2+Ndo5bfsxIbL94RKYnxktJ6rwkO9ijzra+boA97v0JeSnOQwkLK giif+6fIZaRho2/gVVp2TVkGJ0Hvi7tp+B5uVQ1R/+AHI4eW41rpxvYSGYyP29J25Utk 8MKVjVsy+hr8Ebw1yB3Gkz3Ae2DjlazfGhZt9NPoWc2bRoLdo3xYoFsAiR7R06WDu/3X OEMvNsHUujwfRTzaZpRR9LdgD5AdcZ3pJ1vHUo2O6vE3XrwQ4G8+SEiMvcOVFgI+EO02 DhoQ== X-Gm-Message-State: AOAM531Be6hifkVjozTThDxPPdmblBw7s3TWWXJVWfUhvBsTu2rUm2fl gDlSZjsOUm+eDXTVK8L7jUUdSQ/M03E= X-Google-Smtp-Source: ABdhPJypt7U96OpVtqkejHXOyVzHPW5LvvJiK8RF6eMmHpECTgCxVXGmQIB2kcjXjbf2ZJn3T4WeGg== X-Received: by 2002:a1c:a709:: with SMTP id q9mr6677447wme.23.1626309926112; Wed, 14 Jul 2021 17:45:26 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id z13sm4354159wro.79.2021.07.14.17.45.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Jul 2021 17:45:25 -0700 (PDT) Message-Id: <0d1d0f180a35aaf355683909987731f1e0f4539b.1626309924.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Thu, 15 Jul 2021 00:45:21 +0000 Subject: [PATCH v3 1/4] diff: correct warning message when renameLimit exceeded Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Bagas Sanjaya , Elijah Newren , Eric Sunshine , Derrick Stolee , Jeff King , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren The warning when quadratic rename detection was skipped referred to "inexact rename detection". For years, the only linear portion of rename detection was looking for exact renames, so "inexact rename detection" was an accurate way to refer to the quadratic portion of rename detection. However, that changed with commit bd24aa2f97a0 (diffcore-rename: guide inexact rename detection based on basenames, 2021-02-14). Let's instead use the term "exhaustive rename detection" to refer to the quadratic portion. Signed-off-by: Elijah Newren --- diff.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/diff.c b/diff.c index 52c791574b7..2454e34cf6d 100644 --- a/diff.c +++ b/diff.c @@ -6284,7 +6284,7 @@ static int is_summary_empty(const struct diff_queue_struct *q) } static const char rename_limit_warning[] = -N_("inexact rename detection was skipped due to too many files."); +N_("exhaustive rename detection was skipped due to too many files."); static const char degrade_cc_to_c_warning[] = N_("only found copies from modified paths due to too many files."); From patchwork Thu Jul 15 00:45:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 12378213 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E6FF2C47E4B for ; Thu, 15 Jul 2021 00:45:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BE5DA613CC for ; Thu, 15 Jul 2021 00:45:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231181AbhGOAsY (ORCPT ); Wed, 14 Jul 2021 20:48:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34968 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230433AbhGOAsV (ORCPT ); Wed, 14 Jul 2021 20:48:21 -0400 Received: from mail-wr1-x429.google.com (mail-wr1-x429.google.com [IPv6:2a00:1450:4864:20::429]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 51738C061762 for ; Wed, 14 Jul 2021 17:45:28 -0700 (PDT) Received: by mail-wr1-x429.google.com with SMTP id g16so5420097wrw.5 for ; Wed, 14 Jul 2021 17:45:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=JMMMy0Fcyl8mCDG0H/yyawYs8tnD7kR/OH3f55UMsV8=; b=avxo3TpIFnyEK7zYIFFxiJpluE272vtuPHUnMXAH4b8SWQVE8y7rOT90qBaknfeTRO RjmS+iDnxh+19VYp4JVRZ3Unyv69sZK2TwHkuwMWI4bdJPYohseNgUjG7beLBSlSYFxQ y480u1vufGBe3LNIQAwileERDiP3TrTlKO5xhIjqXGUMqpJjC5vVnhcIaZef84ZPayqD ZfXlxHcZQPxX4tPNSayR3K9AKAPEivHR5KC+jBvkbVmI/eQs5IBHfvIjzFjz9AOIUhXp qm0DbSFlCjBpGlFwpexHyOZ728NpiDCjriG+/4xiBn163Dw8jDq26AxwD4EDepevxQ6f EVXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=JMMMy0Fcyl8mCDG0H/yyawYs8tnD7kR/OH3f55UMsV8=; b=njYUMqv7bLPQ6EDmBEo9KSDTPoLvKWQehMicHlTHpWdmmGZ/DSNVW2iyLsxUVf3dNC S0CUSM2pWHIQrt0LuJvylcFJU26V8jk8tDIA0/+aQe9Nm/O2iG3XVd+W/y1DW5J67aFF yuzqqn532rej2iMEgJpMvVsQTqwpqvmA+e1gdZHKrzbBaF8KHHwenp+SwGToS5BeOA8L n2YHvTR4mLTk6VgCnr2Xkt2w64yTrPKXHXgXJegmXCc6w/RK4Ac5RcNFKiFw1M/fF1yo FFOy8PJFdFTQQaw+PDNJTHi0nTWfjBctXa7fGqivPtgDsykF5xIX7+jTq0nx7PpdU+Pq P2zA== X-Gm-Message-State: AOAM5334B5K3JoiKa7oxNmnp7rxlhcrqFKXolvydUkitKUf9+l6+FWPp J/kPzeUH925bn/CIG3DFjBN0CShBNA4= X-Google-Smtp-Source: ABdhPJwfPjoOYAt3pWxsRRsLX58zo9WKe9E42c2toFCAKL1LKWE0TlNhl4ZzprgXkv794Q0oS4Ss/Q== X-Received: by 2002:a5d:5609:: with SMTP id l9mr889417wrv.123.1626309926836; Wed, 14 Jul 2021 17:45:26 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id l34sm7623926wms.45.2021.07.14.17.45.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Jul 2021 17:45:26 -0700 (PDT) Message-Id: <193385d7ca11f5af74b88164062a9220956a5aaf.1626309924.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Thu, 15 Jul 2021 00:45:22 +0000 Subject: [PATCH v3 2/4] doc: clarify documentation for rename/copy limits Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Bagas Sanjaya , Elijah Newren , Eric Sunshine , Derrick Stolee , Jeff King , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren A few places in the docs implied that rename/copy detection is always quadratic or that all (unpaired) files were involved in the quadratic portion of rename/copy detection. The following two commits each introduced an exception to this: 9027f53cb505 (Do linear-time/space rename logic for exact renames, 2007-10-25) bd24aa2f97a0 (diffcore-rename: guide inexact rename detection based on basenames, 2021-02-14) (As a side note, for copy detection, the basename guided inexact rename detection is turned off and the exact renames will only result in sources (without the dests) being removed from the set of files used in quadratic detection. So, for copy detection, the documentation was closer to correct.) Avoid implying that all files involved in rename/copy detection are subject to the full quadratic algorithm. While at it, also note the default values for all these settings. Signed-off-by: Elijah Newren --- Documentation/config/diff.txt | 7 ++++--- Documentation/config/merge.txt | 10 ++++++---- Documentation/diff-options.txt | 15 ++++++++++----- 3 files changed, 20 insertions(+), 12 deletions(-) diff --git a/Documentation/config/diff.txt b/Documentation/config/diff.txt index 2d3331f55c2..d1b5cfa3542 100644 --- a/Documentation/config/diff.txt +++ b/Documentation/config/diff.txt @@ -118,9 +118,10 @@ diff.orderFile:: relative to the top of the working tree. diff.renameLimit:: - The number of files to consider when performing the copy/rename - detection; equivalent to the 'git diff' option `-l`. This setting - has no effect if rename detection is turned off. + The number of files to consider in the exhaustive portion of + copy/rename detection; equivalent to the 'git diff' option + `-l`. If not set, the default value is currently 400. This + setting has no effect if rename detection is turned off. diff.renames:: Whether and how Git detects renames. If set to "false", diff --git a/Documentation/config/merge.txt b/Documentation/config/merge.txt index 6b66c83eabe..7cd6d7883b6 100644 --- a/Documentation/config/merge.txt +++ b/Documentation/config/merge.txt @@ -33,10 +33,12 @@ merge.verifySignatures:: include::fmt-merge-msg.txt[] merge.renameLimit:: - The number of files to consider when performing rename detection - during a merge; if not specified, defaults to the value of - diff.renameLimit. This setting has no effect if rename detection - is turned off. + The number of files to consider in the exhaustive portion of + rename detection during a merge. If not specified, defaults + to the value of diff.renameLimit. If neither + merge.renameLimit nor diff.renameLimit are specified, + currently defaults to 1000. This setting has no effect if + rename detection is turned off. merge.renames:: Whether Git detects renames. If set to "false", rename detection diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt index 32e6dee5ac3..58acfff9289 100644 --- a/Documentation/diff-options.txt +++ b/Documentation/diff-options.txt @@ -588,11 +588,16 @@ When used together with `-B`, omit also the preimage in the deletion part of a delete/create pair. -l:: - The `-M` and `-C` options require O(n^2) processing time where n - is the number of potential rename/copy targets. This - option prevents rename/copy detection from running if - the number of rename/copy targets exceeds the specified - number. + The `-M` and `-C` options involve some preliminary steps that + can detect subsets of renames/copies cheaply, followed by an + exhaustive fallback portion that compares all remaining + unpaired destinations to all relevant sources. (For renames, + only remaining unpaired sources are relevant; for copies, all + original sources are relevant.) For N sources and + destinations, this exhaustive check is O(N^2). This option + prevents the exhaustive portion of rename/copy detection from + running if the number of source/destination files involved + exceeds the specified number. Defaults to diff.renameLimit. ifndef::git-format-patch[] --diff-filter=[(A|C|D|M|R|T|U|X|B)...[*]]:: From patchwork Thu Jul 15 00:45:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 12378215 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4FB07C47E48 for ; Thu, 15 Jul 2021 00:45:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3854C613CC for ; Thu, 15 Jul 2021 00:45:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231196AbhGOAsY (ORCPT ); Wed, 14 Jul 2021 20:48:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34976 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230441AbhGOAsV (ORCPT ); Wed, 14 Jul 2021 20:48:21 -0400 Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D9076C061764 for ; Wed, 14 Jul 2021 17:45:28 -0700 (PDT) Received: by mail-wm1-x32d.google.com with SMTP id w13so2641805wmc.3 for ; Wed, 14 Jul 2021 17:45:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=thJkGMSlxD1uro5AfAd0fExT4AlbO8y7OSCImnrUsNc=; b=URbgCDtxZt2+HdWLFPpFJFSq+B3S4j0KHm+qG2HjRyCmF9SnfkL5mT5nJ4JGF6Di68 eamhgxdB7iEBVdBwsAaT+bKI1MjLcLh5+lC/zmrz4aGmsugUwaqiu2dYmPFl/J1VJGhI 7ACnND4bzmTN2/on2PdiG9xsGVDPh4D77KlxZ9HTgGSFu+8WFH5GZM7ck6EEvRxLKF2i OSItJd3x4ZfVGIXaIMq8rMMkfYfgexXHDJ0zKVobab0P+uD79OYWGYTB/zJMX8McWUJv Gvw4raLC6inbdgGaH0SMeBujVpeleozeTIBd66c29Sua4X1T9X33SeR+4PJmTwRTegng BeBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=thJkGMSlxD1uro5AfAd0fExT4AlbO8y7OSCImnrUsNc=; b=cacNMkkv8yhGAxxxNzCdM0nzlZq8S2JTnuvjplHc0olsZmtXRP15eJkwaRpV25vOa2 V4yTSp7cj/uebo0AjmpT14TENcs6RMJtZlfDvYwRSTle/JqtzhKpedOBAPd1foDGx6Nl HbYaZXQk8EkiB+xxz1i1FNsnnXRJspv3gUd3YdfCMq8nq7yLvjLn8v0qNILWW3vhx0nD rwth2QfZOGMQa0Ls2Nxa7PNn+3ayeN2F1Pzi5mP/2uC4WchXOTHhKAWLSSQodcw/I/W4 aQm9EnjzqLhal+nOFTlVwQtZKFgOL2PDOTL3379rEapXU5q5GWAjjlp0SqgxoqEeRsf4 rJmg== X-Gm-Message-State: AOAM533tvSvoewPLXeje1VjQWZkErCU9tEGDjOX/BRU3mvYwP7UNB8wb A3QDA9Z+vn9R3pjgZc/OaIjdzH8FZu8= X-Google-Smtp-Source: ABdhPJw9nM02NSmdWYdV1qvEysK55+4d+qapT5ZAw8PNjbbbDPBkdtmm2xRAw0PngQgV6ZtLSvuqoA== X-Received: by 2002:a1c:63d6:: with SMTP id x205mr826453wmb.42.1626309927476; Wed, 14 Jul 2021 17:45:27 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a8sm4400027wrt.61.2021.07.14.17.45.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Jul 2021 17:45:27 -0700 (PDT) Message-Id: <00a2072baea435060b525b3907121bdf980461e9.1626309924.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Thu, 15 Jul 2021 00:45:23 +0000 Subject: [PATCH v3 3/4] diffcore-rename: treat a rename_limit of 0 as unlimited Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Bagas Sanjaya , Elijah Newren , Eric Sunshine , Derrick Stolee , Jeff King , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren In commit 89973554b52c (diffcore-rename: make diff-tree -l0 mean -l, 2017-11-29), -l0 was given a special magical "large" value, but one which was not large enough for some uses (as can be seen from commit 9f7e4bfa3b6d (diff: remove silent clamp of renameLimit, 2017-11-13). Make 0 (or a negative value) be treated as unlimited instead and update the documentation to mention this. Signed-off-by: Elijah Newren --- Documentation/diff-options.txt | 1 + diffcore-rename.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt index 58acfff9289..0aebe832057 100644 --- a/Documentation/diff-options.txt +++ b/Documentation/diff-options.txt @@ -598,6 +598,7 @@ of a delete/create pair. prevents the exhaustive portion of rename/copy detection from running if the number of source/destination files involved exceeds the specified number. Defaults to diff.renameLimit. + Note that a value of 0 is treated as unlimited. ifndef::git-format-patch[] --diff-filter=[(A|C|D|M|R|T|U|X|B)...[*]]:: diff --git a/diffcore-rename.c b/diffcore-rename.c index 3375e24659e..513ba7b05f1 100644 --- a/diffcore-rename.c +++ b/diffcore-rename.c @@ -1021,7 +1021,7 @@ static int too_many_rename_candidates(int num_destinations, int num_sources, * memory for the matrix anyway. */ if (rename_limit <= 0) - rename_limit = 32767; + return 0; /* treat as unlimited */ if (st_mult(num_destinations, num_sources) <= st_mult(rename_limit, rename_limit)) return 0; From patchwork Thu Jul 15 00:45:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 12378217 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1C6C1C12002 for ; Thu, 15 Jul 2021 00:45:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EABB0613CF for ; Thu, 15 Jul 2021 00:45:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231255AbhGOAsZ (ORCPT ); Wed, 14 Jul 2021 20:48:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34978 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229666AbhGOAsX (ORCPT ); Wed, 14 Jul 2021 20:48:23 -0400 Received: from mail-wm1-x336.google.com (mail-wm1-x336.google.com [IPv6:2a00:1450:4864:20::336]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B0316C06175F for ; Wed, 14 Jul 2021 17:45:29 -0700 (PDT) Received: by mail-wm1-x336.google.com with SMTP id b14-20020a1c1b0e0000b02901fc3a62af78so4958524wmb.3 for ; Wed, 14 Jul 2021 17:45:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=4G93HvUbypxL40PZTxO9azNfHX/zr6GJiPxepeTZg2A=; b=K3fAIzlS1dTaHdh1N/0lp+Gom6yQYTH1TJWi1iZoZCiCMN9xEmg5/WNn/qKya+czJ4 ECpiWrA70qByP4807dX3XRwxqzQqVzrEpDyl9d/seDXEYhCxbAZvTh1a0RyK8ZIdHqfz vpYgOGKtMe89YUDtU+6KRHrLBQnZyNfEBfW1jC+esHZTyGUQubngvMBQzcgwk9zUiMZC E8zKFg5S2doVGsigd9nsDYbrx6HQXWR/Sjeg0gxCTirsiCHtRQ+I37dNHHmo+3NAofAK xzjMNO2w5ZhJzMN2aCnXfd79N4X0cnHpf63ualChetq+3rbTWcnRRZhJj2g8ZGI9grbz WEqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=4G93HvUbypxL40PZTxO9azNfHX/zr6GJiPxepeTZg2A=; b=evpl9PwVNd5dDTVw43ZYpeyZSfPk4KuHkoKiM1t1c3o3TaCjdYZ7whFzJdCxtG5iEO zKz7ntpPCQG/9UhqOFpWOBlcZxNIWESSyFAUdXCItABHpwPwQE57uIpB0UzEA+8yscgU eJuIiHNrNe9Z64n41QH3mr747Ebwh7BEDXz2XTWsrWU6mYC/am7dyW49jwChKxeIoR55 dfOTA5bBMr9+6/ndBFDwv3anepN5tT8a2qdrST6izGqGoNeIqwZ41ZY9QMS+24hUHIfv nancVVZcBCDb6AylbbhaSvt+TXy/1cVLWrgOrhqDfsNZgjcvO/ITOrtE0iKpiAc75EY6 zeeQ== X-Gm-Message-State: AOAM531SE5CwaIAa3e329MwmSGnGBOirvav4BN4bemtsnKQoega+rWfS Wi2vihmMDCEiNrDyGIK0gW8UHQCO9JU= X-Google-Smtp-Source: ABdhPJwDYpNndWfUPT/aynjqJzrvA20bXpSDm3vhVpd/nztmt1x/dibsAeuym+dsPa8/MLgXVDHxoA== X-Received: by 2002:a05:600c:1c07:: with SMTP id j7mr3627028wms.165.1626309928242; Wed, 14 Jul 2021 17:45:28 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id x17sm4371713wru.6.2021.07.14.17.45.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Jul 2021 17:45:27 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Thu, 15 Jul 2021 00:45:24 +0000 Subject: [PATCH v3 4/4] Bump rename limit defaults (yet again) Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Bagas Sanjaya , Elijah Newren , Eric Sunshine , Derrick Stolee , Jeff King , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren These were last bumped in commit 92c57e5c1d29 (bump rename limit defaults (again), 2011-02-19), and were bumped both because processors had gotten faster, and because people were getting ugly merges that caused problems and reporting it to the mailing list (suggesting that folks were willing to spend more time waiting). Since that time: * Linus has continued recommending kernel folks to set diff.renameLimit=0 (maps to 32767, currently) * Folks with repositories with lots of renames were happy to set merge.renameLimit above 32767, once the code supported that, to get correct cherry-picks * Processors have gotten faster * It has been discovered that the timing methodology used last time probably used too large example files. The last point is probably worth explaining a bit more: * The "average" file size used appears to have been average blob size in the linux kernel history at the time (probably v2.6.25 or something close to it). * Since bigger files are modified more frequently, such a computation weights towards larger files. * Larger files may be more likely to be modified over time, but are not more likely to be renamed -- the mean and median blob size within a tree are a bit higher than the mean and median of blob sizes in the history leading up to that version for the linux kernel. * The mean blob size in v2.6.25 was half the average blob size in history leading to that point * The median blob size in v2.6.25 was about 40% of the mean blob size in v2.6.25. * Since the mean blob size is more than double the median blob size, any file as big as the mean will not be compared to any files of median size or less (because they'd be more than 50% dissimilar). * Since it is the number of files compared that provides the O(n^2) behavior, median-sized files should matter more than mean-sized ones. The combined effect of the above is that the file size used in past calculations was likely about 5x too large. Combine that with a CPU performance improvement of ~30%, and we can increase the limits by a factor of sqrt(5/(1-.3)) = 2.67, while keeping the original stated time limits. Keeping the same approximate time limit probably makes sense for diff.renameLimit (there is no progress feedback in e.g. git log -p), but the experience above suggests merge.renameLimit could be extended significantly. In fact, it probably would make sense to have an unlimited default setting for merge.renameLimit, but that would likely need to be coupled with changes to how progress is displayed. (See https://lore.kernel.org/git/YOx+Ok%2FEYvLqRMzJ@coredump.intra.peff.net/ for details in that area.) For now, let's just bump the approximate time limit from 10s to 1m. (Note: We do not want to use actual time limits, because getting results that depend on how loaded your system is that day feels bad, and because we don't discover that we won't get all the renames until after we've put in a lot of work rather than just upfront telling the user there are too many files involved.) Using the original time limit of 2s for diff.renameLimit, and bumping merge.renameLimit from 10s to 60s, I found the following timings using the simple script at the end of this commit message (on an AWS c5.xlarge which reports as "Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz"): N Timing 1300 1.995s 7100 59.973s So let's round down to nice even numbers and bump the limits from 400->1000, and from 1000->7000. Here is the measure_rename_perf script (adapted from https://lore.kernel.org/git/20080211113516.GB6344@coredump.intra.peff.net/ in particular to avoid triggering the linear handling from basename-guided rename detection): #!/bin/bash n=$1; shift rm -rf repo mkdir repo && cd repo git init -q -b main mkdata() { mkdir $1 for i in `seq 1 $2`; do (sed "s/^/$i /" <../sample echo tag: $1 ) >$1/$i done } mkdata initial $n git add . git commit -q -m initial mkdata new $n git add . cd new for i in *; do git mv $i $i.renamed; done cd .. git rm -q -rf initial git commit -q -m new time git diff-tree -M -l0 --summary HEAD^ HEAD Signed-off-by: Elijah Newren --- Documentation/config/diff.txt | 2 +- Documentation/config/merge.txt | 2 +- diff.c | 2 +- merge-ort.c | 2 +- merge-recursive.c | 2 +- 5 files changed, 5 insertions(+), 5 deletions(-) diff --git a/Documentation/config/diff.txt b/Documentation/config/diff.txt index d1b5cfa3542..32f84838ac1 100644 --- a/Documentation/config/diff.txt +++ b/Documentation/config/diff.txt @@ -120,7 +120,7 @@ diff.orderFile:: diff.renameLimit:: The number of files to consider in the exhaustive portion of copy/rename detection; equivalent to the 'git diff' option - `-l`. If not set, the default value is currently 400. This + `-l`. If not set, the default value is currently 1000. This setting has no effect if rename detection is turned off. diff.renames:: diff --git a/Documentation/config/merge.txt b/Documentation/config/merge.txt index 7cd6d7883b6..e27cc639447 100644 --- a/Documentation/config/merge.txt +++ b/Documentation/config/merge.txt @@ -37,7 +37,7 @@ merge.renameLimit:: rename detection during a merge. If not specified, defaults to the value of diff.renameLimit. If neither merge.renameLimit nor diff.renameLimit are specified, - currently defaults to 1000. This setting has no effect if + currently defaults to 7000. This setting has no effect if rename detection is turned off. merge.renames:: diff --git a/diff.c b/diff.c index 2454e34cf6d..0244a371d32 100644 --- a/diff.c +++ b/diff.c @@ -35,7 +35,7 @@ static int diff_detect_rename_default; static int diff_indent_heuristic = 1; -static int diff_rename_limit_default = 400; +static int diff_rename_limit_default = 1000; static int diff_suppress_blank_empty; static int diff_use_color_default = -1; static int diff_color_moved_default; diff --git a/merge-ort.c b/merge-ort.c index b954f7184a5..8a84375e940 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -2558,7 +2558,7 @@ static void detect_regular_renames(struct merge_options *opt, diff_opts.detect_rename = DIFF_DETECT_RENAME; diff_opts.rename_limit = opt->rename_limit; if (opt->rename_limit <= 0) - diff_opts.rename_limit = 1000; + diff_opts.rename_limit = 7000; diff_opts.rename_score = opt->rename_score; diff_opts.show_rename_progress = opt->show_rename_progress; diff_opts.output_format = DIFF_FORMAT_NO_OUTPUT; diff --git a/merge-recursive.c b/merge-recursive.c index 4327e0cfa33..f19f8cc37bd 100644 --- a/merge-recursive.c +++ b/merge-recursive.c @@ -1879,7 +1879,7 @@ static struct diff_queue_struct *get_diffpairs(struct merge_options *opt, */ if (opts.detect_rename > DIFF_DETECT_RENAME) opts.detect_rename = DIFF_DETECT_RENAME; - opts.rename_limit = (opt->rename_limit >= 0) ? opt->rename_limit : 1000; + opts.rename_limit = (opt->rename_limit >= 0) ? opt->rename_limit : 7000; opts.rename_score = opt->rename_score; opts.show_rename_progress = opt->show_rename_progress; opts.output_format = DIFF_FORMAT_NO_OUTPUT;