From patchwork Wed Jul 21 04:23:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 12390013 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9AE63C12002 for ; Wed, 21 Jul 2021 04:24:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7898961019 for ; Wed, 21 Jul 2021 04:24:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232119AbhGUDnf (ORCPT ); Tue, 20 Jul 2021 23:43:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59872 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231802AbhGUDna (ORCPT ); Tue, 20 Jul 2021 23:43:30 -0400 Received: from mail-wr1-x435.google.com (mail-wr1-x435.google.com [IPv6:2a00:1450:4864:20::435]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CE7BFC061762 for ; Tue, 20 Jul 2021 21:24:07 -0700 (PDT) Received: by mail-wr1-x435.google.com with SMTP id c15so653966wrs.5 for ; Tue, 20 Jul 2021 21:24:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=gKWw0d1gC9tSRFpyxyCf2xMnXMTq4D/wJsfrMxHIeU8=; b=oacE0PF0KR0rraUFNYFNI+Igq4bYIh7Rz6rjhY8BJp50mpo2Sglb4c6wL5QAYaTZ25 hiAVDv/zMllTMd0LfVSGnbCYc/vpFYGQ3Tq/FaOC1LDnGZ28J93VQrcx92EJA0MpDBly RbYcr0iQSpsJlkyIaixG9MT75x9u+qGBQwMJCvMSzv+qCN5x1vZKs3TaYBq1z6fYt43n vC9qO5P36/b3yeR2vtBpxIT98n7rS8EsH1LExKLQGkH7bTShkbIweo3nW58U/WiR1uU8 Jo78WXCGh80HhXfh3nA+BswU28NUm9PmU/k5aC4VTdQuK8BfvPfDT3c5OSdeTe8gC+P1 C+oA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=gKWw0d1gC9tSRFpyxyCf2xMnXMTq4D/wJsfrMxHIeU8=; b=lm4hO5qpr+epJ+mCn4jo1t0ccVArOzLQm0HoW09zKcdlcZx0y7Z4b73pJhGY+4PXDb 8MuIrHt7zGVDhTPdy3ryemcoLEX3dbOfg+41YJIiRcV1n+Dk/fDZc4l9otfS2KGc6Twe iyf7EbaPHU4GGy7nEQrp0cjgJVo3EdqR/5bGs/zPokH+OvX45AwOEIKRT7ar2kNpjyEL OD/y7uK4BFqHeLkWGckYceX9YMWO8DTTmUOco/sFKpEwYjZoxyu27hceHhjQ4Vj1od+v +9hlZhXIxZ1fvN9MXXPmN8wGbqWo9tfkqwTQqNZRzv4tcOrCOUsRZ05b4sPeWzJPjNiS CBiQ== X-Gm-Message-State: AOAM530nqARbiFwussc/81wtfZrhTVybeY4c1uXxuJS2HGKsZZpfwgCy mB5yoKlFeSa+uqovRCUwTKntRHZPfNo= X-Google-Smtp-Source: ABdhPJysNH7hBGyMvJelZreZhpVDUNb652/+MowgkfpaSKNX2Wi+VpPhVbEdCvSHP2c1ilaiXcfi4Q== X-Received: by 2002:adf:c5d2:: with SMTP id v18mr40172455wrg.386.1626841446455; Tue, 20 Jul 2021 21:24:06 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id r19sm27439250wrg.74.2021.07.20.21.24.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 21:24:06 -0700 (PDT) Message-Id: <7fdfeb159d0125cce980e63bcc1415144b91e486.1626841444.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 21 Jul 2021 04:23:58 +0000 Subject: [PATCH v4 1/7] merge-ort: resolve paths early when we have sufficient information Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Bagas Sanjaya , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren When there are no directories involved at a given path, and all three sides have a file at that path, and two of the three sides of history match, we can immediately resolve the merge of that path in collect_merge_info() and do not need to wait until process_entries(). This is actually a very minor improvement: half the time when I run it, I see an improvement; the other half a slowdown. It seems to be in the range of noise. However, this idea serves as the beginning of some bigger optimizations coming in the following patches. Acked-by: Derrick Stolee Signed-off-by: Elijah Newren --- merge-ort.c | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) diff --git a/merge-ort.c b/merge-ort.c index e3a5dfc7b31..6299b4f9413 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -1023,6 +1023,43 @@ static int collect_merge_info_callback(int n, return mask; } + /* + * If the sides match, and all three paths are present and are + * files, then we can take either as the resolution. We can't do + * this with trees, because there may be rename sources from the + * merge_base. + */ + if (sides_match && filemask == 0x07) { + /* use side1 (== side2) version as resolution */ + setup_path_info(opt, &pi, dirname, info->pathlen, fullpath, + names, names+1, side1_null, 0, + filemask, dirmask, 1); + return mask; + } + + /* + * If side1 matches mbase and all three paths are present and are + * files, then we can use side2 as the resolution. We cannot + * necessarily do so this for trees, because there may be rename + * destinations within side2. + */ + if (side1_matches_mbase && filemask == 0x07) { + /* use side2 version as resolution */ + setup_path_info(opt, &pi, dirname, info->pathlen, fullpath, + names, names+2, side2_null, 0, + filemask, dirmask, 1); + return mask; + } + + /* Similar to above but swapping sides 1 and 2 */ + if (side2_matches_mbase && filemask == 0x07) { + /* use side1 version as resolution */ + setup_path_info(opt, &pi, dirname, info->pathlen, fullpath, + names, names+1, side1_null, 0, + filemask, dirmask, 1); + return mask; + } + /* * Gather additional information used in rename detection. */ From patchwork Wed Jul 21 04:23:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 12390015 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A038AC07E9B for ; Wed, 21 Jul 2021 04:24:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 86D396023F for ; Wed, 21 Jul 2021 04:24:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232220AbhGUDnn (ORCPT ); Tue, 20 Jul 2021 23:43:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59878 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231974AbhGUDnc (ORCPT ); Tue, 20 Jul 2021 23:43:32 -0400 Received: from mail-wr1-x42e.google.com (mail-wr1-x42e.google.com [IPv6:2a00:1450:4864:20::42e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 70AC5C061574 for ; Tue, 20 Jul 2021 21:24:08 -0700 (PDT) Received: by mail-wr1-x42e.google.com with SMTP id n1so630844wri.10 for ; Tue, 20 Jul 2021 21:24:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=ZjrST+x2g7njK0pDZhQy/jIgO7Hl+GqQVMHa3oDGsYc=; b=i9zmHdvttYnOlcB0+pKK7M0moed89B0oD1Ulyfty1dzsaqaEdfxIchU7QQ4EbIFHZ4 9XMv8py86WFX1Y3/jYRct8fOwmzDdAWe3XTuSPVgyK3hAsb/HcoUaX7W8Ki+QRc7QdD8 MywqrkGLPdDuTRP18zLhzuMG1MuNOqeTU1zBacuvuRtd5lTYG14moGHEscODVF7+b7AO KJMJYjq7RhUxDM+2MlmpPnXrif6qa+ccA37KYE0a/6dPn6WQ+Dvbqq9xtfnJPWfTzdek yDG0B+3uFBP5InjYhEG6f15s3BqN/ixNOyZsbKHenXUOboEqp9CHL5e1SYiKmhraC0PI BIzw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=ZjrST+x2g7njK0pDZhQy/jIgO7Hl+GqQVMHa3oDGsYc=; b=Ph4bjad+mNB8ywWlS/j9Zd3EPlqq6Sau4TMi43x3aqhvJQMxTQfTpXKlFC3ib2LOsK Q+9O4WKPlyV6si/G9QsmPOAnJizNGff4Ad984uui6ApnolTliyf2kngLeFzWrt3xqaPM fYtjMKjHz4d90cTiZcIiHMrBf8ekdCczzjflrqAjPpsSiq7UiBuNJrGYJZ0ijUI3ol4N eha1FXGtuQw4muJAKj0/smdp6K4sTxqnrftK5nKKjxvrEyD+DV/jkp1XWzr4ly98c398 xoQduDBL30prsqdMQf1Op6cRTJiVVJHXST1zc1SWdSqTG5SjWbq6scJpD6y48na7PCvw JADA== X-Gm-Message-State: AOAM531huElpNg6aVmAnRFvB8XMk1tBfchx2kEBaxZA8ZbAHZJzIcSrN vkx6VPYN6jOsJk/tZRJSP/9zQ1V3haE= X-Google-Smtp-Source: ABdhPJzskvrMELazzZOnLhN3+r7MaSiLG/or8sllXn51VZsAuNZTWpZ+BrmI1WFPl6bRyUzy521Wmw== X-Received: by 2002:a5d:457b:: with SMTP id a27mr39082272wrc.280.1626841447035; Tue, 20 Jul 2021 21:24:07 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id m32sm3858734wms.23.2021.07.20.21.24.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 21:24:06 -0700 (PDT) Message-Id: <7a0085f2da9d9517b20745dcb8be4c15852bd56f.1626841444.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 21 Jul 2021 04:23:59 +0000 Subject: [PATCH v4 2/7] merge-ort: add some more explanations in collect_merge_info_callback() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Bagas Sanjaya , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren The previous patch possibly raises some questions about whether additional cases in collect_merge_info_callback() can be handled early. Add some explanations in the form of comments to help explain these better. While we're at it, add a few comments to denote what a few boolean '0' or '1' values stand for. Acked-by: Derrick Stolee Signed-off-by: Elijah Newren --- merge-ort.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index 6299b4f9413..843fa693145 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -1018,8 +1018,8 @@ static int collect_merge_info_callback(int n, if (side1_matches_mbase && side2_matches_mbase) { /* mbase, side1, & side2 all match; use mbase as resolution */ setup_path_info(opt, &pi, dirname, info->pathlen, fullpath, - names, names+0, mbase_null, 0, - filemask, dirmask, 1); + names, names+0, mbase_null, 0 /* df_conflict */, + filemask, dirmask, 1 /* resolved */); return mask; } @@ -1061,14 +1061,24 @@ static int collect_merge_info_callback(int n, } /* - * Gather additional information used in rename detection. + * Sometimes we can tell that a source path need not be included in + * rename detection -- namely, whenever either + * side1_matches_mbase && side2_null + * or + * side2_matches_mbase && side1_null + * However, we call collect_rename_info() even in those cases, + * because exact renames are cheap and would let us remove both a + * source and destination path. We'll cull the unneeded sources + * later. */ collect_rename_info(opt, names, dirname, fullpath, filemask, dirmask, match_mask); /* - * Record information about the path so we can resolve later in - * process_entries. + * None of the special cases above matched, so we have a + * provisional conflict. (Rename detection might allow us to + * unconflict some more cases, but that comes later so all we can + * do now is record the different non-null file hashes.) */ setup_path_info(opt, &pi, dirname, info->pathlen, fullpath, names, NULL, 0, df_conflict, filemask, dirmask, 0); From patchwork Wed Jul 21 04:24:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 12390017 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AD51BC07E9B for ; Wed, 21 Jul 2021 04:24:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7EAD361001 for ; Wed, 21 Jul 2021 04:24:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232157AbhGUDnv (ORCPT ); Tue, 20 Jul 2021 23:43:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59882 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231947AbhGUDnb (ORCPT ); Tue, 20 Jul 2021 23:43:31 -0400 Received: from mail-wr1-x429.google.com (mail-wr1-x429.google.com [IPv6:2a00:1450:4864:20::429]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 08D27C061762 for ; Tue, 20 Jul 2021 21:24:09 -0700 (PDT) Received: by mail-wr1-x429.google.com with SMTP id c15so653986wrs.5 for ; Tue, 20 Jul 2021 21:24:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=hQmthNMTykK/AsO+8FvtkRVDEZWYoM5gQvPc/w2Fe+Y=; b=p27H7r9//nwQZUTBzV+qd75yxnEdy89H7ryCSGDSPbaGwbiKutknVPR10oe4n7g4dm z/UzOALVA1Q5E8ty/42XRCTNw+uAHd17PT2cb3vpcDOTujbHmGKQfTaGICindRAOiEzr 1ulnVjvaAIC4gpHxDALDi/5ZNxY76IGOX0rythifFeexluKiYWyJLWQz/a82jq5ztU2/ ewDgReDUj6e99w3IYf2276JlU766P0pTQv6fkVfv5gbnrVfAtrkpiJiCDiemaS8h1PCl AFoKQpjrDUcs/Ds8EHJLuV+ivZhpIaKE9MoWQsyv5xHQgkEyTR9+0pYROZwSLn1mSBRS +n9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=hQmthNMTykK/AsO+8FvtkRVDEZWYoM5gQvPc/w2Fe+Y=; b=dhlU/Lo4BmrrgWhqqM4h/x14GYNE+9YqDGxJSTbqTaeS24mvXIr2bffNINWQUd/eqV O/5Zhw1dvuL6kUwEjH/Pj/+kLhKu0pLQ5gSrYs/TYzllrhJmGD9fQcMXDJLzs9VjAhJ4 MO8C4zRHW8Srjo+mlmr8N5ckKUpGXuEutcCmDSQQ+ngU0WnGamZEPrEfJ3CfiEpQ63Qp 1CJ1TKeR58Ol5JpkOyzYCdruboQ124UDpQOVrJeRqQAxIsPr3jdxWySS9/34a4zdCQnx K/s1+/uniNs7eFQzPjGQ3h+LFaSZ2JhKckp2ww5nEE2GFcHpZga55Wr/41uoq0N5dUq4 XSiA== X-Gm-Message-State: AOAM5338tnhucTS2SB5c2YotNj99HoS7Ikv72NLq3LAP9nEpWz5EsPkM +7C1rucwn7ucuvOQoOCMuHJYH53cs1Y= X-Google-Smtp-Source: ABdhPJwk0Xo+kYlDvNC6LkwMVwRcSFvqvU8WstjsPTnEaAXoy3a4U5TQSNo57mhuaYequ/O2Nj+KBQ== X-Received: by 2002:adf:e581:: with SMTP id l1mr39497227wrm.116.1626841447671; Tue, 20 Jul 2021 21:24:07 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id b6sm4049979wmj.34.2021.07.20.21.24.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 21:24:07 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Wed, 21 Jul 2021 04:24:00 +0000 Subject: [PATCH v4 3/7] merge-ort: add data structures for allowable trivial directory resolves Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Bagas Sanjaya , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren As noted a few commits ago, we can resolve individual files early if all three sides of the merge have a file at the path and two of the three sides match. We would really like to do the same thing with directories, because being able to do a trivial directory resolve means we don't have to recurse into the directory, potentially saving us a huge amount of time in both collect_merge_info() and process_entries(). Unfortunately, resolving directories early would mean missing any renames whose source or destination is underneath that directory. If we somehow knew there weren't any renames under the directory in question, then we could resolve it early. Sadly, it is impossible to determine whether there are renames under the directory in question without recursing into it, and this has traditionally kept us from ever implementing such an optimization. In commit f89b4f2bee ("merge-ort: skip rename detection entirely if possible", 2021-03-11), we added an additional reason that rename detection could be skipped entirely -- namely, if no *relevant* sources were present. Without completing collect_merge_info_callback(), we do not yet know if there are no relevant sources. However, we do know that if the current directory on one side matches the merge base, then every source file within that directory will not be RELEVANT_CONTENT, and a few simple checks can often let us rule out RELEVANT_LOCATION as well. This suggests we can just defer recursing into such directories until the end of collect_merge_info. Since the deferred directories are known to not add any relevant sources due to the above properties, then if there are no relevant sources after we've traversed all paths other than the deferred ones, then we know there are not any relevant sources. Under those conditions, rename detection is unnecessary, and that means we can resolve the deferred directories without recursing into them. Note that the logic for skipping rename detection was also modified further in commit 76e253793c ("merge-ort, diffcore-rename: employ cached renames when possible", 2021-01-30); in particular rename detection can be skipped if we already have cached renames for each relevant source. We can take advantage of this information as well with our deferral of recursing into directories where one side matches the merge base. Add some data structures that we will use to do these deferrals, with some lengthy comments explaining their purpose. Acked-by: Derrick Stolee Signed-off-by: Elijah Newren --- merge-ort.c | 61 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 61 insertions(+) diff --git a/merge-ort.c b/merge-ort.c index 843fa693145..d9263ec5aca 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -62,6 +62,53 @@ struct traversal_callback_data { struct name_entry names[3]; }; +struct deferred_traversal_data { + /* + * possible_trivial_merges: directories to be explored only when needed + * + * possible_trivial_merges is a map of directory names to + * dir_rename_mask. When we detect that a directory is unchanged on + * one side, we can sometimes resolve the directory without recursing + * into it. Renames are the only things that can prevent such an + * optimization. However, for rename sources: + * - If no parent directory needed directory rename detection, then + * no path under such a directory can be a relevant_source. + * and for rename destinations: + * - If no cached rename has a target path under the directory AND + * - If there are no unpaired relevant_sources elsewhere in the + * repository + * then we don't need any path under this directory for a rename + * destination. The only way to know the last item above is to defer + * handling such directories until the end of collect_merge_info(), + * in handle_deferred_entries(). + * + * For each we store dir_rename_mask, since that's the only bit of + * information we need, other than the path, to resume the recursive + * traversal. + */ + struct strintmap possible_trivial_merges; + + /* + * trivial_merges_okay: if trivial directory merges are okay + * + * See possible_trivial_merges above. The "no unpaired + * relevant_sources elsewhere in the repository" is a single boolean + * per merge side, which we store here. Note that while 0 means no, + * 1 only means "maybe" rather than "yes"; we optimistically set it + * to 1 initially and only clear when we determine it is unsafe to + * do trivial directory merges. + */ + unsigned trivial_merges_okay; + + /* + * target_dirs: ancestor directories of rename targets + * + * target_dirs contains all directory names that are an ancestor of + * any rename destination. + */ + struct strset target_dirs; +}; + struct rename_info { /* * All variables that are arrays of size 3 correspond to data tracked @@ -119,6 +166,8 @@ struct rename_info { */ struct strintmap relevant_sources[3]; + struct deferred_traversal_data deferred[3]; + /* * dir_rename_mask: * 0: optimization removing unmodified potential rename source okay @@ -501,6 +550,11 @@ static void clear_or_reinit_internal_opts(struct merge_options_internal *opti, strmap_clear(&renames->dir_rename_count[i], 1); } } + for (i = MERGE_SIDE1; i <= MERGE_SIDE2; ++i) { + strintmap_func(&renames->deferred[i].possible_trivial_merges); + strset_func(&renames->deferred[i].target_dirs); + renames->deferred[i].trivial_merges_okay = 1; /* 1 == maybe */ + } renames->cached_pairs_valid_side = 0; renames->dir_rename_mask = 0; @@ -4052,6 +4106,13 @@ static void merge_start(struct merge_options *opt, struct merge_result *result) strset_init_with_options(&renames->cached_target_names[i], NULL, 0); } + for (i = MERGE_SIDE1; i <= MERGE_SIDE2; i++) { + strintmap_init_with_options(&renames->deferred[i].possible_trivial_merges, + 0, NULL, 0); + strset_init_with_options(&renames->deferred[i].target_dirs, + NULL, 1); + renames->deferred[i].trivial_merges_okay = 1; /* 1 == maybe */ + } /* * Although we initialize opt->priv->paths with strdup_strings=0, From patchwork Wed Jul 21 04:24:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 12390019 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 391F4C12002 for ; Wed, 21 Jul 2021 04:24:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 19FCF611C1 for ; Wed, 21 Jul 2021 04:24:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232270AbhGUDoQ (ORCPT ); Tue, 20 Jul 2021 23:44:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59884 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232035AbhGUDnd (ORCPT ); Tue, 20 Jul 2021 23:43:33 -0400 Received: from mail-wr1-x430.google.com (mail-wr1-x430.google.com [IPv6:2a00:1450:4864:20::430]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 93DD0C061766 for ; Tue, 20 Jul 2021 21:24:09 -0700 (PDT) Received: by mail-wr1-x430.google.com with SMTP id l7so647879wrv.7 for ; Tue, 20 Jul 2021 21:24:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=ANb4HRB/JZ4hsz38nwl51OvmtaqBsNmH6ehGSjafGmU=; b=EUpW4LasaWC9H7c/+GdZHsNNtjTietkviM5TCG3AkzXEd6Cu0IY23wav8v40RS2ime BjC07rT8fsy+eY9SslqoR1fUfFsTv5slj0dd4YShrnpOSPaTzw3hTs4Oa23P6Pzamf8W fAC1FeAPozJ2iXRmmMx9YxUySXil/AZIF/AD95BP8PdoCyyaX7f6Zi791C3KyhB/ZrxW WmmrSPFe0MXGiQQwTlCnbUP2rhXu6LPpQnx5lxV9krqY/3cO/vg7qkMMjsOdoQ0ouO78 F1TBmrEhgbsybAiEyuJXu3IVk+rELIPRXNOGVnELnwAbR4DQsVUkZ1l/iZXTk4z6OnNM sEkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=ANb4HRB/JZ4hsz38nwl51OvmtaqBsNmH6ehGSjafGmU=; b=ItmG6a1OWpleL3S4zHe0vF7FStesmsKedDccs/WByjNJFYH9gyBPKD5HYId0Z9Mr7I 0b+VbIu2oeC2yCtytVKPV+vCn1SdSTc+10JGouORk9q5Pgi9XYivbmlr09EGLs6Ivb40 gdV53YOYPnNUgy3708DfzE3yVBGpsFPyVPB9G9v4jxgwjNvi7wpJlDpdgwaNlAkPxVmf t4CEG7MTOd4s6EbdVTTXceeS+B4yyx5yU1Fkhi5Poarg03TUdjg0dSDjrsL6Coy7D7zI 5cYIs8onNo3tdYtdZqkB0Yq/yYutg9rvb8s6nInm4eeU+ilAW0xV87kgmZW5UB8pHQAI CEfA== X-Gm-Message-State: AOAM530ZI0gDFVSQOMIOuqCX94awSZKOtc3T9r9rEGaZ2sArFm32pfKu R/QRbTE5Be3dZG0TaJ8i4CKvyNa4jI8= X-Google-Smtp-Source: ABdhPJxf4b2z649mR+LfJtk+Je44/ysRzrQR/38CBQ153GCQNH0NTTgSIcukVkNQD84H3Krbfmw+Tg== X-Received: by 2002:adf:e0c4:: with SMTP id m4mr40102043wri.312.1626841448270; Tue, 20 Jul 2021 21:24:08 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id l22sm3959037wmp.41.2021.07.20.21.24.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 21:24:07 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Wed, 21 Jul 2021 04:24:01 +0000 Subject: [PATCH v4 4/7] merge-ort: add a handle_deferred_entries() helper function Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Bagas Sanjaya , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren In order to allow trivial directory resolution, we first need to be able to gather more information to determine if the optimization is safe. To enable that, we need a way of deferring the recursion into the directory until a later time. Naturally, deferring the entry into a subtree means that we need some function that will later recurse into the subdirectory exactly the same way that collect_merge_info_callback() would have done. Add a helper function that does this. For now this function is not used but a subsequent commit will change that. Future commits will also make the function sometimes resolve directories instead of traversing inside. Acked-by: Derrick Stolee Signed-off-by: Elijah Newren --- merge-ort.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+) diff --git a/merge-ort.c b/merge-ort.c index d9263ec5aca..f0a07684df6 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -1202,6 +1202,70 @@ static int collect_merge_info_callback(int n, return mask; } +MAYBE_UNUSED +static int handle_deferred_entries(struct merge_options *opt, + struct traverse_info *info) +{ + struct rename_info *renames = &opt->priv->renames; + struct hashmap_iter iter; + struct strmap_entry *entry; + int side, ret = 0; + + for (side = MERGE_SIDE1; side <= MERGE_SIDE2; side++) { + renames->deferred[side].trivial_merges_okay = 0; + strintmap_for_each_entry(&renames->deferred[side].possible_trivial_merges, + &iter, entry) { + const char *path = entry->key; + unsigned dir_rename_mask = (intptr_t)entry->value; + struct conflict_info *ci; + unsigned dirmask; + struct tree_desc t[3]; + void *buf[3] = {NULL,}; + int i; + + ci = strmap_get(&opt->priv->paths, path); + VERIFY_CI(ci); + dirmask = ci->dirmask; + + info->name = path; + info->namelen = strlen(path); + info->pathlen = info->namelen + 1; + + for (i = 0; i < 3; i++, dirmask >>= 1) { + if (i == 1 && ci->match_mask == 3) + t[1] = t[0]; + else if (i == 2 && ci->match_mask == 5) + t[2] = t[0]; + else if (i == 2 && ci->match_mask == 6) + t[2] = t[1]; + else { + const struct object_id *oid = NULL; + if (dirmask & 1) + oid = &ci->stages[i].oid; + buf[i] = fill_tree_descriptor(opt->repo, + t+i, oid); + } + } + + ci->match_mask &= ci->filemask; + opt->priv->current_dir_name = path; + renames->dir_rename_mask = dir_rename_mask; + if (renames->dir_rename_mask == 0 || + renames->dir_rename_mask == 0x07) + ret = traverse_trees(NULL, 3, t, info); + else + ret = traverse_trees_wrapper(NULL, 3, t, info); + + for (i = MERGE_BASE; i <= MERGE_SIDE2; i++) + free(buf[i]); + + if (ret < 0) + return ret; + } + } + return ret; +} + static int collect_merge_info(struct merge_options *opt, struct tree *merge_base, struct tree *side1, From patchwork Wed Jul 21 04:24:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 12390021 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1065C07E9B for ; Wed, 21 Jul 2021 04:25:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B440B611C1 for ; Wed, 21 Jul 2021 04:25:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232329AbhGUDoT (ORCPT ); Tue, 20 Jul 2021 23:44:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59890 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232029AbhGUDnd (ORCPT ); Tue, 20 Jul 2021 23:43:33 -0400 Received: from mail-wm1-x335.google.com (mail-wm1-x335.google.com [IPv6:2a00:1450:4864:20::335]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 46CD2C061574 for ; Tue, 20 Jul 2021 21:24:10 -0700 (PDT) Received: by mail-wm1-x335.google.com with SMTP id o30-20020a05600c511eb029022e0571d1a0so191466wms.5 for ; Tue, 20 Jul 2021 21:24:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=YPuuJ1bhGSPcZEGk8NahRAMUPLm1QrFuQIqBcJIMtRA=; b=d0/nEIatA9NUyTMvPC0exf0kb69o15g9lu3e1+tv+iRx4oggo9noPR534rg3zC46zY GOBJCtlRgIQ8oIXhVL/5PAriWZUyRSboWRiu31NK2M+0qiIrMUfOCepPFvmy0rkWPJkQ 8uvf3sDMX6+gO14HBmllN4up9YS/C9Y4dL2I2qwRf8eLyBSl9IbiraDoLro6jTNv7mYh Im25EJyzy6JBBflMF9IfwRzuu1UecRWIgAK3pzbGAWLAAr6eVLoayZiC+4bjnolZkPF2 qhFbN6vMD5LfZ34IzRx6JLm/HpSvzT4YHLPTECUHO2QkvkCaPJRWpTm7miujyTbGzvd8 7byg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=YPuuJ1bhGSPcZEGk8NahRAMUPLm1QrFuQIqBcJIMtRA=; b=s+9QxBF9O+vBYH6FuqLjJxfrPKk7lPbj0n49ROl1C3QzY/4rJByI1zZeVmGhMpZBib mY///5caW8I+QFDuy5Uq42DawLqrMqtC6NoTDJ+jOIttrfbXvSjQMjF/mR9MLoH13e+i fct2lZiO/SnlzP6LXw5aU1IkhqTQd8x3SeC/cPJnT17a2yJ1l6A93IdqQBjoR4RHiHVo fGRjEpTLU3J6CneftbBr22C93Wj5dD3Rb6+tsTUJuKcujzwiwBtYkJbLj5gxfylpzx6y hGrht//3qj07S4GnxMnz1b818VhPAMwB9tf40bdSm7nuydz9cpryqxFMl754ulaAfgLu i7mQ== X-Gm-Message-State: AOAM530vewg6rad3u2rHiJDnT6dmUwywIz+KZeWi2zT66BZFj3ITyBJV NM1p4ZK1jy/8w14U9n0mjO97nLo86ZQ= X-Google-Smtp-Source: ABdhPJxQ7BeeUFVJMxAQ/uTbTgaQ8gPdaIoUKE2SUwKqUDVGCq87xexQDoy2P2Ic3L9QkDww+4AOMQ== X-Received: by 2002:a7b:cb01:: with SMTP id u1mr18954426wmj.44.1626841448868; Tue, 20 Jul 2021 21:24:08 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id d8sm26456600wrv.20.2021.07.20.21.24.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 21:24:08 -0700 (PDT) Message-Id: <5b01c118f10ec78eb77a696fa20a88038dfc28c9.1626841444.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 21 Jul 2021 04:24:02 +0000 Subject: [PATCH v4 5/7] merge-ort: defer recursing into directories when merge base is matched Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Bagas Sanjaya , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren When one side of history matches the merge base (including when the merge base has no entry for the given directory), have collect_merge_info_callback() defer recursing into the directory. To ensure those entries are eventually handled, add a call to handled_deferred_entries() in collect_merge_info() after traverse_trees() returns. Note that the condition in collect_merge_info_callback() may look more complicated than necessary at first glance; renames->trivial_merges_okay[side] is always true until handle_deferred_entries() is called, and possible_trivial_merges[side] is always empty right now (and in the future won't be filled until handle_deferred_entries() is called). However, when handle_deferred_entries() calls traverse_trees() for the relevant deferred directories, those traverse_trees() calls will once again end up in collect_merge_info_callback() for all the entries under those subdirectories. The extra conditions are there for such deferred cases and will be used more as we do more with those variables. Acked-by: Derrick Stolee Signed-off-by: Elijah Newren --- merge-ort.c | 33 +++++++++++++++++++++++++++++++-- 1 file changed, 31 insertions(+), 2 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index f0a07684df6..dbccf8c62e2 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -1147,8 +1147,36 @@ static int collect_merge_info_callback(int n, struct tree_desc t[3]; void *buf[3] = {NULL, NULL, NULL}; const char *original_dir_name; - int i, ret; + int i, ret, side; + /* + * Check for whether we can avoid recursing due to one side + * matching the merge base. The side that does NOT match is + * the one that might have a rename destination we need. + */ + assert(!side1_matches_mbase || !side2_matches_mbase); + side = side1_matches_mbase ? MERGE_SIDE2 : + side2_matches_mbase ? MERGE_SIDE1 : MERGE_BASE; + if (filemask == 0 && (dirmask == 2 || dirmask == 4)) { + /* + * Also defer recursing into new directories; set up a + * few variables to let us do so. + */ + ci->match_mask = (7 - dirmask); + side = dirmask / 2; + } + if (renames->dir_rename_mask != 0x07 && + side != MERGE_BASE && + renames->deferred[side].trivial_merges_okay && + !strset_contains(&renames->deferred[side].target_dirs, + pi.string)) { + strintmap_set(&renames->deferred[side].possible_trivial_merges, + pi.string, renames->dir_rename_mask); + renames->dir_rename_mask = prev_dir_rename_mask; + return mask; + } + + /* We need to recurse */ ci->match_mask &= filemask; newinfo = *info; newinfo.prev = info; @@ -1202,7 +1230,6 @@ static int collect_merge_info_callback(int n, return mask; } -MAYBE_UNUSED static int handle_deferred_entries(struct merge_options *opt, struct traverse_info *info) { @@ -1291,6 +1318,8 @@ static int collect_merge_info(struct merge_options *opt, trace2_region_enter("merge", "traverse_trees", opt->repo); ret = traverse_trees(NULL, 3, t, &info); + if (ret == 0) + ret = handle_deferred_entries(opt, &info); trace2_region_leave("merge", "traverse_trees", opt->repo); return ret; From patchwork Wed Jul 21 04:24:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 12390023 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2BADC12002 for ; Wed, 21 Jul 2021 04:25:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C1E6E61001 for ; Wed, 21 Jul 2021 04:25:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232334AbhGUDo1 (ORCPT ); Tue, 20 Jul 2021 23:44:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59892 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232111AbhGUDne (ORCPT ); Tue, 20 Jul 2021 23:43:34 -0400 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D89BAC061762 for ; Tue, 20 Jul 2021 21:24:10 -0700 (PDT) Received: by mail-wr1-x42d.google.com with SMTP id k4so640648wrc.8 for ; Tue, 20 Jul 2021 21:24:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:mime-version :content-transfer-encoding:fcc:to:cc; bh=GrqZQtDOMosDqWoAZzBDHFGfZidiSnrJH8SL/Q6ug8Y=; b=MMFYbOVJ4I5Dq9MXHb/8GvMlgFIfGboJZ82qJSPnrsoEH3Y+IFCA2Yd9ehbjXbJIWZ N86MdRPm5BK6qOJCa+23fTJhIQqFDRN219WJMM4IoKrrj4Eh+6tRiOFws4UP51JKsnMc DApD6g4T53ZnIc4nTx7V6YGcOFLX9eK/g6XHw3GixbdKZrpMntVWVMWhCgtxi8WMdlga e9JkpFmFTpiSZbRk0UuEXvoLmJEScXR/Uvcij8yWrv7bYaLzTOZ2/UrIORnr260dv9V6 J6ANFzBkIMTtxlTX/S97LatkgmbPV/2HX9cE+lNoaNFl7BR06jJCKfxfjjDrO7A0LcX7 39JQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:mime-version:content-transfer-encoding:fcc:to:cc; bh=GrqZQtDOMosDqWoAZzBDHFGfZidiSnrJH8SL/Q6ug8Y=; b=eWlny2bgwN1gjcF+6T/64x40b69l53j/eLdP/x8NbMo2NXodgiZcVbrYMtY99KrPhv FHmYX9K854Gz2yCE9a9DcgLpBOdZgyCRspSzm8/WM34F6GcuWhGz1JD/RXl21verPVsE cu0WYFbWxye51pKiiJX+NeSHOXz1xnZ+cEgZjgazIadmWaYuQIGogck8VaDvuVovPGtv BAKecNEAIZ3/kECCvsMxjb4B6HUtUuH4PqnVQmMPqhBfyYAPfpTj9BIJpr6yqV5Y1xjy 2gJwP/9O9uAvMJao+w6lB0ttCiVFtm2Z0jEzgYMygDDf75H02BbTGs6kG9Rbep2OkI6m szuQ== X-Gm-Message-State: AOAM532uJRSQd6dupcJ6H/eyVl3MhhKwlQ7qLU5pGUAdLGJ3tPMT4JCw zxDArpvn/lnzyOi/fmXI/HgNJuTzkFk= X-Google-Smtp-Source: ABdhPJx00A0uQ0zpPES2Yv0Aftjzg+q8QyfVlhfAJfnr2Yel8o4q3SMaJ0ddAc/sn17bw6dg58umcw== X-Received: by 2002:a5d:4e43:: with SMTP id r3mr39925074wrt.132.1626841449495; Tue, 20 Jul 2021 21:24:09 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id m32sm3858803wms.23.2021.07.20.21.24.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 21:24:09 -0700 (PDT) Message-Id: <7b2112718157e65ba558842e27521df8c351f596.1626841444.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Wed, 21 Jul 2021 04:24:03 +0000 Subject: [PATCH v4 6/7] merge-ort: avoid recursing into directories when we don't need to MIME-Version: 1.0 Fcc: Sent To: git@vger.kernel.org Cc: Derrick Stolee , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Bagas Sanjaya , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren This combines the work of the last several patches, and implements the conditions when we don't need to recurse into directories. It's perhaps easiest to see the logic by separating the fact that a directory might have both rename sources and rename destinations: * rename sources: only files present in the merge base can serve as rename sources, and only when one side deletes that file. When the tree on one side matches the merge base, that means every file within the subtree matches the merge base. This means that the skip-irrelevant-rename-detection optimization from before kicks in and we don't need any of these files as rename sources. * rename destinations: the tree that does not match the merge base might have newly added and hence unmatched destination files. This is what usually prevents us from doing trivial directory resolutions in the merge machinery. However, the fact that we have deferred recursing into this directory until the end means we know whether there are any unmatched relevant potential rename sources elsewhere in this merge. If there are no unmatched such relevant sources anywhere, then there is no need to look for unmatched potential rename destinations to match them with. This informs our algorithm: * Search through relevant_sources; if we have entries, they better all be reflected in cached_pairs or cached_irrelevant, otherwise they represent an unmatched potential rename source (causing the optimization to be disallowed). * For any relevant_source represented in cached_pairs, we do need to to make sure to get the destination for each source, meaning we need to recurse into any ancestor directories of those destinations. * Once we've recursed into all the rename destinations for any relevant_sources in cached_pairs, we can then do the trivial directory resolution for the remaining directories. For the testcases mentioned in commit 557ac0350d ("merge-ort: begin performance work; instrument with trace2_region_* calls", 2020-10-28), this change improves the performance as follows: Before After no-renames: 5.235 s ± 0.042 s 205.1 ms ± 3.8 ms mega-renames: 9.419 s ± 0.107 s 1.564 s ± 0.010 s just-one-mega: 480.1 ms ± 3.9 ms 479.5 ms ± 3.9 ms Acked-by: Derrick Stolee Signed-off-by: Elijah Newren --- merge-ort.c | 102 ++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 99 insertions(+), 3 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index dbccf8c62e2..a013708fa79 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -1230,6 +1230,18 @@ static int collect_merge_info_callback(int n, return mask; } +static void resolve_trivial_directory_merge(struct conflict_info *ci, int side) +{ + VERIFY_CI(ci); + assert((side == 1 && ci->match_mask == 5) || + (side == 2 && ci->match_mask == 3)); + oidcpy(&ci->merged.result.oid, &ci->stages[side].oid); + ci->merged.result.mode = ci->stages[side].mode; + ci->merged.is_null = is_null_oid(&ci->stages[side].oid); + ci->match_mask = 0; + ci->merged.clean = 1; /* (ci->filemask == 0); */ +} + static int handle_deferred_entries(struct merge_options *opt, struct traverse_info *info) { @@ -1239,9 +1251,72 @@ static int handle_deferred_entries(struct merge_options *opt, int side, ret = 0; for (side = MERGE_SIDE1; side <= MERGE_SIDE2; side++) { - renames->deferred[side].trivial_merges_okay = 0; - strintmap_for_each_entry(&renames->deferred[side].possible_trivial_merges, - &iter, entry) { + unsigned optimization_okay = 1; + struct strintmap copy; + + /* Loop over the set of paths we need to know rename info for */ + strset_for_each_entry(&renames->relevant_sources[side], + &iter, entry) { + char *rename_target, *dir, *dir_marker; + struct strmap_entry *e; + + /* + * If we don't know delete/rename info for this path, + * then we need to recurse into all trees to get all + * adds to make sure we have it. + */ + if (strset_contains(&renames->cached_irrelevant[side], + entry->key)) + continue; + e = strmap_get_entry(&renames->cached_pairs[side], + entry->key); + if (!e) { + optimization_okay = 0; + break; + } + + /* If this is a delete, we have enough info already */ + rename_target = e->value; + if (!rename_target) + continue; + + /* If we already walked the rename target, we're good */ + if (strmap_contains(&opt->priv->paths, rename_target)) + continue; + + /* + * Otherwise, we need to get a list of directories that + * will need to be recursed into to get this + * rename_target. + */ + dir = xstrdup(rename_target); + while ((dir_marker = strrchr(dir, '/'))) { + *dir_marker = '\0'; + if (strset_contains(&renames->deferred[side].target_dirs, + dir)) + break; + strset_add(&renames->deferred[side].target_dirs, + dir); + } + free(dir); + } + renames->deferred[side].trivial_merges_okay = optimization_okay; + /* + * We need to recurse into any directories in + * possible_trivial_merges[side] found in target_dirs[side]. + * But when we recurse, we may need to queue up some of the + * subdirectories for possible_trivial_merges[side]. Since + * we can't safely iterate through a hashmap while also adding + * entries, move the entries into 'copy', iterate over 'copy', + * and then we'll also iterate anything added into + * possible_trivial_merges[side] once this loop is done. + */ + copy = renames->deferred[side].possible_trivial_merges; + strintmap_init_with_options(&renames->deferred[side].possible_trivial_merges, + 0, + NULL, + 0); + strintmap_for_each_entry(©, &iter, entry) { const char *path = entry->key; unsigned dir_rename_mask = (intptr_t)entry->value; struct conflict_info *ci; @@ -1254,6 +1329,13 @@ static int handle_deferred_entries(struct merge_options *opt, VERIFY_CI(ci); dirmask = ci->dirmask; + if (optimization_okay && + !strset_contains(&renames->deferred[side].target_dirs, + path)) { + resolve_trivial_directory_merge(ci, side); + continue; + } + info->name = path; info->namelen = strlen(path); info->pathlen = info->namelen + 1; @@ -1289,6 +1371,20 @@ static int handle_deferred_entries(struct merge_options *opt, if (ret < 0) return ret; } + strintmap_clear(©); + strintmap_for_each_entry(&renames->deferred[side].possible_trivial_merges, + &iter, entry) { + const char *path = entry->key; + struct conflict_info *ci; + + ci = strmap_get(&opt->priv->paths, path); + VERIFY_CI(ci); + + assert(renames->deferred[side].trivial_merges_okay && + !strset_contains(&renames->deferred[side].target_dirs, + path)); + resolve_trivial_directory_merge(ci, side); + } } return ret; } From patchwork Wed Jul 21 04:24:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 12390025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29E88C12002 for ; Wed, 21 Jul 2021 04:25:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0B10361019 for ; Wed, 21 Jul 2021 04:25:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232356AbhGUDoi (ORCPT ); Tue, 20 Jul 2021 23:44:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59884 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232091AbhGUDne (ORCPT ); Tue, 20 Jul 2021 23:43:34 -0400 Received: from mail-wm1-x329.google.com (mail-wm1-x329.google.com [IPv6:2a00:1450:4864:20::329]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 76920C061574 for ; Tue, 20 Jul 2021 21:24:11 -0700 (PDT) Received: by mail-wm1-x329.google.com with SMTP id f8-20020a1c1f080000b029022d4c6cfc37so2631413wmf.5 for ; Tue, 20 Jul 2021 21:24:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:mime-version :content-transfer-encoding:fcc:to:cc; bh=H61kpePd0RggohlQhvjJVORWuqGjEcvAkljU5/g4MKQ=; b=vaXIoWAFK1BxG/cAOGDXaqB4MnK4bjcf5pPwHSYIBCbOkl3OabXubs66pSPMXcKRw+ 07K3YpjF0/XfkL+OjWSfAJ+cxJTLeSCn+BIEJ0N8IZOeSNF+tZCnB8F7twuAMiJsOPEp 0HF805oF3ImzZTZkdEzs+uJBf5b/5lU168zn+FpPmPDOr+fesN2yvct3m2jPZnvtZ/Gc sc89NaccfLDUECacsXDcvFhfObxnN13mrWY2eCeSJv18wMI+Cw8tnN9yNYZUTbbXN2hB p7vXlNMuPUmD4Z2vdxx57Gt2BQz4fvzn3vXH43aCd7dqXR8wpgVlfmq8kP7Dl61AHxM6 89Nw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:mime-version:content-transfer-encoding:fcc:to:cc; bh=H61kpePd0RggohlQhvjJVORWuqGjEcvAkljU5/g4MKQ=; b=gEorEIvUMwlTPSgKHOyAR1vtofePdEgQhgur6C7qNnX1PRKDvYnvBLISgU5zD+CHde Ff323laiq75g88BFrFOKEb95xMzfprhFXXWCh09AhMIcci7nJIIrH6Otm0oCcl0NR4/m 5HL9E3TDXCooSiUZjGX4Owup/qPyPpF7MiscxaDFXy0qGkBWi9TLLy1lNF5udfFxht97 wsKCg7Is3sMmvdm+oCs4xxo169Mni1q5jTd/3C583tzJqaCbV7KqxRBqDgmJpz6GTG0k wbiL2ojGWJKNZ7Q42EaJ2Fs62vgT4mp4IGBRCUJ6MRnMXPkhWjfegep2Cot6Z4jhAYPn je4g== X-Gm-Message-State: AOAM531zsA7w08GjNUwpu3gwb4NwZhE/xUCC9bZUBDrP2wS8Wv0SLZva ibsfD1nYyzj6A7nrp3EcBhil6To+JjA= X-Google-Smtp-Source: ABdhPJxOrDYJZnhKg/29P4sdCOLzHXQv4XitoCK76FVN3Ng3QT0W2gEkNzTT9pJ8rHC0gOmOwLKyDQ== X-Received: by 2002:a05:600c:4f05:: with SMTP id l5mr1857147wmq.96.1626841450045; Tue, 20 Jul 2021 21:24:10 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id s6sm30330834wrt.45.2021.07.20.21.24.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 21:24:09 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Wed, 21 Jul 2021 04:24:04 +0000 Subject: [PATCH v4 7/7] merge-ort: restart merge with cached renames to reduce process entry cost MIME-Version: 1.0 Fcc: Sent To: git@vger.kernel.org Cc: Derrick Stolee , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Bagas Sanjaya , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren The merge algorithm mostly consists of the following three functions: collect_merge_info() detect_and_process_renames() process_entries() Prior to the trivial directory resolution optimization of the last half dozen commits, process_entries() was consistently the slowest, followed by collect_merge_info(), then detect_and_process_renames(). When the trivial directory resolution applies, it often dramatically decreases the amount of time spent in the two slower functions. Looking at the performance results in the previous commit, the trivial directory resolution optimization helps amazingly well when there are no relevant renames. It also helps really well when reapplying a long series of linear commits (such as in a rebase or cherry-pick), since the relevant renames may well be cached from the first reapplied commit. But when there are any relevant renames that are not cached (represented by the just-one-mega testcase), then the optimization does not help at all. Often, I noticed that when the optimization does not apply, it is because there are a handful of relevant sources -- maybe even only one. It felt frustrating to need to recurse into potentially hundreds or even thousands of directories just for a single rename, but it was needed for correctness. However, staring at this list of functions and noticing that process_entries() is the most expensive and knowing I could avoid it if I had cached renames suggested a simple idea: change collect_merge_info() detect_and_process_renames() process_entries() into collect_merge_info() detect_and_process_renames() collect_merge_info() detect_and_process_renames() process_entries() This may seem odd and look like more work. However, note that although we run collect_merge_info() twice, the second time we get to employ trivial directory resolves, which makes it much faster, so the increased time in collect_merge_info() is small. While we run detect_and_process_renames() again, all renames are cached so it's nearly a no-op (we don't call into diffcore_rename_extended() but we do have a little bit of data structure checking and fixing up). And the big payoff comes from the fact that process_entries(), will be much faster due to having far fewer entries to process. This restarting only makes sense if we can save recursing into enough directories to make it worth our while. Introduce a simple heuristic to guide this. Note that this heuristic uses a "wanted_factor" that I have virtually no actual real world data for, just some back-of-the-envelope quasi-scientific calculations that I included in some comments and then plucked a simple round number out of thin air. It could be that tweaking this number to make it either higher or lower improves the optimization. (There's slightly more here; when I first introduced this optimization, I used a factor of 10, because I was completely confident it was big enough to not cause slowdowns in special cases. I was certain it was higher than needed. Several months later, I added the rough calculations which make me think the optimal number is close to 2; but instead of pushing to the limit, I just bumped it to 3 to reduce the risk that there are special cases where this optimization can result in slowing down the code a little. If the ratio of path counts is below 3, we probably will only see minor performance improvements at best anyway.) Also, note that while the diffstat looks kind of long (nearly 100 lines), more than half of it is in two comments explaining how things work. For the testcases mentioned in commit 557ac0350d ("merge-ort: begin performance work; instrument with trace2_region_* calls", 2020-10-28), this change improves the performance as follows: Before After no-renames: 205.1 ms ± 3.8 ms 204.2 ms ± 3.0 ms mega-renames: 1.564 s ± 0.010 s 1.076 s ± 0.015 s just-one-mega: 479.5 ms ± 3.9 ms 364.1 ms ± 7.0 ms Acked-by: Derrick Stolee Signed-off-by: Elijah Newren --- merge-ort.c | 92 +++++++++++++++++++++++++++-- t/t6423-merge-rename-directories.sh | 2 +- 2 files changed, 87 insertions(+), 7 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index a013708fa79..e361443087a 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -213,6 +213,7 @@ struct rename_info { * MERGE_SIDE2: cached data from side2 can be reused * MERGE_SIDE1: cached data from side1 can be reused * 0: no cached data can be reused + * -1: See redo_after_renames; both sides can be reused. */ int cached_pairs_valid_side; @@ -258,6 +259,28 @@ struct rename_info { */ struct strset cached_irrelevant[3]; + /* + * redo_after_renames: optimization flag for "restarting" the merge + * + * Sometimes it pays to detect renames, cache them, and then + * restart the merge operation from the beginning. The reason for + * this is that when we know where all the renames are, we know + * whether a certain directory has any paths under it affected -- + * and if a directory is not affected then it permits us to do + * trivial tree merging in more cases. Doing trivial tree merging + * prevents the need to run process_entry() on every path + * underneath trees that can be trivially merged, and + * process_entry() is more expensive than collect_merge_info() -- + * plus, the second collect_merge_info() will be much faster since + * it doesn't have to recurse into the relevant trees. + * + * Values for this flag: + * 0 = don't bother, not worth it (or conditions not yet checked) + * 1 = conditions for optimization met, optimization worthwhile + * 2 = we already did it (don't restart merge yet again) + */ + unsigned redo_after_renames; + /* * needed_limit: value needed for inexact rename detection to run * @@ -541,7 +564,8 @@ static void clear_or_reinit_internal_opts(struct merge_options_internal *opti, strintmap_func(&renames->relevant_sources[i]); if (!reinitialize) assert(renames->cached_pairs_valid_side == 0); - if (i != renames->cached_pairs_valid_side) { + if (i != renames->cached_pairs_valid_side && + -1 != renames->cached_pairs_valid_side) { strset_func(&renames->cached_target_names[i]); strmap_func(&renames->cached_pairs[i], 1); strset_func(&renames->cached_irrelevant[i]); @@ -1249,7 +1273,9 @@ static int handle_deferred_entries(struct merge_options *opt, struct hashmap_iter iter; struct strmap_entry *entry; int side, ret = 0; + int path_count_before, path_count_after = 0; + path_count_before = strmap_get_size(&opt->priv->paths); for (side = MERGE_SIDE1; side <= MERGE_SIDE2; side++) { unsigned optimization_okay = 1; struct strintmap copy; @@ -1385,7 +1411,32 @@ static int handle_deferred_entries(struct merge_options *opt, path)); resolve_trivial_directory_merge(ci, side); } + if (!optimization_okay || path_count_after) + path_count_after = strmap_get_size(&opt->priv->paths); } + if (path_count_after) { + /* + * The choice of wanted_factor here does not affect + * correctness, only performance. When the + * path_count_after / path_count_before + * ratio is high, redoing after renames is a big + * performance boost. I suspect that redoing is a wash + * somewhere near a value of 2, and below that redoing will + * slow things down. I applied a fudge factor and picked + * 3; see the commit message when this was introduced for + * back of the envelope calculations for this ratio. + */ + const int wanted_factor = 3; + + /* We should only redo collect_merge_info one time */ + assert(renames->redo_after_renames == 0); + + if (path_count_after / path_count_before >= wanted_factor) { + renames->redo_after_renames = 1; + renames->cached_pairs_valid_side = -1; + } + } else if (renames->redo_after_renames == 2) + renames->redo_after_renames = 0; return ret; } @@ -2828,8 +2879,8 @@ static int compare_pairs(const void *a_, const void *b_) } /* Call diffcore_rename() to update deleted/added pairs into rename pairs */ -static void detect_regular_renames(struct merge_options *opt, - unsigned side_index) +static int detect_regular_renames(struct merge_options *opt, + unsigned side_index) { struct diff_options diff_opts; struct rename_info *renames = &opt->priv->renames; @@ -2842,7 +2893,7 @@ static void detect_regular_renames(struct merge_options *opt, * side had directory renames. */ resolve_diffpair_statuses(&renames->pairs[side_index]); - return; + return 0; } partial_clear_dir_rename_count(&renames->dir_rename_count[side_index]); @@ -2868,6 +2919,8 @@ static void detect_regular_renames(struct merge_options *opt, trace2_region_leave("diff", "diffcore_rename", opt->repo); resolve_diffpair_statuses(&diff_queued_diff); + if (diff_opts.needed_rename_limit > 0) + renames->redo_after_renames = 0; if (diff_opts.needed_rename_limit > renames->needed_limit) renames->needed_limit = diff_opts.needed_rename_limit; @@ -2877,6 +2930,8 @@ static void detect_regular_renames(struct merge_options *opt, diff_queued_diff.nr = 0; diff_queued_diff.queue = NULL; diff_flush(&diff_opts); + + return 1; } /* @@ -2966,14 +3021,32 @@ static int detect_and_process_renames(struct merge_options *opt, struct diff_queue_struct combined; struct rename_info *renames = &opt->priv->renames; int need_dir_renames, s, clean = 1; + unsigned detection_run = 0; memset(&combined, 0, sizeof(combined)); if (!possible_renames(renames)) goto cleanup; trace2_region_enter("merge", "regular renames", opt->repo); - detect_regular_renames(opt, MERGE_SIDE1); - detect_regular_renames(opt, MERGE_SIDE2); + detection_run |= detect_regular_renames(opt, MERGE_SIDE1); + detection_run |= detect_regular_renames(opt, MERGE_SIDE2); + if (renames->redo_after_renames && detection_run) { + int i, side; + struct diff_filepair *p; + + /* Cache the renames, we found */ + for (side = MERGE_SIDE1; side <= MERGE_SIDE2; side++) { + for (i = 0; i < renames->pairs[side].nr; ++i) { + p = renames->pairs[side].queue[i]; + possibly_cache_new_pair(renames, p, side, NULL); + } + } + + /* Restart the merge with the cached renames */ + renames->redo_after_renames = 2; + trace2_region_leave("merge", "regular renames", opt->repo); + goto cleanup; + } use_cached_pairs(opt, &renames->cached_pairs[1], &renames->pairs[1]); use_cached_pairs(opt, &renames->cached_pairs[2], &renames->pairs[2]); trace2_region_leave("merge", "regular renames", opt->repo); @@ -4390,6 +4463,7 @@ static void merge_ort_nonrecursive_internal(struct merge_options *opt, opt->subtree_shift); } +redo: trace2_region_enter("merge", "collect_merge_info", opt->repo); if (collect_merge_info(opt, merge_base, side1, side2) != 0) { /* @@ -4409,6 +4483,12 @@ static void merge_ort_nonrecursive_internal(struct merge_options *opt, result->clean = detect_and_process_renames(opt, merge_base, side1, side2); trace2_region_leave("merge", "renames", opt->repo); + if (opt->priv->renames.redo_after_renames == 2) { + trace2_region_enter("merge", "reset_maps", opt->repo); + clear_or_reinit_internal_opts(opt->priv, 1); + trace2_region_leave("merge", "reset_maps", opt->repo); + goto redo; + } trace2_region_enter("merge", "process_entries", opt->repo); process_entries(opt, &working_tree_oid); diff --git a/t/t6423-merge-rename-directories.sh b/t/t6423-merge-rename-directories.sh index e834b7e6efe..d8919d276a1 100755 --- a/t/t6423-merge-rename-directories.sh +++ b/t/t6423-merge-rename-directories.sh @@ -4797,7 +4797,7 @@ test_setup_12f () { ) } -test_expect_merge_algorithm failure failure '12f: Trivial directory resolve, caching, all kinds of fun' ' +test_expect_merge_algorithm failure success '12f: Trivial directory resolve, caching, all kinds of fun' ' test_setup_12f && ( cd 12f &&