From patchwork Tue Jul 20 10:36:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood X-Patchwork-Id: 12388035 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A90CC07E9B for ; Tue, 20 Jul 2021 10:39:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8469F610F7 for ; Tue, 20 Jul 2021 10:39:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237053AbhGTJ7I (ORCPT ); Tue, 20 Jul 2021 05:59:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46062 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235613AbhGTJ4D (ORCPT ); Tue, 20 Jul 2021 05:56:03 -0400 Received: from mail-wr1-x42a.google.com (mail-wr1-x42a.google.com [IPv6:2a00:1450:4864:20::42a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 66762C0613E4 for ; Tue, 20 Jul 2021 03:36:37 -0700 (PDT) Received: by mail-wr1-x42a.google.com with SMTP id c12so7436196wrt.3 for ; Tue, 20 Jul 2021 03:36:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=uEWYiN3zMagJGskhmrwR0avZIR39YtJZYgnhZg5J5eg=; b=MuKRa98pQuqwEp2sbqvu78KQ4vpUoB99phjzKE1iwLC1q99IfMWEg3dfZN5E5Z+ZWE lNxW9soMyNvEI8rxOMeW2Q/ggCiFy3NDr0DfMAhcqKCuF/uZ0v1O8CNnXzSxKyTqVr+A +gFF1nQDQh1YVhVwGL7R0VODuSPbdF/gUP3aYy1ST/EUPo3dC4lLmjNxqOySe7H3U9RK Vu6ASNeOU6jmfJnDziIO9OZEzorJLhC2gKXMeqJXcicP66z3FIbXkSnKGmffrvZ0sc7q NZp+EeBSiSPNG6pCrF32nK9npqyd4CAVZTgjy/hw5nTeOptNfpe1Qg2Hk/HzPpF/ipIY SMHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=uEWYiN3zMagJGskhmrwR0avZIR39YtJZYgnhZg5J5eg=; b=lfduc9PCTbu5DMqk0NmoIVOkkNg0nyp4jDMeBPrPU7G+Nhpfs3yIAoMOj4GJr9MdOc TenLJ03KdsXXyuxug0ezW/+YMoE/SZuLnIcymjfs+pJIKhl4Fs3m1f9K7IHCSOpd570T XbouaRJn9mr1gmA1Jcmt7bufO3Xb8xVDmIvc4AXd0UNfq3wSfJeQN8WPEKS5HT2xRBs3 WYeCrItcQ03peWNHre0CBCkpMq3FIeQ6Xk1RT5eBCeuWEXDVofiixY99yIQ4MOvI2Nj3 KUvGrvlcBDJ2l7Ms3toYeNilKlD7pekHF9wKQ/3zdcNzsnHhiWbb9dWny06Etvv6yxv5 OamA== X-Gm-Message-State: AOAM532J9CrdODEaP939yqcRiXBLl5jVkCwF5j5Sggy7yvtaV8cwpo8O r3rv00FObk6UqiCMjkqHENBJuwByJwA= X-Google-Smtp-Source: ABdhPJytEmJodjjLdzxGzLblv5bBRVi12Upph/SI7ihwjW+JuklwDYnCrSlesJAya063s7HDtXrkMw== X-Received: by 2002:a5d:4048:: with SMTP id w8mr35311196wrp.82.1626777396042; Tue, 20 Jul 2021 03:36:36 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id y3sm23551559wrh.16.2021.07.20.03.36.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 03:36:35 -0700 (PDT) Message-Id: <8fc8914a37b3c343cd92bb0255088f7b000ff7f7.1626777393.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 10:36:22 +0000 Subject: [PATCH v2 01/12] diff --color-moved: add perf tests Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Phillip Wood , =?utf-8?b?w4Z2YXIgQXJuZmo=?= =?utf-8?b?w7Zyw7A=?= Bjarmason , Phillip Wood , Phillip Wood Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Phillip Wood From: Phillip Wood Add some tests so we can monitor changes to the performance of the move detection code. The tests record the performance of a single large diff and a sequence of smaller diffs. Signed-off-by: Phillip Wood --- t/perf/p4002-diff-color-moved.sh | 45 ++++++++++++++++++++++++++++++++ 1 file changed, 45 insertions(+) create mode 100755 t/perf/p4002-diff-color-moved.sh diff --git a/t/perf/p4002-diff-color-moved.sh b/t/perf/p4002-diff-color-moved.sh new file mode 100755 index 00000000000..ad56bcb71e4 --- /dev/null +++ b/t/perf/p4002-diff-color-moved.sh @@ -0,0 +1,45 @@ +#!/bin/sh + +test_description='Tests diff --color-moved performance' +. ./perf-lib.sh + +test_perf_default_repo + +if ! git rev-parse --verify v2.29.0^{commit} >/dev/null +then + skip_all='skipping because tag v2.29.0 was not found' + test_done +fi + +GIT_PAGER_IN_USE=1 +test_export GIT_PAGER_IN_USE + +test_perf 'diff --no-color-moved --no-color-moved-ws large change' ' + git diff --no-color-moved --no-color-moved-ws v2.28.0 v2.29.0 +' + +test_perf 'diff --color-moved --no-color-moved-ws large change' ' + git diff --color-moved=zebra --no-color-moved-ws v2.28.0 v2.29.0 +' + +test_perf 'diff --color-moved-ws=allow-indentation-change large change' ' + git diff --color-moved=zebra --color-moved-ws=allow-indentation-change \ + v2.28.0 v2.29.0 +' + +test_perf 'log --no-color-moved --no-color-moved-ws' ' + git log --no-color-moved --no-color-moved-ws --no-merges --patch \ + -n1000 v2.29.0 +' + +test_perf 'log --color-moved --no-color-moved-ws' ' + git log --color-moved=zebra --no-color-moved-ws --no-merges --patch \ + -n1000 v2.29.0 +' + +test_perf 'log --color-moved-ws=allow-indentation-change' ' + git log --color-moved=zebra --color-moved-ws=allow-indentation-change \ + --no-merges --patch -n1000 v2.29.0 +' + +test_done From patchwork Tue Jul 20 10:36:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood X-Patchwork-Id: 12388033 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CDB88C07E95 for ; Tue, 20 Jul 2021 10:39:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B7563610F7 for ; Tue, 20 Jul 2021 10:39:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237008AbhGTJ64 (ORCPT ); Tue, 20 Jul 2021 05:58:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46080 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237024AbhGTJ4E (ORCPT ); Tue, 20 Jul 2021 05:56:04 -0400 Received: from mail-wr1-x42e.google.com (mail-wr1-x42e.google.com [IPv6:2a00:1450:4864:20::42e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0A928C0613E5 for ; Tue, 20 Jul 2021 03:36:38 -0700 (PDT) Received: by mail-wr1-x42e.google.com with SMTP id d2so25508252wrn.0 for ; Tue, 20 Jul 2021 03:36:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=t1cGcruRbiUvTwZ0y1BCNXZXKd3TyRpHQO/UtOTBTlE=; b=XKM0ugFt9aPDAW6AWBzNBuH9dBjPvZ/BFooNENBUol841rKr8aLrtutqPepsVCk0dx 3OM00JBfGsjPJnxlJhH6Shths2JPFtuHvPK/RB7osbTHyU0PLihvaGdEVaFlRUmBWJB1 cKsc8WZIz2ejpvox+3te3RooUmEXon5geiIZ8ZxWt/WHbx246PiYQrPk7NYZrbzUGicw PNTYURgK+1fnO2kdfP45kRGidjsISpHBMH063k1Lw0CWXMQYsnGqyE/XS1c67aFXGAfU EyVww48E7My83WQKDyfrapa1ywHFJTnntW87lLAEGuUE499kuZhjsD2hk4fEA/FCcbej HBsw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=t1cGcruRbiUvTwZ0y1BCNXZXKd3TyRpHQO/UtOTBTlE=; b=CiJvC7ZJITbYF3p2DZnByvPFc4/UlFSfHghjIFbQUFMrk3BCJk7LuodmIGd5TNaPry VUgXFnYYqFM0HY3XiSvCxh5pGENkRikG35WDKG+6oCrtxM22EMOvxtY85h0QkSnqSZmN o94eTbJBS+iszaozfUtSjmmcb5B9P7wo2+t/GDlQr5dcEEvNHkkwJE4OGTeo+8GUGcnV mkTwtirPeaKvlyPcsNM4/En0g1iLgfuZadpw9vtPo8gM7UCbnvF4TuqRtqgBHCmJ33g/ qvbqtkWBq0ropgDsdc5vl2RuwdSIWHR/uYDr2+jkT6kTjbcdN4jF/Mm+vbOiGgRH1dqd US2w== X-Gm-Message-State: AOAM532v0JRRbv9YTbVanolL76wPH4tZ/TMXvdRlodtv183Z2Mnr5VOZ McDPoQJHLn/r1Jo5+wVD6h9PrfhDx9o= X-Google-Smtp-Source: ABdhPJx9Szes+XgsGuV/Si+r0FKn8T5HRNl4aqYg4/ueB4lu1Lu8D+5MNHR3J92B8gL4n2dPv/6t6g== X-Received: by 2002:a5d:6b86:: with SMTP id n6mr35375279wrx.298.1626777396647; Tue, 20 Jul 2021 03:36:36 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id l18sm2243340wme.29.2021.07.20.03.36.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 03:36:36 -0700 (PDT) Message-Id: <9b4e4d2674a38b6bed793e0f5661edefa6df5d23.1626777393.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 10:36:23 +0000 Subject: [PATCH v2 02/12] diff --color-moved=zebra: fix alternate coloring Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Phillip Wood , =?utf-8?b?w4Z2YXIgQXJuZmo=?= =?utf-8?b?w7Zyw7A=?= Bjarmason , Phillip Wood , Phillip Wood Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Phillip Wood From: Phillip Wood b0a2ba4776 ("diff --color-moved=zebra: be stricter with color alternation", 2018-11-23) sought to avoid using the alternate colors unless there are two adjacent moved blocks of the same sign. Unfortunately it contains two bugs that prevented it from fixing the problem properly. Firstly `last_symbol` is reset at the start of each iteration of the loop losing the symbol of the last line and secondly when deciding whether to use the alternate color it should be checking if the current line is the same sign of the last line, not a different sign. The combination of the two errors means that we still use the alternate color when we should do but we also use it when we shouldn't. This is most noticable when using --color-moved-ws=allow-indentation-change with hunks like -this line gets indented + this line gets indented where the post image is colored with newMovedAlternate rather than newMoved. While this does not matter much, the next commit will change the coloring to be correct in this case, so lets fix the bug here to make it clear why the output is changing and add a regression test. Signed-off-by: Phillip Wood --- diff.c | 4 +-- t/t4015-diff-whitespace.sh | 72 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 74 insertions(+), 2 deletions(-) diff --git a/diff.c b/diff.c index 52c791574b7..cb068f8258c 100644 --- a/diff.c +++ b/diff.c @@ -1142,6 +1142,7 @@ static void mark_color_as_moved(struct diff_options *o, struct moved_block *pmb = NULL; /* potentially moved blocks */ int pmb_nr = 0, pmb_alloc = 0; int n, flipped_block = 0, block_length = 0; + enum diff_symbol last_symbol = 0; for (n = 0; n < o->emitted_symbols->nr; n++) { @@ -1149,7 +1150,6 @@ static void mark_color_as_moved(struct diff_options *o, struct moved_entry *key; struct moved_entry *match = NULL; struct emitted_diff_symbol *l = &o->emitted_symbols->buf[n]; - enum diff_symbol last_symbol = 0; switch (l->s) { case DIFF_SYMBOL_PLUS: @@ -1214,7 +1214,7 @@ static void mark_color_as_moved(struct diff_options *o, } if (adjust_last_block(o, n, block_length) && - pmb_nr && last_symbol != l->s) + pmb_nr && last_symbol == l->s) flipped_block = (flipped_block + 1) % 2; else flipped_block = 0; diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh index 2c13b62d3c6..920114cd795 100755 --- a/t/t4015-diff-whitespace.sh +++ b/t/t4015-diff-whitespace.sh @@ -1442,6 +1442,78 @@ test_expect_success 'detect permutations inside moved code -- dimmed-zebra' ' test_cmp expected actual ' +test_expect_success 'zebra alternate color is only used when necessary' ' + cat >old.txt <<-\EOF && + line 1A should be marked as oldMoved newMovedAlternate + line 1B should be marked as oldMoved newMovedAlternate + unchanged + line 2A should be marked as oldMoved newMovedAlternate + line 2B should be marked as oldMoved newMovedAlternate + line 3A should be marked as oldMovedAlternate newMoved + line 3B should be marked as oldMovedAlternate newMoved + unchanged + line 4A should be marked as oldMoved newMovedAlternate + line 4B should be marked as oldMoved newMovedAlternate + line 5A should be marked as oldMovedAlternate newMoved + line 5B should be marked as oldMovedAlternate newMoved + line 6A should be marked as oldMoved newMoved + line 6B should be marked as oldMoved newMoved + EOF + cat >new.txt <<-\EOF && + line 1A should be marked as oldMoved newMovedAlternate + line 1B should be marked as oldMoved newMovedAlternate + unchanged + line 3A should be marked as oldMovedAlternate newMoved + line 3B should be marked as oldMovedAlternate newMoved + line 2A should be marked as oldMoved newMovedAlternate + line 2B should be marked as oldMoved newMovedAlternate + unchanged + line 6A should be marked as oldMoved newMoved + line 6B should be marked as oldMoved newMoved + line 4A should be marked as oldMoved newMovedAlternate + line 4B should be marked as oldMoved newMovedAlternate + line 5A should be marked as oldMovedAlternate newMoved + line 5B should be marked as oldMovedAlternate newMoved + EOF + test_expect_code 1 git diff --no-index --color --color-moved=zebra \ + --color-moved-ws=allow-indentation-change \ + old.txt new.txt >output && + grep -v index output | test_decode_color >actual && + cat >expected <<-\EOF && + diff --git a/old.txt b/new.txt + --- a/old.txt + +++ b/new.txt + @@ -1,14 +1,14 @@ + -line 1A should be marked as oldMoved newMovedAlternate + -line 1B should be marked as oldMoved newMovedAlternate + + line 1A should be marked as oldMoved newMovedAlternate + + line 1B should be marked as oldMoved newMovedAlternate + unchanged + -line 2A should be marked as oldMoved newMovedAlternate + -line 2B should be marked as oldMoved newMovedAlternate + -line 3A should be marked as oldMovedAlternate newMoved + -line 3B should be marked as oldMovedAlternate newMoved + + line 3A should be marked as oldMovedAlternate newMoved + + line 3B should be marked as oldMovedAlternate newMoved + + line 2A should be marked as oldMoved newMovedAlternate + + line 2B should be marked as oldMoved newMovedAlternate + unchanged + -line 4A should be marked as oldMoved newMovedAlternate + -line 4B should be marked as oldMoved newMovedAlternate + -line 5A should be marked as oldMovedAlternate newMoved + -line 5B should be marked as oldMovedAlternate newMoved + -line 6A should be marked as oldMoved newMoved + -line 6B should be marked as oldMoved newMoved + + line 6A should be marked as oldMoved newMoved + + line 6B should be marked as oldMoved newMoved + + line 4A should be marked as oldMoved newMovedAlternate + + line 4B should be marked as oldMoved newMovedAlternate + + line 5A should be marked as oldMovedAlternate newMoved + + line 5B should be marked as oldMovedAlternate newMoved + EOF + test_cmp expected actual +' + test_expect_success 'cmd option assumes configured colored-moved' ' test_config color.diff.oldMoved "magenta" && test_config color.diff.newMoved "cyan" && From patchwork Tue Jul 20 10:36:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood X-Patchwork-Id: 12388031 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 37E29C07E9B for ; Tue, 20 Jul 2021 10:39:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 21FE16108B for ; Tue, 20 Jul 2021 10:39:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235277AbhGTJ6q (ORCPT ); Tue, 20 Jul 2021 05:58:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46168 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237034AbhGTJ4E (ORCPT ); Tue, 20 Jul 2021 05:56:04 -0400 Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [IPv6:2a00:1450:4864:20::333]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8BF27C0613E6 for ; Tue, 20 Jul 2021 03:36:38 -0700 (PDT) Received: by mail-wm1-x333.google.com with SMTP id y21-20020a7bc1950000b02902161fccabf1so1199308wmi.2 for ; Tue, 20 Jul 2021 03:36:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=OIRnvnQJGMfH9gMZnEIvJzcVGWMzhMT6d8yFwXGSxog=; b=WGhM9b1ZpWKH0pSq0ZL0vvK6Qh7j+s4CH3InY97qZ2uXnZHIOP1EyvvpZ90nyYBDdX WXt4ObnLEBAWSdqvtQsAG/MM8lglefOFUhVE41gxQ7gOQWpiec12soVzDh6wLShFZIRJ wVQG7DHoo5G7DB4C5LiaIL0EnDMcQgWlXXcfiED6UosOSQM5qkNNA1jq6TjMwDPc5QQ/ LNiYSETRumVbS08yWdtNOM2vzjRHCPzNU5sG07XKl20u4H1cueNgKulC0rq6UgQD4FA8 jjQ9fIf91l0zTK20X3moBOvilBNZa3jjspFgiakvjpcWLv8IyjKx3LyzIaAz4yTlO+7D 9CcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=OIRnvnQJGMfH9gMZnEIvJzcVGWMzhMT6d8yFwXGSxog=; b=GxBQL55H/BnTABs/+CKmBHJUoEgKwdd4+t3dzlZ39E1vPR6Q0A6pkfseStqZV4k5KM d6kyUY4awyetV1M3Zod23u5hIApPHIT2ObDxy40zsniZn5kiMZOvsVYSdwXQv+gfuRv4 oo8Lf/GEI7LjfAy/a56wyrasOXUim5KegeVDFAmfTZixrwJt+uinRQKV05EiRTUqTi8q dkhrVCVx2fZL8sjbNE2wAW/eAnEZzUmOGJ4Tuu+z4+blTd7oSXbfLuUNMACNXn5wLzN0 nPgeTbJ46SJoE+xPPYlE01PGcxqkmsa+EHmyFGvvk3FP6BsqwnMHBv5bT6Uu9XtC5Mn8 NXkA== X-Gm-Message-State: AOAM530Shy2lD8Vvl7mRP0BMGW0Xwqjbhq2sxG3X3+qCpnBIM5RPyVuu wHqBjUbjVdhSL624rBQbLwNw7joe9B0= X-Google-Smtp-Source: ABdhPJxcwgJa7u8k97XmAvYfDdM3uXQhabr20cJ1x6ZX2KVPHOwfECta6lrLGJQDzBrZGSedorg+fw== X-Received: by 2002:a7b:ca43:: with SMTP id m3mr30760487wml.74.1626777397216; Tue, 20 Jul 2021 03:36:37 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id o14sm23517755wrj.66.2021.07.20.03.36.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 03:36:36 -0700 (PDT) Message-Id: <5512145c70ff47e9ab239f4f896498edf4c16745.1626777393.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 10:36:24 +0000 Subject: [PATCH v2 03/12] diff --color-moved: avoid false short line matches and bad zerba coloring Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Phillip Wood , =?utf-8?b?w4Z2YXIgQXJuZmo=?= =?utf-8?b?w7Zyw7A=?= Bjarmason , Phillip Wood , Phillip Wood Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Phillip Wood From: Phillip Wood When marking moved lines it is possible for a block of potential matched lines to extend past a change in sign when there is a sequence of added lines whose text matches the text of a sequence of deleted and added lines. Most of the time either `match` will be NULL or `pmb_advance_or_null()` will fail when the loop encounters a change of sign but there are corner cases where `match` is non-NULL and `pmb_advance_or_null()` successfully advances the moved block despite the change in sign. One consequence of this is highlighting a short line as moved when it should not be. For example -moved line # Correctly highlighted as moved +short line # Wrongly highlighted as moved context +moved line # Correctly highlighted as moved +short line context -short line The other consequence is coloring a moved addition following a moved deletion in the wrong color. In the example below the first "+moved line 3" should be highlighted as newMoved not newMovedAlternate. -moved line 1 # Correctly highlighted as oldMoved -moved line 2 # Correctly highlighted as oldMovedAlternate +moved line 3 # Wrongly highlighted as newMovedAlternate context # Everything else is highlighted correctly +moved line 2 +moved line 3 context +moved line 1 -moved line 3 These false matches are more likely when using --color-moved-ws with the exception of --color-moved-ws=allow-indentation-change which ties the sign of the current whitespace delta to the sign of the line to avoid this problem. The fix is to check that the sign of the new line being matched is the same as the sign of the line that started the block of potential matches. Signed-off-by: Phillip Wood --- diff.c | 17 ++++++---- t/t4015-diff-whitespace.sh | 65 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 76 insertions(+), 6 deletions(-) diff --git a/diff.c b/diff.c index cb068f8258c..2b51b77fd20 100644 --- a/diff.c +++ b/diff.c @@ -1142,7 +1142,7 @@ static void mark_color_as_moved(struct diff_options *o, struct moved_block *pmb = NULL; /* potentially moved blocks */ int pmb_nr = 0, pmb_alloc = 0; int n, flipped_block = 0, block_length = 0; - enum diff_symbol last_symbol = 0; + enum diff_symbol moved_symbol = DIFF_SYMBOL_BINARY_DIFF_HEADER; for (n = 0; n < o->emitted_symbols->nr; n++) { @@ -1168,7 +1168,7 @@ static void mark_color_as_moved(struct diff_options *o, flipped_block = 0; } - if (!match) { + if (pmb_nr && (!match || l->s != moved_symbol)) { int i; adjust_last_block(o, n, block_length); @@ -1177,12 +1177,13 @@ static void mark_color_as_moved(struct diff_options *o, pmb_nr = 0; block_length = 0; flipped_block = 0; - last_symbol = l->s; + } + if (!match) { + moved_symbol = DIFF_SYMBOL_BINARY_DIFF_HEADER; continue; } if (o->color_moved == COLOR_MOVED_PLAIN) { - last_symbol = l->s; l->flags |= DIFF_SYMBOL_MOVED_LINE; continue; } @@ -1214,11 +1215,16 @@ static void mark_color_as_moved(struct diff_options *o, } if (adjust_last_block(o, n, block_length) && - pmb_nr && last_symbol == l->s) + pmb_nr && moved_symbol == l->s) flipped_block = (flipped_block + 1) % 2; else flipped_block = 0; + if (pmb_nr) + moved_symbol = l->s; + else + moved_symbol = DIFF_SYMBOL_BINARY_DIFF_HEADER; + block_length = 0; } @@ -1228,7 +1234,6 @@ static void mark_color_as_moved(struct diff_options *o, if (flipped_block && o->color_moved != COLOR_MOVED_BLOCKS) l->flags |= DIFF_SYMBOL_MOVED_LINE_ALT; } - last_symbol = l->s; } adjust_last_block(o, n, block_length); diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh index 920114cd795..3119a59f071 100755 --- a/t/t4015-diff-whitespace.sh +++ b/t/t4015-diff-whitespace.sh @@ -1514,6 +1514,71 @@ test_expect_success 'zebra alternate color is only used when necessary' ' test_cmp expected actual ' +test_expect_success 'short lines of opposite sign do not get marked as moved' ' + cat >old.txt <<-\EOF && + this line should be marked as moved + unchanged + unchanged + unchanged + unchanged + too short + this line should be marked as oldMoved newMoved + this line should be marked as oldMovedAlternate newMoved + unchanged 1 + unchanged 2 + unchanged 3 + unchanged 4 + this line should be marked as oldMoved newMoved/newMovedAlternate + EOF + cat >new.txt <<-\EOF && + too short + unchanged + unchanged + this line should be marked as moved + too short + unchanged + unchanged + this line should be marked as oldMoved newMoved/newMovedAlternate + unchanged 1 + unchanged 2 + this line should be marked as oldMovedAlternate newMoved + this line should be marked as oldMoved newMoved/newMovedAlternate + unchanged 3 + this line should be marked as oldMoved newMoved + unchanged 4 + EOF + test_expect_code 1 git diff --no-index --color --color-moved=zebra \ + old.txt new.txt >output && cat output && + grep -v index output | test_decode_color >actual && + cat >expect <<-\EOF && + diff --git a/old.txt b/new.txt + --- a/old.txt + +++ b/new.txt + @@ -1,13 +1,15 @@ + -this line should be marked as moved + +too short + unchanged + unchanged + +this line should be marked as moved + +too short + unchanged + unchanged + -too short + -this line should be marked as oldMoved newMoved + -this line should be marked as oldMovedAlternate newMoved + +this line should be marked as oldMoved newMoved/newMovedAlternate + unchanged 1 + unchanged 2 + +this line should be marked as oldMovedAlternate newMoved + +this line should be marked as oldMoved newMoved/newMovedAlternate + unchanged 3 + +this line should be marked as oldMoved newMoved + unchanged 4 + -this line should be marked as oldMoved newMoved/newMovedAlternate + EOF + test_cmp expect actual +' + test_expect_success 'cmd option assumes configured colored-moved' ' test_config color.diff.oldMoved "magenta" && test_config color.diff.newMoved "cyan" && From patchwork Tue Jul 20 10:36:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood X-Patchwork-Id: 12388029 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7DDD5C07E95 for ; Tue, 20 Jul 2021 10:39:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6741861186 for ; Tue, 20 Jul 2021 10:39:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236812AbhGTJ6n (ORCPT ); Tue, 20 Jul 2021 05:58:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46170 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237008AbhGTJ4D (ORCPT ); Tue, 20 Jul 2021 05:56:03 -0400 Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com [IPv6:2a00:1450:4864:20::332]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 08D93C0613E7 for ; Tue, 20 Jul 2021 03:36:39 -0700 (PDT) Received: by mail-wm1-x332.google.com with SMTP id o30-20020a05600c511eb029022e0571d1a0so1729747wms.5 for ; Tue, 20 Jul 2021 03:36:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=F2bSDOWO8mgf1qcFiqewDbhA1sEl4sLhPseETaely2s=; b=BC6D1dHk5vIwiLY5RgVWrllZNqKDLJOVjG3XbNv4qnBhtfwpyGptvZoEOcaGt/85QJ 10QxONvV9L/PEeVRnODFrAyYt2DTvaJKN3/P3RjqqJlzgVEjJzphl6LdRsvby+jfVjSM hi+J+8V2T6JonbckSI8U0mxVKXI8ebqoNsOd7+cTwGjWFtgpqiz4QovGOCij50Z3+gWS ccakPuxAR6A4rWm0/zaMdefIjyXgKaZkbJnQUPa6qq9e50xFfpQNtImhz7SoGSRm982q bYjMGleaz4Quv5uqellp/jNB7ydTsS+ZrLB961EiL2VeZFlKo5JxuoZ9eaZaaTEVWeRs Ak1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=F2bSDOWO8mgf1qcFiqewDbhA1sEl4sLhPseETaely2s=; b=EOm8to33tuzh7l87QW14+qpqwuzj9S7JIqgj5n9rxeX1yiuadjE/WXsZz3c46YcQBZ FkZ16yMdllbgRbyH7GPXEBBWzADf55ncT0FACqHL7R6QinHDNr73M+bF2ddV2Gzzck48 BPQM8DYsk1MCHfx+1ZT5Eei1mOu9s1kF00uws0up2huEX9U9usL4z+mhJaLNncOHPXu+ xgqjiRIgEv8/HIOUzB2YqjcP7DqdlDngEtPPNLNxUtg1NZQkDX8FNdr+9hJJadZWa9wX M0OVOMjltmSA58VTdy7P/Cl0E9VDiSLq0SbTjr+m+iqYPSOE7BWAW+h+q7hE4j7vjBoy HwtQ== X-Gm-Message-State: AOAM532WZenvlQ1qBc9ON4gmXXWGUU+azWFIZOefHozUEVc9/fHL3tGJ Ls/BKI3POQuhXLsXRmM8P35LAz0egGw= X-Google-Smtp-Source: ABdhPJzaQlmiFoP99WeMeulnY4vSAHDLEI6v+C2Mu/YzLEKBQdBGpooGym9ngSnb+RSZFdNE5Sxmow== X-Received: by 2002:a1c:f70b:: with SMTP id v11mr30994845wmh.186.1626777397708; Tue, 20 Jul 2021 03:36:37 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 11sm2060115wmo.10.2021.07.20.03.36.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 03:36:37 -0700 (PDT) Message-Id: <93fdef30d64dd86493733c28f67ee22bcf5e6f58.1626777394.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 10:36:25 +0000 Subject: [PATCH v2 04/12] diff: simplify allow-indentation-change delta calculation Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Phillip Wood , =?utf-8?b?w4Z2YXIgQXJuZmo=?= =?utf-8?b?w7Zyw7A=?= Bjarmason , Phillip Wood , Phillip Wood Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Phillip Wood From: Phillip Wood Now that we reliably end a block when the sign changes we don't need the whitespace delta calculation to rely on the sign. Signed-off-by: Phillip Wood --- diff.c | 13 ++----------- 1 file changed, 2 insertions(+), 11 deletions(-) diff --git a/diff.c b/diff.c index 2b51b77fd20..77c893b266a 100644 --- a/diff.c +++ b/diff.c @@ -864,23 +864,17 @@ static int compute_ws_delta(const struct emitted_diff_symbol *a, a_width = a->indent_width, b_off = b->indent_off, b_width = b->indent_width; - int delta; if (a_width == INDENT_BLANKLINE && b_width == INDENT_BLANKLINE) { *out = INDENT_BLANKLINE; return 1; } - if (a->s == DIFF_SYMBOL_PLUS) - delta = a_width - b_width; - else - delta = b_width - a_width; - if (a_len - a_off != b_len - b_off || memcmp(a->line + a_off, b->line + b_off, a_len - a_off)) return 0; - *out = delta; + *out = a_width - b_width; return 1; } @@ -924,10 +918,7 @@ static int cmp_in_block_with_wsd(const struct diff_options *o, * match those of the current block and that the text of 'l' and 'cur' * after the indentation match. */ - if (cur->es->s == DIFF_SYMBOL_PLUS) - delta = a_width - c_width; - else - delta = c_width - a_width; + delta = c_width - a_width; /* * If the previous lines of this block were all blank then set its From patchwork Tue Jul 20 10:36:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood X-Patchwork-Id: 12388027 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6D66FC636C9 for ; Tue, 20 Jul 2021 10:39:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 45AB661003 for ; Tue, 20 Jul 2021 10:39:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236206AbhGTJ6Z (ORCPT ); Tue, 20 Jul 2021 05:58:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46078 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236994AbhGTJ4D (ORCPT ); Tue, 20 Jul 2021 05:56:03 -0400 Received: from mail-wr1-x431.google.com (mail-wr1-x431.google.com [IPv6:2a00:1450:4864:20::431]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B491FC0613E8 for ; Tue, 20 Jul 2021 03:36:39 -0700 (PDT) Received: by mail-wr1-x431.google.com with SMTP id l7so25438663wrv.7 for ; Tue, 20 Jul 2021 03:36:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=vHfyFYpM90A53Br3+lLul75GLpxjAtbZbo0mXWfuLsQ=; b=K9nDy1DoSFxPa+o41wXOHygPnHm4g/YOudc74S1EKM0d/YX3aME8Viwy9slHteTuDs HOqOsjIm+OUv49J66FKD3srAGN79eEZ3PbZWkwe+kom8Vpfq+y9TGg5C7I9OqiwrazFM YbSZSv7s+1BsAyVY9wh4yujCy2BU3Rke4ra1UwFdW6qi2lDaJTA6Dm3NmI+56lUp2NLD fcHLQ06Dealq8FR98Q1DL5EJbKbm8G6aZvl6mPhgmX6rG5+kgFZy6x559VHQGpvmlBWs o3YwENBDugRu6rBqBGAJUmiky/GytlRhXutxhV4GOkeHWgVRCpKhhfY8SFoS4OBhw53Y FK3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=vHfyFYpM90A53Br3+lLul75GLpxjAtbZbo0mXWfuLsQ=; b=sPbt3aj9gRgkg9km1I19PMP1TBbl4QgKRBgVfxmCS//Zfrz2BXvpW1s2s/QYKzNFHI kIBrWbG1pk/eJc3aytQDfUQvEof95bZ7f0WK0OBW4Yt1z2ABMLfTpOHMvEJG3xVjdWWL N+6eVVS4zjwqq0D+mnNE+kch+zx8yI3h3CBi2T5Tqn4kOy/8clUKV8NA/JNPmTDjJJln ypjW74bZk5bcZiU6SMaeDArWXtP2xV9nivQ/taEHvNV+wXEsfGlfoEYGFvEsbKNSQVQo DgpxCPQ7F0+8qHTiA4jtM8hOlJO/9jGbqpfd4Y6IclYLwe/8cDqD0cba45SgVl/0c3N0 la2Q== X-Gm-Message-State: AOAM530el2mWcoIxOTd+yVuelrLdNnA4YucuBbGYAIU2ZcQROX/0EsOL Wztk4PuxL4lri4FWvmZCdWrP0a1g5DI= X-Google-Smtp-Source: ABdhPJw6/7+1aTVUzp21KibgYUxr7Z2JUFuhjaRhj/4OdSJrcKAfpZnBpE5pk91Bnh/J2uxSbXnq0w== X-Received: by 2002:adf:e0c4:: with SMTP id m4mr35011996wri.312.1626777398234; Tue, 20 Jul 2021 03:36:38 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id e8sm1348729wrc.6.2021.07.20.03.36.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 03:36:37 -0700 (PDT) Message-Id: <6b7a8aed4ec005b12e041846ba6c4251a2490554.1626777394.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 10:36:26 +0000 Subject: [PATCH v2 05/12] diff --color-moved-ws=allow-indentation-change: simplify and optimize Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Phillip Wood , =?utf-8?b?w4Z2YXIgQXJuZmo=?= =?utf-8?b?w7Zyw7A=?= Bjarmason , Phillip Wood , Phillip Wood Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Phillip Wood From: Phillip Wood If we already have a block of potentially moved lines then as we move down the diff we need to check if the next line of each potentially moved line matches the current line of the diff. The implementation of --color-moved-ws=allow-indentation-change was needlessly performing this check on all the lines in the diff that matched the current line rather than just the current line. To exacerbate the problem finding all the other lines in the diff that match the current line involves a fuzzy lookup so we were wasting even more time performing a second comparison to filter out the non-matching lines. Fixing this reduces time to run git diff --color-moved-ws=allow-indentation-change v2.28.0 v2.29.0 by 93% compared to master and simplifies the code. Test HEAD^ HEAD --------------------------------------------------------------------------------------------------------------- 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.41( 0.38+0.03) 0.41(0.37+0.04) +0.0% 4002.2: diff --color-moved --no-color-moved-ws large change 0.83( 0.79+0.04) 0.82(0.79+0.02) -1.2% 4002.3: diff --color-moved-ws=allow-indentation-change large change 13.68(13.59+0.07) 0.92(0.89+0.03) -93.3% 4002.4: log --no-color-moved --no-color-moved-ws 1.31( 1.22+0.08) 1.31(1.21+0.10) +0.0% 4002.5: log --color-moved --no-color-moved-ws 1.47( 1.40+0.07) 1.47(1.36+0.10) +0.0% 4002.6: log --color-moved-ws=allow-indentation-change 1.87( 1.77+0.09) 1.50(1.41+0.09) -19.8% Signed-off-by: Phillip Wood --- diff.c | 65 ++++++++++++++++------------------------------------------ 1 file changed, 18 insertions(+), 47 deletions(-) diff --git a/diff.c b/diff.c index 77c893b266a..55384449170 100644 --- a/diff.c +++ b/diff.c @@ -881,35 +881,20 @@ static int compute_ws_delta(const struct emitted_diff_symbol *a, static int cmp_in_block_with_wsd(const struct diff_options *o, const struct moved_entry *cur, - const struct moved_entry *match, - struct moved_block *pmb, - int n) + const struct emitted_diff_symbol *l, + struct moved_block *pmb) { - struct emitted_diff_symbol *l = &o->emitted_symbols->buf[n]; - int al = cur->es->len, bl = match->es->len, cl = l->len; + int al = cur->es->len, bl = l->len; const char *a = cur->es->line, - *b = match->es->line, - *c = l->line; + *b = l->line; int a_off = cur->es->indent_off, a_width = cur->es->indent_width, - c_off = l->indent_off, - c_width = l->indent_width; + b_off = l->indent_off, + b_width = l->indent_width; int delta; - /* - * We need to check if 'cur' is equal to 'match'. As those - * are from the same (+/-) side, we do not need to adjust for - * indent changes. However these were found using fuzzy - * matching so we do have to check if they are equal. Here we - * just check the lengths. We delay calling memcmp() to check - * the contents until later as if the length comparison for a - * and c fails we can avoid the call all together. - */ - if (al != bl) - return 1; - /* If 'l' and 'cur' are both blank then they match. */ - if (a_width == INDENT_BLANKLINE && c_width == INDENT_BLANKLINE) + if (a_width == INDENT_BLANKLINE && b_width == INDENT_BLANKLINE) return 0; /* @@ -918,7 +903,7 @@ static int cmp_in_block_with_wsd(const struct diff_options *o, * match those of the current block and that the text of 'l' and 'cur' * after the indentation match. */ - delta = c_width - a_width; + delta = b_width - a_width; /* * If the previous lines of this block were all blank then set its @@ -927,9 +912,8 @@ static int cmp_in_block_with_wsd(const struct diff_options *o, if (pmb->wsd == INDENT_BLANKLINE) pmb->wsd = delta; - return !(delta == pmb->wsd && al - a_off == cl - c_off && - !memcmp(a, b, al) && ! - memcmp(a + a_off, c + c_off, al - a_off)); + return !(delta == pmb->wsd && al - a_off == bl - b_off && + !memcmp(a + a_off, b + b_off, al - a_off)); } static int moved_entry_cmp(const void *hashmap_cmp_fn_data, @@ -1030,36 +1014,23 @@ static void pmb_advance_or_null(struct diff_options *o, } static void pmb_advance_or_null_multi_match(struct diff_options *o, - struct moved_entry *match, - struct hashmap *hm, + struct emitted_diff_symbol *l, struct moved_block *pmb, - int pmb_nr, int n) + int pmb_nr) { int i; - char *got_match = xcalloc(1, pmb_nr); - - hashmap_for_each_entry_from(hm, match, ent) { - for (i = 0; i < pmb_nr; i++) { - struct moved_entry *prev = pmb[i].match; - struct moved_entry *cur = (prev && prev->next_line) ? - prev->next_line : NULL; - if (!cur) - continue; - if (!cmp_in_block_with_wsd(o, cur, match, &pmb[i], n)) - got_match[i] |= 1; - } - } for (i = 0; i < pmb_nr; i++) { - if (got_match[i]) { + struct moved_entry *prev = pmb[i].match; + struct moved_entry *cur = (prev && prev->next_line) ? + prev->next_line : NULL; + if (cur && !cmp_in_block_with_wsd(o, cur, l, &pmb[i])) { /* Advance to the next line */ - pmb[i].match = pmb[i].match->next_line; + pmb[i].match = cur; } else { moved_block_clear(&pmb[i]); } } - - free(got_match); } static int shrink_potential_moved_blocks(struct moved_block *pmb, @@ -1181,7 +1152,7 @@ static void mark_color_as_moved(struct diff_options *o, if (o->color_moved_ws_handling & COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) - pmb_advance_or_null_multi_match(o, match, hm, pmb, pmb_nr, n); + pmb_advance_or_null_multi_match(o, l, pmb, pmb_nr); else pmb_advance_or_null(o, match, hm, pmb, pmb_nr); From patchwork Tue Jul 20 10:36:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood X-Patchwork-Id: 12388039 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 55FF8C636C8 for ; Tue, 20 Jul 2021 10:40:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3C09961175 for ; Tue, 20 Jul 2021 10:40:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237203AbhGTJ7N (ORCPT ); Tue, 20 Jul 2021 05:59:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46052 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236262AbhGTJ4K (ORCPT ); Tue, 20 Jul 2021 05:56:10 -0400 Received: from mail-wr1-x432.google.com (mail-wr1-x432.google.com [IPv6:2a00:1450:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3AAB2C0613E9 for ; Tue, 20 Jul 2021 03:36:40 -0700 (PDT) Received: by mail-wr1-x432.google.com with SMTP id t5so25418905wrw.12 for ; Tue, 20 Jul 2021 03:36:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=3SmJDJvsOG2gFGwfBSH1N2QCy/wGVsNdWNxop02Q4HM=; b=Iheq0QSSZFs+Jl90+wQYhZu5rvlzZOwE/cGMc7AhdRnQqtQzMGkrIl9kMawO2fWjCf Nyavh4h32qW2fP2yEM8laU1k+17un3jpdTMxaHyd+Hs1QIsPIaSd/Rm9jeNtG+xPLlzt 6gk+WTDTHo2ob3gx60WcxOsYNUKaQAlL3dIuoMBvPlDPh+Q6+G5rNChk0ZvbKAvPq+XB CjWLkxKVuWXotiiyxLTBlr+hr4b8mCoVcbRdmGykk7SvbYmRDIX7CzlFPUu7bJqO+WS8 H3TKHfempahk3RrXmtnx4OeOgSuLERVUOQnH8mtZo3eGj4pmE+tiWYI1ToqE0NQLl5qy yXOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=3SmJDJvsOG2gFGwfBSH1N2QCy/wGVsNdWNxop02Q4HM=; b=hb8ERQ1rQJayyFWFiZHJtl4dJnwe54fV+k11R+QyRyvENo5zbXhXT+5ABVyWjV1JCq I4V6QrlbEyNj1kqXcVMcVkSudG5ZMLXd+CU9gBnwVvUSkF7ho8342RXab9+mJGmi7QTQ 4NXIqTG1qAm0W0R2nq89WoabZtmLl7ajyvSytJXl87hSJhsScCizDVtCYzbeKXlWTavc hWE/4RcbQOkX54rv/UX1DBi2socQhhZclEUd6nNLkwyUo8kTQHBH46HgkRPX7467qEAg CzsEEt/cwfKrpNXRTqkexDwyKvqVSLu9ju0LBrAyTiSKMnQRNC6FU5uv8spP3TkKjT6Z dvVw== X-Gm-Message-State: AOAM533iFlT5bmrsFaqmUNlKmzscgIUhRRFqGsrf60DCXmoMgsBdRYzO mxRyHDWfn4cUPupWqzmZQGmPdX3CogE= X-Google-Smtp-Source: ABdhPJwFH6kZT/LgduIDwrGtwY9wOZRIM+AstFs+fiA1jqiJx2eDsuep6NDVRhHM5Ev7MdoDc1ozLA== X-Received: by 2002:adf:fc85:: with SMTP id g5mr35349855wrr.296.1626777398874; Tue, 20 Jul 2021 03:36:38 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id g15sm19126278wmh.44.2021.07.20.03.36.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 03:36:38 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 20 Jul 2021 10:36:27 +0000 Subject: [PATCH v2 06/12] diff --color-moved: call comparison function directly Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Phillip Wood , =?utf-8?b?w4Z2YXIgQXJuZmo=?= =?utf-8?b?w7Zyw7A=?= Bjarmason , Phillip Wood , Phillip Wood Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Phillip Wood From: Phillip Wood This change will allow us to easily combine pmb_advance_or_null() and pmb_advance_or_null_multi_match() in the next commit. Calling xdiff_compare_lines() directly rather than using a function pointer from the hash map has little effect on the run time. Test HEAD^ HEAD ------------------------------------------------------------------------------------------------------------- 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.41(0.37+0.04) 0.41(0.39+0.02) +0.0% 4002.2: diff --color-moved --no-color-moved-ws large change 0.82(0.79+0.02) 0.83(0.79+0.03) +1.2% 4002.3: diff --color-moved-ws=allow-indentation-change large change 0.92(0.89+0.03) 0.91(0.85+0.05) -1.1% 4002.4: log --no-color-moved --no-color-moved-ws 1.31(1.21+0.10) 1.33(1.22+0.10) +1.5% 4002.5: log --color-moved --no-color-moved-ws 1.47(1.36+0.10) 1.47(1.39+0.08) +0.0% 4002.6: log --color-moved-ws=allow-indentation-change 1.50(1.41+0.09) 1.51(1.42+0.09) +0.7% Signed-off-by: Phillip Wood --- diff.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/diff.c b/diff.c index 55384449170..c056d917d0d 100644 --- a/diff.c +++ b/diff.c @@ -995,17 +995,20 @@ static void add_lines_to_move_detection(struct diff_options *o, } static void pmb_advance_or_null(struct diff_options *o, - struct moved_entry *match, - struct hashmap *hm, + struct emitted_diff_symbol *l, struct moved_block *pmb, int pmb_nr) { int i; + unsigned flags = o->color_moved_ws_handling & XDF_WHITESPACE_FLAGS; + for (i = 0; i < pmb_nr; i++) { struct moved_entry *prev = pmb[i].match; struct moved_entry *cur = (prev && prev->next_line) ? prev->next_line : NULL; - if (cur && !hm->cmpfn(o, &cur->ent, &match->ent, NULL)) { + if (cur && xdiff_compare_lines(cur->es->line, cur->es->len, + l->line, l->len, + flags)) { pmb[i].match = cur; } else { pmb[i].match = NULL; @@ -1154,7 +1157,7 @@ static void mark_color_as_moved(struct diff_options *o, COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) pmb_advance_or_null_multi_match(o, l, pmb, pmb_nr); else - pmb_advance_or_null(o, match, hm, pmb, pmb_nr); + pmb_advance_or_null(o, l, pmb, pmb_nr); pmb_nr = shrink_potential_moved_blocks(pmb, pmb_nr); From patchwork Tue Jul 20 10:36:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood X-Patchwork-Id: 12388037 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4E8DEC07E9B for ; Tue, 20 Jul 2021 10:40:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2BE3F6113C for ; Tue, 20 Jul 2021 10:40:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235683AbhGTJ7K (ORCPT ); Tue, 20 Jul 2021 05:59:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46200 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236425AbhGTJ4K (ORCPT ); Tue, 20 Jul 2021 05:56:10 -0400 Received: from mail-wm1-x32a.google.com (mail-wm1-x32a.google.com [IPv6:2a00:1450:4864:20::32a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D4C77C0613EE for ; Tue, 20 Jul 2021 03:36:40 -0700 (PDT) Received: by mail-wm1-x32a.google.com with SMTP id m11-20020a05600c3b0bb0290228f19cb433so1224708wms.0 for ; Tue, 20 Jul 2021 03:36:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=80GoSsR6PdqwqviMcSNCekMfq0aB+E1fQQkO1RL2oMo=; b=Q/vKNCsF7xliISJe8s7T2P0T17NEmFIP85LncqdKGVglK9rIgBk2G1aH3VrlI9xjMB yhv8dWZdzCdPcRguBZLk/0Fs0Ged38P1cVFjkPbvTkgBkz96yLRRM3Sg2Slr4QPv3Knw YTJN+S3fwVzOlwfEFReLjBIChQOi0YXqAz+K9NQwEIJisIlmec26SBkIZwcdrZivIPY7 PApLB8pXw8CE0aMD7T0AOl7T265aDLrJcD5fM+8GzIIdakZY6wjuV4G0IRYQ0+IR8usG lMjITHXe9jSDJRQJ63K/vqBkOEjY3wIUJOLztVL6LchbUb+F28RYCd2vmUUZtYtKj5h1 ozFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=80GoSsR6PdqwqviMcSNCekMfq0aB+E1fQQkO1RL2oMo=; b=TG8nPkC9pW7u291M8c1Lie577d9ln9G7MOsaV2u2l56sziZz91rKPUwI+KqrudLuRx Yoe4y+O/USTBKa+uS4L6NdrRlazSnh2muBJKgMIbHH3N6WGKly8st4UZSZLIMzsDLeeY ykw7LVEbhZMBdVf46xW/hSjXO+dfRDVOzI8rpS4sWOv+564uCrs4yPvAmKDgmQRBz7kF f2zpj1Vm9i5gCDE7gJpnkeFJzAb0/PfDZcUZfe50YW8X0R1Sg4pKbj9dctZpcmiMOYKt CaCMaaSu+MeHsDxRsfmJ/hju5XoKWe3VF/JUOvZp5P/6Qg5Fagv4S9WHAiLbNa0yB8My /hlQ== X-Gm-Message-State: AOAM53298GV/04fJnywHQqR/bxc9EUxKBBtUZQrPngev7o2bKsTHhnH+ 9rGOFFDCPk7fdguB6MVJOb2BdwZZZd8= X-Google-Smtp-Source: ABdhPJzPgVAOd0y1xMMXJPO1oAPn+4frYFfJxNV9iRqkgHFVS+LuyfFT/U13wsquiscsumhGEIZBnA== X-Received: by 2002:a05:600c:350b:: with SMTP id h11mr12359334wmq.20.1626777399490; Tue, 20 Jul 2021 03:36:39 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h15sm22397624wrq.88.2021.07.20.03.36.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 03:36:39 -0700 (PDT) Message-Id: <73ce9b54e869740ff23253d4153405444c673276.1626777394.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 10:36:28 +0000 Subject: [PATCH v2 07/12] diff --color-moved: unify moved block growth functions Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Phillip Wood , =?utf-8?b?w4Z2YXIgQXJuZmo=?= =?utf-8?b?w7Zyw7A=?= Bjarmason , Phillip Wood , Phillip Wood Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Phillip Wood From: Phillip Wood After the last two commits pmb_advance_or_null() and pmb_advance_or_null_multi_match() differ only in the comparison they perform. Lets simplify the code by combining them into a single function. Signed-off-by: Phillip Wood --- diff.c | 41 ++++++++++++----------------------------- 1 file changed, 12 insertions(+), 29 deletions(-) diff --git a/diff.c b/diff.c index c056d917d0d..b03f79b626c 100644 --- a/diff.c +++ b/diff.c @@ -1003,36 +1003,23 @@ static void pmb_advance_or_null(struct diff_options *o, unsigned flags = o->color_moved_ws_handling & XDF_WHITESPACE_FLAGS; for (i = 0; i < pmb_nr; i++) { + int match; struct moved_entry *prev = pmb[i].match; struct moved_entry *cur = (prev && prev->next_line) ? prev->next_line : NULL; - if (cur && xdiff_compare_lines(cur->es->line, cur->es->len, - l->line, l->len, - flags)) { - pmb[i].match = cur; - } else { - pmb[i].match = NULL; - } - } -} -static void pmb_advance_or_null_multi_match(struct diff_options *o, - struct emitted_diff_symbol *l, - struct moved_block *pmb, - int pmb_nr) -{ - int i; - - for (i = 0; i < pmb_nr; i++) { - struct moved_entry *prev = pmb[i].match; - struct moved_entry *cur = (prev && prev->next_line) ? - prev->next_line : NULL; - if (cur && !cmp_in_block_with_wsd(o, cur, l, &pmb[i])) { - /* Advance to the next line */ + if (o->color_moved_ws_handling & + COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) + match = cur && + !cmp_in_block_with_wsd(o, cur, l, &pmb[i]); + else + match = cur && + xdiff_compare_lines(cur->es->line, cur->es->len, + l->line, l->len, flags); + if (match) pmb[i].match = cur; - } else { + else moved_block_clear(&pmb[i]); - } } } @@ -1153,11 +1140,7 @@ static void mark_color_as_moved(struct diff_options *o, continue; } - if (o->color_moved_ws_handling & - COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) - pmb_advance_or_null_multi_match(o, l, pmb, pmb_nr); - else - pmb_advance_or_null(o, l, pmb, pmb_nr); + pmb_advance_or_null(o, l, pmb, pmb_nr); pmb_nr = shrink_potential_moved_blocks(pmb, pmb_nr); From patchwork Tue Jul 20 10:36:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood X-Patchwork-Id: 12388043 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 315C1C07E95 for ; Tue, 20 Jul 2021 10:40:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1BF036108B for ; Tue, 20 Jul 2021 10:40:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237119AbhGTKAJ (ORCPT ); Tue, 20 Jul 2021 06:00:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46214 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236462AbhGTJ4O (ORCPT ); Tue, 20 Jul 2021 05:56:14 -0400 Received: from mail-wr1-x42e.google.com (mail-wr1-x42e.google.com [IPv6:2a00:1450:4864:20::42e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 532B7C0613EF for ; Tue, 20 Jul 2021 03:36:41 -0700 (PDT) Received: by mail-wr1-x42e.google.com with SMTP id l7so25438741wrv.7 for ; Tue, 20 Jul 2021 03:36:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=RoQDZcAdoGUDb6H+X9AUIqAxsZArFcVOc4P0K4BHhpg=; b=PFzCzA4ZnkvCy5apeXBJRhSgAQs95G79vovNutdvgDWBXOyoYGAZNIZ5y60bqgli0A 8VNoNcTri5QkwbO2t9lEwFAJxkf6mT0GU7r7qnmCfXYbMEn5mu6hT1LL2iK3+0LyTrsG gxL6E/0mOIm8iArqeJT7mRuT3K3iQXly0bdzpcCPFhDMvcs1M+Ym9Hsb0fI2DYFK+nz0 LJ/hLWvaNVYUAGGvjgefG2CO5Q9Zp8oF6Lwi0gSi1+xxdHaw3VL4qCA9OViMrgrQA/w/ lRpOMY3okFRMTYRjLdAjMSeiY42/UQjtkczw+B6jy8y4VLbT3CDSyDgCqi/vFbp7UHRq nNYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=RoQDZcAdoGUDb6H+X9AUIqAxsZArFcVOc4P0K4BHhpg=; b=OGQZKX00D4P+AGjTF0L6Ho2Rhh2giiIpqHnp19VEQ1jMQDWuHsVUA7FYscutmH17Ee j3WLdBxIGrQUTFEw0IOSWlj0Mn4hlhmZDL5cPa5d5E5HRD9zUiGZJgrqkphXQS/OEDX0 4EnvWBOO2PhdCP6AmsJ0+WkXJz/WaM8ppGfGdWWjeRLf1tMDSVTFEUGZh8GDVGY1uatr rPmRph12dVkvI+INdaMeBNGeVxP28wKBDZbZaiXQm3JJP3FOZ9xeA+UPdkLwCeZf8az+ UGDMlE1d9H1JrWYQSJCHUszsxjRqxPwUtQJPx0SrYxMxOFHKYGc0ds2Sc923CkY1NrEb sbpA== X-Gm-Message-State: AOAM531nhX8g4CBZPW6yMRLw6qwe/RorSt37WczDQlZgAz/PU8Y8A9xA lfAsDBvRqKlpkdLSvb91C682UQ1NbqQ= X-Google-Smtp-Source: ABdhPJxup/QRYTBm/nWs5764/9866HcdzaptuKO/49bT+/MqI2rM/hnbpKIQGscWuqABf9Hhom4vRA== X-Received: by 2002:adf:ec4b:: with SMTP id w11mr35276003wrn.420.1626777400036; Tue, 20 Jul 2021 03:36:40 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id r16sm2044885wmg.11.2021.07.20.03.36.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 03:36:39 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 20 Jul 2021 10:36:29 +0000 Subject: [PATCH v2 08/12] diff --color-moved: shrink potential moved blocks as we go Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Phillip Wood , =?utf-8?b?w4Z2YXIgQXJuZmo=?= =?utf-8?b?w7Zyw7A=?= Bjarmason , Phillip Wood , Phillip Wood Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Phillip Wood From: Phillip Wood Rather than setting `match` to NULL and then looping over the list of potential matched blocks for a second time to remove blocks with no matches just filter out the blocks with no matches as we go. Signed-off-by: Phillip Wood --- diff.c | 42 ++++++------------------------------------ 1 file changed, 6 insertions(+), 36 deletions(-) diff --git a/diff.c b/diff.c index b03f79b626c..068473c0be3 100644 --- a/diff.c +++ b/diff.c @@ -997,12 +997,12 @@ static void add_lines_to_move_detection(struct diff_options *o, static void pmb_advance_or_null(struct diff_options *o, struct emitted_diff_symbol *l, struct moved_block *pmb, - int pmb_nr) + int *pmb_nr) { - int i; + int i, j; unsigned flags = o->color_moved_ws_handling & XDF_WHITESPACE_FLAGS; - for (i = 0; i < pmb_nr; i++) { + for (i = 0, j = 0; i < *pmb_nr; i++) { int match; struct moved_entry *prev = pmb[i].match; struct moved_entry *cur = (prev && prev->next_line) ? @@ -1017,37 +1017,9 @@ static void pmb_advance_or_null(struct diff_options *o, xdiff_compare_lines(cur->es->line, cur->es->len, l->line, l->len, flags); if (match) - pmb[i].match = cur; - else - moved_block_clear(&pmb[i]); + pmb[j++].match = cur; } -} - -static int shrink_potential_moved_blocks(struct moved_block *pmb, - int pmb_nr) -{ - int lp, rp; - - /* Shrink the set of potential block to the remaining running */ - for (lp = 0, rp = pmb_nr - 1; lp <= rp;) { - while (lp < pmb_nr && pmb[lp].match) - lp++; - /* lp points at the first NULL now */ - - while (rp > -1 && !pmb[rp].match) - rp--; - /* rp points at the last non-NULL */ - - if (lp < pmb_nr && rp > -1 && lp < rp) { - pmb[lp] = pmb[rp]; - memset(&pmb[rp], 0, sizeof(pmb[rp])); - rp--; - lp++; - } - } - - /* Remember the number of running sets */ - return rp + 1; + *pmb_nr = j; } /* @@ -1140,9 +1112,7 @@ static void mark_color_as_moved(struct diff_options *o, continue; } - pmb_advance_or_null(o, l, pmb, pmb_nr); - - pmb_nr = shrink_potential_moved_blocks(pmb, pmb_nr); + pmb_advance_or_null(o, l, pmb, &pmb_nr); if (pmb_nr == 0) { /* From patchwork Tue Jul 20 10:36:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood X-Patchwork-Id: 12388041 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC567C07E95 for ; Tue, 20 Jul 2021 10:40:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C2B4C61107 for ; Tue, 20 Jul 2021 10:40:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236268AbhGTKAF (ORCPT ); Tue, 20 Jul 2021 06:00:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46092 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236465AbhGTJ4O (ORCPT ); Tue, 20 Jul 2021 05:56:14 -0400 Received: from mail-wm1-x334.google.com (mail-wm1-x334.google.com [IPv6:2a00:1450:4864:20::334]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F1999C0613DD for ; Tue, 20 Jul 2021 03:36:41 -0700 (PDT) Received: by mail-wm1-x334.google.com with SMTP id u5-20020a7bc0450000b02901480e40338bso1304187wmc.1 for ; Tue, 20 Jul 2021 03:36:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Nunq3FkTqdTpNuMe+/uq7V+j0dRIWt2R3G2vbK6JIQQ=; b=t7Y/HbtqCLF/A9Hean4DFu8siLoB2UX/itioZPFiIpM3ZMRc0oHzD/oc9ExTrro97g lrAYBPuCeWRTSIx/P/uVmZ+/jgRP1Y2VREgHcuexmlmrP8UDTFyMi7qDw0u+Q3JS40s7 zKAaBd+huhLEyRzKcHTSZMJpENv8iHeToDb8aMYM919sxwr0krONtdlb5vOHUJDnHn1L 7B7f0VK6+E6KWVVpjvf6gRLIkkyf4nYrpROcqGp/ws5NDBcrNo1DYyrmlr7GhSk7dHIk De9omrMiSjDN0KvGxddB3IPGPMrld2L8beKoFobIcrBuCRGhQS8NwLhWfRxAeE8PMVYh LGTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Nunq3FkTqdTpNuMe+/uq7V+j0dRIWt2R3G2vbK6JIQQ=; b=N8V9MqFNqP6c+Ub8iLnh7/oekJX96Vmo9u5EvrlB9s55Hy/LjG9ZYt7c1zHk996J0d Ye6fZr7NRTw9UJU2fkqFc3UWbFG/Be7m2P9rWNUX4rt50/IE1bO/JVdtO9xQhAgrcnta 8v2qE/MWuldS4E79qV7wtQPsuqVVZVC2ljHKLCFDiJvwxD6ku/Y3cCVMxKCn8UFXjYFi zeijOliyIAdR4te6xulP29ybo9tYXYvMZ3N3Ftsv6l2rjAuLcvXIGRrzWuTob6eZHHhz 0fK82YvLDa97tmqyfK7TyPSxXeEmxg8LZbiGtd0+WoZvI+GyWWymdgEp45To5ePMnjov wYeA== X-Gm-Message-State: AOAM533Q6lJlz3kC4lyagU1V9km62qyXOLcTocgBVIsAHa8sKHnrl7oA VOrBI+OdGCMqKVXDzFbfu1nvakZ8gn8= X-Google-Smtp-Source: ABdhPJwdBE/47AMbvL+4UTNwQd4oQ5yeY6jZtNIUkmmt7aH7nyq2s1e3sj/OoUIOyrVaWbbVZzCRsA== X-Received: by 2002:a05:600c:3783:: with SMTP id o3mr31772451wmr.23.1626777400623; Tue, 20 Jul 2021 03:36:40 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id j4sm9445820wrt.24.2021.07.20.03.36.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 03:36:40 -0700 (PDT) Message-Id: <9d0a042eae17e449d6fd5a4cb7df31a8a02d972c.1626777394.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 10:36:30 +0000 Subject: [PATCH v2 09/12] diff --color-moved: stop clearing potential moved blocks Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Phillip Wood , =?utf-8?b?w4Z2YXIgQXJuZmo=?= =?utf-8?b?w7Zyw7A=?= Bjarmason , Phillip Wood , Phillip Wood Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Phillip Wood From: Phillip Wood moved_block_clear() was introduced in 74d156f4a1 ("diff --color-moved-ws: fix double free crash", 2018-10-04) to free the memory that was allocated when initializing a potential moved block. However since 21536d077f ("diff --color-moved-ws: modify allow-indentation-change", 2018-11-23) initializing a potential moved block no longer allocates any memory. Up until the last commit we were relying on moved_block_clear() to set the `match` pointer to NULL when a block stopped matching, but since that commit we do not clear a moved block that does not match so it does not make sense to clear them elsewhere. Signed-off-by: Phillip Wood --- diff.c | 11 ----------- 1 file changed, 11 deletions(-) diff --git a/diff.c b/diff.c index 068473c0be3..4b5776a5a0a 100644 --- a/diff.c +++ b/diff.c @@ -807,11 +807,6 @@ struct moved_block { int wsd; /* The whitespace delta of this block */ }; -static void moved_block_clear(struct moved_block *b) -{ - memset(b, 0, sizeof(*b)); -} - #define INDENT_BLANKLINE INT_MIN static void fill_es_indent_data(struct emitted_diff_symbol *es) @@ -1093,11 +1088,7 @@ static void mark_color_as_moved(struct diff_options *o, } if (pmb_nr && (!match || l->s != moved_symbol)) { - int i; - adjust_last_block(o, n, block_length); - for(i = 0; i < pmb_nr; i++) - moved_block_clear(&pmb[i]); pmb_nr = 0; block_length = 0; flipped_block = 0; @@ -1155,8 +1146,6 @@ static void mark_color_as_moved(struct diff_options *o, } adjust_last_block(o, n, block_length); - for(n = 0; n < pmb_nr; n++) - moved_block_clear(&pmb[n]); free(pmb); } From patchwork Tue Jul 20 10:36:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood X-Patchwork-Id: 12388047 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4AC4AC07E95 for ; Tue, 20 Jul 2021 10:41:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 31F6861209 for ; Tue, 20 Jul 2021 10:41:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237035AbhGTKAh (ORCPT ); Tue, 20 Jul 2021 06:00:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46094 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236768AbhGTJ4T (ORCPT ); Tue, 20 Jul 2021 05:56:19 -0400 Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [IPv6:2a00:1450:4864:20::32e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8174CC0613DE for ; Tue, 20 Jul 2021 03:36:42 -0700 (PDT) Received: by mail-wm1-x32e.google.com with SMTP id w13so11996902wmc.3 for ; Tue, 20 Jul 2021 03:36:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=AE6Zyj9DjecfFGQtzrMFU916Wi+MtRfwgs9nFeFm3Vc=; b=Va20QI/+R6+E3QqJhDkcO49PJhUCZbybvJKRGqv700SqD+TGlAzaTTaFvrxWxeaUlb AOkW+5l5OZzVIKoO1GhQsO1fPWy+FyBXUd21TLE100jxtted3d520MKLbCda1L1sWy4y eunBIEbrQgxs9NKhOAnC++CeRKRz0d7obdNs5MLGoz27UcXonHRDDesAbgJilQlIR8E+ CXSvzEpyoVpcZ1IbDfKVZroe/QNBuDe9f7ivMNHsvGu+6yJXIQwqFBpIki0thagYJgXz BNhd5jikJlXTH+ICFbkSwNRpypBqulCXhPadEazw6QamgqhHVnT6WL/RGjfG1qr8TXxF sOhg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=AE6Zyj9DjecfFGQtzrMFU916Wi+MtRfwgs9nFeFm3Vc=; b=mkLnKEsrNQKYJpqFTgysVElgZtnwTl28Ohc5udsfMbDpbk5bgLvzEt2u4OVtBibp0V HAq/pgJJtJuHo3i8V1UMcq/YotmP9P+p2qFsjvgsl718v3jzgD/dbIkSdLSnT0cCFc5X 0uOPRXoOAgf+8c9igu5Sj5W627pcFHK1xSLn6XAh5HAu/0y/Be2YUufAue7rG10YMFMn CAxK7wY+WNNRucxTevZ1EzRMi7OGcWwE3uc0kScief0aafW687Zy7uJjOua5zjCxIhkE N03V44IeiH1JhVFIXoVaO4qgjSbGSLZRIygIp4Moo9MBYq3fBRKvMMLjLLFG8zDi9vHm 20iA== X-Gm-Message-State: AOAM531H8FTRXhq1p12ZbndA+sEeQBG6K5v2KERwsOweXnrc3KUlzYcr 4wlTLNIh19enW5NrAQMrnqgf0/o6/Rc= X-Google-Smtp-Source: ABdhPJx8jL5OEXALW2xDERICouV21EeUTsmxgmiEo49Fj4y7nBzXzGYpWdQEVm3saoyDPjk7A0F1XQ== X-Received: by 2002:a05:600c:35d1:: with SMTP id r17mr6729831wmq.98.1626777401202; Tue, 20 Jul 2021 03:36:41 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id n23sm18355145wms.4.2021.07.20.03.36.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 03:36:40 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 20 Jul 2021 10:36:31 +0000 Subject: [PATCH v2 10/12] diff --color-moved-ws=allow-indentation-change: improve hash lookups Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Phillip Wood , =?utf-8?b?w4Z2YXIgQXJuZmo=?= =?utf-8?b?w7Zyw7A=?= Bjarmason , Phillip Wood , Phillip Wood Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Phillip Wood From: Phillip Wood As libxdiff does not have a whitespace flag to ignore the indentation the code for --color-moved-ws=allow-indentation-change uses XDF_IGNORE_WHITESPACE and then filters out any hash lookups where there are non-indentation changes. This filtering is inefficient as we have to perform another string comparison. By using the offset data that we have already computed to skip the indentation we can avoid using XDF_IGNORE_WHITESPACE and safely remove the extra checks which improves the performance by 11% and paves the way for the elimination of string comparisons in the next commit. This change slightly increases the run time of other --color-moved modes. This could be avoided by using different comparison functions for the different modes but after the next two commits there is no measurable benefit in doing so. Test HEAD^ HEAD -------------------------------------------------------------------------------------------------------------- 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.41(0.38+0.03) 0.41(0.36+0.04) +0.0% 4002.2: diff --color-moved --no-color-moved-ws large change 0.82(0.76+0.05) 0.84(0.79+0.04) +2.4% 4002.3: diff --color-moved-ws=allow-indentation-change large change 0.91(0.88+0.03) 0.81(0.74+0.06) -11.0% 4002.4: log --no-color-moved --no-color-moved-ws 1.32(1.21+0.10) 1.31(1.19+0.11) -0.8% 4002.5: log --color-moved --no-color-moved-ws 1.47(1.37+0.10) 1.47(1.36+0.11) +0.0% 4002.6: log --color-moved-ws=allow-indentation-change 1.51(1.42+0.09) 1.48(1.37+0.10) -2.0% Signed-off-by: Phillip Wood --- diff.c | 66 +++++++++++++++++----------------------------------------- 1 file changed, 19 insertions(+), 47 deletions(-) diff --git a/diff.c b/diff.c index 4b5776a5a0a..f899083d028 100644 --- a/diff.c +++ b/diff.c @@ -850,28 +850,15 @@ static void fill_es_indent_data(struct emitted_diff_symbol *es) } static int compute_ws_delta(const struct emitted_diff_symbol *a, - const struct emitted_diff_symbol *b, - int *out) -{ - int a_len = a->len, - b_len = b->len, - a_off = a->indent_off, - a_width = a->indent_width, - b_off = b->indent_off, + const struct emitted_diff_symbol *b) +{ + int a_width = a->indent_width, b_width = b->indent_width; - if (a_width == INDENT_BLANKLINE && b_width == INDENT_BLANKLINE) { - *out = INDENT_BLANKLINE; - return 1; - } - - if (a_len - a_off != b_len - b_off || - memcmp(a->line + a_off, b->line + b_off, a_len - a_off)) - return 0; - - *out = a_width - b_width; + if (a_width == INDENT_BLANKLINE && b_width == INDENT_BLANKLINE) + return INDENT_BLANKLINE; - return 1; + return a_width - b_width; } static int cmp_in_block_with_wsd(const struct diff_options *o, @@ -917,26 +904,17 @@ static int moved_entry_cmp(const void *hashmap_cmp_fn_data, const void *keydata) { const struct diff_options *diffopt = hashmap_cmp_fn_data; - const struct moved_entry *a, *b; + const struct emitted_diff_symbol *a, *b; unsigned flags = diffopt->color_moved_ws_handling & XDF_WHITESPACE_FLAGS; - a = container_of(eptr, const struct moved_entry, ent); - b = container_of(entry_or_key, const struct moved_entry, ent); - - if (diffopt->color_moved_ws_handling & - COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) - /* - * As there is not specific white space config given, - * we'd need to check for a new block, so ignore all - * white space. The setup of the white space - * configuration for the next block is done else where - */ - flags |= XDF_IGNORE_WHITESPACE; + a = container_of(eptr, const struct moved_entry, ent)->es; + b = container_of(entry_or_key, const struct moved_entry, ent)->es; - return !xdiff_compare_lines(a->es->line, a->es->len, - b->es->line, b->es->len, - flags); + return !xdiff_compare_lines(a->line + a->indent_off, + a->len - a->indent_off, + b->line + b->indent_off, + b->len - b->indent_off, flags); } static struct moved_entry *prepare_entry(struct diff_options *o, @@ -945,7 +923,8 @@ static struct moved_entry *prepare_entry(struct diff_options *o, struct moved_entry *ret = xmalloc(sizeof(*ret)); struct emitted_diff_symbol *l = &o->emitted_symbols->buf[line_no]; unsigned flags = o->color_moved_ws_handling & XDF_WHITESPACE_FLAGS; - unsigned int hash = xdiff_hash_string(l->line, l->len, flags); + unsigned int hash = xdiff_hash_string(l->line + l->indent_off, + l->len - l->indent_off, flags); hashmap_entry_init(&ret->ent, hash); ret->es = l; @@ -1113,14 +1092,11 @@ static void mark_color_as_moved(struct diff_options *o, hashmap_for_each_entry_from(hm, match, ent) { ALLOC_GROW(pmb, pmb_nr + 1, pmb_alloc); if (o->color_moved_ws_handling & - COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) { - if (compute_ws_delta(l, match->es, - &pmb[pmb_nr].wsd)) - pmb[pmb_nr++].match = match; - } else { + COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) + pmb[pmb_nr].wsd = compute_ws_delta(l, match->es); + else pmb[pmb_nr].wsd = 0; - pmb[pmb_nr++].match = match; - } + pmb[pmb_nr++].match = match; } if (adjust_last_block(o, n, block_length) && @@ -6240,10 +6216,6 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o) if (o->color_moved) { struct hashmap add_lines, del_lines; - if (o->color_moved_ws_handling & - COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) - o->color_moved_ws_handling |= XDF_IGNORE_WHITESPACE; - hashmap_init(&del_lines, moved_entry_cmp, o, 0); hashmap_init(&add_lines, moved_entry_cmp, o, 0); From patchwork Tue Jul 20 10:36:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood X-Patchwork-Id: 12388045 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BFE11C07E9B for ; Tue, 20 Jul 2021 10:41:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AB7F3611CE for ; Tue, 20 Jul 2021 10:41:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236935AbhGTKAT (ORCPT ); Tue, 20 Jul 2021 06:00:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46236 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236782AbhGTJ4U (ORCPT ); Tue, 20 Jul 2021 05:56:20 -0400 Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [IPv6:2a00:1450:4864:20::32e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1D476C0613DF for ; Tue, 20 Jul 2021 03:36:43 -0700 (PDT) Received: by mail-wm1-x32e.google.com with SMTP id l17-20020a05600c1d11b029021f84fcaf75so1200763wms.1 for ; Tue, 20 Jul 2021 03:36:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=CroHrlIswWe0UcKsGuSy3azuqUBqALEU25neQYHZWRQ=; b=k7hRkAvdixsrlWiMUCEBGpGmRygIukjbD4xjuW3uWhC8N84oNofmCuvxRtlgSm4nb1 oZtmPfUem4OsXik7Al3Mm4kdWLGIVDn0PG7q7nWr+S5RnmK+j3af9iib0/7O7wzbu6eY L2fcp9YMUfvE4a1cH8UdIol6E7joKX5qExMMUdTmnWEduKT7YlggUPC+q1LdSVWAfEnh cpHqVvaTDMWzyYLxEzjFseRtzOdenGo2a+YCNCmzlFnt31nxw92tNBRttjDCj1V/suH3 VrYFPX5R5X2oq4NgK+AQ4nOJRN6u9wElp4UJohZTjRHtaSE5bn0gvM4ieJSMoy2VAlxQ VvbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=CroHrlIswWe0UcKsGuSy3azuqUBqALEU25neQYHZWRQ=; b=N/xCmYQChoBw+NF9L7CwihJo64nGJS3VwOCJ6THPiEvC0duWj0SQyPtCfugGv6o6Ti zscMZC149YOsk4z4ZIH3oGGMMn+DTtETnfLvRbTOwrcJnJMBFQS/qirsagUqwF0u0kb4 Y6QwEx3Vo2r3ebID9tVot6jM5a3DAfKvaTHH3hFVJZFaoQ9rHf2WfjtSROOxCQ5vlY7E xlJ4wkVgFI8hnBa6JONM06e0UpXP9BMIVx+4wPejkmzLlX9+A6opQktw9BjrqcUE7hIr 2mhoBrU+njQHlgiOAJ0PRrTJ/VTxsliZc0/mCc39ZqtXCHK4ODcaDjpe9GdMHnk9HG3o Ga3A== X-Gm-Message-State: AOAM533HiPs3uqVml3a4v+jziN5MHre1CcBVNtltNLtN1DlSHyW44nnq 1TaVHcNJh+PsZ66iwRQCASTbvCu/m/k= X-Google-Smtp-Source: ABdhPJz4aDhdAtLMEfkWAwMosA0FbbSyj7Z7msZ3BS9U9ltZVwZYmHhZlIvIWihtW7J1shSbCf3grA== X-Received: by 2002:a7b:c846:: with SMTP id c6mr30785842wml.92.1626777401754; Tue, 20 Jul 2021 03:36:41 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id j16sm23203056wrw.62.2021.07.20.03.36.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 03:36:41 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 20 Jul 2021 10:36:32 +0000 Subject: [PATCH v2 11/12] diff: use designated initializers for emitted_diff_symbol Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Phillip Wood , =?utf-8?b?w4Z2YXIgQXJuZmo=?= =?utf-8?b?w7Zyw7A=?= Bjarmason , Phillip Wood , Phillip Wood Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Phillip Wood From: Phillip Wood This makes it clearer which fields are being explicitly initialized and will simplify the next commit where we add a new field to the struct. Signed-off-by: Phillip Wood --- diff.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/diff.c b/diff.c index f899083d028..31a20a34240 100644 --- a/diff.c +++ b/diff.c @@ -1460,7 +1460,9 @@ static void emit_diff_symbol_from_struct(struct diff_options *o, static void emit_diff_symbol(struct diff_options *o, enum diff_symbol s, const char *line, int len, unsigned flags) { - struct emitted_diff_symbol e = {line, len, flags, 0, 0, s}; + struct emitted_diff_symbol e = { + .line = line, .len = len, .flags = flags, .s = s + }; if (o->emitted_symbols) append_emitted_diff_symbol(o, &e); From patchwork Tue Jul 20 10:36:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phillip Wood X-Patchwork-Id: 12388049 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5DF02C07E95 for ; Tue, 20 Jul 2021 10:41:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4059061209 for ; Tue, 20 Jul 2021 10:41:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237512AbhGTKAt (ORCPT ); Tue, 20 Jul 2021 06:00:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46262 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237050AbhGTJ40 (ORCPT ); Tue, 20 Jul 2021 05:56:26 -0400 Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [IPv6:2a00:1450:4864:20::333]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C98EEC0613F0 for ; Tue, 20 Jul 2021 03:36:43 -0700 (PDT) Received: by mail-wm1-x333.google.com with SMTP id b14-20020a1c1b0e0000b02901fc3a62af78so1190293wmb.3 for ; Tue, 20 Jul 2021 03:36:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=QwhwOsthjJJ1IU4SpZfdMl4dOFcRAB8QDVFJjogy7O8=; b=I8gjdpyG2UltVo1XmX+6VN8koJ6pJQ0lEfGu7fN9BweP3JsZX+fphI3Y3LdI1aaqCz Zpfn9pFQ/VTIun192hzd1I7PFpxCj4m4kPee/XcknynfTSa2B8SVERlB7Kfqa7KT2GEE rmojBTVr9pz78cIYtV+Dpa37Y1DgQCHP69JIWjfoi+uWyAShQ628Wyz7P4A4AGbK/U2W xbaTlnVPSb8P1R5rf4txJfL72amhxQ/dCzhVbkhBlobwedMsNig2wj1NHhhyamxTruFv gL4qNVlQx0+3Cle0q4KCfSKnzRT7KM0NBnVFx34Q7bj9MVDHRmhR67rD216ralvNfYic aqQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=QwhwOsthjJJ1IU4SpZfdMl4dOFcRAB8QDVFJjogy7O8=; b=anpLJqT/uL5lMepngm8zTOPaCvNBk6wEJ2vpMuuHZNs0Ialgn6t8zLx1SRGNwa54kw QY5wowEVS/OdoDSQ9noLZUWBnXUl9QgzqBcaqE3OiIgXi1v1jLrzKFpu0cTL5O7/pmoY XYI38UI2pmwb1IEv8Dj/p1Fsla3qSQfn9xAoNbxEQN9NHrVuin7p0Q0Nv9DWxg0iwZAC RwSIQzOW6cR979g6kI1NknmCHEXcUVetjbFfFS2jV6TJ66yBaDAePc3Q4aNADng5cMPl rRFgOAk2qko0sduKG5JBJmjGYMSBTJxpcBAhqbA4quvMiQOKy7lRH36xIAjCvLwZ4Ozn 8h/Q== X-Gm-Message-State: AOAM531VTyZXvTqaF8QHjbvW9+33DbDGJ8+SsqWKY4PAIyQvAl7tabZa GsIxA3t8BDjP8OAvLoxflRpDpwZqCDE= X-Google-Smtp-Source: ABdhPJyyxjiVdinozrf5CXQJIeW27+UojL0ob2fETIquYpu9j8oCcHsmNNWDwWe/61NxB28OVciH7A== X-Received: by 2002:a05:600c:2105:: with SMTP id u5mr30634589wml.18.1626777402363; Tue, 20 Jul 2021 03:36:42 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id s17sm23845659wrv.2.2021.07.20.03.36.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jul 2021 03:36:42 -0700 (PDT) Message-Id: <753554587f9bbe22e9f32a245b551ab1f38ea1bd.1626777394.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 20 Jul 2021 10:36:33 +0000 Subject: [PATCH v2 12/12] diff --color-moved: intern strings Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Phillip Wood , =?utf-8?b?w4Z2YXIgQXJuZmo=?= =?utf-8?b?w7Zyw7A=?= Bjarmason , Phillip Wood , Phillip Wood Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Phillip Wood From: Phillip Wood Taking inspiration from xdl_classify_record() assign an id to each addition and deletion such that lines that match for the current --color-moved-ws mode share the same unique id. This reduces the number of hash lookups a little (calculating the ids still involves one hash lookup per line) but the main benefit is that when growing blocks of potentially moved lines we can replace string comparisons which involve chasing a pointer with a simple integer comparison. On a large diff this commit reduces the time to run diff --color-moved by 33% and diff --color-moved-ws=allow-indentation-change by 26%. Compared to master the time to run diff --color-moved-ws=allow-indentation-change is now reduced by 95% and the overhead compared to --no-color-moved is reduced to 50%. Compared to the previous commit the time to run git log --patch --color-moved is increased slightly, but compared to master there is no change in run time. Test HEAD^ HEAD -------------------------------------------------------------------------------------------------------------- 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.41(0.36+0.04) 0.41(0.37+0.03) +0.0% 4002.2: diff --color-moved --no-color-moved-ws large change 0.83(0.79+0.03) 0.55(0.52+0.03) -33.7% 4002.3: diff --color-moved-ws=allow-indentation-change large change 0.81(0.77+0.04) 0.60(0.55+0.05) -25.9% 4002.4: log --no-color-moved --no-color-moved-ws 1.30(1.20+0.09) 1.31(1.22+0.08) +0.8% 4002.5: log --color-moved --no-color-moved-ws 1.46(1.35+0.11) 1.47(1.30+0.16) +0.7% 4002.6: log --color-moved-ws=allow-indentation-change 1.46(1.38+0.07) 1.47(1.34+0.13) +0.7% Test master HEAD -------------------------------------------------------------------------------------------------------------- 4002.1: diff --no-color-moved --no-color-moved-ws large change 0.40( 0.36+0.03) 0.41(0.37+0.03) +2.5% 4002.2: diff --color-moved --no-color-moved-ws large change 0.82( 0.77+0.04) 0.55(0.52+0.03) -32.9% 4002.3: diff --color-moved-ws=allow-indentation-change large change 14.10(14.04+0.04) 0.60(0.55+0.05) -95.7% 4002.4: log --no-color-moved --no-color-moved-ws 1.31( 1.21+0.09) 1.31(1.22+0.08) +0.0% 4002.5: log --color-moved --no-color-moved-ws 1.47( 1.37+0.09) 1.47(1.30+0.16) +0.0% 4002.6: log --color-moved-ws=allow-indentation-change 1.86( 1.76+0.10) 1.47(1.34+0.13) -21.0% Signed-off-by: Phillip Wood --- diff.c | 171 ++++++++++++++++++++++++++++++++------------------------- 1 file changed, 95 insertions(+), 76 deletions(-) diff --git a/diff.c b/diff.c index 31a20a34240..2956c8f7103 100644 --- a/diff.c +++ b/diff.c @@ -18,6 +18,7 @@ #include "submodule-config.h" #include "submodule.h" #include "hashmap.h" +#include "mem-pool.h" #include "ll-merge.h" #include "string-list.h" #include "strvec.h" @@ -772,6 +773,7 @@ struct emitted_diff_symbol { int flags; int indent_off; /* Offset to first non-whitespace character */ int indent_width; /* The visual width of the indentation */ + unsigned id; enum diff_symbol s; }; #define EMITTED_DIFF_SYMBOL_INIT {NULL} @@ -797,9 +799,9 @@ static void append_emitted_diff_symbol(struct diff_options *o, } struct moved_entry { - struct hashmap_entry ent; const struct emitted_diff_symbol *es; struct moved_entry *next_line; + struct moved_entry *next_match; }; struct moved_block { @@ -866,24 +868,24 @@ static int cmp_in_block_with_wsd(const struct diff_options *o, const struct emitted_diff_symbol *l, struct moved_block *pmb) { - int al = cur->es->len, bl = l->len; - const char *a = cur->es->line, - *b = l->line; - int a_off = cur->es->indent_off, - a_width = cur->es->indent_width, - b_off = l->indent_off, - b_width = l->indent_width; + int a_width = cur->es->indent_width, b_width = l->indent_width; int delta; - /* If 'l' and 'cur' are both blank then they match. */ - if (a_width == INDENT_BLANKLINE && b_width == INDENT_BLANKLINE) + /* The text of each line must match */ + if (cur->es->id != l->id) + return 1; + + /* + * If 'l' and 'cur' are both blank then we don't need to check the + * indent. We only need to check cur as we know the strings match. + * */ + if (a_width == INDENT_BLANKLINE) return 0; /* * The indent changes of the block are known and stored in pmb->wsd; * however we need to check if the indent changes of the current line - * match those of the current block and that the text of 'l' and 'cur' - * after the indentation match. + * match those of the current block. */ delta = b_width - a_width; @@ -894,22 +896,26 @@ static int cmp_in_block_with_wsd(const struct diff_options *o, if (pmb->wsd == INDENT_BLANKLINE) pmb->wsd = delta; - return !(delta == pmb->wsd && al - a_off == bl - b_off && - !memcmp(a + a_off, b + b_off, al - a_off)); + return delta != pmb->wsd; } -static int moved_entry_cmp(const void *hashmap_cmp_fn_data, - const struct hashmap_entry *eptr, - const struct hashmap_entry *entry_or_key, - const void *keydata) +struct interned_diff_symbol { + struct hashmap_entry ent; + struct emitted_diff_symbol *es; +}; + +static int interned_diff_symbol_cmp(const void *hashmap_cmp_fn_data, + const struct hashmap_entry *eptr, + const struct hashmap_entry *entry_or_key, + const void *keydata) { const struct diff_options *diffopt = hashmap_cmp_fn_data; const struct emitted_diff_symbol *a, *b; unsigned flags = diffopt->color_moved_ws_handling & XDF_WHITESPACE_FLAGS; - a = container_of(eptr, const struct moved_entry, ent)->es; - b = container_of(entry_or_key, const struct moved_entry, ent)->es; + a = container_of(eptr, const struct interned_diff_symbol, ent)->es; + b = container_of(entry_or_key, const struct interned_diff_symbol, ent)->es; return !xdiff_compare_lines(a->line + a->indent_off, a->len - a->indent_off, @@ -917,55 +923,81 @@ static int moved_entry_cmp(const void *hashmap_cmp_fn_data, b->len - b->indent_off, flags); } -static struct moved_entry *prepare_entry(struct diff_options *o, - int line_no) +static void prepare_entry(struct diff_options *o, struct emitted_diff_symbol *l, + struct interned_diff_symbol *s) { - struct moved_entry *ret = xmalloc(sizeof(*ret)); - struct emitted_diff_symbol *l = &o->emitted_symbols->buf[line_no]; unsigned flags = o->color_moved_ws_handling & XDF_WHITESPACE_FLAGS; unsigned int hash = xdiff_hash_string(l->line + l->indent_off, l->len - l->indent_off, flags); - hashmap_entry_init(&ret->ent, hash); - ret->es = l; - ret->next_line = NULL; - - return ret; + hashmap_entry_init(&s->ent, hash); + s->es = l; } -static void add_lines_to_move_detection(struct diff_options *o, - struct hashmap *add_lines, - struct hashmap *del_lines) +struct moved_entry_list { + struct moved_entry *add, *del; +}; + +static struct moved_entry_list *add_lines_to_move_detection(struct diff_options *o, + struct mem_pool *entry_mem_pool) { struct moved_entry *prev_line = NULL; - + struct mem_pool interned_pool; + struct hashmap interned_map; + struct moved_entry_list *entry_list = NULL; + size_t entry_list_alloc = 0; + unsigned id = 0; int n; + + hashmap_init(&interned_map, interned_diff_symbol_cmp, o, 8096); + mem_pool_init(&interned_pool, 1024 * 1024); + for (n = 0; n < o->emitted_symbols->nr; n++) { - struct hashmap *hm; - struct moved_entry *key; + struct interned_diff_symbol key; + struct emitted_diff_symbol *l = &o->emitted_symbols->buf[n]; + struct interned_diff_symbol *s; + struct moved_entry *entry; - switch (o->emitted_symbols->buf[n].s) { - case DIFF_SYMBOL_PLUS: - hm = add_lines; - break; - case DIFF_SYMBOL_MINUS: - hm = del_lines; - break; - default: + if (l->s != DIFF_SYMBOL_PLUS && l->s != DIFF_SYMBOL_MINUS) { prev_line = NULL; continue; } if (o->color_moved_ws_handling & COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) - fill_es_indent_data(&o->emitted_symbols->buf[n]); - key = prepare_entry(o, n); - if (prev_line && prev_line->es->s == o->emitted_symbols->buf[n].s) - prev_line->next_line = key; + fill_es_indent_data(l); - hashmap_add(hm, &key->ent); - prev_line = key; + prepare_entry(o, l, &key); + s = hashmap_get_entry(&interned_map, &key, ent, &key.ent); + if (s) { + l->id = s->es->id; + } else { + l->id = id; + ALLOC_GROW_BY(entry_list, id, 1, entry_list_alloc); + hashmap_add(&interned_map, + memcpy(mem_pool_alloc(&interned_pool, + sizeof(key)), + &key, sizeof(key))); + } + entry = mem_pool_alloc(entry_mem_pool, sizeof(*entry)); + entry->es = l; + entry->next_line = NULL; + if (prev_line && prev_line->es->s == l->s) + prev_line->next_line = entry; + prev_line = entry; + if (l->s == DIFF_SYMBOL_PLUS) { + entry->next_match = entry_list[l->id].add; + entry_list[l->id].add = entry; + } else { + entry->next_match = entry_list[l->id].del; + entry_list[l->id].del = entry; + } } + + hashmap_clear(&interned_map); + mem_pool_discard(&interned_pool, 0); + + return entry_list; } static void pmb_advance_or_null(struct diff_options *o, @@ -974,7 +1006,6 @@ static void pmb_advance_or_null(struct diff_options *o, int *pmb_nr) { int i, j; - unsigned flags = o->color_moved_ws_handling & XDF_WHITESPACE_FLAGS; for (i = 0, j = 0; i < *pmb_nr; i++) { int match; @@ -987,9 +1018,8 @@ static void pmb_advance_or_null(struct diff_options *o, match = cur && !cmp_in_block_with_wsd(o, cur, l, &pmb[i]); else - match = cur && - xdiff_compare_lines(cur->es->line, cur->es->len, - l->line, l->len, flags); + match = cur && cur->es->id == l->id; + if (match) pmb[j++].match = cur; } @@ -1034,8 +1064,7 @@ static int adjust_last_block(struct diff_options *o, int n, int block_length) /* Find blocks of moved code, delegate actual coloring decision to helper */ static void mark_color_as_moved(struct diff_options *o, - struct hashmap *add_lines, - struct hashmap *del_lines) + struct moved_entry_list *entry_list) { struct moved_block *pmb = NULL; /* potentially moved blocks */ int pmb_nr = 0, pmb_alloc = 0; @@ -1044,23 +1073,15 @@ static void mark_color_as_moved(struct diff_options *o, for (n = 0; n < o->emitted_symbols->nr; n++) { - struct hashmap *hm = NULL; - struct moved_entry *key; struct moved_entry *match = NULL; struct emitted_diff_symbol *l = &o->emitted_symbols->buf[n]; switch (l->s) { case DIFF_SYMBOL_PLUS: - hm = del_lines; - key = prepare_entry(o, n); - match = hashmap_get_entry(hm, key, ent, NULL); - free(key); + match = entry_list[l->id].del; break; case DIFF_SYMBOL_MINUS: - hm = add_lines; - key = prepare_entry(o, n); - match = hashmap_get_entry(hm, key, ent, NULL); - free(key); + match = entry_list[l->id].add; break; default: flipped_block = 0; @@ -1089,7 +1110,7 @@ static void mark_color_as_moved(struct diff_options *o, * The current line is the start of a new block. * Setup the set of potential blocks. */ - hashmap_for_each_entry_from(hm, match, ent) { + for (; match; match = match->next_match) { ALLOC_GROW(pmb, pmb_nr + 1, pmb_alloc); if (o->color_moved_ws_handling & COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) @@ -6216,20 +6237,18 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o) if (o->emitted_symbols) { if (o->color_moved) { - struct hashmap add_lines, del_lines; - - hashmap_init(&del_lines, moved_entry_cmp, o, 0); - hashmap_init(&add_lines, moved_entry_cmp, o, 0); + struct mem_pool entry_pool; + struct moved_entry_list *entry_list; - add_lines_to_move_detection(o, &add_lines, &del_lines); - mark_color_as_moved(o, &add_lines, &del_lines); + mem_pool_init(&entry_pool, 1024 * 1024); + entry_list = add_lines_to_move_detection(o, + &entry_pool); + mark_color_as_moved(o, entry_list); if (o->color_moved == COLOR_MOVED_ZEBRA_DIM) dim_moved_lines(o); - hashmap_clear_and_free(&add_lines, struct moved_entry, - ent); - hashmap_clear_and_free(&del_lines, struct moved_entry, - ent); + mem_pool_discard(&entry_pool, 0); + free(entry_list); } for (i = 0; i < esm.nr; i++)