diff mbox series

[v1,9/9] diff --color-moved-ws: handle blank lines

Message ID 20181116110356.12311-10-phillip.wood@talktalk.net (mailing list archive)
State New, archived
Headers show
Series diff --color-moved-ws fixes and enhancment | expand

Commit Message

Phillip Wood Nov. 16, 2018, 11:03 a.m. UTC
From: Phillip Wood <phillip.wood@dunelm.org.uk>

When using --color-moved-ws=allow-indentation-change allow lines with
the same indentation change to be grouped across blank lines. For now
this only works if the blank lines have been moved as well, not for
blocks that have just had their indentation changed.

This completes the changes to the implementation of
--color-moved=allow-indentation-change. Running

  git diff --color-moved=allow-indentation-change v2.18.0 v2.19.0

now takes 5.0s. This is a saving of 41% from 8.5s for the optimized
version of the previous implementation and 66% from the original which
took 14.6s.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---

Notes:
    Changes since rfc:
     - Split these changes into a separate commit.
     - Detect blank lines when processing the indentation rather than
       parsing each line twice.
     - Tweaked the test to make it harder as suggested by Stefan.
     - Added timing data to the commit message.

 diff.c                     | 34 ++++++++++++++++++++++++++++---
 t/t4015-diff-whitespace.sh | 41 ++++++++++++++++++++++++++++++++++----
 2 files changed, 68 insertions(+), 7 deletions(-)

Comments

Stefan Beller Nov. 20, 2018, 6:05 p.m. UTC | #1
On Fri, Nov 16, 2018 at 3:04 AM Phillip Wood <phillip.wood@talktalk.net> wrote:
>
> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>
> When using --color-moved-ws=allow-indentation-change allow lines with
> the same indentation change to be grouped across blank lines. For now
> this only works if the blank lines have been moved as well, not for
> blocks that have just had their indentation changed.
>
> This completes the changes to the implementation of
> --color-moved=allow-indentation-change. Running
>
>   git diff --color-moved=allow-indentation-change v2.18.0 v2.19.0
>
> now takes 5.0s. This is a saving of 41% from 8.5s for the optimized
> version of the previous implementation and 66% from the original which
> took 14.6s.
>
> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
> ---
>
> Notes:
>     Changes since rfc:
>      - Split these changes into a separate commit.
>      - Detect blank lines when processing the indentation rather than
>        parsing each line twice.
>      - Tweaked the test to make it harder as suggested by Stefan.
>      - Added timing data to the commit message.
>
>  diff.c                     | 34 ++++++++++++++++++++++++++++---
>  t/t4015-diff-whitespace.sh | 41 ++++++++++++++++++++++++++++++++++----
>  2 files changed, 68 insertions(+), 7 deletions(-)
>
> diff --git a/diff.c b/diff.c
> index 89559293e7..072b5bced6 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -792,9 +792,11 @@ static void moved_block_clear(struct moved_block *b)
>         memset(b, 0, sizeof(*b));
>  }
>
> +#define INDENT_BLANKLINE INT_MIN

Answering my question from the previous patch:
This is why we need to keep the indents signed.

This patch looks quite nice to read along.

The whole series looks good to me.
Do we need to update the docs in any way?

Thanks,
Stefan
Phillip Wood Nov. 21, 2018, 3:49 p.m. UTC | #2
On 20/11/2018 18:05, Stefan Beller wrote:
> On Fri, Nov 16, 2018 at 3:04 AM Phillip Wood <phillip.wood@talktalk.net> wrote:
>>
>> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>>
>> When using --color-moved-ws=allow-indentation-change allow lines with
>> the same indentation change to be grouped across blank lines. For now
>> this only works if the blank lines have been moved as well, not for
>> blocks that have just had their indentation changed.
>>
>> This completes the changes to the implementation of
>> --color-moved=allow-indentation-change. Running
>>
>>    git diff --color-moved=allow-indentation-change v2.18.0 v2.19.0
>>
>> now takes 5.0s. This is a saving of 41% from 8.5s for the optimized
>> version of the previous implementation and 66% from the original which
>> took 14.6s.
>>
>> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
>> ---
>>
>> Notes:
>>      Changes since rfc:
>>       - Split these changes into a separate commit.
>>       - Detect blank lines when processing the indentation rather than
>>         parsing each line twice.
>>       - Tweaked the test to make it harder as suggested by Stefan.
>>       - Added timing data to the commit message.
>>
>>   diff.c                     | 34 ++++++++++++++++++++++++++++---
>>   t/t4015-diff-whitespace.sh | 41 ++++++++++++++++++++++++++++++++++----
>>   2 files changed, 68 insertions(+), 7 deletions(-)
>>
>> diff --git a/diff.c b/diff.c
>> index 89559293e7..072b5bced6 100644
>> --- a/diff.c
>> +++ b/diff.c
>> @@ -792,9 +792,11 @@ static void moved_block_clear(struct moved_block *b)
>>          memset(b, 0, sizeof(*b));
>>   }
>>
>> +#define INDENT_BLANKLINE INT_MIN
> 
> Answering my question from the previous patch:
> This is why we need to keep the indents signed.
> 
> This patch looks quite nice to read along.
> 
> The whole series looks good to me.

Thanks

> Do we need to update the docs in any way?

I'm not sure, at the moment it does not make any promises about the 
exact behavior of --color-moved-ws=allow-indentation-change, we could 
change it to be more explicit but I'm not sure it's worth it.

Thanks for looking over these patches, I'll post a reroll soon based on 
your comments.

Phillip

> Thanks,
> Stefan
>
diff mbox series

Patch

diff --git a/diff.c b/diff.c
index 89559293e7..072b5bced6 100644
--- a/diff.c
+++ b/diff.c
@@ -792,9 +792,11 @@  static void moved_block_clear(struct moved_block *b)
 	memset(b, 0, sizeof(*b));
 }
 
+#define INDENT_BLANKLINE INT_MIN
+
 static void fill_es_indent_data(struct emitted_diff_symbol *es)
 {
-	unsigned int off = 0;
+	unsigned int off = 0, i;
 	int width = 0, tab_width = es->flags & WS_TAB_WIDTH_MASK;
 	const char *s = es->line;
 	const int len = es->len;
@@ -818,8 +820,18 @@  static void fill_es_indent_data(struct emitted_diff_symbol *es)
 		}
 	}
 
-	es->indent_off = off;
-	es->indent_width = width;
+	/* check if this line is blank */
+	for (i = off; i < len; i++)
+		if (!isspace(s[i]))
+		    break;
+
+	if (i == len) {
+		es->indent_width = INDENT_BLANKLINE;
+		es->indent_off = len;
+	} else {
+		es->indent_off = off;
+		es->indent_width = width;
+	}
 }
 
 static int compute_ws_delta(const struct emitted_diff_symbol *a,
@@ -834,6 +846,11 @@  static int compute_ws_delta(const struct emitted_diff_symbol *a,
 	    b_width = b->indent_width;
 	int delta;
 
+	if (a_width == INDENT_BLANKLINE && b_width == INDENT_BLANKLINE) {
+		*out = INDENT_BLANKLINE;
+		return 1;
+	}
+
 	if (a->s == DIFF_SYMBOL_PLUS)
 		delta = a_width - b_width;
 	else
@@ -877,6 +894,10 @@  static int cmp_in_block_with_wsd(const struct diff_options *o,
 	if (al != bl)
 		return 1;
 
+	/* If 'l' and 'cur' are both blank then they match. */
+	if (a_width == INDENT_BLANKLINE && c_width == INDENT_BLANKLINE)
+		return 0;
+
 	/*
 	 * The indent changes of the block are known and stored in pmb->wsd;
 	 * however we need to check if the indent changes of the current line
@@ -888,6 +909,13 @@  static int cmp_in_block_with_wsd(const struct diff_options *o,
 	else
 		delta = c_width - a_width;
 
+	/*
+	 * If the previous lines of this block were all blank then set its
+	 * whitespace delta.
+	 */
+	if (pmb->wsd == INDENT_BLANKLINE)
+		pmb->wsd = delta;
+
 	return !(delta == pmb->wsd && al - a_off == cl - c_off &&
 		 !memcmp(a, b, al) && !
 		 memcmp(a + a_off, c + c_off, al - a_off));
diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index e023839ba6..9d6f88b07f 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -1901,10 +1901,20 @@  test_expect_success 'compare whitespace delta incompatible with other space opti
 	test_i18ngrep allow-indentation-change err
 '
 
+EMPTY=''
 test_expect_success 'compare mixed whitespace delta across moved blocks' '
 
 	git reset --hard &&
 	tr Q_ "\t " <<-EOF >text.txt &&
+	${EMPTY}
+	____too short without
+	${EMPTY}
+	___being grouped across blank line
+	${EMPTY}
+	context
+	lines
+	to
+	anchor
 	____Indented text to
 	_Q____be further indented by four spaces across
 	____Qseveral lines
@@ -1918,9 +1928,18 @@  test_expect_success 'compare mixed whitespace delta across moved blocks' '
 	git commit -m "add text.txt" &&
 
 	tr Q_ "\t " <<-EOF >text.txt &&
+	context
+	lines
+	to
+	anchor
 	QIndented text to
 	QQbe further indented by four spaces across
 	Q____several lines
+	${EMPTY}
+	QQtoo short without
+	${EMPTY}
+	Q_______being grouped across blank line
+	${EMPTY}
 	Q_QThese two lines have had their
 	indentation reduced by four spaces
 	QQdifferent indentation change
@@ -1937,7 +1956,16 @@  test_expect_success 'compare mixed whitespace delta across moved blocks' '
 	<BOLD>diff --git a/text.txt b/text.txt<RESET>
 	<BOLD>--- a/text.txt<RESET>
 	<BOLD>+++ b/text.txt<RESET>
-	<CYAN>@@ -1,7 +1,7 @@<RESET>
+	<CYAN>@@ -1,16 +1,16 @@<RESET>
+	<BOLD;MAGENTA>-<RESET>
+	<BOLD;MAGENTA>-<RESET><BOLD;MAGENTA>    too short without<RESET>
+	<BOLD;MAGENTA>-<RESET>
+	<BOLD;MAGENTA>-<RESET><BOLD;MAGENTA>   being grouped across blank line<RESET>
+	<BOLD;MAGENTA>-<RESET>
+	 <RESET>context<RESET>
+	 <RESET>lines<RESET>
+	 <RESET>to<RESET>
+	 <RESET>anchor<RESET>
 	<BOLD;MAGENTA>-<RESET><BOLD;MAGENTA>    Indented text to<RESET>
 	<BOLD;MAGENTA>-<RESET><BRED> <RESET>	<BOLD;MAGENTA>    be further indented by four spaces across<RESET>
 	<BOLD;MAGENTA>-<RESET><BRED>    <RESET>	<BOLD;MAGENTA>several lines<RESET>
@@ -1948,9 +1976,14 @@  test_expect_success 'compare mixed whitespace delta across moved blocks' '
 	<BOLD;CYAN>+<RESET>	<BOLD;CYAN>Indented text to<RESET>
 	<BOLD;CYAN>+<RESET>		<BOLD;CYAN>be further indented by four spaces across<RESET>
 	<BOLD;CYAN>+<RESET>	<BOLD;CYAN>    several lines<RESET>
-	<BOLD;YELLOW>+<RESET>	<BRED> <RESET>	<BOLD;YELLOW>These two lines have had their<RESET>
-	<BOLD;YELLOW>+<RESET><BOLD;YELLOW>indentation reduced by four spaces<RESET>
-	<BOLD;CYAN>+<RESET>		<BOLD;CYAN>different indentation change<RESET>
+	<BOLD;YELLOW>+<RESET>
+	<BOLD;YELLOW>+<RESET>		<BOLD;YELLOW>too short without<RESET>
+	<BOLD;YELLOW>+<RESET>
+	<BOLD;YELLOW>+<RESET>	<BOLD;YELLOW>       being grouped across blank line<RESET>
+	<BOLD;YELLOW>+<RESET>
+	<BOLD;CYAN>+<RESET>	<BRED> <RESET>	<BOLD;CYAN>These two lines have had their<RESET>
+	<BOLD;CYAN>+<RESET><BOLD;CYAN>indentation reduced by four spaces<RESET>
+	<BOLD;YELLOW>+<RESET>		<BOLD;YELLOW>different indentation change<RESET>
 	<GREEN>+<RESET><BRED>  <RESET>	<GREEN>too short<RESET>
 	EOF