diff mbox series

patch-id: ignore newline at end of file in diff_flush_patch_id()

Message ID b67eb51d-75e8-62c5-d1c4-fc3015e13fc6@web.de (mailing list archive)
State New, archived
Headers show
Series patch-id: ignore newline at end of file in diff_flush_patch_id() | expand

Commit Message

René Scharfe Aug. 18, 2020, 10:08 p.m. UTC
Whitespace is ignored when calculating patch IDs.  This is done by
removing all whitespace from diff lines before hashing them, including
a newline at the end of a file.  If that newline is missing, however,
--
2.28.0

Comments

Junio C Hamano Aug. 18, 2020, 10:52 p.m. UTC | #1
René Scharfe <l.s.r@web.de> writes:

> Whitespace is ignored when calculating patch IDs.  This is done by
> removing all whitespace from diff lines before hashing them, including
> a newline at the end of a file.  If that newline is missing, however,
> diff reports that fact in a separate line containing "\ No newline at
> end of file\n", and this marker is hashed like a context line.

Ah, ouch.

> This goes against our goal of making patch IDs independent of
> whitespace.  Use the same heuristic that 2485eab55cc (git-patch-id: do
> not trip over "no newline" markers, 2011-02-17) added to git patch-id
> instead and skip diff lines that start with a backslash and a space
> and are longer than twelve characters.

Good find of previous example.  Excellent.

> Reported-by: Tilman Vogel <tilman.vogel@web.de>
> Initial-test-by: Tilman Vogel <tilman.vogel@web.de>
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
>  diff.c            |  2 ++
>  t/t3500-cherry.sh | 23 +++++++++++++++++++++++
>  2 files changed, 25 insertions(+)

Thanks.

> diff --git a/diff.c b/diff.c
> index f9709de7b45..f175019eb7a 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -6044,6 +6044,8 @@ static void patch_id_consume(void *priv, char *line, unsigned long len)
>  	struct patch_id_t *data = priv;
>  	int new_len;
>
> +	if (len > 12 && starts_with(line, "\\ "))
> +		return;
>  	new_len = remove_space(line, len);
>
>  	the_hash_algo->update_fn(data->ctx, line, new_len);
> diff --git a/t/t3500-cherry.sh b/t/t3500-cherry.sh
> index f038f34b7c0..2b8d9cb38ed 100755
> --- a/t/t3500-cherry.sh
> +++ b/t/t3500-cherry.sh
> @@ -55,4 +55,27 @@ test_expect_success \
>       expr "$(echo $(git cherry master my-topic-branch) )" : "+ [^ ]* - .*"
>  '
>
> +test_expect_success 'cherry ignores whitespace' '
> +	git switch --orphan=upstream-with-space &&
> +	test_commit initial file &&
> +	>expect &&
> +	git switch --create=feature-without-space &&
> +
> +	# A spaceless file on the feature branch.  Expect a match upstream.
> +	printf space >file &&
> +	git add file &&
> +	git commit -m"file without space" &&
> +	git log --format="- %H" -1 >>expect &&
> +
> +	# A further change.  Should not match upstream.
> +	test_commit change file &&
> +	git log --format="+ %H" -1 >>expect &&
> +
> +	git switch upstream-with-space &&
> +	# Same as the spaceless file, just with spaces and on upstream.
> +	test_commit "file with space" file "s p a c e" file-with-space &&
> +	git cherry upstream-with-space feature-without-space >actual &&
> +	test_cmp expect actual
> +'
> +
>  test_done
> --
> 2.28.0
Johannes Schindelin Aug. 24, 2020, 12:42 p.m. UTC | #2
Hi,

On Tue, 18 Aug 2020, Junio C Hamano wrote:

> René Scharfe <l.s.r@web.de> writes:
>
> > Whitespace is ignored when calculating patch IDs.  This is done by
> > removing all whitespace from diff lines before hashing them, including
> > a newline at the end of a file.  If that newline is missing, however,
> > diff reports that fact in a separate line containing "\ No newline at
> > end of file\n", and this marker is hashed like a context line.
>
> Ah, ouch.
>
> > This goes against our goal of making patch IDs independent of
> > whitespace.  Use the same heuristic that 2485eab55cc (git-patch-id: do
> > not trip over "no newline" markers, 2011-02-17) added to git patch-id
> > instead and skip diff lines that start with a backslash and a space
> > and are longer than twelve characters.
>
> Good find of previous example.  Excellent.

Yup. Looks good to me, too. Thank you!
Dscho

>
> > Reported-by: Tilman Vogel <tilman.vogel@web.de>
> > Initial-test-by: Tilman Vogel <tilman.vogel@web.de>
> > Signed-off-by: René Scharfe <l.s.r@web.de>
> > ---
> >  diff.c            |  2 ++
> >  t/t3500-cherry.sh | 23 +++++++++++++++++++++++
> >  2 files changed, 25 insertions(+)
>
> Thanks.
>
> > diff --git a/diff.c b/diff.c
> > index f9709de7b45..f175019eb7a 100644
> > --- a/diff.c
> > +++ b/diff.c
> > @@ -6044,6 +6044,8 @@ static void patch_id_consume(void *priv, char *line, unsigned long len)
> >  	struct patch_id_t *data = priv;
> >  	int new_len;
> >
> > +	if (len > 12 && starts_with(line, "\\ "))
> > +		return;
> >  	new_len = remove_space(line, len);
> >
> >  	the_hash_algo->update_fn(data->ctx, line, new_len);
> > diff --git a/t/t3500-cherry.sh b/t/t3500-cherry.sh
> > index f038f34b7c0..2b8d9cb38ed 100755
> > --- a/t/t3500-cherry.sh
> > +++ b/t/t3500-cherry.sh
> > @@ -55,4 +55,27 @@ test_expect_success \
> >       expr "$(echo $(git cherry master my-topic-branch) )" : "+ [^ ]* - .*"
> >  '
> >
> > +test_expect_success 'cherry ignores whitespace' '
> > +	git switch --orphan=upstream-with-space &&
> > +	test_commit initial file &&
> > +	>expect &&
> > +	git switch --create=feature-without-space &&
> > +
> > +	# A spaceless file on the feature branch.  Expect a match upstream.
> > +	printf space >file &&
> > +	git add file &&
> > +	git commit -m"file without space" &&
> > +	git log --format="- %H" -1 >>expect &&
> > +
> > +	# A further change.  Should not match upstream.
> > +	test_commit change file &&
> > +	git log --format="+ %H" -1 >>expect &&
> > +
> > +	git switch upstream-with-space &&
> > +	# Same as the spaceless file, just with spaces and on upstream.
> > +	test_commit "file with space" file "s p a c e" file-with-space &&
> > +	git cherry upstream-with-space feature-without-space >actual &&
> > +	test_cmp expect actual
> > +'
> > +
> >  test_done
> > --
> > 2.28.0
>
Tilman Vogel Aug. 27, 2020, 9:05 a.m. UTC | #3
That's great, thanks René! Looking forward to try that out!

Tilman

Am Mi., 19. Aug. 2020 um 00:09 Uhr schrieb René Scharfe <l.s.r@web.de>:
>
> Whitespace is ignored when calculating patch IDs.  This is done by
> removing all whitespace from diff lines before hashing them, including
> a newline at the end of a file.  If that newline is missing, however,
> diff reports that fact in a separate line containing "\ No newline at
> end of file\n", and this marker is hashed like a context line.
>
> This goes against our goal of making patch IDs independent of
> whitespace.  Use the same heuristic that 2485eab55cc (git-patch-id: do
> not trip over "no newline" markers, 2011-02-17) added to git patch-id
> instead and skip diff lines that start with a backslash and a space
> and are longer than twelve characters.
>
> Reported-by: Tilman Vogel <tilman.vogel@web.de>
> Initial-test-by: Tilman Vogel <tilman.vogel@web.de>
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
>  diff.c            |  2 ++
>  t/t3500-cherry.sh | 23 +++++++++++++++++++++++
>  2 files changed, 25 insertions(+)
>
> diff --git a/diff.c b/diff.c
> index f9709de7b45..f175019eb7a 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -6044,6 +6044,8 @@ static void patch_id_consume(void *priv, char *line, unsigned long len)
>         struct patch_id_t *data = priv;
>         int new_len;
>
> +       if (len > 12 && starts_with(line, "\\ "))
> +               return;
>         new_len = remove_space(line, len);
>
>         the_hash_algo->update_fn(data->ctx, line, new_len);
> diff --git a/t/t3500-cherry.sh b/t/t3500-cherry.sh
> index f038f34b7c0..2b8d9cb38ed 100755
> --- a/t/t3500-cherry.sh
> +++ b/t/t3500-cherry.sh
> @@ -55,4 +55,27 @@ test_expect_success \
>       expr "$(echo $(git cherry master my-topic-branch) )" : "+ [^ ]* - .*"
>  '
>
> +test_expect_success 'cherry ignores whitespace' '
> +       git switch --orphan=upstream-with-space &&
> +       test_commit initial file &&
> +       >expect &&
> +       git switch --create=feature-without-space &&
> +
> +       # A spaceless file on the feature branch.  Expect a match upstream.
> +       printf space >file &&
> +       git add file &&
> +       git commit -m"file without space" &&
> +       git log --format="- %H" -1 >>expect &&
> +
> +       # A further change.  Should not match upstream.
> +       test_commit change file &&
> +       git log --format="+ %H" -1 >>expect &&
> +
> +       git switch upstream-with-space &&
> +       # Same as the spaceless file, just with spaces and on upstream.
> +       test_commit "file with space" file "s p a c e" file-with-space &&
> +       git cherry upstream-with-space feature-without-space >actual &&
> +       test_cmp expect actual
> +'
> +
>  test_done
> --
> 2.28.0
diff mbox series

Patch

diff reports that fact in a separate line containing "\ No newline at
end of file\n", and this marker is hashed like a context line.

This goes against our goal of making patch IDs independent of
whitespace.  Use the same heuristic that 2485eab55cc (git-patch-id: do
not trip over "no newline" markers, 2011-02-17) added to git patch-id
instead and skip diff lines that start with a backslash and a space
and are longer than twelve characters.

Reported-by: Tilman Vogel <tilman.vogel@web.de>
Initial-test-by: Tilman Vogel <tilman.vogel@web.de>
Signed-off-by: René Scharfe <l.s.r@web.de>
---
 diff.c            |  2 ++
 t/t3500-cherry.sh | 23 +++++++++++++++++++++++
 2 files changed, 25 insertions(+)

diff --git a/diff.c b/diff.c
index f9709de7b45..f175019eb7a 100644
--- a/diff.c
+++ b/diff.c
@@ -6044,6 +6044,8 @@  static void patch_id_consume(void *priv, char *line, unsigned long len)
 	struct patch_id_t *data = priv;
 	int new_len;

+	if (len > 12 && starts_with(line, "\\ "))
+		return;
 	new_len = remove_space(line, len);

 	the_hash_algo->update_fn(data->ctx, line, new_len);
diff --git a/t/t3500-cherry.sh b/t/t3500-cherry.sh
index f038f34b7c0..2b8d9cb38ed 100755
--- a/t/t3500-cherry.sh
+++ b/t/t3500-cherry.sh
@@ -55,4 +55,27 @@  test_expect_success \
      expr "$(echo $(git cherry master my-topic-branch) )" : "+ [^ ]* - .*"
 '

+test_expect_success 'cherry ignores whitespace' '
+	git switch --orphan=upstream-with-space &&
+	test_commit initial file &&
+	>expect &&
+	git switch --create=feature-without-space &&
+
+	# A spaceless file on the feature branch.  Expect a match upstream.
+	printf space >file &&
+	git add file &&
+	git commit -m"file without space" &&
+	git log --format="- %H" -1 >>expect &&
+
+	# A further change.  Should not match upstream.
+	test_commit change file &&
+	git log --format="+ %H" -1 >>expect &&
+
+	git switch upstream-with-space &&
+	# Same as the spaceless file, just with spaces and on upstream.
+	test_commit "file with space" file "s p a c e" file-with-space &&
+	git cherry upstream-with-space feature-without-space >actual &&
+	test_cmp expect actual
+'
+
 test_done