diff mbox series

[RFC/PATCH] add-patch: handle splitting hunks with diff.suppressBlankEmpty

Message ID 20240710093610.GA2076910@coredump.intra.peff.net (mailing list archive)
State New
Headers show
Series [RFC/PATCH] add-patch: handle splitting hunks with diff.suppressBlankEmpty | expand

Commit Message

Jeff King July 10, 2024, 9:36 a.m. UTC
On Fri, Jul 05, 2024 at 09:39:52AM -0700, Junio C Hamano wrote:

> As to the "commit -p" issue, I think the patch parser is in the
> wrong and needs to be corrected, period.  As long as the patches
> given as input are well-formed, we should be prepared to grok
> them (we even allow manual editing of patches, right?).

Maybe this?

-- >8 --
Subject: add-patch: handle splitting hunks with diff.suppressBlankEmpty

When "add -p" parses diffs, it looks for context lines starting with a
single space. But when diff.suppressBlankEmpty is in effect, an empty
context line will omit the space, giving us a true empty line. This
confuses the parser, which is unable to split based on such a line.

It's tempting to say that we should just make sure that we generate a

Comments

Phillip Wood July 10, 2024, 1:46 p.m. UTC | #1
Hi Peff

On 10/07/2024 10:36, Jeff King wrote:
> Subject: add-patch: handle splitting hunks with diff.suppressBlankEmpty

> When "add -p" parses diffs, it looks for context lines starting with a
> single space. But when diff.suppressBlankEmpty is in effect, an empty
> context line will omit the space, giving us a true empty line. This
> confuses the parser, which is unable to split based on such a line.

> It's tempting to say that we should just make sure that we generate a
> diff without that option. But we may parse diffs not only generated by
> Git, but ones that users have manually edited. And POSIX calls the
> decision of whether to print the space here "implementation-defined".

Do we ever parse an edited hunk with this code? I'm not sure there is a
way of splitting a hunk after it has been edited as I don't think we
ever display it again.

> So let's handle both cases: a context line either starts with a space or
> consists of a totally empty line.
> 
> Reported-by: Ilya Tumaykin <itumaykin@gmail.com>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> I'm a little worried that this creates ambiguities, since I don't think
> we are careful about following the hunk header's line counts. Imagine
> you had an input like this:
> 
>    @@ -1,2 +1,2 @@> 
>    -old
>    +new
>     stuff
> 
>    some garbage
> 
> We obviously know that "some garbage" is not a context line and is just
> trailing junk, because it does not begin with "-", "+" or space. But
> what about the blank line in between? It looks like an empty context
> line, but we can only know that it is not by respecting the counts in
> the hunk header.
> 
> I don't think we'd ever generate this ourselves, but could somebody
> manually edit a hunk into this shape? When I tried it in practice, it
> looks like we fail to apply the result even before my patch, though. I'm> not sure why that is. If I put "some garbage" without the blank line, we
> correctly realize it should be discarded. It's possible I'm just holding
> it wrong.

When we recount the hunk after it has been edited we ignore lines that
don't begin with '+', '-', ' ', or '\n' so if you add some garbage at
the end of the hunk the recounted hunk header excludes it when it gets
applied.

I think your patch looks good. I did wonder if we wanted to fix this
by normalizing context lines instead as shown in the diff below. That
might make it less likely to miss adding "|| '\n'" in future code that
is looking for a context line but I don't have a strong preference
either way.

Best Wishes

Phillip

---- >8 ----
diff --git a/add-patch.c b/add-patch.c
index d8ea05ff108..795aa772b7a 100644
--- a/add-patch.c
+++ b/add-patch.c
@@ -400,6 +400,12 @@ static void complete_file(char marker, struct hunk *hunk)
  		hunk->splittable_into++;
  }
  
+/* Empty context lines may omit the leading ' ' */
+static int normalize_marker(char marker)
+{
+	return marker == '\n' ? ' ' : marker;
+}
+
  static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
  {
  	struct strvec args = STRVEC_INIT;
@@ -485,6 +491,7 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
  	while (p != pend) {
  		char *eol = memchr(p, '\n', pend - p);
  		const char *deleted = NULL, *mode_change = NULL;
+		char ch = normalize_marker(*p);
  
  		if (!eol)
  			eol = pend;
@@ -532,7 +539,7 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
  			 * Start counting into how many hunks this one can be
  			 * split
  			 */
-			marker = *p;
+			marker = ch;
  		} else if (hunk == &file_diff->head &&
  			   starts_with(p, "new file")) {
  			file_diff->added = 1;
@@ -586,10 +593,10 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
  			    (int)(eol - (plain->buf + file_diff->head.start)),
  			    plain->buf + file_diff->head.start);
  
-		if ((marker == '-' || marker == '+') && *p == ' ')
+		if ((marker == '-' || marker == '+') && ch == ' ')
  			hunk->splittable_into++;
-		if (marker && *p != '\\')
-			marker = *p;
+		if (marker && ch != '\\')
+			marker = ch;
  
  		p = eol == pend ? pend : eol + 1;
  		hunk->end = p - plain->buf;
@@ -953,7 +960,7 @@ static int split_hunk(struct add_p_state *s, struct file_diff *file_diff,
  	context_line_count = 0;
  
  	while (splittable_into > 1) {
-		ch = s->plain.buf[current];
+		ch = normalize_marker(s->plain.buf[current]);
  
  		if (!ch)
  			BUG("buffer overrun while splitting hunks");
diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 5d78868ac16..385c246baf0 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -1164,4 +1164,31 @@ test_expect_success 'reset -p with unmerged files' '
  	test_must_be_empty staged
  '
  
+test_expect_success 'hunk splitting works with diff.suppressBlankEmpty' '
+	cat >expect <<-\EOF &&
+	diff --git a/file b/file
+	index 777b174..ebc9684 100755
+	--- a/file
+	+++ b/file
+	@@ -2,6 +2,6 @@ p
+	 q
+	 r
+
+	-d
+	-e
+	-f
+	+s
+	+t
+	+u
+	EOF
+
+	test_config diff.suppressBlankEmpty true &&
+	test_write_lines a b c "" d e f >file &&
+	git add file &&
+	test_write_lines p q r "" s t u >file &&
+	test_write_lines s y n q | git add -p &&
+	git diff >actual &&
+	diff_cmp expect actual
+'
+
  test_done
Junio C Hamano July 10, 2024, 5:06 p.m. UTC | #2
Jeff King <peff@peff.net> writes:

> I'm a little worried that this creates ambiguities, since I don't think
> we are careful about following the hunk header's line counts. Imagine
> you had an input like this:
>
>   @@ -1,2 +1,2 @@
>   -old
>   +new
>    stuff
>
>   some garbage

True.  Especially if you allow editing of hunks, the only thing you
could make available to you are the version you gave the user and
the version you got back from the user, but comparing them would not
help resolve the ambiguity to correct the hunk header all that much.
"diff" edit mode various editors may have _could_ help as they know
each and every keystroke by the user and how the modified patch was
constructed to guess the intention better than a mere comparison of
before- and after- shape of the patch (but the last time I checked,
diff edit mode of GNU Emacs did not adjust the hunk header correctly
in some corner cases).

> I don't think we'd ever generate this ourselves, but could somebody
> manually edit a hunk into this shape? When I tried it in practice, it
> looks like we fail to apply the result even before my patch, though. I'm
> not sure why that is. If I put "some garbage" without the blank line, we
> correctly realize it should be discarded. It's possible I'm just holding
> it wrong.

I've given up on the "hunk edit" doing wrong things to a patch
already.  

The "edit" codepath used to be a lot less careless before Phillip
whipped it into a much better shape with the series that ended at
436d18f2 (Merge branch 'pw/add-p-recount', 2018-03-14), that
introduced recount_edited_hunk() that special cases "+/-/ ".  It
already knows that a completely empty line in a patch is an empty
and unmodified line (which was ported from f4d35a6b (add -p: fix
counting empty context lines in edited patches, 2018-06-11)), so
this patch does not have to do anything new there.

But "recounting" will be fooled by garbage left in the edited
result, so I think we have done all we can do to resolve possible
ambiguities.  The patch under discussion is not adding anything new
and making it any worse, I would say.

> diff --git a/add-patch.c b/add-patch.c
> index 6e176cd21a..7beead1d0a 100644
> --- a/add-patch.c
> +++ b/add-patch.c
> @@ -588,7 +588,7 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
>  			    (int)(eol - (plain->buf + file_diff->head.start)),
>  			    plain->buf + file_diff->head.start);
>  
> -		if ((marker == '-' || marker == '+') && *p == ' ')
> +		if ((marker == '-' || marker == '+') && (*p == ' ' || *p == '\n'))
>  			hunk->splittable_into++;
>  		if (marker && *p != '\\')
>  			marker = *p;
> @@ -964,7 +964,7 @@ static int split_hunk(struct add_p_state *s, struct file_diff *file_diff,
>  		 * Is this the first context line after a chain of +/- lines?
>  		 * Then record the start of the next split hunk.
>  		 */
> -		if ((marker == '-' || marker == '+') && ch == ' ') {
> +		if ((marker == '-' || marker == '+') && (ch == ' ' || ch == '\n')) {
>  			first = 0;
>  			hunk[1].start = current;
>  			if (colored)
> @@ -979,14 +979,14 @@ static int split_hunk(struct add_p_state *s, struct file_diff *file_diff,
>  		 * Then just increment the appropriate counter and continue
>  		 * with the next line.
>  		 */
> -		if (marker != ' ' || (ch != '-' && ch != '+')) {
> +		if ((marker != ' ' && marker != '\n') || (ch != '-' && ch != '+')) {
>  next_hunk_line:
>  			/* Comment lines are attached to the previous line */
>  			if (ch == '\\')
>  				ch = marker ? marker : ' ';
>  
>  			/* current hunk not done yet */
> -			if (ch == ' ')
> +			if (ch == ' ' || ch == '\n')
>  				context_line_count++;
>  			else if (ch == '-')
>  				header->old_count++;
> diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
> index 5d78868ac1..92c8e6dc8c 100755
> --- a/t/t3701-add-interactive.sh
> +++ b/t/t3701-add-interactive.sh
> @@ -1164,4 +1164,36 @@ test_expect_success 'reset -p with unmerged files' '
>  	test_must_be_empty staged
>  '
>  
> +test_expect_success 'splitting handles diff.suppressBlankEmpty' '
> +	test_when_finished "git reset --hard" &&
> +	cat >file <<-\EOF &&
> +	1
> +	2
> +
> +	3
> +	4
> +	EOF
> +	git add file &&
> +
> +	cat >file <<-\EOF &&
> +	one
> +	two
> +
> +	three
> +	four
> +	EOF
> +	test_write_lines s n y |
> +	git -c diff.suppressBlankEmpty=true add -p &&
> +
> +	git cat-file blob :file >actual &&
> +	cat >expect <<-\EOF &&
> +	1
> +	2
> +
> +	three
> +	four
> +	EOF
> +	test_cmp expect actual
> +'
> +
>  test_done
Jeff King July 11, 2024, 9:26 p.m. UTC | #3
On Wed, Jul 10, 2024 at 02:46:30PM +0100, Phillip Wood wrote:

> > It's tempting to say that we should just make sure that we generate a
> > diff without that option. But we may parse diffs not only generated by
> > Git, but ones that users have manually edited. And POSIX calls the
> > decision of whether to print the space here "implementation-defined".
> 
> Do we ever parse an edited hunk with this code? I'm not sure there is a
> way of splitting a hunk after it has been edited as I don't think we
> ever display it again.

Hmm, I just assumed this code would see the edited hunk, but now I'm not
sure. Note that the changes required do go outside of split_hunk(); the
initial parse_diff() needs to decide whether the hunk is splittable in
the first place (as an aside, that puzzled me at first why just changing
split_hunk() was enough for the case that started this thread, but not
the one in the included test. The difference is that without the empty
line as context, the hunk in the test wouldn't be splittable at all).

But looking closer: yes, I do think parse_diff() is used only for the
initial patch. So we really would only see git-generated patches here.
Which I think takes away my ambiguity concern, but does mean the commit
message is wrong.

> > I don't think we'd ever generate this ourselves, but could somebody
> > manually edit a hunk into this shape? When I tried it in practice, it
> > looks like we fail to apply the result even before my patch, though. I'm
> > not sure why that is. If I put "some garbage" without the blank line, we
> > correctly realize it should be discarded. It's possible I'm just holding
> > it wrong.
> 
> When we recount the hunk after it has been edited we ignore lines that
> don't begin with '+', '-', ' ', or '\n' so if you add some garbage at
> the end of the hunk the recounted hunk header excludes it when it gets
> applied.

OK, that makes sense. And we could never rely on the hunk header in the
edited hunk anyway, since the whole point is that we have to recount it.
So the user must accept that an extra blank line is potential context
(and that goes all the way back to bcdd297b78 (built-in add -p:
implement hunk editing, 2019-12-13).

> I think your patch looks good. I did wonder if we wanted to fix this
> by normalizing context lines instead as shown in the diff below. That
> might make it less likely to miss adding "|| '\n'" in future code that
> is looking for a context line but I don't have a strong preference
> either way.

Yeah, I had a similar thought, but it got messy because we have to deal
with the source buffer. But the extra "char ch" you added in the patch
below fixes that. I think the result is better.

Looking at the blank-line handling in recount_edited_hunk(), we also
handle a CRLF empty line there. Should we do so here, too? If so, then
it would just be a matter of touching normalize_marker() in your patch.

Do you want to just re-send your patch with a commit message to replace
mine? (Feel free to steal the non-wrong parts of my message ;) ).

> ---- >8 ----
> diff --git a/add-patch.c b/add-patch.c
> index d8ea05ff108..795aa772b7a 100644
> --- a/add-patch.c
> +++ b/add-patch.c
> @@ -400,6 +400,12 @@ static void complete_file(char marker, struct hunk *hunk)
>  		hunk->splittable_into++;
>  }
> +/* Empty context lines may omit the leading ' ' */
> +static int normalize_marker(char marker)
> +{
> +	return marker == '\n' ? ' ' : marker;
> +}
> +
>  static int parse_diff(struct add_p_state *s, const struct pathspec *ps)

Minor nit: missing blank line between functions.

-Peff
Jeff King July 11, 2024, 9:32 p.m. UTC | #4
On Wed, Jul 10, 2024 at 10:06:28AM -0700, Junio C Hamano wrote:

> > I don't think we'd ever generate this ourselves, but could somebody
> > manually edit a hunk into this shape? When I tried it in practice, it
> > looks like we fail to apply the result even before my patch, though. I'm
> > not sure why that is. If I put "some garbage" without the blank line, we
> > correctly realize it should be discarded. It's possible I'm just holding
> > it wrong.
> 
> I've given up on the "hunk edit" doing wrong things to a patch
> already.  
> 
> The "edit" codepath used to be a lot less careless before Phillip
> whipped it into a much better shape with the series that ended at
> 436d18f2 (Merge branch 'pw/add-p-recount', 2018-03-14), that
> introduced recount_edited_hunk() that special cases "+/-/ ".  It
> already knows that a completely empty line in a patch is an empty
> and unmodified line (which was ported from f4d35a6b (add -p: fix
> counting empty context lines in edited patches, 2018-06-11)), so
> this patch does not have to do anything new there.

Yeah, I was being paranoid without actually thinking through the
implications. As Phillip noted, we do not even run the code I changed on
the edited hunk (which already does handle the empty line). So not only
could I not break it, but it is already doing the right thing thanks to
that earlier work.

> But "recounting" will be fooled by garbage left in the edited
> result, so I think we have done all we can do to resolve possible
> ambiguities.  The patch under discussion is not adding anything new
> and making it any worse, I would say.

Yep, agreed. So modulo a slight inaccuracy in the commit message, I
think my patch is OK. That said, I like what Phillip showed in his
reply. If he wants to wrap that up into a patch, I think it could
replace mine.

-Peff
Phillip Wood July 13, 2024, 1:19 p.m. UTC | #5
Hi Peff

On 11/07/2024 22:26, Jeff King wrote:
> On Wed, Jul 10, 2024 at 02:46:30PM +0100, Phillip Wood wrote:
 >
>> I think your patch looks good. I did wonder if we wanted to fix this
>> by normalizing context lines instead as shown in the diff below. That
>> might make it less likely to miss adding "|| '\n'" in future code that
>> is looking for a context line but I don't have a strong preference
>> either way.
> 
> Yeah, I had a similar thought, but it got messy because we have to deal
> with the source buffer. But the extra "char ch" you added in the patch
> below fixes that. I think the result is better.
> 
> Looking at the blank-line handling in recount_edited_hunk(), we also
> handle a CRLF empty line there. Should we do so here, too? If so, then
> it would just be a matter of touching normalize_marker() in your patch.
> 
> Do you want to just re-send your patch with a commit message to replace
> mine? (Feel free to steal the non-wrong parts of my message ;) ).

Thanks, I'll do that

Best Wishes

Phillip

>> ---- >8 ----
>> diff --git a/add-patch.c b/add-patch.c
>> index d8ea05ff108..795aa772b7a 100644
>> --- a/add-patch.c
>> +++ b/add-patch.c
>> @@ -400,6 +400,12 @@ static void complete_file(char marker, struct hunk *hunk)
>>   		hunk->splittable_into++;
>>   }
>> +/* Empty context lines may omit the leading ' ' */
>> +static int normalize_marker(char marker)
>> +{
>> +	return marker == '\n' ? ' ' : marker;
>> +}
>> +
>>   static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
> 
> Minor nit: missing blank line between functions.
> 
> -Peff
Jeff King July 13, 2024, 9:21 p.m. UTC | #6
On Sat, Jul 13, 2024 at 02:19:39PM +0100, phillip.wood123@gmail.com wrote:

> > Do you want to just re-send your patch with a commit message to replace
> > mine? (Feel free to steal the non-wrong parts of my message ;) ).
> 
> Thanks, I'll do that

Thanks. Note that mine is in 'next', but I think it would not be a big
deal to revert and replace (I'm not sure it is even on track for 2.46
anyway).

-Peff
Phillip Wood July 18, 2024, 2:22 p.m. UTC | #7
On 13/07/2024 22:21, Jeff King wrote:
> On Sat, Jul 13, 2024 at 02:19:39PM +0100, phillip.wood123@gmail.com wrote:
> 
>>> Do you want to just re-send your patch with a commit message to replace
>>> mine? (Feel free to steal the non-wrong parts of my message ;) ).
>>
>> Thanks, I'll do that
> 
> Thanks. Note that mine is in 'next', but I think it would not be a big
> deal to revert and replace (I'm not sure it is even on track for 2.46
> anyway).

Thanks for the heads-up, I see that Junio has reverted it in the latest 
what's cooking email

Phillip

> -Peff
>
Phillip Wood July 18, 2024, 2:23 p.m. UTC | #8
On 11/07/2024 22:26, Jeff King wrote:
> On Wed, Jul 10, 2024 at 02:46:30PM +0100, Phillip Wood wrote:
> 
>>> It's tempting to say that we should just make sure that we generate a
>>> diff without that option. But we may parse diffs not only generated by
>>> Git, but ones that users have manually edited. And POSIX calls the
>>> decision of whether to print the space here "implementation-defined".
>>
>> Do we ever parse an edited hunk with this code? I'm not sure there is a
>> way of splitting a hunk after it has been edited as I don't think we
>> ever display it again.
> 
> Hmm, I just assumed this code would see the edited hunk, but now I'm not
> sure. Note that the changes required do go outside of split_hunk(); the
> initial parse_diff() needs to decide whether the hunk is splittable in
> the first place (as an aside, that puzzled me at first why just changing
> split_hunk() was enough for the case that started this thread, but not
> the one in the included test. The difference is that without the empty
> line as context, the hunk in the test wouldn't be splittable at all).
> 
> But looking closer: yes, I do think parse_diff() is used only for the
> initial patch.

That's true but I've realized that we do in fact allow the user to 
re-display and split an edited hunk. If there are two changes in a file 
then you can edit the first hunk and when prompted about the second 
press 'K' to go back to the first one and then split it with 's' (I 
messed up my test for this before sending my previous mail as I changed 
two separate files rather than putting two changes in a single file). So 
split_hunk() can encounter empty context lines even without 
diff.suppressBlankEmpty being set as lots of editors strip the leading 
space when the rest of the line is empty[1].

As we haven't had any bug reports about that I suspect people are not 
splitting the hunks they edited which I guess makes sense. I think there 
is another bug lurking for anyone who does try to split an edited hunk 
as we don't update `hunk->splittable_into` after it has been edited and 
the edit might change a deleted of an empty line to a context line or 
vice versa. We should make sure any garbage at the end of the hunk is 
discarded as well so we don't show it when we display the edited hunk.

Best Wishes

Phillip

[1] When I added the code to recount the hunk header rather than relying 
on "git apply --recount" initially it did not support empty context 
lines and we received quite a few bug reports pretty quickly after it 
was released.

> So we really would only see git-generated patches here.
> Which I think takes away my ambiguity concern, but does mean the commit
> message is wrong.
> 
>>> I don't think we'd ever generate this ourselves, but could somebody
>>> manually edit a hunk into this shape? When I tried it in practice, it
>>> looks like we fail to apply the result even before my patch, though. I'm
>>> not sure why that is. If I put "some garbage" without the blank line, we
>>> correctly realize it should be discarded. It's possible I'm just holding
>>> it wrong.
>>
>> When we recount the hunk after it has been edited we ignore lines that
>> don't begin with '+', '-', ' ', or '\n' so if you add some garbage at
>> the end of the hunk the recounted hunk header excludes it when it gets
>> applied.
> 
> OK, that makes sense. And we could never rely on the hunk header in the
> edited hunk anyway, since the whole point is that we have to recount it.
> So the user must accept that an extra blank line is potential context
> (and that goes all the way back to bcdd297b78 (built-in add -p:
> implement hunk editing, 2019-12-13).
> 
>> I think your patch looks good. I did wonder if we wanted to fix this
>> by normalizing context lines instead as shown in the diff below. That
>> might make it less likely to miss adding "|| '\n'" in future code that
>> is looking for a context line but I don't have a strong preference
>> either way.
> 
> Yeah, I had a similar thought, but it got messy because we have to deal
> with the source buffer. But the extra "char ch" you added in the patch
> below fixes that. I think the result is better.
> 
> Looking at the blank-line handling in recount_edited_hunk(), we also
> handle a CRLF empty line there. Should we do so here, too? If so, then
> it would just be a matter of touching normalize_marker() in your patch.
> 
> Do you want to just re-send your patch with a commit message to replace
> mine? (Feel free to steal the non-wrong parts of my message ;) ).
> 
>> ---- >8 ----
>> diff --git a/add-patch.c b/add-patch.c
>> index d8ea05ff108..795aa772b7a 100644
>> --- a/add-patch.c
>> +++ b/add-patch.c
>> @@ -400,6 +400,12 @@ static void complete_file(char marker, struct hunk *hunk)
>>   		hunk->splittable_into++;
>>   }
>> +/* Empty context lines may omit the leading ' ' */
>> +static int normalize_marker(char marker)
>> +{
>> +	return marker == '\n' ? ' ' : marker;
>> +}
>> +
>>   static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
> 
> Minor nit: missing blank line between functions.
> 
> -Peff
>
diff mbox series

Patch

diff without that option. But we may parse diffs not only generated by
Git, but ones that users have manually edited. And POSIX calls the
decision of whether to print the space here "implementation-defined".

So let's handle both cases: a context line either starts with a space or
consists of a totally empty line.

Reported-by: Ilya Tumaykin <itumaykin@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
---
I'm a little worried that this creates ambiguities, since I don't think
we are careful about following the hunk header's line counts. Imagine
you had an input like this:

  @@ -1,2 +1,2 @@
  -old
  +new
   stuff

  some garbage

We obviously know that "some garbage" is not a context line and is just
trailing junk, because it does not begin with "-", "+" or space. But
what about the blank line in between? It looks like an empty context
line, but we can only know that it is not by respecting the counts in
the hunk header.

I don't think we'd ever generate this ourselves, but could somebody
manually edit a hunk into this shape? When I tried it in practice, it
looks like we fail to apply the result even before my patch, though. I'm
not sure why that is. If I put "some garbage" without the blank line, we
correctly realize it should be discarded. It's possible I'm just holding
it wrong.

 add-patch.c                |  8 ++++----
 t/t3701-add-interactive.sh | 32 ++++++++++++++++++++++++++++++++
 2 files changed, 36 insertions(+), 4 deletions(-)

diff --git a/add-patch.c b/add-patch.c
index 6e176cd21a..7beead1d0a 100644
--- a/add-patch.c
+++ b/add-patch.c
@@ -588,7 +588,7 @@  static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
 			    (int)(eol - (plain->buf + file_diff->head.start)),
 			    plain->buf + file_diff->head.start);
 
-		if ((marker == '-' || marker == '+') && *p == ' ')
+		if ((marker == '-' || marker == '+') && (*p == ' ' || *p == '\n'))
 			hunk->splittable_into++;
 		if (marker && *p != '\\')
 			marker = *p;
@@ -964,7 +964,7 @@  static int split_hunk(struct add_p_state *s, struct file_diff *file_diff,
 		 * Is this the first context line after a chain of +/- lines?
 		 * Then record the start of the next split hunk.
 		 */
-		if ((marker == '-' || marker == '+') && ch == ' ') {
+		if ((marker == '-' || marker == '+') && (ch == ' ' || ch == '\n')) {
 			first = 0;
 			hunk[1].start = current;
 			if (colored)
@@ -979,14 +979,14 @@  static int split_hunk(struct add_p_state *s, struct file_diff *file_diff,
 		 * Then just increment the appropriate counter and continue
 		 * with the next line.
 		 */
-		if (marker != ' ' || (ch != '-' && ch != '+')) {
+		if ((marker != ' ' && marker != '\n') || (ch != '-' && ch != '+')) {
 next_hunk_line:
 			/* Comment lines are attached to the previous line */
 			if (ch == '\\')
 				ch = marker ? marker : ' ';
 
 			/* current hunk not done yet */
-			if (ch == ' ')
+			if (ch == ' ' || ch == '\n')
 				context_line_count++;
 			else if (ch == '-')
 				header->old_count++;
diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 5d78868ac1..92c8e6dc8c 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -1164,4 +1164,36 @@  test_expect_success 'reset -p with unmerged files' '
 	test_must_be_empty staged
 '
 
+test_expect_success 'splitting handles diff.suppressBlankEmpty' '
+	test_when_finished "git reset --hard" &&
+	cat >file <<-\EOF &&
+	1
+	2
+
+	3
+	4
+	EOF
+	git add file &&
+
+	cat >file <<-\EOF &&
+	one
+	two
+
+	three
+	four
+	EOF
+	test_write_lines s n y |
+	git -c diff.suppressBlankEmpty=true add -p &&
+
+	git cat-file blob :file >actual &&
+	cat >expect <<-\EOF &&
+	1
+	2
+
+	three
+	four
+	EOF
+	test_cmp expect actual
+'
+
 test_done