From patchwork Wed Jul 10 09:36:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 13729125 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 48F5F17FD for ; Wed, 10 Jul 2024 09:36:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720604174; cv=none; b=JSJ1eOxzhoPJCvxMlBidukdGaTZ4yafv7jt8WsE7AZX7E6h1SexebRTyQeoVnSCtjbJ6eOdLEYYE5kOe05J28EclcZoX40TykvLBDXnnYShzj+01NqRswy+zuVy98jFITXlja6r03UJEUz0wD3aaroJsNhtg3moDl3gtf829vSs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720604174; c=relaxed/simple; bh=W5jbaQl7veG4IR1Paez/GbUzL+mbGOlZeL+ni1QCrQc=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=IHndHWAHtCpynKYHRN69R0rQbEZUkDDO2REjCVsnyRibdHH3MulxUc4lNcahxeamXkCf9c0iC/6vYH5ith2t8k5A2oTLpRzyzVioSv2adC5lGgYTM74i3K/22Wqn2pZjfccHD83WrC/cByzLvy/W98EVxOJUw961qVe91IfEMp8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Received: (qmail 1872 invoked by uid 109); 10 Jul 2024 09:36:11 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Wed, 10 Jul 2024 09:36:11 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 11041 invoked by uid 111); 10 Jul 2024 09:36:08 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Wed, 10 Jul 2024 05:36:08 -0400 Authentication-Results: peff.net; auth=none Date: Wed, 10 Jul 2024 05:36:10 -0400 From: Jeff King To: Junio C Hamano Cc: Phillip Wood , Ilya Tumaykin , git@vger.kernel.org, Johannes Schindelin Subject: [RFC/PATCH] add-patch: handle splitting hunks with diff.suppressBlankEmpty Message-ID: <20240710093610.GA2076910@coredump.intra.peff.net> References: <9b31e86f-c408-4625-8d13-f48a209b541b@gmail.com> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: On Fri, Jul 05, 2024 at 09:39:52AM -0700, Junio C Hamano wrote: > As to the "commit -p" issue, I think the patch parser is in the > wrong and needs to be corrected, period. As long as the patches > given as input are well-formed, we should be prepared to grok > them (we even allow manual editing of patches, right?). Maybe this? -- >8 -- Subject: add-patch: handle splitting hunks with diff.suppressBlankEmpty When "add -p" parses diffs, it looks for context lines starting with a single space. But when diff.suppressBlankEmpty is in effect, an empty context line will omit the space, giving us a true empty line. This confuses the parser, which is unable to split based on such a line. It's tempting to say that we should just make sure that we generate a diff without that option. But we may parse diffs not only generated by Git, but ones that users have manually edited. And POSIX calls the decision of whether to print the space here "implementation-defined". So let's handle both cases: a context line either starts with a space or consists of a totally empty line. Reported-by: Ilya Tumaykin Signed-off-by: Jeff King --- I'm a little worried that this creates ambiguities, since I don't think we are careful about following the hunk header's line counts. Imagine you had an input like this: @@ -1,2 +1,2 @@ -old +new stuff some garbage We obviously know that "some garbage" is not a context line and is just trailing junk, because it does not begin with "-", "+" or space. But what about the blank line in between? It looks like an empty context line, but we can only know that it is not by respecting the counts in the hunk header. I don't think we'd ever generate this ourselves, but could somebody manually edit a hunk into this shape? When I tried it in practice, it looks like we fail to apply the result even before my patch, though. I'm not sure why that is. If I put "some garbage" without the blank line, we correctly realize it should be discarded. It's possible I'm just holding it wrong. add-patch.c | 8 ++++---- t/t3701-add-interactive.sh | 32 ++++++++++++++++++++++++++++++++ 2 files changed, 36 insertions(+), 4 deletions(-) diff --git a/add-patch.c b/add-patch.c index 6e176cd21a..7beead1d0a 100644 --- a/add-patch.c +++ b/add-patch.c @@ -588,7 +588,7 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps) (int)(eol - (plain->buf + file_diff->head.start)), plain->buf + file_diff->head.start); - if ((marker == '-' || marker == '+') && *p == ' ') + if ((marker == '-' || marker == '+') && (*p == ' ' || *p == '\n')) hunk->splittable_into++; if (marker && *p != '\\') marker = *p; @@ -964,7 +964,7 @@ static int split_hunk(struct add_p_state *s, struct file_diff *file_diff, * Is this the first context line after a chain of +/- lines? * Then record the start of the next split hunk. */ - if ((marker == '-' || marker == '+') && ch == ' ') { + if ((marker == '-' || marker == '+') && (ch == ' ' || ch == '\n')) { first = 0; hunk[1].start = current; if (colored) @@ -979,14 +979,14 @@ static int split_hunk(struct add_p_state *s, struct file_diff *file_diff, * Then just increment the appropriate counter and continue * with the next line. */ - if (marker != ' ' || (ch != '-' && ch != '+')) { + if ((marker != ' ' && marker != '\n') || (ch != '-' && ch != '+')) { next_hunk_line: /* Comment lines are attached to the previous line */ if (ch == '\\') ch = marker ? marker : ' '; /* current hunk not done yet */ - if (ch == ' ') + if (ch == ' ' || ch == '\n') context_line_count++; else if (ch == '-') header->old_count++; diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh index 5d78868ac1..92c8e6dc8c 100755 --- a/t/t3701-add-interactive.sh +++ b/t/t3701-add-interactive.sh @@ -1164,4 +1164,36 @@ test_expect_success 'reset -p with unmerged files' ' test_must_be_empty staged ' +test_expect_success 'splitting handles diff.suppressBlankEmpty' ' + test_when_finished "git reset --hard" && + cat >file <<-\EOF && + 1 + 2 + + 3 + 4 + EOF + git add file && + + cat >file <<-\EOF && + one + two + + three + four + EOF + test_write_lines s n y | + git -c diff.suppressBlankEmpty=true add -p && + + git cat-file blob :file >actual && + cat >expect <<-\EOF && + 1 + 2 + + three + four + EOF + test_cmp expect actual +' + test_done