[v2,0/2] sequencer: remove use of hardcoded comment char

Message ID	pull.1603.v2.git.1698728952.gitgitgadget@gmail.com (mailing list archive)
Headers	show Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C70816D22 for <git@vger.kernel.org>; Tue, 31 Oct 2023 05:09:19 +0000 (UTC) Message-ID: <pull.1603.v2.git.1698728952.gitgitgadget@gmail.com> In-Reply-To: <pull.1603.git.1698635292629.gitgitgadget@gmail.com> References: <pull.1603.git.1698635292629.gitgitgadget@gmail.com> From: "Tony Tung via GitGitGadget" <gitgitgadget@gmail.com> Date: Tue, 31 Oct 2023 05:09:10 +0000 Subject: [PATCH v2 0/2] sequencer: remove use of hardcoded comment char Fcc: Sent Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk MIME-Version: 1.0 To: git@vger.kernel.org Cc: Elijah Newren <newren@gmail.com>, Tony Tung <tonytung@merly.org>, Tony Tung <tonytung@merly.org>
Series	sequencer: remove use of hardcoded comment char \| expand [v2,0/2] sequencer: remove use of hardcoded comment char [v2,1/2] sequencer: remove use of comment character [v2,2/2] sequencer: fix remaining hardcoded comment char

Message ID

pull.1603.v2.git.1698728952.gitgitgadget@gmail.com (mailing list archive)

Headers

Message-ID: <pull.1603.v2.git.1698728952.gitgitgadget@gmail.com>
In-Reply-To: <pull.1603.git.1698635292629.gitgitgadget@gmail.com>
References: <pull.1603.git.1698635292629.gitgitgadget@gmail.com>
From: "Tony Tung via GitGitGadget" <gitgitgadget@gmail.com>
Date: Tue, 31 Oct 2023 05:09:10 +0000
Subject: [PATCH v2 0/2] sequencer: remove use of hardcoded comment char
Fcc: Sent
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Precedence: bulk
MIME-Version: 1.0
To: git@vger.kernel.org
Cc: Elijah Newren <newren@gmail.com>,
    Tony Tung <tonytung@merly.org>,
    Tony Tung <tonytung@merly.org>

Series

sequencer: remove use of hardcoded comment char | expand

Message

Jean-Noël Avila via GitGitGadget Oct. 31, 2023, 5:09 a.m. UTC

Instead of using the hardcoded # , use the user-defined comment_line_char.
Adds a test to prevent regressions.

Tony Tung (2):
  sequencer: remove use of comment character
  sequencer: fix remaining hardcoded comment char

 sequencer.c                   | 21 +++++++++++++------
 t/t3404-rebase-interactive.sh | 39 +++++++++++++++++++++++++++++++++++
 2 files changed, 54 insertions(+), 6 deletions(-)


base-commit: 2e8e77cbac8ac17f94eee2087187fa1718e38b14
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1603%2Fttung%2Fttung%2Fcommentchar-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1603/ttung/ttung/commentchar-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1603

Range-diff vs v1:

 1:  10598a56d64 = 1:  10598a56d64 sequencer: remove use of comment character
 -:  ----------- > 2:  c9f4ff34dbd sequencer: fix remaining hardcoded comment char

Comments

Elijah Newren Oct. 31, 2023, 6:55 a.m. UTC | #1

Hi,

On Mon, Oct 30, 2023 at 10:09 PM Tony Tung via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> Instead of using the hardcoded # , use the user-defined comment_line_char.
> Adds a test to prevent regressions.
>
> Tony Tung (2):
>   sequencer: remove use of comment character
>   sequencer: fix remaining hardcoded comment char

The second commit message seems to suggest that the two commits should
just be squashed; there's no explicit or even implicit reason provided
for why the two small patches are logically independent.  After
reading them carefully, and digging through the particular changes
being made and what part of the code they touch, I think I can guess
at a potential reason, but I feel like I'm crossing into the territory
of mind reading trying to articulate that reason.  (Besides, my
rationale would argue that the two patches should be split
differently.)  Perhaps a comment could be added, to either the second
commit message or the cover letter, to explain that better?

More importantly, though, I think the second commit message is
actually wrong.  Before and after applying this series:

$ git grep -c -e '".*#' -e "'#'" -- sequencer.c
sequencer.c:16

$ b4 am c9f4ff34dbdb7ba221e4203bb6551b80948dc71d.1698728953.git.gitgitgadget@gmail.com
$ git am ./v2_20231031_gitgitgadget_sequencer_remove_use_of_hardcoded_comment_char.mbx

$ git grep -c -e '".*#' -e "'#'" -- sequencer.c
sequencer.c:12

Granted, four of those lines are code comments, but that still leaves
8 hard coded references to '#' in the code at the end (i.e. the
majority are still left), meaning your second patch doesn't do what
its subject line claims.

And, most important of all is still the first patch.  As I stated
elsewhere in this thread (at
CABPp-BFY7m_g+sT131_Ubxqo5FsHGKOPMng7=90_0-+xCS9NEQ@mail.gmail.com):

"""
I think supporting comment_line_char for the TODO file provides no
value, and I think the easier fix would be undoing the uses of
comment_line_char relative to the TODO file (perhaps also leaving
in-code comments to the effect that comment_line_char just doesn't
apply to the TODO file).

However, if someone prefers to make the TODO file also respect
comment_line_char, despite its dubious value, then I expect any patch
should
  1) audit *every* reference found via git grep -e '".*#' -e "'#'" sequencer.c
  2) add a test case (or cases) involving --rebase-merges -i that
trigger the relevant code paths
If they don't do that, then I fear we might make the bug more likely
to be triggered rather than less.
"""

Personally, I would rather not accept patches changing the handling of
the TODO script relative to comment_line_char until the above is done,
and I worry that half measures _might_ end up being more hurtful than
helpful.

I feel quite differently about patches that make COMMIT_EDITMSG
handling use comment_line_char more consistently since that code
simply writes the file without re-parsing it; although fixing
everything would be best, even fixing some of them to use
comment_line_char would be welcome.  I think the first two hunks of
your second patch happen to fall into this category, so if those were
split out, then I'd say those are good partial solutions.

Phillip Wood Oct. 31, 2023, 11:18 a.m. UTC | #2

Hi Elijah

On 31/10/2023 06:55, Elijah Newren wrote:
> Hi,
> 
> On Mon, Oct 30, 2023 at 10:09 PM Tony Tung via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>>
>> Instead of using the hardcoded # , use the user-defined comment_line_char.
>> Adds a test to prevent regressions.
>>
>> Tony Tung (2):
>>    sequencer: remove use of comment character
>>    sequencer: fix remaining hardcoded comment char
> 
> The second commit message seems to suggest that the two commits should
> just be squashed; there's no explicit or even implicit reason provided
> for why the two small patches are logically independent.  After
> reading them carefully, and digging through the particular changes
> being made and what part of the code they touch, I think I can guess
> at a potential reason, but I feel like I'm crossing into the territory
> of mind reading trying to articulate that reason.  (Besides, my
> rationale would argue that the two patches should be split
> differently.)  Perhaps a comment could be added, to either the second
> commit message or the cover letter, to explain that better?
> 
> More importantly, though, I think the second commit message is
> actually wrong.  Before and after applying this series:
> 
> $ git grep -c -e '".*#' -e "'#'" -- sequencer.c
> sequencer.c:16
> 
> $ b4 am c9f4ff34dbdb7ba221e4203bb6551b80948dc71d.1698728953.git.gitgitgadget@gmail.com
> $ git am ./v2_20231031_gitgitgadget_sequencer_remove_use_of_hardcoded_comment_char.mbx
> 
> $ git grep -c -e '".*#' -e "'#'" -- sequencer.c
> sequencer.c:12

As far as I can see those remaining instances are all to do with the '#' 
that separates a merge subject line from its parents. I don't think we 
need to complicate things anymore by respecting core.commentchar there 
as the '#' is not denoting a commented line, it is being used as an 
intra-line separator instead.

> Granted, four of those lines are code comments, but that still leaves
> 8 hard coded references to '#' in the code at the end (i.e. the
> majority are still left), meaning your second patch doesn't do what
> its subject line claims.
> 
> And, most important of all is still the first patch.  As I stated
> elsewhere in this thread (at
> CABPp-BFY7m_g+sT131_Ubxqo5FsHGKOPMng7=90_0-+xCS9NEQ@mail.gmail.com):
> 
> """
> I think supporting comment_line_char for the TODO file provides no
> value, and I think the easier fix would be undoing the uses of
> comment_line_char relative to the TODO file (perhaps also leaving
> in-code comments to the effect that comment_line_char just doesn't
> apply to the TODO file).

I agree that I don't see much point in respecting core.commentchar in 
the TODO file as unlike a commit message a legitimate non-commented line 
will never begin with '#'. Unfortunately I think we're committed to 
respecting it - see 180bad3d10f (rebase -i: respect core.commentchar, 
2013-02-11)

> [...] 
> I feel quite differently about patches that make COMMIT_EDITMSG
> handling use comment_line_char more consistently since that code
> simply writes the file without re-parsing it; although fixing
> everything would be best, even fixing some of them to use
> comment_line_char would be welcome.  I think the first two hunks of
> your second patch happen to fall into this category, so if those were
> split out, then I'd say those are good partial solutions.

I think splitting the changes so that we have one patch that fixes the 
TODO file generation and another that fixes the commit message 
generation for fixup commands would be best.

Best Wishes

Phillip

Junio C Hamano Nov. 1, 2023, 12:16 a.m. UTC | #3

Phillip Wood <phillip.wood123@gmail.com> writes:

> As far as I can see those remaining instances are all to do with the
> '#' that separates a merge subject line from its parents. I don't
> think we need to complicate things anymore by respecting
> core.commentchar there as the '#' is not denoting a commented line, it
> is being used as an intra-line separator instead.

It is unfortunate that the format of the file needs an intra-line
separator in the first place, but I tend to agree with you that the
comment-line-char would be a terrible fit there.  '#' or any
replacement character at the beginning of a line is easy to spot as
a signal that the line in its entirety is commented out, but it is
much harder to eyeball-spot a single punctuation character in the
middle of a line.  If we do not have to look for a different
character depending on the comment-line-char setting, it would make
the system (slightly) easier to use.

> I agree that I don't see much point in respecting core.commentchar in
> the TODO file as unlike a commit message a legitimate non-commented
> line will never begin with '#'. Unfortunately I think we're committed
> to respecting it - see 180bad3d10f (rebase -i: respect
> core.commentchar, 2013-02-11)

Yeah, the ship has long sailed.

> I think splitting the changes so that we have one patch that fixes the
> TODO file generation and another that fixes the commit message
> generation for fixup commands would be best.

Yes, it would be great.

Thanks.

Elijah Newren Nov. 1, 2023, 12:21 a.m. UTC | #4

On Tue, Oct 31, 2023 at 4:18 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>
> Hi Elijah
>
> On 31/10/2023 06:55, Elijah Newren wrote:
> > Hi,
> >
> > On Mon, Oct 30, 2023 at 10:09 PM Tony Tung via GitGitGadget
> > <gitgitgadget@gmail.com> wrote:
> >>
> >> Instead of using the hardcoded # , use the user-defined comment_line_char.
> >> Adds a test to prevent regressions.
> >>
> >> Tony Tung (2):
> >>    sequencer: remove use of comment character
> >>    sequencer: fix remaining hardcoded comment char
> >
> > The second commit message seems to suggest that the two commits should
> > just be squashed; there's no explicit or even implicit reason provided
> > for why the two small patches are logically independent.  After
> > reading them carefully, and digging through the particular changes
> > being made and what part of the code they touch, I think I can guess
> > at a potential reason, but I feel like I'm crossing into the territory
> > of mind reading trying to articulate that reason.  (Besides, my
> > rationale would argue that the two patches should be split
> > differently.)  Perhaps a comment could be added, to either the second
> > commit message or the cover letter, to explain that better?
> >
> > More importantly, though, I think the second commit message is
> > actually wrong.  Before and after applying this series:
> >
> > $ git grep -c -e '".*#' -e "'#'" -- sequencer.c
> > sequencer.c:16
> >
> > $ b4 am c9f4ff34dbdb7ba221e4203bb6551b80948dc71d.1698728953.git.gitgitgadget@gmail.com
> > $ git am ./v2_20231031_gitgitgadget_sequencer_remove_use_of_hardcoded_comment_char.mbx
> >
> > $ git grep -c -e '".*#' -e "'#'" -- sequencer.c
> > sequencer.c:12
>
> As far as I can see those remaining instances are all to do with the '#'
> that separates a merge subject line from its parents. I don't think we
> need to complicate things anymore by respecting core.commentchar there
> as the '#' is not denoting a commented line, it is being used as an
> intra-line separator instead.

Ah, that might be jogging my memory slightly.  I had a patch to put a
comment before the one-line commit summaries in the TODO list
(https://github.com/git/git/commit/f1ae608477e010b96557d6fc87eed9f3f39b905e).
I think I at some point noticed comment_line_char, and went to switch
to it, probably also switching the mid-line comment char for merges,
and then noticed the potential for breakage due to the manual parsing
of those.

Anyway, I trust your analysis, but I believe some of that analysis
belongs in the relevant commit messages if we push forward with these
changes.

> > Granted, four of those lines are code comments, but that still leaves
> > 8 hard coded references to '#' in the code at the end (i.e. the
> > majority are still left), meaning your second patch doesn't do what
> > its subject line claims.
> >
> > And, most important of all is still the first patch.  As I stated
> > elsewhere in this thread (at
> > CABPp-BFY7m_g+sT131_Ubxqo5FsHGKOPMng7=90_0-+xCS9NEQ@mail.gmail.com):
> >
> > """
> > I think supporting comment_line_char for the TODO file provides no
> > value, and I think the easier fix would be undoing the uses of
> > comment_line_char relative to the TODO file (perhaps also leaving
> > in-code comments to the effect that comment_line_char just doesn't
> > apply to the TODO file).
>
> I agree that I don't see much point in respecting core.commentchar in
> the TODO file as unlike a commit message a legitimate non-commented line
> will never begin with '#'. Unfortunately I think we're committed to
> respecting it - see 180bad3d10f (rebase -i: respect core.commentchar,
> 2013-02-11)

Thanks for digging up the old commit and the explicit mention of the
TODO file.  Kind of disappointing.  While I can't imagine anything
that would actually break by reverting this, it's not worth it at this
point.

> > [...]
> > I feel quite differently about patches that make COMMIT_EDITMSG
> > handling use comment_line_char more consistently since that code
> > simply writes the file without re-parsing it; although fixing
> > everything would be best, even fixing some of them to use
> > comment_line_char would be welcome.  I think the first two hunks of
> > your second patch happen to fall into this category, so if those were
> > split out, then I'd say those are good partial solutions.
>
> I think splitting the changes so that we have one patch that fixes the
> TODO file generation and another that fixes the commit message
> generation for fixup commands would be best.

That would seem more logical to me.