diff mbox series

sequencer: remove use of comment character

Message ID pull.1603.git.1698635292629.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series sequencer: remove use of comment character | expand

Commit Message

Tony Tung Oct. 30, 2023, 3:08 a.m. UTC
From: Tony Tung <tonytung@merly.org>

Instead of using the hardcoded `# `, use the
user-defined comment_line_char.  Adds a test
to prevent regressions.

Signed-off-by: Tony Tung <tonytung@merly.org>
---
    sequencer: remove use of hardcoded comment char
    
    Instead of using the hardcoded # , use the user-defined
    comment_line_char. Adds a test to prevent regressions.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1603%2Fttung%2Fttung%2Fcommentchar-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1603/ttung/ttung/commentchar-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1603

 sequencer.c                   |  5 +++--
 t/t3404-rebase-interactive.sh | 39 +++++++++++++++++++++++++++++++++++
 2 files changed, 42 insertions(+), 2 deletions(-)


base-commit: 2e8e77cbac8ac17f94eee2087187fa1718e38b14

Comments

Junio C Hamano Oct. 30, 2023, 4 a.m. UTC | #1
"Tony Tung via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Tony Tung <tonytung@merly.org>
>
> Instead of using the hardcoded `# `, use the
> user-defined comment_line_char.  Adds a test
> to prevent regressions.

Good spotting.

Two observations.

 (1) There are a few more places that need similar treatment in the
     same file; you may want to fix them all while at it.

 (2) The second argument to strbuf_commented_addf() is always the
     comment_line_char global variable, not just inside this file
     but all callers across the codebase.  We probably should drop
     it and have the strbuf_commented_addf() helper itself refer to
     the global.  That way, if we ever want to change the global
     variable reference to something else (e.g. function call), we
     only have to touch a single place.

The latter is meant as #leftoverbits and will be a lot wider
clean-up that we may want to do long after this patch hits out
codebase.  The "other places" I spotted for the former are the
following, but needs to be taken with a huge grain of salt, as it
has not even been compile tested.

Thanks.

 sequencer.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git c/sequencer.c w/sequencer.c
index d584cac8ed..33208b1660 100644
--- c/sequencer.c
+++ w/sequencer.c
@@ -1893,8 +1893,8 @@ static void update_squash_message_for_fixup(struct strbuf *msg)
 	size_t orig_msg_len;
 	int i = 1;
 
-	strbuf_addf(&buf1, "# %s\n", _(first_commit_msg_str));
-	strbuf_addf(&buf2, "# %s\n", _(skip_first_commit_msg_str));
+	strbuf_addf(&buf1, comment_line_char, "%s\n", _(first_commit_msg_str));
+	strbuf_addf(&buf2, comment_line_char, "%s\n", _(skip_first_commit_msg_str));
 	s = start = orig_msg = strbuf_detach(msg, &orig_msg_len);
 	while (s) {
 		const char *next;
@@ -2269,8 +2269,8 @@ static int do_pick_commit(struct repository *r,
 		next = parent;
 		next_label = msg.parent_label;
 		if (opts->commit_use_reference) {
-			strbuf_addstr(&msgbuf,
-				"# *** SAY WHY WE ARE REVERTING ON THE TITLE LINE ***");
+			strbuf_commented_addf(&msgbuf, comment_line_char, "%s",
+				"*** SAY WHY WE ARE REVERTING ON THE TITLE LINE ***");
 		} else if (skip_prefix(msg.subject, "Revert \"", &orig_subject) &&
 			   /*
 			    * We don't touch pre-existing repeated reverts, because
Elijah Newren Oct. 30, 2023, 5:26 p.m. UTC | #2
On Sun, Oct 29, 2023 at 9:01 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Tony Tung via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Tony Tung <tonytung@merly.org>
> >
> > Instead of using the hardcoded `# `, use the
> > user-defined comment_line_char.  Adds a test
> > to prevent regressions.
>
> Good spotting.
>
> Two observations.
>
>  (1) There are a few more places that need similar treatment in the
>      same file; you may want to fix them all while at it.
>
>  (2) The second argument to strbuf_commented_addf() is always the
>      comment_line_char global variable, not just inside this file
>      but all callers across the codebase.  We probably should drop
>      it and have the strbuf_commented_addf() helper itself refer to
>      the global.  That way, if we ever want to change the global
>      variable reference to something else (e.g. function call), we
>      only have to touch a single place.
>
> The latter is meant as #leftoverbits and will be a lot wider
> clean-up that we may want to do long after this patch hits out
> codebase.  The "other places" I spotted for the former are the
> following, but needs to be taken with a huge grain of salt, as it
> has not even been compile tested.
>
> Thanks.
>
>  sequencer.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git c/sequencer.c w/sequencer.c
> index d584cac8ed..33208b1660 100644
> --- c/sequencer.c
> +++ w/sequencer.c
> @@ -1893,8 +1893,8 @@ static void update_squash_message_for_fixup(struct strbuf *msg)
>         size_t orig_msg_len;
>         int i = 1;
>
> -       strbuf_addf(&buf1, "# %s\n", _(first_commit_msg_str));
> -       strbuf_addf(&buf2, "# %s\n", _(skip_first_commit_msg_str));
> +       strbuf_addf(&buf1, comment_line_char, "%s\n", _(first_commit_msg_str));
> +       strbuf_addf(&buf2, comment_line_char, "%s\n", _(skip_first_commit_msg_str));
>         s = start = orig_msg = strbuf_detach(msg, &orig_msg_len);
>         while (s) {
>                 const char *next;
> @@ -2269,8 +2269,8 @@ static int do_pick_commit(struct repository *r,
>                 next = parent;
>                 next_label = msg.parent_label;
>                 if (opts->commit_use_reference) {
> -                       strbuf_addstr(&msgbuf,
> -                               "# *** SAY WHY WE ARE REVERTING ON THE TITLE LINE ***");
> +                       strbuf_commented_addf(&msgbuf, comment_line_char, "%s",
> +                               "*** SAY WHY WE ARE REVERTING ON THE TITLE LINE ***");
>                 } else if (skip_prefix(msg.subject, "Revert \"", &orig_subject) &&
>                            /*
>                             * We don't touch pre-existing repeated reverts, because
>

I thought the point of the comment_line_char was so that commit
messages could have lines starting with '#'.  That rationale doesn't
apply to the TODO list generation or parsing, and I'm not sure if we
want to add the same complexity there.  If we do want to add the same
complexity there, I'm worried that making these changes are
insufficent; there are some other hardcoded '#' references in the code
(as a quick greps for '".*#' and "'#'" will turn up).  Since those
other references include parsing as well as generation, I think we
might actually be introducing bugs in the TODO list handling if we
only partially convert it, but someone would need to double check.
Junio C Hamano Oct. 30, 2023, 11:35 p.m. UTC | #3
Elijah Newren <newren@gmail.com> writes:

> I thought the point of the comment_line_char was so that commit
> messages could have lines starting with '#'.  That rationale doesn't
> apply to the TODO list generation or parsing, and I'm not sure if we
> want to add the same complexity there.

Thanks for a healthy dose of sanity.  I noticed existing use of
comment_line_char everywhere in sequencer.c and assumed we would
want to be consistent, but you are right to point out that they are
all about the COMMIT_EDITMSG kind of thing, and not about what
appears in "sequencer/todo".
Tony Tung Oct. 31, 2023, 4:42 a.m. UTC | #4
> On Oct 30, 2023, at 4:35 PM, Junio C Hamano <gitster@pobox.com> wrote:
> 
> Elijah Newren <newren@gmail.com> writes:
> 
>> I thought the point of the comment_line_char was so that commit
>> messages could have lines starting with '#'.  That rationale doesn't
>> apply to the TODO list generation or parsing, and I'm not sure if we
>> want to add the same complexity there.
> 
> Thanks for a healthy dose of sanity.  I noticed existing use of
> comment_line_char everywhere in sequencer.c and assumed we would
> want to be consistent, but you are right to point out that they are
> all about the COMMIT_EDITMSG kind of thing, and not about what
> appears in "sequencer/todo”.

I believe comment_line_char is being applied when the sequencer reads back the instructions, which is why I ran into this problem to begin with.

If the intent is not to apply it to the sequencer, then the bugfix is in the wrong place.

Thanks
Tony Tung Oct. 31, 2023, 4:50 a.m. UTC | #5
> On Oct 30, 2023, at 4:35 PM, Junio C Hamano <gitster@pobox.com> wrote:
> 
> Elijah Newren <newren@gmail.com> writes:
> 
>> I thought the point of the comment_line_char was so that commit
>> messages could have lines starting with '#'.  That rationale doesn't
>> apply to the TODO list generation or parsing, and I'm not sure if we
>> want to add the same complexity there.
> 
> Thanks for a healthy dose of sanity.  I noticed existing use of
> comment_line_char everywhere in sequencer.c and assumed we would
> want to be consistent, but you are right to point out that they are
> all about the COMMIT_EDITMSG kind of thing, and not about what
> appears in "sequencer/todo”.

Actually, I withdraw my previous comment.  comment_line_char is all over sequencer.c, and there are tests that say, "rebase -i respects core.commentchar”, which strongly implies that the *intent* is that rebase -i respects comment_line_char.

I would propose that we move forward with this, except perhaps removing more instances of comment_line_char.

Thanks.
Junio C Hamano Oct. 31, 2023, 5:33 a.m. UTC | #6
Junio C Hamano <gitster@pobox.com> writes:

> Elijah Newren <newren@gmail.com> writes:
>
>> I thought the point of the comment_line_char was so that commit
>> messages could have lines starting with '#'.  That rationale doesn't
>> apply to the TODO list generation or parsing, and I'm not sure if we
>> want to add the same complexity there.

Earlier I said

> Thanks for a healthy dose of sanity.  I noticed existing use of
> comment_line_char everywhere in sequencer.c and assumed we would
> want to be consistent, but you are right to point out that they are
> all about the COMMIT_EDITMSG kind of thing, and not about what
> appears in "sequencer/todo".

but with something as simple as

    $ git -c core.commentchar='@' rebase -i master seen^2

I can see that the references to comment_line_char in sequencer.c
are about the commented lines after the list of insn in the
generated sequencer/todo file, so even though the rationale does not
apply, isn't this already "broken" in the current code anyway?

Thanks.
Elijah Newren Oct. 31, 2023, 6:20 a.m. UTC | #7
On Mon, Oct 30, 2023 at 10:33 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Junio C Hamano <gitster@pobox.com> writes:
>
> > Elijah Newren <newren@gmail.com> writes:
> >
> >> I thought the point of the comment_line_char was so that commit
> >> messages could have lines starting with '#'.  That rationale doesn't
> >> apply to the TODO list generation or parsing, and I'm not sure if we
> >> want to add the same complexity there.
>
> Earlier I said
>
> > Thanks for a healthy dose of sanity.  I noticed existing use of
> > comment_line_char everywhere in sequencer.c and assumed we would
> > want to be consistent, but you are right to point out that they are
> > all about the COMMIT_EDITMSG kind of thing, and not about what
> > appears in "sequencer/todo".
>
> but with something as simple as
>
>     $ git -c core.commentchar='@' rebase -i master seen^2
>
> I can see that the references to comment_line_char in sequencer.c
> are about the commented lines after the list of insn in the
> generated sequencer/todo file, so even though the rationale does not
> apply, isn't this already "broken" in the current code anyway?

Yes, I believe it is.  However, I remember specifically looking at
cases with --rebase-merges about a year and a half ago, and noted that
there was a mixture of hardcoded '#' references along with
comment_line_char.  I noted at the time that changing
comment_line_char looked like it had a bug, and that the parsing in
particular would be fooled and do wrong things if it changed.
Unfortunately, I can't find any notes from the time with the details,
so I don't remember exactly what or how it was triggered.

However, I do suspect that the references to comment_line_char in the
`rebase -i` codepaths was not for any actual intended purpose, but
just noting that they were used elsewhere in the file (for
COMMIT_EDITMSG, where it made sense) and just mimicking that code
without realizing the lack of rationale.  That would have been mere
wasted effort had the comment_line_char been consistently supported in
the TODO file editing and parsing, but it wasn't, which left TODO
editing & parsing somewhat broken.

I think supporting comment_line_char for the TODO file provides no
value, and I think the easier fix would be undoing the uses of
comment_line_char relative to the TODO file (perhaps also leaving
in-code comments to the effect that comment_line_char just doesn't
apply to the TODO file).

However, if someone prefers to make the TODO file also respect
comment_line_char, despite its dubious value, then I expect any patch
should
  1) audit *every* reference found via git grep -e '".*#' -e "'#'" sequencer.c
  2) add a test case (or cases) involving --rebase-merges -i that
trigger the relevant code paths
If they don't do that, then I fear we might make the bug more likely
to be triggered rather than less.
diff mbox series

Patch

diff --git a/sequencer.c b/sequencer.c
index d584cac8ed9..8c6666d5e43 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -6082,8 +6082,9 @@  static int add_decorations_to_list(const struct commit *commit,
 		/* If the branch is checked out, then leave a comment instead. */
 		if ((path = branch_checked_out(decoration->name))) {
 			item->command = TODO_COMMENT;
-			strbuf_addf(ctx->buf, "# Ref %s checked out at '%s'\n",
-				    decoration->name, path);
+			strbuf_commented_addf(ctx->buf, comment_line_char,
+					      "Ref %s checked out at '%s'\n",
+					      decoration->name, path);
 		} else {
 			struct string_list_item *sti;
 			item->command = TODO_UPDATE_REF;
diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index 8ea2bf13026..076dca87871 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -1839,6 +1839,45 @@  test_expect_success '--update-refs adds label and update-ref commands' '
 	)
 '
 
+test_expect_success '--update-refs works with core.commentChar' '
+	git checkout -b update-refs-with-commentchar no-conflict-branch &&
+	test_config core.commentChar : &&
+	git branch -f base HEAD~4 &&
+	git branch -f first HEAD~3 &&
+	git branch -f second HEAD~3 &&
+	git branch -f third HEAD~1 &&
+	git commit --allow-empty --fixup=third &&
+	git branch -f is-not-reordered &&
+	git commit --allow-empty --fixup=HEAD~4 &&
+	git branch -f shared-tip &&
+	git checkout update-refs &&
+	(
+		write_script fake-editor.sh <<-\EOF &&
+		grep "^[^:]" "$1"
+		exit 1
+		EOF
+		test_set_editor "$(pwd)/fake-editor.sh" &&
+
+		cat >expect <<-EOF &&
+		pick $(git log -1 --format=%h J) J
+		fixup $(git log -1 --format=%h update-refs) fixup! J : empty
+		update-ref refs/heads/second
+		update-ref refs/heads/first
+		pick $(git log -1 --format=%h K) K
+		pick $(git log -1 --format=%h L) L
+		fixup $(git log -1 --format=%h is-not-reordered) fixup! L : empty
+		update-ref refs/heads/third
+		pick $(git log -1 --format=%h M) M
+		update-ref refs/heads/no-conflict-branch
+		update-ref refs/heads/is-not-reordered
+		update-ref refs/heads/update-refs-with-commentchar
+		EOF
+
+		test_must_fail git rebase -i --autosquash --update-refs primary shared-tip >todo &&
+		test_cmp expect todo
+	)
+'
+
 test_expect_success '--update-refs adds commands with --rebase-merges' '
 	git checkout -b update-refs-with-merge no-conflict-branch &&
 	git branch -f base HEAD~4 &&