rebase -i: Fix possibly wrong onto hash in todo
diff mbox series

Message ID 20200811131313.3349582-1-detegr@rbx.email
State New
Headers show
Series
  • rebase -i: Fix possibly wrong onto hash in todo
Related show

Commit Message

Antti Keränen Aug. 11, 2020, 1:13 p.m. UTC
'todo_list_write_to_file' may overwrite the static buffer, originating
from 'find_unique_abbrev', that was used to store the short commit hash
'c' for "# Rebase a..b onto c" message in the todo editor.

Fix by duplicating the string before usage, so subsequent calls to
'find_unique_abbrev' or other functions calling 'hash_to_hex_algop_r'
can't overwrite the buffer.

Found-by: Jussi Keränen <jussike@gmail.com>
Signed-off-by: Antti Keränen <detegr@rbx.email>
---
 sequencer.c                   |  7 ++++---
 t/t3404-rebase-interactive.sh | 13 +++++++++++++
 2 files changed, 17 insertions(+), 3 deletions(-)

Comments

Taylor Blau Aug. 11, 2020, 3:28 p.m. UTC | #1
Hi Antti,

Thanks for working on this. I have a few thoughts below, but I think
that this is on the right track.

On Tue, Aug 10, 2020 at 04:13:15PM +0300, Antti Keränen wrote:
> 'todo_list_write_to_file' may overwrite the static buffer, originating
> from 'find_unique_abbrev', that was used to store the short commit hash
> 'c' for "# Rebase a..b onto c" message in the todo editor.

It would be great to know the commit that regressed, or if this has
always been the case. I'm not sure if you'll have a ton of luck
bisecting, since you indicate that this overwrite *may* occur (that
makes me think that it doesn't always happen, so your efforts to bisect
the change may be noisy).

> Fix by duplicating the string before usage, so subsequent calls to
> 'find_unique_abbrev' or other functions calling 'hash_to_hex_algop_r'
> can't overwrite the buffer.
>
> Found-by: Jussi Keränen <jussike@gmail.com>
> Signed-off-by: Antti Keränen <detegr@rbx.email>

> ---
>  sequencer.c                   |  7 ++++---
>  t/t3404-rebase-interactive.sh | 13 +++++++++++++
>  2 files changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/sequencer.c b/sequencer.c
> index fd7701c88a..0679adb639 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -5178,13 +5178,12 @@ int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
>  		    struct string_list *commands, unsigned autosquash,
>  		    struct todo_list *todo_list)
>  {
> -	const char *shortonto, *todo_file = rebase_path_todo();
> +	const char *todo_file = rebase_path_todo();
>  	struct todo_list new_todo = TODO_LIST_INIT;
>  	struct strbuf *buf = &todo_list->buf, buf2 = STRBUF_INIT;
>  	struct object_id oid = onto->object.oid;
>  	int res;
> -
> -	shortonto = find_unique_abbrev(&oid, DEFAULT_ABBREV);
> +	char *shortonto;

A minor nit is that you could probably move this line up to below the
'const char *' declaration (it makes sense why you have to drop the
const qualifier, though).

>
>  	if (buf->len == 0) {
>  		struct todo_item *item = append_new_todo(todo_list);
> @@ -5206,8 +5205,10 @@ int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
>  		return error(_("nothing to do"));
>  	}
>
> +	shortonto = xstrdup(find_unique_abbrev(&oid, DEFAULT_ABBREV));
>  	res = edit_todo_list(r, todo_list, &new_todo, shortrevisions,
>  			     shortonto, flags);
> +	free(shortonto);

OK. I think of two things here:

  1. Why are we calling 'find_unique_abbrev()' instead of
     'short_commit_name()'? We already have a commit pointer, so I don't
     see any reason that we should be reimplementing that function (even
     though it is a one-liner).

  2. If we should indeed be calling 'short_commit_name()', are there
     other callers that need to do the same duplication? In other words:
     could you say a little bit more about what makes
     'complete_action()' special in this regard?

>  	if (res == -1)
>  		return -1;
>  	else if (res == -2) {
> diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
> index 4a7d21f898..09af16753c 100755
> --- a/t/t3404-rebase-interactive.sh
> +++ b/t/t3404-rebase-interactive.sh
> @@ -1760,6 +1760,19 @@ test_expect_success 'correct error message for commit --amend after empty pick'
>  	test_i18ngrep "middle of a rebase -- cannot amend." err
>  '
>
> +test_expect_success 'todo has correct onto hash' '
> +	write_script dump-raw.sh <<-\EOF &&
> +		cat "$1"
> +	EOF

It's too bad that you have to write your own script here, since we
already have 'set_cat_todo_editor()', but it makes sense since you want
to retain the comments (something which set_cat_todo_editor explicitly
does not do).

So, this makes sense.

> +	git checkout branch1 &&
> +	(
> +		test_set_editor "$(pwd)/dump-raw.sh" &&
> +		git rebase -i HEAD~5 >actual
> +	) &&
> +	onto=$(git rev-parse --short HEAD~5) &&
> +	test_i18ngrep "^# Rebase ..* onto $onto .*" actual
> +'
> +

This all makes sense, too, and it leaves the below test as the last one
in the file, which is the right thing to do.

>  # This must be the last test in this file
>  test_expect_success '$EDITOR and friends are unchanged' '
>  	test_editor_unchanged
> --
> 2.28.0

Thanks,
Taylor
Phillip Wood Aug. 11, 2020, 3:32 p.m. UTC | #2
Hi Antti

On 11/08/2020 14:13, Antti Keränen wrote:
> 'todo_list_write_to_file' may overwrite the static buffer, originating
> from 'find_unique_abbrev', that was used to store the short commit hash
> 'c' for "# Rebase a..b onto c" message in the todo editor.
> Fix by duplicating the string before usage, so subsequent calls to
> 'find_unique_abbrev' or other functions calling 'hash_to_hex_algop_r'
> can't overwrite the buffer.
> 
> Found-by: Jussi Keränen <jussike@gmail.com>
> Signed-off-by: Antti Keränen <detegr@rbx.email>

Thanks for working on this

> ---
>   sequencer.c                   |  7 ++++---
>   t/t3404-rebase-interactive.sh | 13 +++++++++++++
>   2 files changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/sequencer.c b/sequencer.c
> index fd7701c88a..0679adb639 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -5178,13 +5178,12 @@ int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
>   		    struct string_list *commands, unsigned autosquash,
>   		    struct todo_list *todo_list)
>   {
> -	const char *shortonto, *todo_file = rebase_path_todo();
> +	const char *todo_file = rebase_path_todo();

I'm not sure it's worth rearranging these lines. It probably does not 
matter but we could do

+	char shortonto[GIT_MAX_HEXSZ + 1];

and then later call find_unique_abbrev_r() instead so we don't have to 
worry about freeing shortonto.

>   	struct todo_list new_todo = TODO_LIST_INIT;
>   	struct strbuf *buf = &todo_list->buf, buf2 = STRBUF_INIT;
>   	struct object_id oid = onto->object.oid;
>   	int res;
> -
> -	shortonto = find_unique_abbrev(&oid, DEFAULT_ABBREV);
> +	char *shortonto;
>   
>   	if (buf->len == 0) {
>   		struct todo_item *item = append_new_todo(todo_list);
> @@ -5206,8 +5205,10 @@ int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
>   		return error(_("nothing to do"));
>   	}
>   
> +	shortonto = xstrdup(find_unique_abbrev(&oid, DEFAULT_ABBREV));
>   	res = edit_todo_list(r, todo_list, &new_todo, shortrevisions,
>   			     shortonto, flags);
> +	free(shortonto);
>   	if (res == -1)
>   		return -1;
>   	else if (res == -2) {
> diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
> index 4a7d21f898..09af16753c 100755
> --- a/t/t3404-rebase-interactive.sh
> +++ b/t/t3404-rebase-interactive.sh
> @@ -1760,6 +1760,19 @@ test_expect_success 'correct error message for commit --amend after empty pick'
>   	test_i18ngrep "middle of a rebase -- cannot amend." err
>   '
>   
> +test_expect_success 'todo has correct onto hash' '
> +	write_script dump-raw.sh <<-\EOF &&
> +		cat "$1"
> +	EOF
> +	git checkout branch1 &&
> +	(
> +		test_set_editor "$(pwd)/dump-raw.sh" &&
> +		git rebase -i HEAD~5 >actual
> +	) &&

Thanks for taking the trouble to add a test, I think all the lines above 
could be simplified to

	GIT_SEQUENCE_EDITOR=cat git rebase -i HEAD~5 branch1 >actual

> +	onto=$(git rev-parse --short HEAD~5) &&
> +	test_i18ngrep "^# Rebase ..* onto $onto .*" actual

we could lose the final .*

Many Thanks and Best Wishes

Phillip

> +'
> +
>   # This must be the last test in this file
>   test_expect_success '$EDITOR and friends are unchanged' '
>   	test_editor_unchanged
>
Taylor Blau Aug. 11, 2020, 3:36 p.m. UTC | #3
On Tue, Aug 11, 2020 at 04:32:37PM +0100, Phillip Wood wrote:
> Hi Antti
>
> On 11/08/2020 14:13, Antti Keränen wrote:
> > 'todo_list_write_to_file' may overwrite the static buffer, originating
> > from 'find_unique_abbrev', that was used to store the short commit hash
> > 'c' for "# Rebase a..b onto c" message in the todo editor.
> > Fix by duplicating the string before usage, so subsequent calls to
> > 'find_unique_abbrev' or other functions calling 'hash_to_hex_algop_r'
> > can't overwrite the buffer.
> >
> > Found-by: Jussi Keränen <jussike@gmail.com>
> > Signed-off-by: Antti Keränen <detegr@rbx.email>
>
> Thanks for working on this
>
> > ---
> >   sequencer.c                   |  7 ++++---
> >   t/t3404-rebase-interactive.sh | 13 +++++++++++++
> >   2 files changed, 17 insertions(+), 3 deletions(-)
> >
> > diff --git a/sequencer.c b/sequencer.c
> > index fd7701c88a..0679adb639 100644
> > --- a/sequencer.c
> > +++ b/sequencer.c
> > @@ -5178,13 +5178,12 @@ int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
> >   		    struct string_list *commands, unsigned autosquash,
> >   		    struct todo_list *todo_list)
> >   {
> > -	const char *shortonto, *todo_file = rebase_path_todo();
> > +	const char *todo_file = rebase_path_todo();
>
> I'm not sure it's worth rearranging these lines. It probably does not matter
> but we could do
>
> +	char shortonto[GIT_MAX_HEXSZ + 1];
>
> and then later call find_unique_abbrev_r() instead so we don't have to worry
> about freeing shortonto.
>
> >   	struct todo_list new_todo = TODO_LIST_INIT;
> >   	struct strbuf *buf = &todo_list->buf, buf2 = STRBUF_INIT;
> >   	struct object_id oid = onto->object.oid;
> >   	int res;
> > -
> > -	shortonto = find_unique_abbrev(&oid, DEFAULT_ABBREV);
> > +	char *shortonto;
> >   	if (buf->len == 0) {
> >   		struct todo_item *item = append_new_todo(todo_list);
> > @@ -5206,8 +5205,10 @@ int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
> >   		return error(_("nothing to do"));
> >   	}
> > +	shortonto = xstrdup(find_unique_abbrev(&oid, DEFAULT_ABBREV));
> >   	res = edit_todo_list(r, todo_list, &new_todo, shortrevisions,
> >   			     shortonto, flags);
> > +	free(shortonto);
> >   	if (res == -1)
> >   		return -1;
> >   	else if (res == -2) {
> > diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
> > index 4a7d21f898..09af16753c 100755
> > --- a/t/t3404-rebase-interactive.sh
> > +++ b/t/t3404-rebase-interactive.sh
> > @@ -1760,6 +1760,19 @@ test_expect_success 'correct error message for commit --amend after empty pick'
> >   	test_i18ngrep "middle of a rebase -- cannot amend." err
> >   '
> > +test_expect_success 'todo has correct onto hash' '
> > +	write_script dump-raw.sh <<-\EOF &&
> > +		cat "$1"
> > +	EOF
> > +	git checkout branch1 &&
> > +	(
> > +		test_set_editor "$(pwd)/dump-raw.sh" &&
> > +		git rebase -i HEAD~5 >actual
> > +	) &&
>
> Thanks for taking the trouble to add a test, I think all the lines above
> could be simplified to
>
> 	GIT_SEQUENCE_EDITOR=cat git rebase -i HEAD~5 branch1 >actual

Good suggestion.

> > +	onto=$(git rev-parse --short HEAD~5) &&
> > +	test_i18ngrep "^# Rebase ..* onto $onto .*" actual
>
> we could lose the final .*

Ack, I noticed this too during my review, but apparently forgot to
comment on it. I'm puzzled by the first '..*'. If you're searching for
any non-empty string, how about '.+' instead?

> Many Thanks and Best Wishes
>
> Phillip
>
> > +'
> > +
> >   # This must be the last test in this file
> >   test_expect_success '$EDITOR and friends are unchanged' '
> >   	test_editor_unchanged
> >
Thanks,
Taylor
Antti Keränen Aug. 11, 2020, 6:10 p.m. UTC | #4
Hi Taylor,

On Tue, Aug 11, 2020 at 11:28:32AM -0400, Taylor Blau wrote:
> Hi Antti,
> 
> Thanks for working on this. I have a few thoughts below, but I think
> that this is on the right track.
> 
> On Tue, Aug 10, 2020 at 04:13:15PM +0300, Antti Keränen wrote:
> > 'todo_list_write_to_file' may overwrite the static buffer, originating
> > from 'find_unique_abbrev', that was used to store the short commit hash
> > 'c' for "# Rebase a..b onto c" message in the todo editor.
> 
> It would be great to know the commit that regressed, or if this has
> always been the case. I'm not sure if you'll have a ton of luck
> bisecting, since you indicate that this overwrite *may* occur (that
> makes me think that it doesn't always happen, so your efforts to bisect
> the change may be noisy).

This was introduced when interactive rebase was refactored. The first
commit is 94bcad797966b6a3490bc6806d3ee3eed54da9d9. Would you like to
have this information in the commit message also?

The reason I stated it may occur is that the buffer find_unique_abbrev
returns is valid for 4 more calls. So the rebase must have 4 commits in
it before this happens.

> > Fix by duplicating the string before usage, so subsequent calls to
> > 'find_unique_abbrev' or other functions calling 'hash_to_hex_algop_r'
> > can't overwrite the buffer.
> >
> > Found-by: Jussi Keränen <jussike@gmail.com>
> > Signed-off-by: Antti Keränen <detegr@rbx.email>
> 
> > ---
> >  sequencer.c                   |  7 ++++---
> >  t/t3404-rebase-interactive.sh | 13 +++++++++++++
> >  2 files changed, 17 insertions(+), 3 deletions(-)
> >
> > diff --git a/sequencer.c b/sequencer.c
> > index fd7701c88a..0679adb639 100644
> > --- a/sequencer.c
> > +++ b/sequencer.c
> > @@ -5178,13 +5178,12 @@ int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
> >  		    struct string_list *commands, unsigned autosquash,
> >  		    struct todo_list *todo_list)
> >  {
> > -	const char *shortonto, *todo_file = rebase_path_todo();
> > +	const char *todo_file = rebase_path_todo();
> >  	struct todo_list new_todo = TODO_LIST_INIT;
> >  	struct strbuf *buf = &todo_list->buf, buf2 = STRBUF_INIT;
> >  	struct object_id oid = onto->object.oid;
> >  	int res;
> > -
> > -	shortonto = find_unique_abbrev(&oid, DEFAULT_ABBREV);
> > +	char *shortonto;
> 
> A minor nit is that you could probably move this line up to below the
> 'const char *' declaration (it makes sense why you have to drop the
> const qualifier, though).

Ack.

> > +	shortonto = xstrdup(find_unique_abbrev(&oid, DEFAULT_ABBREV));
> >  	res = edit_todo_list(r, todo_list, &new_todo, shortrevisions,
> >  			     shortonto, flags);
> > +	free(shortonto);
> 
> OK. I think of two things here:
> 
>   1. Why are we calling 'find_unique_abbrev()' instead of
>      'short_commit_name()'? We already have a commit pointer, so I don't
>      see any reason that we should be reimplementing that function (even
>      though it is a one-liner).
> 
>   2. If we should indeed be calling 'short_commit_name()', are there
>      other callers that need to do the same duplication? In other words:
>      could you say a little bit more about what makes
>      'complete_action()' special in this regard?

Good question. The code used find_unique_abbrev and as I'm new to the
code base I did not notice that short_commit_name essentially does the
same thing.

The reason what makes this special is that this code first calls
find_unique_abbrev which, as we know, stores its information to a static
buffer. The pointer is stored in a variable and it is assumed it does
not change.
Afterwards, it calls edit_todo_list that reuses the buffer before it
accesses the shortonto string, meaning that if we have enough commits in
the rebase, the shortonto string is overwritten.

Actually, now that you asked, I noticed that orig_head in
complete_action (actually comes from get_revision_ranges in
builtin/rebase.c) may be affected as well. Though, I'm not sure whether
there are enough calls to find_unique_abbrev in the execution path when
it is being used to actually cause a rewritten buffer. It wouldn't hurt
to duplicate it too just in case.
Antti Keränen Aug. 11, 2020, 6:15 p.m. UTC | #5
Hi Phillip and Taylor.

On Tue, Aug 11, 2020 at 04:32:37PM +0100, Phillip Wood wrote:
> > I'm not sure it's worth rearranging these lines. It probably does not matter
> > but we could do
> >
> > +	char shortonto[GIT_MAX_HEXSZ + 1];
> >
> > and then later call find_unique_abbrev_r() instead so we don't have to worry
> > about freeing shortonto.

I like this. I'll change the code, if the rest of the patch is okay.

> > Thanks for taking the trouble to add a test, I think all the lines above
> > could be simplified to
> >
> > 	GIT_SEQUENCE_EDITOR=cat git rebase -i HEAD~5 branch1 >actual

Nice! :) There was a test called 'todo count' which I mimicked. This
obviously is much cleaner solution, thanks.

> > we could lose the final .*

Ack.

On Tue, Aug 11, 2020 at 11:36:21AM -0400, Taylor Blau wrote:
> Ack, I noticed this too during my review, but apparently forgot to
> comment on it. I'm puzzled by the first '..*'. If you're searching for
> any non-empty string, how about '.+' instead?

That's true. Good point. I pretty much copy&pasted the 'todo count' test
so I didn't give this much thought. I'll fix this.
Taylor Blau Aug. 11, 2020, 6:24 p.m. UTC | #6
On Tue, Aug 11, 2020 at 09:10:17PM +0300, Antti Keränen wrote:
> Hi Taylor,
>
> On Tue, Aug 11, 2020 at 11:28:32AM -0400, Taylor Blau wrote:
> > Hi Antti,
> >
> > Thanks for working on this. I have a few thoughts below, but I think
> > that this is on the right track.
> >
> > On Tue, Aug 10, 2020 at 04:13:15PM +0300, Antti Keränen wrote:
> > > 'todo_list_write_to_file' may overwrite the static buffer, originating
> > > from 'find_unique_abbrev', that was used to store the short commit hash
> > > 'c' for "# Rebase a..b onto c" message in the todo editor.
> >
> > It would be great to know the commit that regressed, or if this has
> > always been the case. I'm not sure if you'll have a ton of luck
> > bisecting, since you indicate that this overwrite *may* occur (that
> > makes me think that it doesn't always happen, so your efforts to bisect
> > the change may be noisy).
>
> This was introduced when interactive rebase was refactored. The first
> commit is 94bcad797966b6a3490bc6806d3ee3eed54da9d9. Would you like to
> have this information in the commit message also?

If it's been broken since the start, I think having a reference to
94bcad7979 isn't interesting, but "this behavior has been broken since
its introduction" is.

> The reason I stated it may occur is that the buffer find_unique_abbrev
> returns is valid for 4 more calls. So the rebase must have 4 commits in
> it before this happens.

:-). Interesting detail, too, and probably worth noting in the commit
message.

>
> > > Fix by duplicating the string before usage, so subsequent calls to
> > > 'find_unique_abbrev' or other functions calling 'hash_to_hex_algop_r'
> > > can't overwrite the buffer.
> > >
> > > Found-by: Jussi Keränen <jussike@gmail.com>
> > > Signed-off-by: Antti Keränen <detegr@rbx.email>
> >
> > > ---
> > >  sequencer.c                   |  7 ++++---
> > >  t/t3404-rebase-interactive.sh | 13 +++++++++++++
> > >  2 files changed, 17 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/sequencer.c b/sequencer.c
> > > index fd7701c88a..0679adb639 100644
> > > --- a/sequencer.c
> > > +++ b/sequencer.c
> > > @@ -5178,13 +5178,12 @@ int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
> > >  		    struct string_list *commands, unsigned autosquash,
> > >  		    struct todo_list *todo_list)
> > >  {
> > > -	const char *shortonto, *todo_file = rebase_path_todo();
> > > +	const char *todo_file = rebase_path_todo();
> > >  	struct todo_list new_todo = TODO_LIST_INIT;
> > >  	struct strbuf *buf = &todo_list->buf, buf2 = STRBUF_INIT;
> > >  	struct object_id oid = onto->object.oid;
> > >  	int res;
> > > -
> > > -	shortonto = find_unique_abbrev(&oid, DEFAULT_ABBREV);
> > > +	char *shortonto;
> >
> > A minor nit is that you could probably move this line up to below the
> > 'const char *' declaration (it makes sense why you have to drop the
> > const qualifier, though).
>
> Ack.

Feel free to discard this suggestion in favor of what Phillip is
proposing, though.

> > > +	shortonto = xstrdup(find_unique_abbrev(&oid, DEFAULT_ABBREV));
> > >  	res = edit_todo_list(r, todo_list, &new_todo, shortrevisions,
> > >  			     shortonto, flags);
> > > +	free(shortonto);
> >
> > OK. I think of two things here:
> >
> >   1. Why are we calling 'find_unique_abbrev()' instead of
> >      'short_commit_name()'? We already have a commit pointer, so I don't
> >      see any reason that we should be reimplementing that function (even
> >      though it is a one-liner).
> >
> >   2. If we should indeed be calling 'short_commit_name()', are there
> >      other callers that need to do the same duplication? In other words:
> >      could you say a little bit more about what makes
> >      'complete_action()' special in this regard?
>
> Good question. The code used find_unique_abbrev and as I'm new to the
> code base I did not notice that short_commit_name essentially does the
> same thing.
>
> The reason what makes this special is that this code first calls
> find_unique_abbrev which, as we know, stores its information to a static
> buffer. The pointer is stored in a variable and it is assumed it does
> not change.
> Afterwards, it calls edit_todo_list that reuses the buffer before it
> accesses the shortonto string, meaning that if we have enough commits in
> the rebase, the shortonto string is overwritten.

Thanks for the detailed analysis. Makes sense.

> Actually, now that you asked, I noticed that orig_head in
> complete_action (actually comes from get_revision_ranges in
> builtin/rebase.c) may be affected as well. Though, I'm not sure whether
> there are enough calls to find_unique_abbrev in the execution path when
> it is being used to actually cause a rewritten buffer. It wouldn't hurt
> to duplicate it too just in case.
>
> --
> Antti

Thanks,
Taylor
Junio C Hamano Aug. 11, 2020, 6:58 p.m. UTC | #7
Antti Keränen <antti@keraset.fi> writes:

> On Tue, Aug 11, 2020 at 11:36:21AM -0400, Taylor Blau wrote:
>> Ack, I noticed this too during my review, but apparently forgot to
>> comment on it. I'm puzzled by the first '..*'. If you're searching for
>> any non-empty string, how about '.+' instead?
>
> That's true. Good point. I pretty much copy&pasted the 'todo count' test
> so I didn't give this much thought. I'll fix this.

Please don't shorten ..* into .+ if you are writing a portable sed
script---stick to the BRE.
Taylor Blau Aug. 11, 2020, 7:01 p.m. UTC | #8
On Tue, Aug 11, 2020 at 11:58:14AM -0700, Junio C Hamano wrote:
> Antti Keränen <antti@keraset.fi> writes:
>
> > On Tue, Aug 11, 2020 at 11:36:21AM -0400, Taylor Blau wrote:
> >> Ack, I noticed this too during my review, but apparently forgot to
> >> comment on it. I'm puzzled by the first '..*'. If you're searching for
> >> any non-empty string, how about '.+' instead?
> >
> > That's true. Good point. I pretty much copy&pasted the 'todo count' test
> > so I didn't give this much thought. I'll fix this.
>
> Please don't shorten ..* into .+ if you are writing a portable sed
> script---stick to the BRE.

Sure, and sorry -- I didn't know that we cared about the difference
between BRE and ERE. Do you prefer ..* over .\+? Both should be
supported in BRE, if I'm reading [1] correctly.

Thanks,
Taylor

[1]: https://www.gnu.org/software/sed/manual/html_node/BRE-vs-ERE.html#BRE-vs-ERE
Junio C Hamano Aug. 11, 2020, 7:05 p.m. UTC | #9
Taylor Blau <me@ttaylorr.com> writes:

> On Tue, Aug 11, 2020 at 11:58:14AM -0700, Junio C Hamano wrote:
>> Antti Keränen <antti@keraset.fi> writes:
>>
>> > On Tue, Aug 11, 2020 at 11:36:21AM -0400, Taylor Blau wrote:
>> >> Ack, I noticed this too during my review, but apparently forgot to
>> >> comment on it. I'm puzzled by the first '..*'. If you're searching for
>> >> any non-empty string, how about '.+' instead?
>> >
>> > That's true. Good point. I pretty much copy&pasted the 'todo count' test
>> > so I didn't give this much thought. I'll fix this.
>>
>> Please don't shorten ..* into .+ if you are writing a portable sed
>> script---stick to the BRE.
>
> Sure, and sorry -- I didn't know that we cared about the difference
> between BRE and ERE. Do you prefer ..* over .\+? Both should be
> supported in BRE, if I'm reading [1] correctly.

I thought that BRE only commands taking backslash-quoted ERE was GNU
extension?

Look for "stick to a subset of BRE" in Documention/CodingGuidelines;
we may need to update the document to raise the baseline to match
the reality of year 2020, though.
Phillip Wood Aug. 12, 2020, 1:59 p.m. UTC | #10
Hi Antti

On 11/08/2020 19:15, Antti Keränen wrote:
> Hi Phillip and Taylor.
> 
> On Tue, Aug 11, 2020 at 04:32:37PM +0100, Phillip Wood wrote:
>>> I'm not sure it's worth rearranging these lines. It probably does not matter
>>> but we could do
>>>
>>> +	char shortonto[GIT_MAX_HEXSZ + 1];
>>>
>>> and then later call find_unique_abbrev_r() instead so we don't have to worry
>>> about freeing shortonto.
> 
> I like this. I'll change the code, if the rest of the patch is okay.

Thanks great

Thanks

Phillip

> 
>>> Thanks for taking the trouble to add a test, I think all the lines above
>>> could be simplified to
>>>
>>> 	GIT_SEQUENCE_EDITOR=cat git rebase -i HEAD~5 branch1 >actual
> 
> Nice! :) There was a test called 'todo count' which I mimicked. This
> obviously is much cleaner solution, thanks.
> 
>>> we could lose the final .*
> 
> Ack.
> 
> On Tue, Aug 11, 2020 at 11:36:21AM -0400, Taylor Blau wrote:
>> Ack, I noticed this too during my review, but apparently forgot to
>> comment on it. I'm puzzled by the first '..*'. If you're searching for
>> any non-empty string, how about '.+' instead?
> 
> That's true. Good point. I pretty much copy&pasted the 'todo count' test
> so I didn't give this much thought. I'll fix this.
>
Taylor Blau Aug. 12, 2020, 2:03 p.m. UTC | #11
On Tue, Aug 11, 2020 at 12:05:58PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > On Tue, Aug 11, 2020 at 11:58:14AM -0700, Junio C Hamano wrote:
> >> Antti Keränen <antti@keraset.fi> writes:
> >>
> >> > On Tue, Aug 11, 2020 at 11:36:21AM -0400, Taylor Blau wrote:
> >> >> Ack, I noticed this too during my review, but apparently forgot to
> >> >> comment on it. I'm puzzled by the first '..*'. If you're searching for
> >> >> any non-empty string, how about '.+' instead?
> >> >
> >> > That's true. Good point. I pretty much copy&pasted the 'todo count' test
> >> > so I didn't give this much thought. I'll fix this.
> >>
> >> Please don't shorten ..* into .+ if you are writing a portable sed
> >> script---stick to the BRE.
> >
> > Sure, and sorry -- I didn't know that we cared about the difference
> > between BRE and ERE. Do you prefer ..* over .\+? Both should be
> > supported in BRE, if I'm reading [1] correctly.
>
> I thought that BRE only commands taking backslash-quoted ERE was GNU
> extension?

*Grumble*, it's not anywhere in POSIX:

  https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html

...making this a GNU-ism.

> Look for "stick to a subset of BRE" in Documention/CodingGuidelines;
> we may need to update the document to raise the baseline to match
> the reality of year 2020, though.

So, I think the "reality of year 2020" is that we still write '..*'
instead of '.\+'.

Thanks,
Taylor
Junio C Hamano Aug. 12, 2020, 7:40 p.m. UTC | #12
Taylor Blau <me@ttaylorr.com> writes:

> *Grumble*, it's not anywhere in POSIX:
>
>   https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html
>
> ...making this a GNU-ism.
>
>> Look for "stick to a subset of BRE" in Documention/CodingGuidelines;
>> we may need to update the document to raise the baseline to match
>> the reality of year 2020, though.
>
> So, I think the "reality of year 2020" is that we still write '..*'
> instead of '.\+'.

Another interesting thing I found is this piece in "man sed" on
Linux (hence GNU):

    REGULAR EXPRESSIONS
           ...
           The -E option switches to using extended regular  expressions  instead;
           it  has  been  supported  for  years by GNU sed, and is now included in
           POSIX.

But the "sed" documentation [*1*] that matches what you cited above (The
Open Group Base Specifications Issue 7, 2018 edition) does not
mention "-E", "-r" nor "--regexp-extended".


[Reference]

*1*
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html#tag_20_116_02

Patch
diff mbox series

diff --git a/sequencer.c b/sequencer.c
index fd7701c88a..0679adb639 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -5178,13 +5178,12 @@  int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
 		    struct string_list *commands, unsigned autosquash,
 		    struct todo_list *todo_list)
 {
-	const char *shortonto, *todo_file = rebase_path_todo();
+	const char *todo_file = rebase_path_todo();
 	struct todo_list new_todo = TODO_LIST_INIT;
 	struct strbuf *buf = &todo_list->buf, buf2 = STRBUF_INIT;
 	struct object_id oid = onto->object.oid;
 	int res;
-
-	shortonto = find_unique_abbrev(&oid, DEFAULT_ABBREV);
+	char *shortonto;
 
 	if (buf->len == 0) {
 		struct todo_item *item = append_new_todo(todo_list);
@@ -5206,8 +5205,10 @@  int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
 		return error(_("nothing to do"));
 	}
 
+	shortonto = xstrdup(find_unique_abbrev(&oid, DEFAULT_ABBREV));
 	res = edit_todo_list(r, todo_list, &new_todo, shortrevisions,
 			     shortonto, flags);
+	free(shortonto);
 	if (res == -1)
 		return -1;
 	else if (res == -2) {
diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index 4a7d21f898..09af16753c 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -1760,6 +1760,19 @@  test_expect_success 'correct error message for commit --amend after empty pick'
 	test_i18ngrep "middle of a rebase -- cannot amend." err
 '
 
+test_expect_success 'todo has correct onto hash' '
+	write_script dump-raw.sh <<-\EOF &&
+		cat "$1"
+	EOF
+	git checkout branch1 &&
+	(
+		test_set_editor "$(pwd)/dump-raw.sh" &&
+		git rebase -i HEAD~5 >actual
+	) &&
+	onto=$(git rev-parse --short HEAD~5) &&
+	test_i18ngrep "^# Rebase ..* onto $onto .*" actual
+'
+
 # This must be the last test in this file
 test_expect_success '$EDITOR and friends are unchanged' '
 	test_editor_unchanged