diff mbox series

[v2,2/3] rebase -i: re-fix short SHA-1 collision

Message ID 102fa568dc09c1faa2d36903ccb7e1b285dd50b2.1579304283.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series Re-fix rebase -i with SHA-1 collisions | expand

Commit Message

Koji Nakamaru via GitGitGadget Jan. 17, 2020, 11:38 p.m. UTC
From: Johannes Schindelin <johannes.schindelin@gmx.de>

In 66ae9a57b88 (t3404: rebase -i: demonstrate short SHA-1 collision,
2013-08-23), we added a test case that demonstrated how it is possible
that a previously unambiguous short commit ID could become ambiguous
*during* a rebase.

In 75c69766554 (rebase -i: fix short SHA-1 collision, 2013-08-23), we
fixed that problem simply by writing out the todo list with expanded
commit IDs (except *right* before letting the user edit the todo list,
in which case we shorten them, but we expand them right after the file
was edited).

However, the bug resurfaced as a side effect of 393adf7a6f6 (sequencer:
directly call pick_commits() from complete_action(), 2019-11-24): as of
this commit, the sequencer no longer re-reads the todo list after
writing it out with expanded commit IDs.

The only redeeming factor is that the todo list is already parsed at
that stage, including all the commits corresponding to the commands,
therefore the sequencer can continue even if the internal todo list has
short commit IDs.

That does not prevent problems, though: the sequencer writes out the
`done` and `git-rebase-todo` files incrementally (i.e. overwriting the
todo list with a version that has _short_ commit IDs), and if a merge
conflict happens, or if an `edit` or a `break` command is encountered, a
subsequent `git rebase --continue` _will_ re-read the todo list, opening
an opportunity for the "short SHA-1 collision" bug again.

To avoid that, let's make sure that we do expand the commit IDs in the
todo list as soon as we have parsed it after letting the user edit it.

Additionally, we improve the 'short SHA-1 collide' test case in t3404 to
test specifically for the case where the rebase is resumed. We also
hard-code the expected colliding short SHA-1s, to document the
expectation (and to make it easier on future readers).

Note that we specifically test that the short commit ID is used in the
`git-rebase-todo.tmp` file: this file is created by the fake editor in
the test script and reflects the state that would have been presented to
the user to edit.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 sequencer.c                   | 11 ++++++++++-
 t/t3404-rebase-interactive.sh | 15 +++++++++++++--
 2 files changed, 23 insertions(+), 3 deletions(-)

Comments

Eric Sunshine Jan. 20, 2020, 4:49 p.m. UTC | #1
On Fri, Jan 17, 2020 at 6:38 PM Johannes Schindelin via GitGitGadget
<gitgitgadget@gmail.com> wrote:
> diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
> @@ -1264,13 +1264,24 @@ test_expect_success SHA1 'short SHA-1 setup' '
>  test_expect_success SHA1 'short SHA-1 collide' '
>         test_when_finished "reset_rebase && git checkout master" &&
>         git checkout collide &&
> +       colliding_sha1=6bcda37 &&
> +       test $colliding_sha1 = "$(git rev-parse HEAD | cut -c 1-7)" &&

How much do we care that this is introducing new code with git
upstream of a pipe (considering recent efforts to eradicate such
usage)? Same question regarding several other new instances introduce
by this patch.
Johannes Schindelin Jan. 20, 2020, 8:04 p.m. UTC | #2
Hi Eric,

On Mon, 20 Jan 2020, Eric Sunshine wrote:

> On Fri, Jan 17, 2020 at 6:38 PM Johannes Schindelin via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> > diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
> > @@ -1264,13 +1264,24 @@ test_expect_success SHA1 'short SHA-1 setup' '
> >  test_expect_success SHA1 'short SHA-1 collide' '
> >         test_when_finished "reset_rebase && git checkout master" &&
> >         git checkout collide &&
> > +       colliding_sha1=6bcda37 &&
> > +       test $colliding_sha1 = "$(git rev-parse HEAD | cut -c 1-7)" &&
>
> How much do we care that this is introducing new code with git
> upstream of a pipe (considering recent efforts to eradicate such
> usage)? Same question regarding several other new instances introduce
> by this patch.

I would argue that the test case will fail if the `git` call fails. So I
am not overly concerned if that `git` call is upstream of a pipe.

Ciao,
Dscho
Eric Sunshine Jan. 20, 2020, 8:08 p.m. UTC | #3
On Mon, Jan 20, 2020 at 3:04 PM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> On Mon, 20 Jan 2020, Eric Sunshine wrote:
> > On Fri, Jan 17, 2020 at 6:38 PM Johannes Schindelin via GitGitGadget
> > <gitgitgadget@gmail.com> wrote:
> > > +       test $colliding_sha1 = "$(git rev-parse HEAD | cut -c 1-7)" &&
> >
> > How much do we care that this is introducing new code with git
> > upstream of a pipe (considering recent efforts to eradicate such
> > usage)? Same question regarding several other new instances introduce
> > by this patch.
>
> I would argue that the test case will fail if the `git` call fails. So I
> am not overly concerned if that `git` call is upstream of a pipe.

Unless the git command crashes _after_ it produces the correct output...
Johannes Schindelin Jan. 21, 2020, 11:49 a.m. UTC | #4
Hi Eric,

On Mon, 20 Jan 2020, Eric Sunshine wrote:

> On Mon, Jan 20, 2020 at 3:04 PM Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
> > On Mon, 20 Jan 2020, Eric Sunshine wrote:
> > > On Fri, Jan 17, 2020 at 6:38 PM Johannes Schindelin via GitGitGadget
> > > <gitgitgadget@gmail.com> wrote:
> > > > +       test $colliding_sha1 = "$(git rev-parse HEAD | cut -c 1-7)" &&
> > >
> > > How much do we care that this is introducing new code with git
> > > upstream of a pipe (considering recent efforts to eradicate such
> > > usage)? Same question regarding several other new instances introduce
> > > by this patch.
> >
> > I would argue that the test case will fail if the `git` call fails. So I
> > am not overly concerned if that `git` call is upstream of a pipe.
>
> Unless the git command crashes _after_ it produces the correct output...

Yes, that is true. In a very hypothetical way, of course.

:-)

Ciao,
Dscho
Junio C Hamano Jan. 21, 2020, 10:02 p.m. UTC | #5
Eric Sunshine <sunshine@sunshineco.com> writes:

> On Mon, Jan 20, 2020 at 3:04 PM Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
>> On Mon, 20 Jan 2020, Eric Sunshine wrote:
>> > On Fri, Jan 17, 2020 at 6:38 PM Johannes Schindelin via GitGitGadget
>> > <gitgitgadget@gmail.com> wrote:
>> > > +       test $colliding_sha1 = "$(git rev-parse HEAD | cut -c 1-7)" &&
>> >
>> > How much do we care that this is introducing new code with git
>> > upstream of a pipe (considering recent efforts to eradicate such
>> > usage)? Same question regarding several other new instances introduce
>> > by this patch.
>>
>> I would argue that the test case will fail if the `git` call fails. So I
>> am not overly concerned if that `git` call is upstream of a pipe.
>
> Unless the git command crashes _after_ it produces the correct output...

True.  Doesn't rev-parse have an appropriate option for this kind of
thing that gets rid of the need for "cut" in the first place?
Johannes Schindelin Jan. 22, 2020, 2:10 p.m. UTC | #6
Hi Junio,

On Tue, 21 Jan 2020, Junio C Hamano wrote:

> Eric Sunshine <sunshine@sunshineco.com> writes:
>
> > On Mon, Jan 20, 2020 at 3:04 PM Johannes Schindelin
> > <Johannes.Schindelin@gmx.de> wrote:
> >> On Mon, 20 Jan 2020, Eric Sunshine wrote:
> >> > On Fri, Jan 17, 2020 at 6:38 PM Johannes Schindelin via GitGitGadget
> >> > <gitgitgadget@gmail.com> wrote:
> >> > > +       test $colliding_sha1 = "$(git rev-parse HEAD | cut -c 1-7)" &&
> >> >
> >> > How much do we care that this is introducing new code with git
> >> > upstream of a pipe (considering recent efforts to eradicate such
> >> > usage)? Same question regarding several other new instances introduce
> >> > by this patch.
> >>
> >> I would argue that the test case will fail if the `git` call fails. So I
> >> am not overly concerned if that `git` call is upstream of a pipe.
> >
> > Unless the git command crashes _after_ it produces the correct output...
>
> True.  Doesn't rev-parse have an appropriate option for this kind of
> thing that gets rid of the need for "cut" in the first place?

You mean `git rev-parse --short=4`? That does something _sligthly_
different: it tries to shorten the OID to 4 characters _unless that would
be ambiguous_. In our case, it _will_ be ambiguous. That's why I use
`cut`.

As to the crash in `rev-parse` _after_ printing out the OID: yes, there is
a possibility for that. But that regression test is not about `rev-parse`,
so it is actually a good thing that it would not trigger on such a bug ;-)

Ciao,
Dscho
Junio C Hamano Jan. 22, 2020, 6:15 p.m. UTC | #7
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> True.  Doesn't rev-parse have an appropriate option for this kind of
>> thing that gets rid of the need for "cut" in the first place?
>
> You mean `git rev-parse --short=4`? That does something _sligthly_
> different: it tries to shorten the OID to 4 characters _unless that would
> be ambiguous_. In our case, it _will_ be ambiguous. That's why I use
> `cut`.

Ah, yes of course; we want ambiguous prefix.  I think a more
thorough test would be to see that the output with --short=$n (where
n is the length of the abbreviated object name in $colliding_sha1)
is longer than $colliding_sha1 and the output prefix-matches
$colliding_sha1 iow, something like

	abbreviated=$(git rev-parse --short=7 HEAD) &&
	case "$abbreviated" in
	"$colliding_sha1"?*) : happy ;;
	*) false ;;
	esac &&
	...

which would make sure that we are testing colliding case.


> As to the crash in `rev-parse` _after_ printing out the OID: yes, there is
> a possibility for that. But that regression test is not about `rev-parse`,
> so it is actually a good thing that it would not trigger on such a bug ;-)

No, I do not think this test should be about rev-parse working
correctly---just that if it is easy enough to make the test robust
enough against such a breakage, it would be nice to do so, that's
all.

I'm not Eric but I suspect his primary point was not about worrying
about rev-parse crashing but more about avoiding to add a pattern
less experienced developers can copy&paste without thinking enough
to realize why it would be OK here and not OK in the context of the
tests they are adding.  That would be what I would worry about more
than rev-parse crashing in the part of the test under discussion.

Thanks.
diff mbox series

Patch

diff --git a/sequencer.c b/sequencer.c
index 7c30dad59c..5f69b47e32 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -5076,7 +5076,7 @@  int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
 {
 	const char *shortonto, *todo_file = rebase_path_todo();
 	struct todo_list new_todo = TODO_LIST_INIT;
-	struct strbuf *buf = &todo_list->buf;
+	struct strbuf *buf = &todo_list->buf, buf2 = STRBUF_INIT;
 	struct object_id oid = onto->object.oid;
 	int res;
 
@@ -5128,6 +5128,15 @@  int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
 		return -1;
 	}
 
+	/* Expand the commit IDs */
+	todo_list_to_strbuf(r, &new_todo, &buf2, -1, 0);
+	strbuf_swap(&new_todo.buf, &buf2);
+	strbuf_release(&buf2);
+	new_todo.total_nr -= new_todo.nr;
+	if (todo_list_parse_insn_buffer(r, new_todo.buf.buf, &new_todo) < 0)
+		BUG("invalid todo list after expanding IDs:\n%s",
+		    new_todo.buf.buf);
+
 	if (opts->allow_ff && skip_unnecessary_picks(r, &new_todo, &oid)) {
 		todo_list_release(&new_todo);
 		return error(_("could not skip unnecessary pick commands"));
diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index ae6e55ce79..1cc9f36bc7 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -1264,13 +1264,24 @@  test_expect_success SHA1 'short SHA-1 setup' '
 test_expect_success SHA1 'short SHA-1 collide' '
 	test_when_finished "reset_rebase && git checkout master" &&
 	git checkout collide &&
+	colliding_sha1=6bcda37 &&
+	test $colliding_sha1 = "$(git rev-parse HEAD | cut -c 1-7)" &&
 	(
 		unset test_tick &&
 		test_tick &&
 		set_fake_editor &&
 		FAKE_COMMIT_MESSAGE="collide2 ac4f2ee" \
-		FAKE_LINES="reword 1 2" git rebase -i HEAD~2
-	)
+		FAKE_LINES="reword 1 break 2" git rebase -i HEAD~2 &&
+		test $colliding_sha1 = "$(git rev-parse HEAD | cut -c 1-7)" &&
+		grep "^pick $colliding_sha1 " \
+			.git/rebase-merge/git-rebase-todo.tmp &&
+		grep "^pick [0-9a-f]\{40\}" \
+			.git/rebase-merge/git-rebase-todo &&
+		git rebase --continue
+	) &&
+	collide2="$(git rev-parse HEAD~1 | cut -c 1-4)" &&
+	collide3="$(git rev-parse collide3 | cut -c 1-4)" &&
+	test "$collide2" = "$collide3"
 '
 
 test_expect_success 'respect core.abbrev' '