mbox series

[00/24] SHA-256 test fixes, part 8

Message ID 20200113123857.3684632-1-sandals@crustytoothpaste.net (mailing list archive)
Headers show
Series SHA-256 test fixes, part 8 | expand

Message

brian m. carlson Jan. 13, 2020, 12:38 p.m. UTC
This is the second-to-last series of test fixes for SHA-256.  Most of
them are rather boring, but there are a few notable exceptions.

t3305 appears to fail with SHA-256 due to the fanout not compressing as
expected.  I believe this is a legitimate bug that our transition to
SHA-256 exposes, but it's unclear to me why it happens and I'm not
familiar enough with the code to figure out what's going on[0].  I've
CC'd Dscho, since he seems to be the person most familiar with the notes
code who's still involved in the project.

I suspect that t3404 also has a bug, since the object IDs that are
supposed to collide do not, according to my instrumentation of the test.
I'm unsure what the intended collision was and consequently haven't
fixed it.  However, it does work with SHA-256 as it stands and is no
more or less functional than with SHA-1, so I've removed the
prerequisite.

I believe the fix in t5616 is correct and still supports the intent of
the test, but I'd appreciate any feedback there.  Why it works with
SHA-1 is unclear to me, but my conjecture is that it's due to ordering
of the object IDs.  I've CC'd Jonathan Tan about this issue since he
seems to be most familiar with that test.

t/lib-pack.sh is updated in this commit, but I expect a small number of
additional fixes to come in with part 9 to support t5308.

I fully expect that this series won't be picked up until after the
release, and that's fine.  It is based on master and intentionally does
not require the other in-flight test series.  I expect a reroll due to
the aforementioned suspected bugs.

Of course, feedback on any aspect of this series is welcome.

[0] While working on the transition to SHA-256, I've found myself quite
confused by the notes code in general, mostly due to the way it uses
partial object IDs.  Reading about the feature (which I'd previously
never used) was indeed helpful, though, so thanks to the folks who wrote
the documentation.

brian m. carlson (24):
  t/lib-pack: support SHA-256
  t3206: make hash size independent
  t3305: annotate with SHA1 prerequisite
  t3308: make test work with SHA-256
  t3309: make test work with SHA-256
  t3310: make test work with SHA-256
  t3311: make test work with SHA-256
  t3404: remove SHA1 prerequisite
  t4013: make test hash independent
  t4060: make test work with SHA-256
  t4211: make test hash independent
  t5302: make hash size independent
  t5309: make test hash independent
  t5313: make test hash independent
  t5321: make test hash independent
  t5515: make test hash independent
  t5318: update for SHA-256
  t5616: use correct filter syntax
  t5607: make hash size independent
  t5703: make test work with SHA-256
  t5703: switch tests to use test_oid
  t6000: abstract away SHA-1-specific constants
  t6006: make hash size independent
  t6024: update for SHA-256

 t/lib-pack.sh                                |  35 ++-
 t/t3206-range-diff.sh                        |  14 +-
 t/t3305-notes-fanout.sh                      |   2 +-
 t/t3308-notes-merge.sh                       |  83 ++++---
 t/t3309-notes-merge-auto-resolve.sh          | 228 ++++++++++++-------
 t/t3310-notes-merge-manual-resolve.sh        |  84 ++++---
 t/t3311-notes-merge-fanout.sh                |  60 +++--
 t/t3404-rebase-interactive.sh                |   4 +-
 t/t4013-diff-various.sh                      |  44 +++-
 t/t4060-diff-submodule-option-diff-format.sh | 126 +++++-----
 t/t4211-line-log.sh                          |  14 +-
 t/t5302-pack-index.sh                        |  18 +-
 t/t5309-pack-delta-cycles.sh                 |  10 +-
 t/t5313-pack-bounds-checks.sh                |  19 +-
 t/t5318-commit-graph.sh                      |   4 +-
 t/t5321-pack-large-objects.sh                |   4 +-
 t/t5515-fetch-merge-logic.sh                 |  51 ++++-
 t/t5607-clone-bundle.sh                      |   2 +-
 t/t5616-partial-clone.sh                     |   2 +-
 t/t5703-upload-pack-ref-in-want.sh           |   7 +-
 t/t6000-rev-list-misc.sh                     |  13 +-
 t/t6006-rev-list-format.sh                   |   4 +-
 t/t6024-recursive-merge.sh                   |  15 +-
 23 files changed, 562 insertions(+), 281 deletions(-)

Comments

Eric Sunshine Jan. 13, 2020, 1:41 p.m. UTC | #1
On Mon, Jan 13, 2020 at 7:40 AM brian m. carlson
<sandals@crustytoothpaste.net> wrote:
> I suspect that t3404 also has a bug, since the object IDs that are
> supposed to collide do not, according to my instrumentation of the test.
> I'm unsure what the intended collision was and consequently haven't
> fixed it.  However, it does work with SHA-256 as it stands and is no
> more or less functional than with SHA-1, so I've removed the
> prerequisite.

The test itself is fine, but it is one of those unfortunate cases of
checking for absence of something (which is a wide net). As explained
by the commit message[1] of the patch which added the test, the
collision occurred only between short OID's. The patch[2] which fixed
the problem did so by avoiding short OID's in the scripted
implementation of `git rebase -i` (and also flipped the test from
`text_expect_failure` to `test_expect_success`).

The test, as currently implemented, is very much specific to SHA-1
since the FAKE_COMMIT_MESSAGE="collide2 ac4f2ee" it uses only produces
a collision with short OID's when SHA-1 is the hashing function, so
the prerequisite is correct and serves as documentation (even if it
doesn't affect the outcome of the test). Removing that prerequisite
should only be done if the test is updated with a different
FAKE_COMMIT_MESSAGE which causes a short OID collision when SHA-256 is
used.

[1]: 66ae9a57b8 (t3404: rebase -i: demonstrate short SHA-1 collision,
2013-08-23)
[2]: 75c6976655 (rebase -i: fix short SHA-1 collision, 2013-08-23)
brian m. carlson Jan. 13, 2020, 11:17 p.m. UTC | #2
On 2020-01-13 at 13:41:44, Eric Sunshine wrote:
> On Mon, Jan 13, 2020 at 7:40 AM brian m. carlson
> <sandals@crustytoothpaste.net> wrote:
> > I suspect that t3404 also has a bug, since the object IDs that are
> > supposed to collide do not, according to my instrumentation of the test.
> > I'm unsure what the intended collision was and consequently haven't
> > fixed it.  However, it does work with SHA-256 as it stands and is no
> > more or less functional than with SHA-1, so I've removed the
> > prerequisite.
> 
> The test itself is fine, but it is one of those unfortunate cases of
> checking for absence of something (which is a wide net). As explained
> by the commit message[1] of the patch which added the test, the
> collision occurred only between short OID's. The patch[2] which fixed
> the problem did so by avoiding short OID's in the scripted
> implementation of `git rebase -i` (and also flipped the test from
> `text_expect_failure` to `test_expect_success`).
> 
> The test, as currently implemented, is very much specific to SHA-1
> since the FAKE_COMMIT_MESSAGE="collide2 ac4f2ee" it uses only produces
> a collision with short OID's when SHA-1 is the hashing function, so
> the prerequisite is correct and serves as documentation (even if it
> doesn't affect the outcome of the test). Removing that prerequisite
> should only be done if the test is updated with a different
> FAKE_COMMIT_MESSAGE which causes a short OID collision when SHA-256 is
> used.

I'll take another look.  When I looked at the output, it looked like
they didn't collide anymore even under SHA-1, but perhaps I instrumented
the test wrong and therefore got the wrong result.  Thanks for double
checking.
Eric Sunshine Jan. 13, 2020, 11:34 p.m. UTC | #3
On Mon, Jan 13, 2020 at 6:17 PM brian m. carlson
<sandals@crustytoothpaste.net> wrote:
> On 2020-01-13 at 13:41:44, Eric Sunshine wrote:
> > The test itself is fine, but it is one of those unfortunate cases of
> > checking for absence of something (which is a wide net). As explained
> > by the commit message[1] of the patch which added the test, the
> > collision occurred only between short OID's. The patch[2] which fixed
> > the problem did so by avoiding short OID's in the scripted
> > implementation of `git rebase -i` (and also flipped the test from
> > `text_expect_failure` to `test_expect_success`).
> >
> > The test, as currently implemented, is very much specific to SHA-1
> > since the FAKE_COMMIT_MESSAGE="collide2 ac4f2ee" it uses only produces
> > a collision with short OID's when SHA-1 is the hashing function, so
> > the prerequisite is correct and serves as documentation (even if it
> > doesn't affect the outcome of the test). Removing that prerequisite
> > should only be done if the test is updated with a different
> > FAKE_COMMIT_MESSAGE which causes a short OID collision when SHA-256 is
> > used.
>
> I'll take another look.  When I looked at the output, it looked like
> they didn't collide anymore even under SHA-1, but perhaps I instrumented
> the test wrong and therefore got the wrong result.  Thanks for double
> checking.

They might not collide anymore if the length of a short OID has
increased since the test was written[1] (even with the "fix" patch[2]
reverted) since, to fail, the test only needed the common prefix of
the OID's to collide, where the common prefix was the length of the
short OID. So, it's possible that the test doesn't do anything anymore
if the short OID length is now longer. (This might suggest that
dropping the test would be a path forward.)

[1]: 66ae9a57b8 (t3404: rebase -i: demonstrate short SHA-1 collision,
2013-08-23)
[2]: 75c6976655 (rebase -i: fix short SHA-1 collision, 2013-08-23)
Johannes Schindelin Jan. 16, 2020, 12:28 a.m. UTC | #4
Hi brian,

On Mon, 13 Jan 2020, brian m. carlson wrote:

> I suspect that t3404 also has a bug, since the object IDs that are
> supposed to collide do not, according to my instrumentation of the test.
> I'm unsure what the intended collision was and consequently haven't
> fixed it.  However, it does work with SHA-256 as it stands and is no
> more or less functional than with SHA-1, so I've removed the
> prerequisite.

This test was first introduced in 66ae9a57b88 (t3404: rebase -i:
demonstrate short SHA-1 collision, 2013-08-23). This commit does, however,
not give the full history of events. The most interesting tidbit is in
this mail:
https://public-inbox.org/git/1377112378-45511-4-git-send-email-sunshine@sunshineco.com/

Sadly, I could not make the indicated revision compile in a quick and
dirty way, so I cannot hope to bisect this down. But I can do better and
try to fix it properly.

On the other hand, I can do _even better_ than that and demonstrate that
the test case is both incomplete _and_ still has the proper collision. Let
me first explain the idea behind it: we want to run an interactive rebase
on

	collide1 - collide2 - collide3

Forget about collide1, this will be the base commit for the rebase. Now,
the idea is that collide2 is reworded during the rebase so that it has the
same short SHA-1 as collide3. Which means that a `pick <collide3>` will
fail if the short name is used because it is ambiguous, it could both
refer to collide3 and to the reworded collide2.

And this is indeed the case: if I insert a `break` after the `reword` to
force the rebase to be interrupted and the execute manually the command
`git rebase --edit-todo`, I see this:

	$ git rebase --edit-todo
	error: short SHA1 6bcda37 is ambiguous
	hint: The candidates are:
	hint:   6bcda372 commit 2005-04-07 - collide3
	hint:   6bcda37f commit 2005-04-07 - collide2 ac4f2ee
	error: could not parse 'collide3
	'
	error: invalid line 1: pick 6bcda37 collide3

But wait! This should not happen. That is exactly what this regression
test wanted to safeguard against: the `git-rebase-todo` file should have
been written with full SHA-1s, even if `git rebase --edit-todo` should
shorten them before opening the editor, and expanding them after the
editor was closed.

Uh oh.

*clicketyclick for some hours* Aha! A recent change in the interactive
rebase avoids re-reading the todo list all the time, and unfortunately we
now _also_ do not re-read the todo list right after expanding the SHA-1s
in `todo_list_write_to_file()`.

*clicketyclick for another few hours* Okay, I think I understand the
issue, and why the test passed (even if it should not have passed). I
opened https://github.com/gitgitgadget/git/pull/529 and will continue with
this project tomorrow.

Ciao,
Dscho