diff mbox series

[v3,5/6] doc hash-function-transition: move rationale upwards

Message ID ee0fa2ec1d0fb4875c6a10af26686d2d5b3cb489.1612549349.git.gitgitgadget@gmail.com (mailing list archive)
State Accepted
Commit 1d18997007de8c0403bb7187d32d1d1d3787220b
Headers show
Series doc: improvements for hash-function-transition | expand

Commit Message

Thomas Ackermann Feb. 5, 2021, 6:22 p.m. UTC
From: Thomas Ackermann <th.acker@arcor.de>

Move rationale for new hash function to beginning of document
so that it appears before the concrete move to SHA-256 is described.

Remove some of the details about SHA-1 weaknesses and add references
to the details on how the new hash function was chosen instead.

Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
---
 .../technical/hash-function-transition.txt    | 76 +++++++++----------
 1 file changed, 34 insertions(+), 42 deletions(-)

Comments

Ævar Arnfjörð Bjarmason Feb. 5, 2021, 8:48 p.m. UTC | #1
On Fri, Feb 05 2021, Thomas Ackermann via GitGitGadget wrote:

> diff --git a/Documentation/technical/hash-function-transition.txt b/Documentation/technical/hash-function-transition.txt
> index 5ff9ee027cff..0c4cb98cd4e9 100644
> --- a/Documentation/technical/hash-function-transition.txt
> +++ b/Documentation/technical/hash-function-transition.txt
> @@ -33,16 +33,9 @@ researchers. On 23 February 2017 the SHAttered attack
>  
>  Git v2.13.0 and later subsequently moved to a hardened SHA-1
>  implementation by default, which isn't vulnerable to the SHAttered
> -attack.
> +attack, but SHA-1 is still weak.
>  
> -Thus Git has in effect already migrated to a new hash that isn't SHA-1
> -and doesn't share its vulnerabilities, its new hash function just
> -happens to produce exactly the same output for all known inputs,
> -except two PDFs published by the SHAttered researchers, and the new
> -implementation (written by those researchers) claims to detect future
> -cryptanalytic collision attacks.
> -
> -Regardless, it's considered prudent to move past any variant of SHA-1
> +Thus it's considered prudent to move past any variant of SHA-1
>  to a new hash. There's no guarantee that future attacks on SHA-1 won't
>  be published in the future, and those attacks may not have viable
>  mitigations.
> @@ -57,6 +50,38 @@ SHA-1 still possesses the other properties such as fast object lookup
>  and safe error checking, but other hash functions are equally suitable
>  that are believed to be cryptographically secure.

I missed version 2 of this. I don't think it's an improvement to
completely remove the description of us using sha1collisiondetection by
default, i.e. effectively revert 5988eb631a3 (doc
hash-function-transition: clarify what SHAttered means, 2018-03-26)

I can see how my comment on v1 could have been read like that. FWIW I
didn't mean remove the whole thing, but that I don't think it adds much
value to our description of how we use SHA-1 to go into the level of
detail of mentioning several researchers by name, there's Wikipedia for
that.

I think what we should instead do is have some brief summary of the
vulnerabilities and how they're impacting git.

Maybe I'm barking up the wrong tree here, and what I'm describing should
be in a "man 5 gitsecurity" or something.

But anyway, I think it adds a lot of value to somewhere have not just
what amounts to "sha-1 sucks, see research papers", but to have some
brief human-readable summary of what the practical impact is on users.

In 2018 it was true that sha1collisiondetection was mitigating the known
attack in practice, and that's also true about this new attack[1] (maybe
there's others I missed ...).

Then there's the fact that we don't *just* rely on SHA-1, but e.g. the
"don't re-write objects we have already". So as a practical attack on
someone using git ...

Oh, and the attacks currently all seem to require file formats like JPEG
or PDF for anything practical, i.e. being able to spew in lots of
arbitrary data into some data segment, as opposed to e.g. creating a
program that compiles.

None of this is meant as some overall defense of SHA-1, just that most
of our users aren't security researchers, and will be helped by a
summary of how this system they're using using SHA-1, and having read
that it's "broken" or "believed to be weak" translates to a threat to
them in practice.

1. https://eprint.iacr.org/2020/014.pdf
Junio C Hamano Feb. 5, 2021, 9:49 p.m. UTC | #2
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> I missed version 2 of this. I don't think it's an improvement to
> completely remove the description of us using sha1collisiondetection by
> default, i.e. effectively revert 5988eb631a3 (doc
> hash-function-transition: clarify what SHAttered means, 2018-03-26)
> ...
> I can see how my comment on v1 could have been read like that. FWIW I
> didn't mean remove the whole thing, but that I don't think it adds much
> value to our description of how we use SHA-1 to go into the level of
> detail of mentioning several researchers by name, there's Wikipedia for
> that.

True.

> I think what we should instead do is have some brief summary of the
> vulnerabilities and how they're impacting git.

I am not sure.

> Maybe I'm barking up the wrong tree here, and what I'm describing should
> be in a "man 5 gitsecurity" or something.

I agree with your that it belongs to some other document, but not
here, where the primary thing is to outline how the migration will
go, and the part we are seeing is merely giving a background story.
At this point in time, readers would not have to learn the details
from this document.  People already know that we are not happy with
SHA-1 and is on our way to migrate to SHA-256.

> But anyway, I think it adds a lot of value to somewhere have not just
> what amounts to "sha-1 sucks, see research papers", but to have some
> brief human-readable summary of what the practical impact is on users.

Yeah.  I think Thomas has in [v3 5/6] gives our readers about the
right level of details.  If I were to change anything, I'd do "but
SHA-1 is {+considered+} still weak."

> In 2018 it was true that sha1collisiondetection was mitigating the known
> attack in practice, and that's also true about this new attack[1] (maybe
> there's others I missed ...).
>
> Then there's the fact that we don't *just* rely on SHA-1, but e.g. the
> "don't re-write objects we have already". So as a practical attack on
> someone using git ...
>
> Oh, and the attacks currently all seem to require file formats like JPEG
> or PDF for anything practical, i.e. being able to spew in lots of
> arbitrary data into some data segment, as opposed to e.g. creating a
> program that compiles.
>
> None of this is meant as some overall defense of SHA-1, just that most
> of our users aren't security researchers, and will be helped by a
> summary of how this system they're using using SHA-1, and having read
> that it's "broken" or "believed to be weak" translates to a threat to
> them in practice.

All of the above are good thing for somebody to write about, but I
am not sure it fits well in the context of this document.  This is
primarily about how the migration should go, and the target audience
is those of us who are already committed to the plan.  The backstory
on how the plan came about makes a nice introductory reading but it
would not be productive to spend too much bits for the purpose of the
document and its target audience, I would think.

Thanks.
diff mbox series

Patch

diff --git a/Documentation/technical/hash-function-transition.txt b/Documentation/technical/hash-function-transition.txt
index 5ff9ee027cff..0c4cb98cd4e9 100644
--- a/Documentation/technical/hash-function-transition.txt
+++ b/Documentation/technical/hash-function-transition.txt
@@ -33,16 +33,9 @@  researchers. On 23 February 2017 the SHAttered attack
 
 Git v2.13.0 and later subsequently moved to a hardened SHA-1
 implementation by default, which isn't vulnerable to the SHAttered
-attack.
+attack, but SHA-1 is still weak.
 
-Thus Git has in effect already migrated to a new hash that isn't SHA-1
-and doesn't share its vulnerabilities, its new hash function just
-happens to produce exactly the same output for all known inputs,
-except two PDFs published by the SHAttered researchers, and the new
-implementation (written by those researchers) claims to detect future
-cryptanalytic collision attacks.
-
-Regardless, it's considered prudent to move past any variant of SHA-1
+Thus it's considered prudent to move past any variant of SHA-1
 to a new hash. There's no guarantee that future attacks on SHA-1 won't
 be published in the future, and those attacks may not have viable
 mitigations.
@@ -57,6 +50,38 @@  SHA-1 still possesses the other properties such as fast object lookup
 and safe error checking, but other hash functions are equally suitable
 that are believed to be cryptographically secure.
 
+Choice of Hash
+--------------
+The hash to replace the hardened SHA-1 should be stronger than SHA-1
+was: we would like it to be trustworthy and useful in practice for at
+least 10 years.
+
+Some other relevant properties:
+
+1. A 256-bit hash (long enough to match common security practice; not
+   excessively long to hurt performance and disk usage).
+
+2. High quality implementations should be widely available (e.g., in
+   OpenSSL and Apple CommonCrypto).
+
+3. The hash function's properties should match Git's needs (e.g. Git
+   requires collision and 2nd preimage resistance and does not require
+   length extension resistance).
+
+4. As a tiebreaker, the hash should be fast to compute (fortunately
+   many contenders are faster than SHA-1).
+
+There were several contenders for a successor hash to SHA-1, including
+SHA-256, SHA-512/256, SHA-256x16, K12, and BLAKE2bp-256.
+
+In late 2018 the project picked SHA-256 as its successor hash.
+
+See 0ed8d8da374 (doc hash-function-transition: pick SHA-256 as
+NewHash, 2018-08-04) and numerous mailing list threads at the time,
+particularly the one starting at
+https://lore.kernel.org/git/20180609224913.GC38834@genre.crustytoothpaste.net/
+for more information.
+
 Goals
 -----
 1. The transition to SHA-256 can be done one local repository at a time.
@@ -601,39 +626,6 @@  example:
 
     git --output-format=sha1 log abac87a^{sha1}..f787cac^{sha256}
 
-Choice of Hash
---------------
-In early 2005, around the time that Git was written, Xiaoyun Wang,
-Yiqun Lisa Yin, and Hongbo Yu announced an attack finding SHA-1
-collisions in 2^69 operations. In August they published details.
-Luckily, no practical demonstrations of a collision in full SHA-1 were
-published until 10 years later, in 2017.
-
-Git v2.13.0 and later subsequently moved to a hardened SHA-1
-implementation by default that mitigates the SHAttered attack, but
-SHA-1 is still believed to be weak.
-
-The hash to replace this hardened SHA-1 should be stronger than SHA-1
-was: we would like it to be trustworthy and useful in practice for at
-least 10 years.
-
-Some other relevant properties:
-
-1. A 256-bit hash (long enough to match common security practice; not
-   excessively long to hurt performance and disk usage).
-
-2. High quality implementations should be widely available (e.g., in
-   OpenSSL and Apple CommonCrypto).
-
-3. The hash function's properties should match Git's needs (e.g. Git
-   requires collision and 2nd preimage resistance and does not require
-   length extension resistance).
-
-4. As a tiebreaker, the hash should be fast to compute (fortunately
-   many contenders are faster than SHA-1).
-
-We choose SHA-256.
-
 Transition plan
 ---------------
 Some initial steps can be implemented independently of one another: