diff mbox series

[2/2] Documentation: packfile-uri hash can be longer than 40 hex chars

Message ID 497c5fd18d7206c137d8a62d229d2f295c9fe4fa.1633708986.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series fetch-pack: redact packfile urls in traces | expand

Commit Message

Ivan Frade Oct. 8, 2021, 4:03 p.m. UTC
From: Ivan Frade <ifrade@google.com>

Packfile-uri line specifies a hash of 40 hex character, but with SHA256
this hash size is 64. There are already tests using SHA256 (e.g. in
ubuntu-latest/linux-clang).

Update protocol-v2 documentation to indicate that the hash size depends
on the hash algorithm in use.

Signed-off-by: Ivan Frade <ifrade@google.com>
---
 Documentation/technical/protocol-v2.txt | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Ævar Arnfjörð Bjarmason Oct. 8, 2021, 7:43 p.m. UTC | #1
On Fri, Oct 08 2021, Ivan Frade via GitGitGadget wrote:

> From: Ivan Frade <ifrade@google.com>
>
> Packfile-uri line specifies a hash of 40 hex character, but with SHA256
> this hash size is 64. There are already tests using SHA256 (e.g. in
> ubuntu-latest/linux-clang).
>
> Update protocol-v2 documentation to indicate that the hash size depends
> on the hash algorithm in use.
>
> Signed-off-by: Ivan Frade <ifrade@google.com>
> ---
>  Documentation/technical/protocol-v2.txt | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
> index 21e8258ccf3..a23f12d6c2b 100644
> --- a/Documentation/technical/protocol-v2.txt
> +++ b/Documentation/technical/protocol-v2.txt
> @@ -393,7 +393,7 @@ header. Most sections are sent only when the packfile is sent.
>      wanted-ref = obj-id SP refname
>  
>      packfile-uris = PKT-LINE("packfile-uris" LF) *packfile-uri
> -    packfile-uri = PKT-LINE(40*(HEXDIGIT) SP *%x20-ff LF)
> +    packfile-uri = PKT-LINE((40|64)*(HEXDIGIT) SP *%x20-ff LF)
>  
>      packfile = PKT-LINE("packfile" LF)
>  	       *PKT-LINE(%x01-03 *%x00-ff)
> @@ -476,9 +476,9 @@ header. Most sections are sent only when the packfile is sent.
>  	* For each URI the server sends, it sends a hash of the pack's
>  	  contents (as output by git index-pack) followed by the URI.
>  
> -	* The hashes are 40 hex characters long. When Git upgrades to a new
> -	  hash algorithm, this might need to be updated. (It should match
> -	  whatever index-pack outputs after "pack\t" or "keep\t".
> +	* The hashes length is defined by the hash algorithm (40 hex
> +	  characters in SHA-1, 64 in SHA-256). It should match whatever
> +	  index-pack outputs after "pack\t" or "keep\t".
>  
>      packfile section
>  	* This section is only included if the client has sent 'want'

(I forgot to say in my first reply, but welcome to the Git Mailing
List!)

This is well spotted, but it seems even better to simply drop this
exhaustive listing of 40 or 64 hex digits here.

In protocol-common.txt we talk about "obj-id", and that's then used
elsewhere in protocol-v2.txt matter-of-factly, e.g. (quoting from a
handy part that happens to use "obj-id"):

    [...]
    obj-id-or-unborn = (obj-id | "unborn")
    ref = PKT-LINE(obj-id-or-unborn SP refname *(SP ref-attribute) LF)
    [...]

So let's just have packfile-uri do the same.

Now, the thing that *does* need to be updated then is
protocol-common.txt, or this part:

  zero-id   =  40*"0"
  obj-id    =  40*(HEXDIGIT)

Because now if you use obj-id that'll just refer back to that, but
that's also a problem with all the rest of the protocol docs.

It would seem that all our SHA-256 tests and client/servers are in
violation of the documentation, and should truncate their OIDs to 40
chars, or we could fix the docs :)

Anyway, whatever we do here this improvement is unrelated to whatever
we're doing with log redaction in your 1/2, I think it would be better
to submit as its own 1 or 2 patch series.
diff mbox series

Patch

diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index 21e8258ccf3..a23f12d6c2b 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -393,7 +393,7 @@  header. Most sections are sent only when the packfile is sent.
     wanted-ref = obj-id SP refname
 
     packfile-uris = PKT-LINE("packfile-uris" LF) *packfile-uri
-    packfile-uri = PKT-LINE(40*(HEXDIGIT) SP *%x20-ff LF)
+    packfile-uri = PKT-LINE((40|64)*(HEXDIGIT) SP *%x20-ff LF)
 
     packfile = PKT-LINE("packfile" LF)
 	       *PKT-LINE(%x01-03 *%x00-ff)
@@ -476,9 +476,9 @@  header. Most sections are sent only when the packfile is sent.
 	* For each URI the server sends, it sends a hash of the pack's
 	  contents (as output by git index-pack) followed by the URI.
 
-	* The hashes are 40 hex characters long. When Git upgrades to a new
-	  hash algorithm, this might need to be updated. (It should match
-	  whatever index-pack outputs after "pack\t" or "keep\t".
+	* The hashes length is defined by the hash algorithm (40 hex
+	  characters in SHA-1, 64 in SHA-256). It should match whatever
+	  index-pack outputs after "pack\t" or "keep\t".
 
     packfile section
 	* This section is only included if the client has sent 'want'