diff mbox series

signature-format.txt: explain and illustrate multi-line headers

Message ID xmqqtuhlisqe.fsf_-_@gitster.g (mailing list archive)
State New, archived
Headers show
Series signature-format.txt: explain and illustrate multi-line headers | expand

Commit Message

Junio C Hamano Oct. 13, 2021, 2:06 a.m. UTC
A signature attached to a signed commit, and the contents of the
commit that merged a signed tag, are both recorded as a value of an
object header field as a multi-line value, and are subject to the
formatting convention for multi-line values in the headers, with a
leading SP signaling that the rest of the line is a continuation of
the previous line.  Most notably, an empty line in such a multi-line
value would result in a line with a sole SP on it.

Examples in the signature-format technical documentation include a
few of these cases but we did not show these otherwise invisible SPs
in the example.  These trailing spaces cannot be seen on display or
on paper, and forces the readers to look for them in their editors
or pagers, even if we added them to the document.

Extend the overview section to explain the multi-line value
formatting and highlight these otherwise invisible SPs by inventing
the "a dollar-sign at the end of line that appears after SP merely
signals that there is a SP there, and the dollar-sign itself does
not appear in the real file" notation, inspired by "cat -e" output,
to help readers to learn exactly where such "a single SP that is
originally an empty line" appears in the examples.

Reported-by: Rob Browning <rlb@defaultvalue.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

    It turns out that the document has another block of text that
    needed the same treatment, so here is an attempt to follow
    through with the "$"-notation approach.

 Documentation/technical/signature-format.txt | 25 ++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)

Comments

Rob Browning Oct. 14, 2021, 5:12 a.m. UTC | #1
Junio C Hamano <gitster@pobox.com> writes:

> A signature attached to a signed commit, and the contents of the
> commit that merged a signed tag, are both recorded as a value of an
> object header field as a multi-line value, and are subject to the
> formatting convention for multi-line values in the headers, with a
> leading SP signaling that the rest of the line is a continuation of
> the previous line.  Most notably, an empty line in such a multi-line
> value would result in a line with a sole SP on it.

One question I had was whether git's requirement was strictly a space,
or if it was following the rfc-822 convention where (if I remember
correctly) a tab is equivalent, i.e. the LWSP production in the grammar.

https://datatracker.ietf.org/doc/html/rfc822#section-3.2

Thanks
Junio C Hamano Oct. 14, 2021, 5:17 p.m. UTC | #2
Rob Browning <rlb@defaultvalue.org> writes:

> One question I had was whether git's requirement was strictly a space,
> or if it was following the rfc-822 convention where (if I remember
> correctly) a tab is equivalent, i.e. the LWSP production in the grammar.

We use a single SP when writing and we accept a single SP when
reading.  See commit.c::add_extra_header() for the writing side, and
commit.c::read_commit_extra_header_lines() for the reading side.

Unlike in RFC-822 style e-mail headers, 

 * keywords in our object headers are not followed by a colon;

 * the value carried on our object header is not a "logically a
   single line of characters".  

 * our headers do not go through their "unfolding" (i.e. removal of
   CRLF eol markers to form a single long line, while preserving the
   WSP that immediately followed the CRLF).  We instead remove the
   SP that signalled the line as a continuation of the previous line
   to keep the original line structure.

With so little similarity, there is no reason for us to mimick their
"folding" rule.

We limit to the SP and not LWP for another reason.  Because the
exact byte sequence in the object (including the header part of
"commit" and "tag" objects) determines the name of the object, it
avoids ambiguity to make sure we do not allow unnecessary
"flexibility" in the way the same thing can be expressed.  If the
same signature is attached to a pair of otherwise identical commits
in their headers, one with SP signaling continued lines, the other
using HT (or random permutations of choice between SP/HT---making
2^N variants for a N line signature block), we would needlessly
create many variants of the "same" commit, which is not ideal.
Rob Browning Oct. 15, 2021, 1:27 a.m. UTC | #3
Junio C Hamano <gitster@pobox.com> writes:

> With so little similarity, there is no reason for us to mimick their
> "folding" rule.

Agreed.  I'd just lazily guessed it might be 822 (was also thinking of
Debian package headers), but certainly shouldn't have glossed over the
missing colon (for example) -- might be worth making sure the rules
described are covered in the technical docs, if they're not already, and
then perhaps refer to them in the section we're adjusting.

> We limit to the SP and not LWP for another reason.  Because the
> exact byte sequence in the object (including the header part of
> "commit" and "tag" objects) determines the name of the object

Ahh, right, of course.

> we would needlessly create many variants of the "same" commit, which
> is not ideal.

Indeed.

Thanks
Junio C Hamano Oct. 15, 2021, 3:58 p.m. UTC | #4
Rob Browning <rlb@defaultvalue.org> writes:

> Agreed.  I'd just lazily guessed it might be 822 (was also thinking of
> ...
> Ahh, right, of course.
> ...
> Indeed.
>
> Thanks

So, back to the original discussion; would the replacement
documentation update be satisfactory?
Rob Browning Oct. 15, 2021, 11:29 p.m. UTC | #5
Junio C Hamano <gitster@pobox.com> writes:

> Rob Browning <rlb@defaultvalue.org> writes:
>
>> Agreed.  I'd just lazily guessed it might be 822 (was also thinking of
>> ...
>> Ahh, right, of course.
>> ...
>> Indeed.
>>
>> Thanks
>
> So, back to the original discussion; would the replacement
> documentation update be satisfactory?

Certainly, and thanks.
diff mbox series

Patch

diff --git a/Documentation/technical/signature-format.txt b/Documentation/technical/signature-format.txt
index 2c9406a56a..45448fba24 100644
--- a/Documentation/technical/signature-format.txt
+++ b/Documentation/technical/signature-format.txt
@@ -13,9 +13,25 @@  Signatures always begin with `-----BEGIN PGP SIGNATURE-----`
 and end with `-----END PGP SIGNATURE-----`, unless gpg is told to
 produce RFC1991 signatures which use `MESSAGE` instead of `SIGNATURE`.
 
+Signatures sometimes appear as a part of the normal payload
+(e.g. a signed tag has the signature block appended after the payload
+that the signature applies to), and sometimes appear in the value of
+an object header (e.g. a merge commit that merged a signed tag would
+have the entire tag contents on its "mergetag" header).  In the case
+of the latter, the usual multi-line formatting rule for object
+headers applies.  I.e. the second and subsequent lines are prefixed
+with a SP to signal that the line is continued from the previous
+line.
+
+This is even true for an originally empty line.  In the following
+examples, the end of line that ends with a whitespace letter is
+highlighted with a `$` sign; if you are trying to recreate these
+example by hand, do not cut and paste them---they are there
+primarily to highlight extra whitespace at the end of some lines.
+
 The signed payload and the way the signature is embedded depends
 on the type of the object resp. transaction.
 
 == Tag signatures
 
 - created by: `git tag -s`
@@ -78,7 +95,7 @@  author A U Thor <author@example.com> 1465981137 +0000
 committer C O Mitter <committer@example.com> 1465981137 +0000
 gpgsig -----BEGIN PGP SIGNATURE-----
  Version: GnuPG v1
-
+ $
  iQEcBAABAgAGBQJXYRjRAAoJEGEJLoW3InGJ3IwIAIY4SA6GxY3BjL60YyvsJPh/
  HRCJwH+w7wt3Yc/9/bW2F+gF72kdHOOs2jfv+OZhq0q4OAN6fvVSczISY/82LpS7
  DVdMQj2/YcHDT4xrDNBnXnviDO9G7am/9OE77kEbXrp7QPxvhjkicHNwy2rEflAA
@@ -128,13 +145,13 @@  mergetag object 04b871796dc0420f8e7561a895b52484b701d51a
  type commit
  tag signedtag
  tagger C O Mitter <committer@example.com> 1465981006 +0000
-
+ $
  signed tag
-
+ $
  signed tag message body
  -----BEGIN PGP SIGNATURE-----
  Version: GnuPG v1
-
+ $
  iQEcBAABAgAGBQJXYRhOAAoJEGEJLoW3InGJklkIAIcnhL7RwEb/+QeX9enkXhxn
  rxfdqrvWd1K80sl2TOt8Bg/NYwrUBw/RWJ+sg/hhHp4WtvE1HDGHlkEz3y11Lkuh
  8tSxS3qKTxXUGozyPGuE90sJfExhZlW4knIQ1wt/yWqM+33E9pN4hzPqLwyrdods