mbox series

[v2,0/5] Support for commits signed by multiple algorithms

Message ID 20210111035840.2437737-1-sandals@crustytoothpaste.net (mailing list archive)
Headers show
Series Support for commits signed by multiple algorithms | expand

Message

brian m. carlson Jan. 11, 2021, 3:58 a.m. UTC
This series introduces support for verifying commits and tags signed by
multiple algorithms.

Originally, we had planned for SHA-256 tags to stuff the signature in a
header instead of using a trailing signature, and a patch to do that was
sent out in part 1/3.  Unfortunately, for whatever reason, that patch
didn't make it into the master branch, and so we use trailing signatures
there.

We can't change this now, because otherwise it would be ambiguous
whether the trailing signature on a SHA-256 object was for the SHA-256
contents or whether the contents were a rewritten SHA-1 object with no
SHA-256 signature at all.  To do the next best thing, let's use the
trailing signature for the preferred hash algorithm and use a header for
the other variant.  This permits round-tripping, but has the downside
that tags signed with multiple algorithms can't be verified with older
versions of Git.  However, signatures created with older versions of Git
continue to be accepted.

For commits, let's accept a commit that has two signatures.  We
previously created the commits correctly but didn't strip the extra
header off when verifying, so our verification indicated the signature
was bad.

Both these situations allow for signing commits and tags that can be
round-tripped through both SHA-1 and SHA-256.  We verify only the
signature using the current hash algorithm, since we currently don't
rewrite objects.

Changes from v1:
* Fix brown paper bag bug where some tests failed due to a bad rebase.

brian m. carlson (5):
  commit: ignore additional signatures when parsing signed commits
  gpg-interface: improve interface for parsing tags
  commit: allow parsing arbitrary buffers with headers
  ref-filter: hoist signature parsing
  gpg-interface: remove other signature headers before verifying

 builtin/receive-pack.c   |  4 +-
 builtin/tag.c            | 16 ++++++--
 commit.c                 | 83 ++++++++++++++++++++++++++--------------
 commit.h                 | 12 +++++-
 fmt-merge-msg.c          | 29 ++++++++------
 gpg-interface.c          | 15 +++++++-
 gpg-interface.h          |  9 ++++-
 log-tree.c               | 15 ++++----
 ref-filter.c             | 23 +++++++----
 t/t7004-tag.sh           | 25 ++++++++++++
 t/t7510-signed-commit.sh | 43 ++++++++++++++++++++-
 tag.c                    | 15 ++++----
 12 files changed, 219 insertions(+), 70 deletions(-)

Comments

Junio C Hamano Jan. 11, 2021, 10:16 p.m. UTC | #1
"brian m. carlson" <sandals@crustytoothpaste.net> writes:

> This series introduces support for verifying commits and tags signed by
> multiple algorithms.
>
> Originally, we had planned for SHA-256 tags to stuff the signature in a
> header instead of using a trailing signature, and a patch to do that was
> sent out in part 1/3.  Unfortunately, for whatever reason, that patch
> didn't make it into the master branch, and so we use trailing signatures
> there.
>
> We can't change this now, because otherwise it would be ambiguous
> whether the trailing signature on a SHA-256 object was for the SHA-256
> contents or whether the contents were a rewritten SHA-1 object with no
> SHA-256 signature at all.

How widely are SHA-256 tags in use in the real world, though?  Is it
really too late to fix that already?
brian m. carlson Jan. 11, 2021, 11:29 p.m. UTC | #2
On 2021-01-11 at 22:16:33, Junio C Hamano wrote:
> "brian m. carlson" <sandals@crustytoothpaste.net> writes:
> 
> > This series introduces support for verifying commits and tags signed by
> > multiple algorithms.
> >
> > Originally, we had planned for SHA-256 tags to stuff the signature in a
> > header instead of using a trailing signature, and a patch to do that was
> > sent out in part 1/3.  Unfortunately, for whatever reason, that patch
> > didn't make it into the master branch, and so we use trailing signatures
> > there.
> >
> > We can't change this now, because otherwise it would be ambiguous
> > whether the trailing signature on a SHA-256 object was for the SHA-256
> > contents or whether the contents were a rewritten SHA-1 object with no
> > SHA-256 signature at all.
> 
> How widely are SHA-256 tags in use in the real world, though?  Is it
> really too late to fix that already?

I don't know.  I don't know of any major hosting platform that supports
them, but of course many people may be using them independently on
self-hosted instances.

I don't think it matters one way or the other, honestly, because the
functionality is the same either way, whether we always put SHA-256 in a
header or whether we put the non-default algorithm in the header.
Multiply signed commits and tags are still unverifiable on older
versions because the older versions consider the header to be part of
the payload and not something to be stripped.

I just noticed this because I'm now getting to the case where we write
(and sign) both SHA-1 and SHA-256 versions of commits and I thought I'd
better send in a patch sooner rather than later, since there's a lot
more prep work (surprise) before we get to anything interesting.
Junio C Hamano Jan. 12, 2021, 2:03 a.m. UTC | #3
"brian m. carlson" <sandals@crustytoothpaste.net> writes:

> I just noticed this because I'm now getting to the case where we write
> (and sign) both SHA-1 and SHA-256 versions of commits and I thought I'd
> better send in a patch sooner rather than later, since there's a lot
> more prep work (surprise) before we get to anything interesting.

Uncomfortably excited to hear this ;-)
brian m. carlson Jan. 12, 2021, 2:24 a.m. UTC | #4
On 2021-01-12 at 02:03:08, Junio C Hamano wrote:
> "brian m. carlson" <sandals@crustytoothpaste.net> writes:
> 
> > I just noticed this because I'm now getting to the case where we write
> > (and sign) both SHA-1 and SHA-256 versions of commits and I thought I'd
> > better send in a patch sooner rather than later, since there's a lot
> > more prep work (surprise) before we get to anything interesting.
> 
> Uncomfortably excited to hear this ;-)

Here's a brief summary of what's ahead:

* struct object_id is going to have an algorithm member.
* Consequently, there will be per-algorithm null OIDs.
* To efficiently compare and copy OIDs of all sizes (notably in khash
  tables, where things otherwise become tricky), we'll zero-pad SHA-1
  OIDs and always compare the full buffer.
* For all of these reasons, oidread (or similar) will become practically
  required.
* Objects will be mapped into the loose object store when written, and
  each type of object will have a function to convert it between formats
  if required.
* The testsuite will learn a mode not to stuff invalid OIDs into things,
  since those will no longer work (because those OIDs can't be mapped
  and so we can't create SHA-1 versions of them).  That will necessitate
  another large set of prerequisite additions in the testsuite.

In progress work can be seen on the transition-interop branch at
https://github.com/bk2204/git.git.  I should point out that it is very
in progress; the tip will definitely fail the testsuite in certain
cases.