mbox series

[RFC,0/2] TPM derived keys

Message ID 20240503221634.44274-1-ignat@cloudflare.com (mailing list archive)
Headers show
Series TPM derived keys | expand

Message

Ignat Korchagin May 3, 2024, 10:16 p.m. UTC
TPM derived keys get their payload from an HMAC primary key in the owner
hierarchy mixed with some metadata from the requesting process.

They are similar to trusted keys in the sense that the key security is rooted
in the TPM, but may provide easier key management for some use-cases.

One inconvenience with trusted keys is that the cryptographic material should
be provided externally. This means either wrapping the key to the TPM on the
executing system (which briefly exposes plaintext cryptographic material to
userspace) or creating the wrapped blob externally, but then we need to gather
and transfer the TPM public key to the remote system, which may be a logistical
problem sometimes.

Moreover, we need to store the wrapped key blob somewhere, and if we lose it,
the application cannot recover its data anymore.

TPM derived keys may make key management for applications easier, especially on
stateless systems as the application can always recreate its keys and the
encrypted data is bound to the device and its TPM. They allow the application
to wrap/unwrap some data to the device without worrying too much about key
management and provisioning. They are similar in a sense to device unique keys
present on many mobile devices and some IoT systems, but even better as every
application has its own unique device key.

It is also easy to quickly "wipe" all the application keys by just resetting
the TPM owner hierarchy.

It is worth mentioning that this functionality can be implemented in userspace
as a /sbin/request-key plugin. However, the advantage of the in-kernel
implementation is that the derived key material never leaves the kernel space
(unless explicitly read into userspace with proper permissions).

Current implementation supports two modes (as demonstrated by the keyctl
userspace tool):
  1. keyctl add derived test '32 path' - will derive a 32 byte key based on
     the TPM seed and the filesystem path of the requesting application. That
     is /usr/bin/keyctl and /opt/bin/keyctl would generate different keys.

  2. keyctl add derived test '32 csum' - will derive a 32 byte key based on the
     TPM seed and the IMA measurement of the requesting application. That is
     /usr/bin/keyctl and /opt/bin/keyctl would generate the same key IFF their
     code exactly matches bit for bit. The implementation does not measure the
     requesting binary itself, but rather relies on already available
     measurement. This means for this mode to work IMA needs to be enabled and
     configured for requesting applications. For example:
       # echo 'audit func=BPRM_CHECK' > \
         /sys/kernel/security/integrity/ima/policy

Open questions (apart from the obvious "is this useful?"):
  * should any other modes/derivation parameters be considered?
  * apparently in checksum mode, when calling keyring syscalls from scripts,
    we mix in the measurement of the interpreter, not the script itself. Is
    there any way to improve this?


Ignat Korchagin (2):
  tpm: add some algorithm and constant definitions from the TPM spec
  KEYS: implement derived keys

 include/linux/tpm.h                     |  16 +-
 security/keys/Kconfig                   |  16 ++
 security/keys/Makefile                  |   1 +
 security/keys/derived-keys/Makefile     |   8 +
 security/keys/derived-keys/derived.c    | 226 +++++++++++++++++++++
 security/keys/derived-keys/derived.h    |   4 +
 security/keys/derived-keys/tpm2_shash.c | 257 ++++++++++++++++++++++++
 7 files changed, 524 insertions(+), 4 deletions(-)
 create mode 100644 security/keys/derived-keys/Makefile
 create mode 100644 security/keys/derived-keys/derived.c
 create mode 100644 security/keys/derived-keys/derived.h
 create mode 100644 security/keys/derived-keys/tpm2_shash.c

Comments

Jarkko Sakkinen May 4, 2024, 12:21 a.m. UTC | #1
On Sat May 4, 2024 at 1:16 AM EEST, Ignat Korchagin wrote:
> TPM derived keys get their payload from an HMAC primary key in the owner
> hierarchy mixed with some metadata from the requesting process.

What metadata?
What is "the requesting process"?

>
> They are similar to trusted keys in the sense that the key security is rooted
> in the TPM, but may provide easier key management for some use-cases.

Which use cases?

Two first paragraphs are confusers not motivators with three undefined assets.

> One inconvenience with trusted keys is that the cryptographic material should
> be provided externally. This means either wrapping the key to the TPM on the
> executing system (which briefly exposes plaintext cryptographic material to
> userspace) or creating the wrapped blob externally, but then we need to gather
> and transfer the TPM public key to the remote system, which may be a logistical
> problem sometimes.

What are the *existential* issues?

You are start by inconviences with trusted keys without describing for
what the trusted keys are used for.


> Moreover, we need to store the wrapped key blob somewhere, and if we lose it,
> the application cannot recover its data anymore.

I don't frankly understand what you are trying to say here. Somewhere is
not a place. It is an indeterministic entity.

>
> TPM derived keys may make key management for applications easier, especially on
> stateless systems as the application can always recreate its keys and the
> encrypted data is bound to the device and its TPM. They allow the application
> to wrap/unwrap some data to the device without worrying too much about key
> management and provisioning. They are similar in a sense to device unique keys
> present on many mobile devices and some IoT systems, but even better as every
> application has its own unique device key.

Does it or does it not make it easier? Please decide.

That said hard fine from mainline perspective unless there is an
existential issue.

>
> It is also easy to quickly "wipe" all the application keys by just resetting
> the TPM owner hierarchy.
>
> It is worth mentioning that this functionality can be implemented in userspace
> as a /sbin/request-key plugin. However, the advantage of the in-kernel
> implementation is that the derived key material never leaves the kernel space
> (unless explicitly read into userspace with proper permissions).

Please describe the implementation with request-key in the context of
the use case where it is used. That is what this should have started.
Then the motivation. Then the proposal for solution. And also focus
only on existential factors.

I have no idea for what the key created with this is even used, which
makes this impossible to review.

BR, Jarkko
Ben Boeckel May 4, 2024, 1:55 p.m. UTC | #2
On Sat, May 04, 2024 at 03:21:11 +0300, Jarkko Sakkinen wrote:
> I have no idea for what the key created with this is even used, which
> makes this impossible to review.

Additionally, there is nothing in Documentation/ for how userspace might
use or create them. This includes things like their description format
and describing available options.

--Ben
Jarkko Sakkinen May 4, 2024, 2:51 p.m. UTC | #3
On Sat May 4, 2024 at 4:55 PM EEST, Ben Boeckel wrote:
> On Sat, May 04, 2024 at 03:21:11 +0300, Jarkko Sakkinen wrote:
> > I have no idea for what the key created with this is even used, which
> > makes this impossible to review.
>
> Additionally, there is nothing in Documentation/ for how userspace might
> use or create them. This includes things like their description format
> and describing available options.

The whole user story is plain out broken. Documenting a feature that has
no provable use case won't fix that part.

So it is better to start with the cover letter. With the *existing*
knowledge of the *real* issue I don't think we need this tbh.

BR, Jarkko
Jarkko Sakkinen May 4, 2024, 3:35 p.m. UTC | #4
On Sat May 4, 2024 at 5:51 PM EEST, Jarkko Sakkinen wrote:
> On Sat May 4, 2024 at 4:55 PM EEST, Ben Boeckel wrote:
> > On Sat, May 04, 2024 at 03:21:11 +0300, Jarkko Sakkinen wrote:
> > > I have no idea for what the key created with this is even used, which
> > > makes this impossible to review.
> >
> > Additionally, there is nothing in Documentation/ for how userspace might
> > use or create them. This includes things like their description format
> > and describing available options.
>
> The whole user story is plain out broken. Documenting a feature that has
> no provable use case won't fix that part.
>
> So it is better to start with the cover letter. With the *existing*
> knowledge of the *real* issue I don't think we need this tbh.

As for code I'd suggest the "Describe your changes" part from 

  https://www.kernel.org/doc/html/latest/process/submitting-patches.html

and most essentially how to split them properly.

My best bet could something along the lines that perhaps there is some
issue to be sorted out but I don't honestly believe that this will ever
be a solution for any possible problem that exist in this planet.

BR, Jarkko
Ignat Korchagin May 13, 2024, 5:09 p.m. UTC | #5
On Sat, May 4, 2024 at 5:35 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
>
> On Sat May 4, 2024 at 5:51 PM EEST, Jarkko Sakkinen wrote:
> > On Sat May 4, 2024 at 4:55 PM EEST, Ben Boeckel wrote:
> > > On Sat, May 04, 2024 at 03:21:11 +0300, Jarkko Sakkinen wrote:
> > > > I have no idea for what the key created with this is even used, which
> > > > makes this impossible to review.
> > >
> > > Additionally, there is nothing in Documentation/ for how userspace might
> > > use or create them. This includes things like their description format
> > > and describing available options.
> >
> > The whole user story is plain out broken. Documenting a feature that has
> > no provable use case won't fix that part.
> >
> > So it is better to start with the cover letter. With the *existing*
> > knowledge of the *real* issue I don't think we need this tbh.
>
> As for code I'd suggest the "Describe your changes" part from
>
>   https://www.kernel.org/doc/html/latest/process/submitting-patches.html
>
> and most essentially how to split them properly.
>
> My best bet could something along the lines that perhaps there is some
> issue to be sorted out but I don't honestly believe that this will ever
> be a solution for any possible problem that exist in this planet.

Sorry, I must admit I wrote the description hastingly and too
high-level (it was pre-travelling, so probably not the right focus and
in a rush). Let me restart from scratch and describe particular
use-cases we're concerned about:

Trusted and encrypted keys are a great way to manage cryptographic
keys inside the kernel, while never exposing plaintext cryptographic
material to userspace: keys can only be read to userspace as encrypted
blobs and with trusted keys - these blobs are created with the TPM, so
only the TPM can unwrap the blobs.

One of the simplest way to create a trusted key is for an application
to request the kernel to generate a new one [1], like below with the
help of keyctl utility from keyutils:
$ keyctl add trusted kmk "new 32 keyhandle=0x81000001" @u

However, after the application generates a trusted key, it is the
responsibility of the application to manage/store it. For example, if
the application wants to reuse the key after a reboot, it needs to
read the key into userspace as an encrypted blob and store it on
persistent storage. This is challenging and sometimes not possible for
stateless/immutable/ephemeral systems, so such systems are effectively
locked out from using hardware-protected cryptographic keys.

Another point: while the fact that the application can't read the
plaintext cryptographic material into userspace is a feature of
trusted keys, it can also be a disadvantage. Since keys in plaintext
exist only in kernel context, they are useful mostly for in-kernel
systems, like dm-crypt, IMA, ecryptfs. Applications cannot easily use
trusted keys for cryptographic purposes for their own workloads: for
example, generating encrypted or MACed configuration files or
encrypting in-transit data. While since commit 7984ceb134bf ("crypto:
af_alg - Support symmetric encryption via keyring keys") it is
possible to use a trusted key via Linux Crypto API userspace interface
[2], it might not always be practical/desirable:
  * due to limitations in the Linux Crypto API implementation it is
not possible to process more than ~64Kb of data using AEAD ciphers [3]
  * needed algorithm implementations might not be enabled in the
kernel configuration file
  * compliance constraints: the utilised cryptographic implementation
must be FIPS-validated
  * performance constraints: passing large blobs of data to the kernel
for encryption is slow even with Crypto API's "zero-copy" interface
[3]

TPM derived keys attempt to address the above use cases by allowing
applications to deterministically derive unique cryptographic keys for
their own purposes directly from the TPM seed in the owner hierarchy.
The idea is that when an application requests a new key, instead of
generating a random key and wrapping it with the TPM, the
implementation generates a key via KDF(hierarchy seed, application
specific info). Therefore, the resulting keys will always be
cryptographically bound to the application itself and the device they
were generated on.

The applications then may either use in-kernel facilities, like [2],
to do crypto operations inside the kernel, so the generated
cryptographic material is never exposed to userspace (similar to
trusted/encrypted keys). Or, if they are subject to
performance/compliance/other constraints mentioned above, they can
read the key material to userspace and use a userspace crypto library.
Even with the latter approach they still get the benefit of using a
key, security of which is rooted in the TPM.

TPM derived keys also address the key storage problem for
stateless/immutable/ephemeral systems: since the derivation process is
deterministic, the same application can always re-create their keys on
the same system and doesn't need to store or back up any wrapped key
blobs. One notable use case (ironically not for a stateless system)
can be setting up proper full-disk encryption (dm-crypt plain mode
without a LUKS header), for example, to provide deniable encryption or
better resiliency to damage of encrypted media [4].

Current implementation provides two options for KDF's input for
application specific info to ensure key uniqueness:

1. A key, which is unique to a filesystem path:
$ keyctl add derived test '32 path'

Above will derive a 32 byte key based on the TPM seed and the
filesystem path of the requesting application. That is /usr/bin/keyctl
and /opt/bin/keyctl would generate different keys.

2. A key, which is cryptographically bound to the code of the
requesting application:
$ keyctl add derived test '32 csum'

Above will derive a 32 byte key based on the TPM seed and the IMA
measurement of the requesting application. That is /usr/bin/keyctl and
/opt/bin/keyctl would generate the same key if and only if their code
exactly matches bit for bit. The implementation does not measure the
requesting binary itself, but rather relies on already available
measurement. This means for this mode to work IMA needs to be enabled
and configured for requesting applications. For example:
# echo 'audit func=BPRM_CHECK' > \
   /sys/kernel/security/integrity/ima/policy

Open questions:
  * should any other modes/derivation parameters be considered as part
of application specific info?
  * apparently in checksum mode, when calling keyring syscalls from
scripts, we mix in the measurement of the interpreter, not the script
itself. Is there any way to improve this?

I would like to mention that in Cloudflare we have found large
infrastructure key management based on derived keys from per-device
unique seeds quite convenient and almost infinitely scalable and I
believe TPM derived keys can be the next evolution bringing hardware
security to the table. I understand that folks here are not required
to follow links for additional information, but if someone is
interested in more details for our approach, which has been working
well for almost 9 years, see [5].

Hope it is better this time.

Ignat

[1]: https://www.kernel.org/doc/html/latest/security/keys/trusted-encrypted.html#examples-of-trusted-and-encrypted-key-usage
[2]: https://www.kernel.org/doc/html/latest/crypto/userspace-if.html
[3]: https://blog.cloudflare.com/the-linux-crypto-api-for-user-applications
[4]: https://wiki.archlinux.org/title/Dm-crypt/Encrypting_an_entire_system#Plain_dm-crypt
[5]: https://youtu.be/2RPcIbP2xsM?si=nKbyY0gss50i04CG

> BR, Jarkko
Ignat Korchagin May 13, 2024, 5:11 p.m. UTC | #6
On Fri, May 3, 2024 at 11:16 PM Ignat Korchagin <ignat@cloudflare.com> wrote:
>
> TPM derived keys get their payload from an HMAC primary key in the owner
> hierarchy mixed with some metadata from the requesting process.
>
> They are similar to trusted keys in the sense that the key security is rooted
> in the TPM, but may provide easier key management for some use-cases.
>
> One inconvenience with trusted keys is that the cryptographic material should
> be provided externally. This means either wrapping the key to the TPM on the

I would like to point out to myself I was wrong: it is possible to ask
the kernel to generate a trusted key inside the kernel locally with
"keyctl add trusted kmk "new 32" @u"

> executing system (which briefly exposes plaintext cryptographic material to
> userspace) or creating the wrapped blob externally, but then we need to gather
> and transfer the TPM public key to the remote system, which may be a logistical
> problem sometimes.
>
> Moreover, we need to store the wrapped key blob somewhere, and if we lose it,
> the application cannot recover its data anymore.
>
> TPM derived keys may make key management for applications easier, especially on
> stateless systems as the application can always recreate its keys and the
> encrypted data is bound to the device and its TPM. They allow the application
> to wrap/unwrap some data to the device without worrying too much about key
> management and provisioning. They are similar in a sense to device unique keys
> present on many mobile devices and some IoT systems, but even better as every
> application has its own unique device key.
>
> It is also easy to quickly "wipe" all the application keys by just resetting
> the TPM owner hierarchy.
>
> It is worth mentioning that this functionality can be implemented in userspace
> as a /sbin/request-key plugin. However, the advantage of the in-kernel
> implementation is that the derived key material never leaves the kernel space
> (unless explicitly read into userspace with proper permissions).
>
> Current implementation supports two modes (as demonstrated by the keyctl
> userspace tool):
>   1. keyctl add derived test '32 path' - will derive a 32 byte key based on
>      the TPM seed and the filesystem path of the requesting application. That
>      is /usr/bin/keyctl and /opt/bin/keyctl would generate different keys.
>
>   2. keyctl add derived test '32 csum' - will derive a 32 byte key based on the
>      TPM seed and the IMA measurement of the requesting application. That is
>      /usr/bin/keyctl and /opt/bin/keyctl would generate the same key IFF their
>      code exactly matches bit for bit. The implementation does not measure the
>      requesting binary itself, but rather relies on already available
>      measurement. This means for this mode to work IMA needs to be enabled and
>      configured for requesting applications. For example:
>        # echo 'audit func=BPRM_CHECK' > \
>          /sys/kernel/security/integrity/ima/policy
>
> Open questions (apart from the obvious "is this useful?"):
>   * should any other modes/derivation parameters be considered?
>   * apparently in checksum mode, when calling keyring syscalls from scripts,
>     we mix in the measurement of the interpreter, not the script itself. Is
>     there any way to improve this?
>
>
> Ignat Korchagin (2):
>   tpm: add some algorithm and constant definitions from the TPM spec
>   KEYS: implement derived keys
>
>  include/linux/tpm.h                     |  16 +-
>  security/keys/Kconfig                   |  16 ++
>  security/keys/Makefile                  |   1 +
>  security/keys/derived-keys/Makefile     |   8 +
>  security/keys/derived-keys/derived.c    | 226 +++++++++++++++++++++
>  security/keys/derived-keys/derived.h    |   4 +
>  security/keys/derived-keys/tpm2_shash.c | 257 ++++++++++++++++++++++++
>  7 files changed, 524 insertions(+), 4 deletions(-)
>  create mode 100644 security/keys/derived-keys/Makefile
>  create mode 100644 security/keys/derived-keys/derived.c
>  create mode 100644 security/keys/derived-keys/derived.h
>  create mode 100644 security/keys/derived-keys/tpm2_shash.c
>
> --
> 2.39.2
>
James Bottomley May 13, 2024, 10:33 p.m. UTC | #7
On Mon, 2024-05-13 at 18:09 +0100, Ignat Korchagin wrote:
[...]
> TPM derived keys attempt to address the above use cases by allowing
> applications to deterministically derive unique cryptographic keys
> for their own purposes directly from the TPM seed in the owner
> hierarchy. The idea is that when an application requests a new key,
> instead of generating a random key and wrapping it with the TPM, the
> implementation generates a key via KDF(hierarchy seed, application
> specific info). Therefore, the resulting keys will always be
> cryptographically bound to the application itself and the device they
> were generated on.

So I think what confuses me is what the expected cryptographic secrecy
properties of the derived keys are.  I get they're a KDF of seed and
deterministic properties, but if those mixing values are well known (as
the path or binary checksum cases) then anyone with access to the TPM
can derive the key from user space because they can easily obtain the
mixing parameters and there's no protection to the TPM keyed hash
operation.

Consider the use case where two users are using derived keys on the
same system (so same TPM).  Assuming they use them to protect sensitive
information, what prevents user1 from simply deriving user2's key and
getting the information, or am I missing the point of this?

James
Jarkko Sakkinen May 14, 2024, 12:28 a.m. UTC | #8
On Mon May 13, 2024 at 8:11 PM EEST, Ignat Korchagin wrote:
> On Fri, May 3, 2024 at 11:16 PM Ignat Korchagin <ignat@cloudflare.com> wrote:
> I would like to point out to myself I was wrong: it is possible to ask
> the kernel to generate a trusted key inside the kernel locally with
> "keyctl add trusted kmk "new 32" @u"

Not in a full-time kernel position ATM as I'm working as contract
researcher up until beginning of Oct (took some industry break after
a startup went down of business), so please, politely asking, write
a bit more compact descriptions ;-) I'm trying to find a new position by
the beginning of Oct but right now I'd appreciate a bit more thought out
text descriptions.

I'm working out a small patch set with James Prestwood to add asymmetric
TPM2 keys based on his old patch set [1] but laid out on top of the
existing baseline.

I did already the key type shenanigans etc. for it and James P is laying
his pre-existing RSA code and new ECDSA on top of that. So this will
give x.509 compatibility [2]. This patch set will be out soon and likely
part of 6.11 (or almost guaranteed as most of it is done).

So by plain guess this might be along the lines what you might want?

[1] https://lore.kernel.org/all/20200518172704.29608-1-prestwoj@gmail.com/
[2] https://datatracker.ietf.org/doc/draft-woodhouse-cert-best-practice/

BR, Jarkko
Ignat Korchagin May 14, 2024, 9:50 a.m. UTC | #9
On Mon, May 13, 2024 at 11:33 PM James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
>
> On Mon, 2024-05-13 at 18:09 +0100, Ignat Korchagin wrote:
> [...]
> > TPM derived keys attempt to address the above use cases by allowing
> > applications to deterministically derive unique cryptographic keys
> > for their own purposes directly from the TPM seed in the owner
> > hierarchy. The idea is that when an application requests a new key,
> > instead of generating a random key and wrapping it with the TPM, the
> > implementation generates a key via KDF(hierarchy seed, application
> > specific info). Therefore, the resulting keys will always be
> > cryptographically bound to the application itself and the device they
> > were generated on.
>
> So I think what confuses me is what the expected cryptographic secrecy
> properties of the derived keys are.  I get they're a KDF of seed and
> deterministic properties, but if those mixing values are well known (as
> the path or binary checksum cases) then anyone with access to the TPM
> can derive the key from user space because they can easily obtain the
> mixing parameters and there's no protection to the TPM keyed hash
> operation.
>
> Consider the use case where two users are using derived keys on the
> same system (so same TPM).  Assuming they use them to protect sensitive
> information, what prevents user1 from simply deriving user2's key and
> getting the information, or am I missing the point of this?

You are correct: it is possible, but in practice it would be limited
only to privileged users/applications. I remember there was a push to
set a 666 mask for the TPM device file, but it is not how it is done
today by default. Also I think the same applies to trusted keys as
well, at least without any additional authorizations or PCR
restrictions on the blob (I remember I could manually unwrap a trusted
key blob in userspace as root).

It would be fixed if we could limit access to some TPM ops only from
the kernel, but I remember from one of your presentations that it is
generally a hard problem and that some solution was in the works (was
it based on limiting access to a resettable PCR?). I'm happy to
consider adopting it here as well somehow.

> James
>
Ignat Korchagin May 14, 2024, 10:05 a.m. UTC | #10
On Tue, May 14, 2024 at 1:28 AM Jarkko Sakkinen <jarkko@kernel.org> wrote:
>
> On Mon May 13, 2024 at 8:11 PM EEST, Ignat Korchagin wrote:
> > On Fri, May 3, 2024 at 11:16 PM Ignat Korchagin <ignat@cloudflare.com> wrote:
> > I would like to point out to myself I was wrong: it is possible to ask
> > the kernel to generate a trusted key inside the kernel locally with
> > "keyctl add trusted kmk "new 32" @u"
>
> Not in a full-time kernel position ATM as I'm working as contract
> researcher up until beginning of Oct (took some industry break after
> a startup went down of business), so please, politely asking, write
> a bit more compact descriptions ;-) I'm trying to find a new position by
> the beginning of Oct but right now I'd appreciate a bit more thought out
> text descriptions.
>
> I'm working out a small patch set with James Prestwood to add asymmetric
> TPM2 keys based on his old patch set [1] but laid out on top of the
> existing baseline.
>
> I did already the key type shenanigans etc. for it and James P is laying
> his pre-existing RSA code and new ECDSA on top of that. So this will

This is great. Perhaps we can finally have ECDSA software signature
support as well, which I have been trying to get in for some time now
[1]

> give x.509 compatibility [2]. This patch set will be out soon and likely
> part of 6.11 (or almost guaranteed as most of it is done).
>
> So by plain guess this might be along the lines what you might want?

I don't think so. I have seen this patchset, but unless the new
version is fundamentally different, it looks to me that the asymmetric
TPM keys are the same as trusted keys except they are asymmetric
instead of being symmetric. That is, they are still of limited use on
stateless systems and are subject to the same restrictions I described
in my revised cover description.

On top of that I'm not sure they would be widely used as "leaf" keys
by applications, maybe more as root/intermediate keys in some kind of
key hierarchy. TPMs are slow and I don't see a high-performance
web-server, for example, using asymmetric TPM keys for TLS operations.
Also, as we learned the hard way operating many TPMs in production,
some TPMs are quite unreliable and fail really fast, if you "spam"
them with a lot of crypto ops. I understand this is a HW/TPM vendor
problem, but in practice we're trying to build systems, where TPM is
used to protect/generate other keys, but most of the "leaf" crypto
operations are done in software, so we don't make the TPM do too much
crypto.

Just to clarify - I'm not arguing about the usefulness of TPM
asymmetric keys in the kernel. I would really want to see this
building block available as well, but I think it just serves a
different purpose/use case from what I'm trying to figure out in this
RFC thread.

> [1] https://lore.kernel.org/all/20200518172704.29608-1-prestwoj@gmail.com/
> [2] https://datatracker.ietf.org/doc/draft-woodhouse-cert-best-practice/
>
> BR, Jarkko

[1] https://lore.kernel.org/lkml/20221014100737.94742-2-ignat@cloudflare.com/T/
Jarkko Sakkinen May 14, 2024, 12:09 p.m. UTC | #11
On Tue May 14, 2024 at 1:05 PM EEST, Ignat Korchagin wrote:
> On Tue, May 14, 2024 at 1:28 AM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> >
> > On Mon May 13, 2024 at 8:11 PM EEST, Ignat Korchagin wrote:
> > > On Fri, May 3, 2024 at 11:16 PM Ignat Korchagin <ignat@cloudflare.com> wrote:
> > > I would like to point out to myself I was wrong: it is possible to ask
> > > the kernel to generate a trusted key inside the kernel locally with
> > > "keyctl add trusted kmk "new 32" @u"
> >
> > Not in a full-time kernel position ATM as I'm working as contract
> > researcher up until beginning of Oct (took some industry break after
> > a startup went down of business), so please, politely asking, write
> > a bit more compact descriptions ;-) I'm trying to find a new position by
> > the beginning of Oct but right now I'd appreciate a bit more thought out
> > text descriptions.
> >
> > I'm working out a small patch set with James Prestwood to add asymmetric
> > TPM2 keys based on his old patch set [1] but laid out on top of the
> > existing baseline.
> >
> > I did already the key type shenanigans etc. for it and James P is laying
> > his pre-existing RSA code and new ECDSA on top of that. So this will
>
> This is great. Perhaps we can finally have ECDSA software signature
> support as well, which I have been trying to get in for some time now
> [1]

Yes exactly both.

>
> > give x.509 compatibility [2]. This patch set will be out soon and likely
> > part of 6.11 (or almost guaranteed as most of it is done).
> >
> > So by plain guess this might be along the lines what you might want?
>
> I don't think so. I have seen this patchset, but unless the new
> version is fundamentally different, it looks to me that the asymmetric
> TPM keys are the same as trusted keys except they are asymmetric
> instead of being symmetric. That is, they are still of limited use on
> stateless systems and are subject to the same restrictions I described
> in my revised cover description.

OK, hmm... can you an "apples and oranges" example what would be
most trivial use case where these don't cut?


> On top of that I'm not sure they would be widely used as "leaf" keys
> by applications, maybe more as root/intermediate keys in some kind of
> key hierarchy. TPMs are slow and I don't see a high-performance
> web-server, for example, using asymmetric TPM keys for TLS operations.
> Also, as we learned the hard way operating many TPMs in production,
> some TPMs are quite unreliable and fail really fast, if you "spam"
> them with a lot of crypto ops. I understand this is a HW/TPM vendor
> problem, but in practice we're trying to build systems, where TPM is
> used to protect/generate other keys, but most of the "leaf" crypto
> operations are done in software, so we don't make the TPM do too much
> crypto.

So what about SGX/SNP/TDX?

TPM is definitely not made for workloads :-)

> Just to clarify - I'm not arguing about the usefulness of TPM
> asymmetric keys in the kernel. I would really want to see this
> building block available as well, but I think it just serves a
> different purpose/use case from what I'm trying to figure out in this
> RFC thread.

Got it :-) NP

BR, Jarkko
Ignat Korchagin May 14, 2024, 1:11 p.m. UTC | #12
On Tue, May 14, 2024 at 1:09 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
>
> On Tue May 14, 2024 at 1:05 PM EEST, Ignat Korchagin wrote:
> > On Tue, May 14, 2024 at 1:28 AM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> > >
> > > On Mon May 13, 2024 at 8:11 PM EEST, Ignat Korchagin wrote:
> > > > On Fri, May 3, 2024 at 11:16 PM Ignat Korchagin <ignat@cloudflare.com> wrote:
> > > > I would like to point out to myself I was wrong: it is possible to ask
> > > > the kernel to generate a trusted key inside the kernel locally with
> > > > "keyctl add trusted kmk "new 32" @u"
> > >
> > > Not in a full-time kernel position ATM as I'm working as contract
> > > researcher up until beginning of Oct (took some industry break after
> > > a startup went down of business), so please, politely asking, write
> > > a bit more compact descriptions ;-) I'm trying to find a new position by
> > > the beginning of Oct but right now I'd appreciate a bit more thought out
> > > text descriptions.
> > >
> > > I'm working out a small patch set with James Prestwood to add asymmetric
> > > TPM2 keys based on his old patch set [1] but laid out on top of the
> > > existing baseline.
> > >
> > > I did already the key type shenanigans etc. for it and James P is laying
> > > his pre-existing RSA code and new ECDSA on top of that. So this will
> >
> > This is great. Perhaps we can finally have ECDSA software signature
> > support as well, which I have been trying to get in for some time now
> > [1]
>
> Yes exactly both.
>
> >
> > > give x.509 compatibility [2]. This patch set will be out soon and likely
> > > part of 6.11 (or almost guaranteed as most of it is done).
> > >
> > > So by plain guess this might be along the lines what you might want?
> >
> > I don't think so. I have seen this patchset, but unless the new
> > version is fundamentally different, it looks to me that the asymmetric
> > TPM keys are the same as trusted keys except they are asymmetric
> > instead of being symmetric. That is, they are still of limited use on
> > stateless systems and are subject to the same restrictions I described
> > in my revised cover description.
>
> OK, hmm... can you an "apples and oranges" example what would be
> most trivial use case where these don't cut?

For example, a cheap NAS box with no internal storage (disks connected
externally via USB). We want:
  * disks to be encrypted and decryptable only by this NAS box
  * if someone steals one of the disks - we don't want them to see it
has encrypted data (no LUKS header)

Additionally we may want to SSH into the NAS for configuration and we
don't want the SSH server key to change after each boot (regardless if
disks are connected or not).

>
> > On top of that I'm not sure they would be widely used as "leaf" keys
> > by applications, maybe more as root/intermediate keys in some kind of
> > key hierarchy. TPMs are slow and I don't see a high-performance
> > web-server, for example, using asymmetric TPM keys for TLS operations.
> > Also, as we learned the hard way operating many TPMs in production,
> > some TPMs are quite unreliable and fail really fast, if you "spam"
> > them with a lot of crypto ops. I understand this is a HW/TPM vendor
> > problem, but in practice we're trying to build systems, where TPM is
> > used to protect/generate other keys, but most of the "leaf" crypto
> > operations are done in software, so we don't make the TPM do too much
> > crypto.
>
> So what about SGX/SNP/TDX?

In theory yes, but I have chased the tech for a while on commodity HW
and it keeps having problems.

> TPM is definitely not made for workloads :-)
>
> > Just to clarify - I'm not arguing about the usefulness of TPM
> > asymmetric keys in the kernel. I would really want to see this
> > building block available as well, but I think it just serves a
> > different purpose/use case from what I'm trying to figure out in this
> > RFC thread.
>
> Got it :-) NP
>
> BR, Jarkko
Jarkko Sakkinen May 14, 2024, 2 p.m. UTC | #13
On Tue May 14, 2024 at 4:11 PM EEST, Ignat Korchagin wrote:
> For example, a cheap NAS box with no internal storage (disks connected
> externally via USB). We want:
>   * disks to be encrypted and decryptable only by this NAS box

So how this differs from LUKS2 style, which also systemd supports where
the encryption key is anchored to PCR's? If I took hard drive out of my
Linux box, I could not decrypt it in another machine because of this.

>   * if someone steals one of the disks - we don't want them to see it
> has encrypted data (no LUKS header)

So what happens when you reconnect?

> Additionally we may want to SSH into the NAS for configuration and we
> don't want the SSH server key to change after each boot (regardless if
> disks are connected or not).

Right, interesting use case. Begin before any technical jargon exactly
with a great example like this. Then it is easier to start to anchoring
stuff and not be misleaded.

BR, Jarkko
James Bottomley May 14, 2024, 2:11 p.m. UTC | #14
On Tue, 2024-05-14 at 10:50 +0100, Ignat Korchagin wrote:
> On Mon, May 13, 2024 at 11:33 PM James Bottomley
> <James.Bottomley@hansenpartnership.com> wrote:
> > 
> > On Mon, 2024-05-13 at 18:09 +0100, Ignat Korchagin wrote:
> > [...]
> > > TPM derived keys attempt to address the above use cases by
> > > allowing applications to deterministically derive unique
> > > cryptographic keys for their own purposes directly from the TPM
> > > seed in the owner hierarchy. The idea is that when an application
> > > requests a new key, instead of generating a random key and
> > > wrapping it with the TPM, the implementation generates a key via
> > > KDF(hierarchy seed, application specific info). Therefore, the
> > > resulting keys will always be cryptographically bound to the
> > > application itself and the device they were generated on.
> > 
> > So I think what confuses me is what the expected cryptographic
> > secrecy properties of the derived keys are.  I get they're a KDF of
> > seed and deterministic properties, but if those mixing values are
> > well known (as the path or binary checksum cases) then anyone with
> > access to the TPM can derive the key from user space because they
> > can easily obtain the mixing parameters and there's no protection
> > to the TPM keyed hash operation.
> > 
> > Consider the use case where two users are using derived keys on the
> > same system (so same TPM).  Assuming they use them to protect
> > sensitive information, what prevents user1 from simply deriving
> > user2's key and getting the information, or am I missing the point
> > of this?
> 
> You are correct: it is possible, but in practice it would be limited
> only to privileged users/applications. I remember there was a push to
> set a 666 mask for the TPM device file, but it is not how it is done
> today by default.

No, it's 660, but in consequence of that every user of the TPM is a
member of the tpm group which, since TPM use from userspace is growing,
is everyone, so it might as well have been 666.  In other words relying
on access restrictions to the TPM itself is largely useless.

>  Also I think the same applies to trusted keys as well, at least
> without any additional authorizations or PCR restrictions on the blob
> (I remember I could manually unwrap a trusted key blob in userspace
> as root).

Well, that's correct, but a TPM key file without policy still has two
protections: the file itself (so the key owner can choose what
permissions and where it is) and the key authority (or password)
although for the mechanical (unsupervised insertion) use case keys tend
not to have an authority.

> It would be fixed if we could limit access to some TPM ops only from
> the kernel, but I remember from one of your presentations that it is
> generally a hard problem and that some solution was in the works (was
> it based on limiting access to a resettable PCR?). I'm happy to
> consider adopting it here as well somehow.

Well, that was based on constructing a policy that meant only the
kernel could access the data (so it requires PCR policy).

In addition to the expected secrecy property question which I don't
think is fully answered I did think of another issue: what if the
application needs to rotate keys because of a suspected compromise? 
For sealed keys, we just generate a new one an use that in place of the
old, but for your derived keys we'd have to change one of the mixing
values, which all look to be based on fairly permanent properties of
the system.

James
Jarkko Sakkinen May 14, 2024, 2:30 p.m. UTC | #15
On Tue May 14, 2024 at 5:00 PM EEST, Jarkko Sakkinen wrote:
> On Tue May 14, 2024 at 4:11 PM EEST, Ignat Korchagin wrote:
> > For example, a cheap NAS box with no internal storage (disks connected
> > externally via USB). We want:
> >   * disks to be encrypted and decryptable only by this NAS box
>
> So how this differs from LUKS2 style, which also systemd supports where
> the encryption key is anchored to PCR's? If I took hard drive out of my
> Linux box, I could not decrypt it in another machine because of this.

Maybe you could replace the real LUKS2 header with a dummy LUKS2
header, which would need to be able the describe "do not use this" and
e.g. SHA256 of the actual header. And then treat the looked up header as
the header when the drive is mounted.

LUKS2 would also need to be able to have pre-defined (e.g. kernel
command-line or bootconfig) small internal storage, which would be
also encrypted with TPM's PRCs containing an array of LUKS2 header
and then look up that with SHA256 as the key.

Without knowing LUKS2 implementation to me these do not sound reaching
the impossible engineer problems so maybe this would be worth of
investigating...

BR, Jarkko
Ignat Korchagin May 14, 2024, 2:41 p.m. UTC | #16
On Tue, May 14, 2024 at 3:00 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
>
> On Tue May 14, 2024 at 4:11 PM EEST, Ignat Korchagin wrote:
> > For example, a cheap NAS box with no internal storage (disks connected
> > externally via USB). We want:
> >   * disks to be encrypted and decryptable only by this NAS box
>
> So how this differs from LUKS2 style, which also systemd supports where
> the encryption key is anchored to PCR's? If I took hard drive out of my
> Linux box, I could not decrypt it in another machine because of this.

It differs with the fact that the disk has a clearly identifiable
LUKS2 header, which tells an adversary that this is a disk with some
data that is encrypted. With derived keys and plain dm-crypt mode
there is no LUKS header, so it is not possible to tell if it is an
encrypted disk or a disk with just random data. Additionally, if I
accidentally wipe the sector with the LUKS2 header - all my data is
lost (because the data encryption key from the header is lost). With
derived keys I can always decrypt at least some data, if the disk is
available.

> >   * if someone steals one of the disks - we don't want them to see it
> > has encrypted data (no LUKS header)
>
> So what happens when you reconnect?

We recover/derive the encryption key and unlock the disk again.

> > Additionally we may want to SSH into the NAS for configuration and we
> > don't want the SSH server key to change after each boot (regardless if
> > disks are connected or not).
>
> Right, interesting use case. Begin before any technical jargon exactly
> with a great example like this. Then it is easier to start to anchoring
> stuff and not be misleaded.
>
> BR, Jarkko
Jarkko Sakkinen May 14, 2024, 2:45 p.m. UTC | #17
On Tue May 14, 2024 at 5:41 PM EEST, Ignat Korchagin wrote:
> On Tue, May 14, 2024 at 3:00 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> >
> > On Tue May 14, 2024 at 4:11 PM EEST, Ignat Korchagin wrote:
> > > For example, a cheap NAS box with no internal storage (disks connected
> > > externally via USB). We want:
> > >   * disks to be encrypted and decryptable only by this NAS box
> >
> > So how this differs from LUKS2 style, which also systemd supports where
> > the encryption key is anchored to PCR's? If I took hard drive out of my
> > Linux box, I could not decrypt it in another machine because of this.
>
> It differs with the fact that the disk has a clearly identifiable
> LUKS2 header, which tells an adversary that this is a disk with some
> data that is encrypted. With derived keys and plain dm-crypt mode
> there is no LUKS header, so it is not possible to tell if it is an
> encrypted disk or a disk with just random data. Additionally, if I
> accidentally wipe the sector with the LUKS2 header - all my data is
> lost (because the data encryption key from the header is lost). With
> derived keys I can always decrypt at least some data, if the disk is
> available.

I figured most of this out myself and sent a follow-up but yeah thnaks
for confirming my toughts. I get this part now.

Follow-ups to my follow-up...

BR, Jarkko
Ignat Korchagin May 14, 2024, 2:54 p.m. UTC | #18
On Tue, May 14, 2024 at 3:11 PM James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
>
> On Tue, 2024-05-14 at 10:50 +0100, Ignat Korchagin wrote:
> > On Mon, May 13, 2024 at 11:33 PM James Bottomley
> > <James.Bottomley@hansenpartnership.com> wrote:
> > >
> > > On Mon, 2024-05-13 at 18:09 +0100, Ignat Korchagin wrote:
> > > [...]
> > > > TPM derived keys attempt to address the above use cases by
> > > > allowing applications to deterministically derive unique
> > > > cryptographic keys for their own purposes directly from the TPM
> > > > seed in the owner hierarchy. The idea is that when an application
> > > > requests a new key, instead of generating a random key and
> > > > wrapping it with the TPM, the implementation generates a key via
> > > > KDF(hierarchy seed, application specific info). Therefore, the
> > > > resulting keys will always be cryptographically bound to the
> > > > application itself and the device they were generated on.
> > >
> > > So I think what confuses me is what the expected cryptographic
> > > secrecy properties of the derived keys are.  I get they're a KDF of
> > > seed and deterministic properties, but if those mixing values are
> > > well known (as the path or binary checksum cases) then anyone with
> > > access to the TPM can derive the key from user space because they
> > > can easily obtain the mixing parameters and there's no protection
> > > to the TPM keyed hash operation.
> > >
> > > Consider the use case where two users are using derived keys on the
> > > same system (so same TPM).  Assuming they use them to protect
> > > sensitive information, what prevents user1 from simply deriving
> > > user2's key and getting the information, or am I missing the point
> > > of this?
> >
> > You are correct: it is possible, but in practice it would be limited
> > only to privileged users/applications. I remember there was a push to
> > set a 666 mask for the TPM device file, but it is not how it is done
> > today by default.
>
> No, it's 660, but in consequence of that every user of the TPM is a
> member of the tpm group which, since TPM use from userspace is growing,
> is everyone, so it might as well have been 666.  In other words relying
> on access restrictions to the TPM itself is largely useless.
>
> >  Also I think the same applies to trusted keys as well, at least
> > without any additional authorizations or PCR restrictions on the blob
> > (I remember I could manually unwrap a trusted key blob in userspace
> > as root).
>
> Well, that's correct, but a TPM key file without policy still has two
> protections: the file itself (so the key owner can choose what
> permissions and where it is) and the key authority (or password)
> although for the mechanical (unsupervised insertion) use case keys tend
> not to have an authority.
>
> > It would be fixed if we could limit access to some TPM ops only from
> > the kernel, but I remember from one of your presentations that it is
> > generally a hard problem and that some solution was in the works (was
> > it based on limiting access to a resettable PCR?). I'm happy to
> > consider adopting it here as well somehow.
>
> Well, that was based on constructing a policy that meant only the
> kernel could access the data (so it requires PCR policy).
>
> In addition to the expected secrecy property question which I don't
> think is fully answered I did think of another issue: what if the
> application needs to rotate keys because of a suspected compromise?
> For sealed keys, we just generate a new one an use that in place of the
> old, but for your derived keys we'd have to change one of the mixing
> values, which all look to be based on fairly permanent properties of
> the system.

For our current (non-TPM based) derived key hierarchy we do allow
applications to specify a "freeform" mixing value, which in practice
may contain a key version, like "v1"/"v2" etc. This also allows
applications to derive multiple different keys for different purposes.
Perhaps, we can do the same here, for example keyctl add derived test
"<key len> (path|csum) <the rest is used as is as another mixin>". We
can also "just ship" a new version of the code (for the csum case),
which would rotate the key. Another option could be using some
optional xattr as a mixin, which can specify the version of the key or
just be a freeform input.

> James
>
Jarkko Sakkinen May 14, 2024, 3:21 p.m. UTC | #19
On Tue May 14, 2024 at 5:30 PM EEST, Jarkko Sakkinen wrote:
> On Tue May 14, 2024 at 5:00 PM EEST, Jarkko Sakkinen wrote:
> > On Tue May 14, 2024 at 4:11 PM EEST, Ignat Korchagin wrote:
> > > For example, a cheap NAS box with no internal storage (disks connected
> > > externally via USB). We want:
> > >   * disks to be encrypted and decryptable only by this NAS box
> >
> > So how this differs from LUKS2 style, which also systemd supports where
> > the encryption key is anchored to PCR's? If I took hard drive out of my
> > Linux box, I could not decrypt it in another machine because of this.
>
> Maybe you could replace the real LUKS2 header with a dummy LUKS2
> header, which would need to be able the describe "do not use this" and
> e.g. SHA256 of the actual header. And then treat the looked up header as
> the header when the drive is mounted.
>
> LUKS2 would also need to be able to have pre-defined (e.g. kernel
> command-line or bootconfig) small internal storage, which would be
> also encrypted with TPM's PRCs containing an array of LUKS2 header
> and then look up that with SHA256 as the key.
>
> Without knowing LUKS2 implementation to me these do not sound reaching
> the impossible engineer problems so maybe this would be worth of
> investigating...

Or why you could not just encrypt the whole header with another key
that is only in that device? Then it would appear as random full
length.

I.e. unsealing

1. Decrypt LUKS2 header with TPM2 key
2. Use the new resulting header as it was in the place of encrypted
   stored to the external drive.
3. Decrypt key from the LUK2S header etc.

?

BR, Jarkko
Jarkko Sakkinen May 14, 2024, 3:26 p.m. UTC | #20
On Tue May 14, 2024 at 6:21 PM EEST, Jarkko Sakkinen wrote:
> On Tue May 14, 2024 at 5:30 PM EEST, Jarkko Sakkinen wrote:
> > On Tue May 14, 2024 at 5:00 PM EEST, Jarkko Sakkinen wrote:
> > > On Tue May 14, 2024 at 4:11 PM EEST, Ignat Korchagin wrote:
> > > > For example, a cheap NAS box with no internal storage (disks connected
> > > > externally via USB). We want:
> > > >   * disks to be encrypted and decryptable only by this NAS box
> > >
> > > So how this differs from LUKS2 style, which also systemd supports where
> > > the encryption key is anchored to PCR's? If I took hard drive out of my
> > > Linux box, I could not decrypt it in another machine because of this.
> >
> > Maybe you could replace the real LUKS2 header with a dummy LUKS2
> > header, which would need to be able the describe "do not use this" and
> > e.g. SHA256 of the actual header. And then treat the looked up header as
> > the header when the drive is mounted.
> >
> > LUKS2 would also need to be able to have pre-defined (e.g. kernel
> > command-line or bootconfig) small internal storage, which would be
> > also encrypted with TPM's PRCs containing an array of LUKS2 header
> > and then look up that with SHA256 as the key.
> >
> > Without knowing LUKS2 implementation to me these do not sound reaching
> > the impossible engineer problems so maybe this would be worth of
> > investigating...
>
> Or why you could not just encrypt the whole header with another key
> that is only in that device? Then it would appear as random full
> length.
>
> I.e. unsealing
>
> 1. Decrypt LUKS2 header with TPM2 key
> 2. Use the new resulting header as it was in the place of encrypted
>    stored to the external drive.
> 3. Decrypt key from the LUK2S header etc.

Maybe something like:

1. Asymmetric for LUKS2 (just like it is)
2. Additional symmetric key, which is created as non-migratable and stored
   to the TPM2 chip. This deciphers the header, i.e. takes the random
   away.

BR, Jarkko
Ignat Korchagin May 14, 2024, 3:30 p.m. UTC | #21
On Tue, May 14, 2024 at 4:26 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
>
> On Tue May 14, 2024 at 6:21 PM EEST, Jarkko Sakkinen wrote:
> > On Tue May 14, 2024 at 5:30 PM EEST, Jarkko Sakkinen wrote:
> > > On Tue May 14, 2024 at 5:00 PM EEST, Jarkko Sakkinen wrote:
> > > > On Tue May 14, 2024 at 4:11 PM EEST, Ignat Korchagin wrote:
> > > > > For example, a cheap NAS box with no internal storage (disks connected
> > > > > externally via USB). We want:
> > > > >   * disks to be encrypted and decryptable only by this NAS box
> > > >
> > > > So how this differs from LUKS2 style, which also systemd supports where
> > > > the encryption key is anchored to PCR's? If I took hard drive out of my
> > > > Linux box, I could not decrypt it in another machine because of this.
> > >
> > > Maybe you could replace the real LUKS2 header with a dummy LUKS2
> > > header, which would need to be able the describe "do not use this" and
> > > e.g. SHA256 of the actual header. And then treat the looked up header as
> > > the header when the drive is mounted.
> > >
> > > LUKS2 would also need to be able to have pre-defined (e.g. kernel
> > > command-line or bootconfig) small internal storage, which would be
> > > also encrypted with TPM's PRCs containing an array of LUKS2 header
> > > and then look up that with SHA256 as the key.
> > >
> > > Without knowing LUKS2 implementation to me these do not sound reaching
> > > the impossible engineer problems so maybe this would be worth of
> > > investigating...
> >
> > Or why you could not just encrypt the whole header with another key
> > that is only in that device? Then it would appear as random full
> > length.
> >
> > I.e. unsealing
> >
> > 1. Decrypt LUKS2 header with TPM2 key
> > 2. Use the new resulting header as it was in the place of encrypted
> >    stored to the external drive.
> > 3. Decrypt key from the LUK2S header etc.
>
> Maybe something like:
>
> 1. Asymmetric for LUKS2 (just like it is)
> 2. Additional symmetric key, which is created as non-migratable and stored
>    to the TPM2 chip. This deciphers the header, i.e. takes the random
>    away.

This could work, but you still have the problem of - if the header
gets wiped, all the data is lost.
As for storing things on the TPM chip - that doesn't scale. Today you
only think about disk encryption, tomorrow there is a new application,
which wants to do the same thing and so on. One of the features of
derived keys - you don't store anything, just recreate/derive when
needed and it scales infinitely.

> BR, Jarkko
James Bottomley May 14, 2024, 3:30 p.m. UTC | #22
On Tue, 2024-05-14 at 14:11 +0100, Ignat Korchagin wrote:
>   * if someone steals one of the disks - we don't want them to see it
> has encrypted data (no LUKS header)

What is the use case that makes this important?  In usual operation
over the network, the fact that we're setting up encryption is easily
identifiable to any packet sniffer (DHE key exchanges are fairly easy
to fingerprint), but security relies on the fact that even knowing that
we're setting up encryption, the attacker can't gain access to it.  The
fact that we are setting up encryption isn't seen as a useful thing to
conceal, so why is it important for your encrypted disk use case?

James
Ignat Korchagin May 14, 2024, 3:38 p.m. UTC | #23
On Tue, May 14, 2024 at 4:30 PM James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
>
> On Tue, 2024-05-14 at 14:11 +0100, Ignat Korchagin wrote:
> >   * if someone steals one of the disks - we don't want them to see it
> > has encrypted data (no LUKS header)
>
> What is the use case that makes this important?  In usual operation
> over the network, the fact that we're setting up encryption is easily
> identifiable to any packet sniffer (DHE key exchanges are fairly easy
> to fingerprint), but security relies on the fact that even knowing that
> we're setting up encryption, the attacker can't gain access to it.  The
> fact that we are setting up encryption isn't seen as a useful thing to
> conceal, so why is it important for your encrypted disk use case?

In some "jurisdictions" authorities can demand that you decrypt the
data for them for "reasons". On the other hand if they can't prove
there is a ciphertext in the first place - it makes their case harder.

> James
>
Jarkko Sakkinen May 14, 2024, 3:42 p.m. UTC | #24
On Tue May 14, 2024 at 6:30 PM EEST, Ignat Korchagin wrote:
> On Tue, May 14, 2024 at 4:26 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> >
> > On Tue May 14, 2024 at 6:21 PM EEST, Jarkko Sakkinen wrote:
> > > On Tue May 14, 2024 at 5:30 PM EEST, Jarkko Sakkinen wrote:
> > > > On Tue May 14, 2024 at 5:00 PM EEST, Jarkko Sakkinen wrote:
> > > > > On Tue May 14, 2024 at 4:11 PM EEST, Ignat Korchagin wrote:
> > > > > > For example, a cheap NAS box with no internal storage (disks connected
> > > > > > externally via USB). We want:
> > > > > >   * disks to be encrypted and decryptable only by this NAS box
> > > > >
> > > > > So how this differs from LUKS2 style, which also systemd supports where
> > > > > the encryption key is anchored to PCR's? If I took hard drive out of my
> > > > > Linux box, I could not decrypt it in another machine because of this.
> > > >
> > > > Maybe you could replace the real LUKS2 header with a dummy LUKS2
> > > > header, which would need to be able the describe "do not use this" and
> > > > e.g. SHA256 of the actual header. And then treat the looked up header as
> > > > the header when the drive is mounted.
> > > >
> > > > LUKS2 would also need to be able to have pre-defined (e.g. kernel
> > > > command-line or bootconfig) small internal storage, which would be
> > > > also encrypted with TPM's PRCs containing an array of LUKS2 header
> > > > and then look up that with SHA256 as the key.
> > > >
> > > > Without knowing LUKS2 implementation to me these do not sound reaching
> > > > the impossible engineer problems so maybe this would be worth of
> > > > investigating...
> > >
> > > Or why you could not just encrypt the whole header with another key
> > > that is only in that device? Then it would appear as random full
> > > length.
> > >
> > > I.e. unsealing
> > >
> > > 1. Decrypt LUKS2 header with TPM2 key
> > > 2. Use the new resulting header as it was in the place of encrypted
> > >    stored to the external drive.
> > > 3. Decrypt key from the LUK2S header etc.
> >
> > Maybe something like:
> >
> > 1. Asymmetric for LUKS2 (just like it is)
> > 2. Additional symmetric key, which is created as non-migratable and stored
> >    to the TPM2 chip. This deciphers the header, i.e. takes the random
> >    away.
>
> This could work, but you still have the problem of - if the header
> gets wiped, all the data is lost.
> As for storing things on the TPM chip - that doesn't scale. Today you
> only think about disk encryption, tomorrow there is a new application,
> which wants to do the same thing and so on. One of the features of
> derived keys - you don't store anything, just recreate/derive when
> needed and it scales infinitely.

OK, so now I know the problem at least and that is probably the
most important thing in this discussion, right?

So make a better story, now you also probably have better idea,
also split the patch properly by subsystem, send the patch set,
and I'll promise to revisit.

Fair enough? :-)

BR, Jarkko
James Bottomley May 14, 2024, 3:54 p.m. UTC | #25
On Tue, 2024-05-14 at 16:38 +0100, Ignat Korchagin wrote:
> On Tue, May 14, 2024 at 4:30 PM James Bottomley
> <James.Bottomley@hansenpartnership.com> wrote:
> > 
> > On Tue, 2024-05-14 at 14:11 +0100, Ignat Korchagin wrote:
> > >   * if someone steals one of the disks - we don't want them to
> > > see it has encrypted data (no LUKS header)
> > 
> > What is the use case that makes this important?  In usual operation
> > over the network, the fact that we're setting up encryption is
> > easily identifiable to any packet sniffer (DHE key exchanges are
> > fairly easy to fingerprint), but security relies on the fact that
> > even knowing that we're setting up encryption, the attacker can't
> > gain access to it.  The fact that we are setting up encryption
> > isn't seen as a useful thing to conceal, so why is it important for
> > your encrypted disk use case?
> 
> In some "jurisdictions" authorities can demand that you decrypt the
> data for them for "reasons". On the other hand if they can't prove
> there is a ciphertext in the first place - it makes their case
> harder.

Well, this isn't necessarily a good assumption: the way to detect an
encrypted disk is to look at the entropy of the device blocks.  If the
disk is encrypted, the entropy will be pretty much maximal unlike every
other use case.  The other thing is that if the authorities have your
TPM, they already have access to the disk in this derived key scenario.
If *you* still have access to your TPM, you can update the storage seed
to shred the data.

James
Ignat Korchagin May 14, 2024, 4:01 p.m. UTC | #26
On Tue, May 14, 2024 at 4:54 PM James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
>
> On Tue, 2024-05-14 at 16:38 +0100, Ignat Korchagin wrote:
> > On Tue, May 14, 2024 at 4:30 PM James Bottomley
> > <James.Bottomley@hansenpartnership.com> wrote:
> > >
> > > On Tue, 2024-05-14 at 14:11 +0100, Ignat Korchagin wrote:
> > > >   * if someone steals one of the disks - we don't want them to
> > > > see it has encrypted data (no LUKS header)
> > >
> > > What is the use case that makes this important?  In usual operation
> > > over the network, the fact that we're setting up encryption is
> > > easily identifiable to any packet sniffer (DHE key exchanges are
> > > fairly easy to fingerprint), but security relies on the fact that
> > > even knowing that we're setting up encryption, the attacker can't
> > > gain access to it.  The fact that we are setting up encryption
> > > isn't seen as a useful thing to conceal, so why is it important for
> > > your encrypted disk use case?
> >
> > In some "jurisdictions" authorities can demand that you decrypt the
> > data for them for "reasons". On the other hand if they can't prove
> > there is a ciphertext in the first place - it makes their case
> > harder.
>
> Well, this isn't necessarily a good assumption: the way to detect an
> encrypted disk is to look at the entropy of the device blocks.  If the
> disk is encrypted, the entropy will be pretty much maximal unlike every
> other use case.  The other thing is that if the authorities have your

What if the disk is filled with random data? Would it not be at maximal entropy?

> TPM, they already have access to the disk in this derived key scenario.

I'm thinking more of a datacenter scenario here - it is much easier to
"steal" a disk rather than a server from a datacenter. So it is
possible that someone has the disk but no access to the TPM.

> If *you* still have access to your TPM, you can update the storage seed
> to shred the data.

The point here is not if I was able to shred the data or not, but the
fact I have something encrypted. Even if I shred the key I would be
considered "uncooperative and refusing to provide the key" vs "I don't
have anything encrypted in the first place".

> James
>
Ignat Korchagin May 14, 2024, 4:08 p.m. UTC | #27
On Tue, May 14, 2024 at 4:43 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
>
> On Tue May 14, 2024 at 6:30 PM EEST, Ignat Korchagin wrote:
> > On Tue, May 14, 2024 at 4:26 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> > >
> > > On Tue May 14, 2024 at 6:21 PM EEST, Jarkko Sakkinen wrote:
> > > > On Tue May 14, 2024 at 5:30 PM EEST, Jarkko Sakkinen wrote:
> > > > > On Tue May 14, 2024 at 5:00 PM EEST, Jarkko Sakkinen wrote:
> > > > > > On Tue May 14, 2024 at 4:11 PM EEST, Ignat Korchagin wrote:
> > > > > > > For example, a cheap NAS box with no internal storage (disks connected
> > > > > > > externally via USB). We want:
> > > > > > >   * disks to be encrypted and decryptable only by this NAS box
> > > > > >
> > > > > > So how this differs from LUKS2 style, which also systemd supports where
> > > > > > the encryption key is anchored to PCR's? If I took hard drive out of my
> > > > > > Linux box, I could not decrypt it in another machine because of this.
> > > > >
> > > > > Maybe you could replace the real LUKS2 header with a dummy LUKS2
> > > > > header, which would need to be able the describe "do not use this" and
> > > > > e.g. SHA256 of the actual header. And then treat the looked up header as
> > > > > the header when the drive is mounted.
> > > > >
> > > > > LUKS2 would also need to be able to have pre-defined (e.g. kernel
> > > > > command-line or bootconfig) small internal storage, which would be
> > > > > also encrypted with TPM's PRCs containing an array of LUKS2 header
> > > > > and then look up that with SHA256 as the key.
> > > > >
> > > > > Without knowing LUKS2 implementation to me these do not sound reaching
> > > > > the impossible engineer problems so maybe this would be worth of
> > > > > investigating...
> > > >
> > > > Or why you could not just encrypt the whole header with another key
> > > > that is only in that device? Then it would appear as random full
> > > > length.
> > > >
> > > > I.e. unsealing
> > > >
> > > > 1. Decrypt LUKS2 header with TPM2 key
> > > > 2. Use the new resulting header as it was in the place of encrypted
> > > >    stored to the external drive.
> > > > 3. Decrypt key from the LUK2S header etc.
> > >
> > > Maybe something like:
> > >
> > > 1. Asymmetric for LUKS2 (just like it is)
> > > 2. Additional symmetric key, which is created as non-migratable and stored
> > >    to the TPM2 chip. This deciphers the header, i.e. takes the random
> > >    away.
> >
> > This could work, but you still have the problem of - if the header
> > gets wiped, all the data is lost.
> > As for storing things on the TPM chip - that doesn't scale. Today you
> > only think about disk encryption, tomorrow there is a new application,
> > which wants to do the same thing and so on. One of the features of
> > derived keys - you don't store anything, just recreate/derive when
> > needed and it scales infinitely.
>
> OK, so now I know the problem at least and that is probably the
> most important thing in this discussion, right?

Yes, I think so.

> So make a better story, now you also probably have better idea,
> also split the patch properly by subsystem, send the patch set,

I'm actually not super clear on this part - I have two patches: one
for TPM header definitions and another one for the keyring subsystem?
Any other subsystems in play here?

> and I'll promise to revisit.

Thanks. Would probably take some time as I want to think more on the
open questions I raised in the description, try to address some
comments from James B from other replies (key rotation for example)
and rebase on recently merged TPM encrypted sessions. But since this
is an RFC I would like to continue the discussion and gather opinions
from folks here, if there are any more concerns.

> Fair enough? :-)
>
> BR, Jarkko
Jarkko Sakkinen May 14, 2024, 4:22 p.m. UTC | #28
On Tue May 14, 2024 at 7:08 PM EEST, Ignat Korchagin wrote:
> On Tue, May 14, 2024 at 4:43 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> >
> > On Tue May 14, 2024 at 6:30 PM EEST, Ignat Korchagin wrote:
> > > On Tue, May 14, 2024 at 4:26 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> > > >
> > > > On Tue May 14, 2024 at 6:21 PM EEST, Jarkko Sakkinen wrote:
> > > > > On Tue May 14, 2024 at 5:30 PM EEST, Jarkko Sakkinen wrote:
> > > > > > On Tue May 14, 2024 at 5:00 PM EEST, Jarkko Sakkinen wrote:
> > > > > > > On Tue May 14, 2024 at 4:11 PM EEST, Ignat Korchagin wrote:
> > > > > > > > For example, a cheap NAS box with no internal storage (disks connected
> > > > > > > > externally via USB). We want:
> > > > > > > >   * disks to be encrypted and decryptable only by this NAS box
> > > > > > >
> > > > > > > So how this differs from LUKS2 style, which also systemd supports where
> > > > > > > the encryption key is anchored to PCR's? If I took hard drive out of my
> > > > > > > Linux box, I could not decrypt it in another machine because of this.
> > > > > >
> > > > > > Maybe you could replace the real LUKS2 header with a dummy LUKS2
> > > > > > header, which would need to be able the describe "do not use this" and
> > > > > > e.g. SHA256 of the actual header. And then treat the looked up header as
> > > > > > the header when the drive is mounted.
> > > > > >
> > > > > > LUKS2 would also need to be able to have pre-defined (e.g. kernel
> > > > > > command-line or bootconfig) small internal storage, which would be
> > > > > > also encrypted with TPM's PRCs containing an array of LUKS2 header
> > > > > > and then look up that with SHA256 as the key.
> > > > > >
> > > > > > Without knowing LUKS2 implementation to me these do not sound reaching
> > > > > > the impossible engineer problems so maybe this would be worth of
> > > > > > investigating...
> > > > >
> > > > > Or why you could not just encrypt the whole header with another key
> > > > > that is only in that device? Then it would appear as random full
> > > > > length.
> > > > >
> > > > > I.e. unsealing
> > > > >
> > > > > 1. Decrypt LUKS2 header with TPM2 key
> > > > > 2. Use the new resulting header as it was in the place of encrypted
> > > > >    stored to the external drive.
> > > > > 3. Decrypt key from the LUK2S header etc.
> > > >
> > > > Maybe something like:
> > > >
> > > > 1. Asymmetric for LUKS2 (just like it is)
> > > > 2. Additional symmetric key, which is created as non-migratable and stored
> > > >    to the TPM2 chip. This deciphers the header, i.e. takes the random
> > > >    away.
> > >
> > > This could work, but you still have the problem of - if the header
> > > gets wiped, all the data is lost.
> > > As for storing things on the TPM chip - that doesn't scale. Today you
> > > only think about disk encryption, tomorrow there is a new application,
> > > which wants to do the same thing and so on. One of the features of
> > > derived keys - you don't store anything, just recreate/derive when
> > > needed and it scales infinitely.
> >
> > OK, so now I know the problem at least and that is probably the
> > most important thing in this discussion, right?
>
> Yes, I think so.
>
> > So make a better story, now you also probably have better idea,
> > also split the patch properly by subsystem, send the patch set,
>
> I'm actually not super clear on this part - I have two patches: one
> for TPM header definitions and another one for the keyring subsystem?
> Any other subsystems in play here?

You're absolutely right the split is fine. I look patches every
day so that must have stuck me somewhere else (sometimes does
happen).

Sorry.

> > and I'll promise to revisit.
>
> Thanks. Would probably take some time as I want to think more on the
> open questions I raised in the description, try to address some
> comments from James B from other replies (key rotation for example)
> and rebase on recently merged TPM encrypted sessions. But since this
> is an RFC I would like to continue the discussion and gather opinions
> from folks here, if there are any more concerns.

Yeah, not trying to argue of anything. Just have shoot with
stupid questions until it gets through, and not pretending
of understanding if I actually do not :-)

So I'll be ready once the next version is out.

>
> > Fair enough? :-)
> >
> > BR, Jarkko


BR, Jarkko