mbox series

[v5,00/11] Encrypted Hibernation

Message ID 20221111231636.3748636-1-evgreen@chromium.org (mailing list archive)
Headers show
Series Encrypted Hibernation | expand

Message

Evan Green Nov. 11, 2022, 11:16 p.m. UTC
We are exploring enabling hibernation in some new scenarios. However,
our security team has a few requirements, listed below:
1. The hibernate image must be encrypted with protection derived from
   both the platform (eg TPM) and user authentication data (eg
   password).
2. Hibernation must not be a vector by which a malicious userspace can
   escalate to the kernel.

Requirement #1 can be achieved solely with uswsusp, however requirement
2 necessitates mechanisms in the kernel to guarantee integrity of the
hibernate image. The kernel needs a way to authenticate that it generated
the hibernate image being loaded, and that the image has not been tampered
with. Adding support for in-kernel AEAD encryption with a TPM-sealed key
allows us to achieve both requirements with a single computation pass.

Matthew Garrett published a series [1] that aligns closely with this
goal. His series utilized the fact that PCR23 is a resettable PCR that
can be blocked from access by usermode. The TPM can create a sealed key
tied to PCR23 in two ways. First, the TPM can attest to the value of
PCR23 when the key was created, which the kernel can use on resume to
verify that the kernel must have created the key (since it is the only
one capable of modifying PCR23). It can also create a policy that enforces
PCR23 be set to a specific value as a condition of unsealing the key,
preventing usermode from unsealing the key by talking directly to the
TPM.

This series adopts that primitive as a foundation, tweaking and building
on it a bit. Where Matthew's series used the TPM-backed key to encrypt a
hash of the image, this series uses the key directly as a gcm(aes)
encryption key, which the kernel uses to encrypt and decrypt the
hibernate image in chunks of 16 pages. This provides both encryption and
integrity, which turns out to be a noticeable performance improvement over
separate passes for encryption and hashing.

The series also introduces the concept of mixing user key material into
the encryption key. This allows usermode to introduce key material
based on unspecified external authentication data (in our case derived
from something like the user password or PIN), without requiring
usermode to do a separate encryption pass.

Matthew also documented issues his series had [2] related to generating
fake images by booting alternate kernels without the PCR23 limiting.
With access to PCR23 on the same machine, usermode can create fake
hibernate images that are indistinguishable to the new kernel from
genuine ones. His post outlines a solution that involves adding more
PCRs into the creation data and policy, with some gyrations to make this
work well on a standard PC.

Our approach would be similar: on our machines PCR 0 indicates whether
the system is booted in secure/verified mode or developer mode. By
adding PCR0 to the policy, we can reject hibernate images made in
developer mode while in verified mode (or vice versa).

Additionally, mixing in the user authentication data limits both
data exfiltration attacks (eg a stolen laptop) and forged hibernation
image attacks to attackers that already know the authentication data (eg
user's password). This, combined with our relatively sealed userspace
(dm-verity on the rootfs), and some judicious clearing of the hibernate
image (such as across an OS update) further reduce the risk of an online
attack. The remaining attack space of a forgery from someone with
physical access to the device and knowledge of the authentication data
is out of scope for us, given that flipping to developer mode or
reflashing RO firmware trivially achieves the same thing.

A couple of patches still need to be written on top of this series. The
generalized functionality to OR in additional PCRs via Kconfig (like PCR
0 or 5) still needs to be added. We'll also need a patch that disallows
unencrypted forms of resume from hibernation, to fully close the door
to malicious userspace. However, I wanted to get this series out first
and get reactions from upstream before continuing to add to it.

[1] https://patchwork.kernel.org/project/linux-pm/cover/20210220013255.1083202-1-matthewgarrett@google.com/
[2] https://mjg59.dreamwidth.org/58077.html

Changes in v5:
 - Change to co-developed by Matthew (Kees)
 - Change tags on RESTRICT_PCR patch (Kees)
 - Rename to TCG_TPM2_RESTRICT_PCR
 - Do nothing on TPM1.2 devices (Jarkko, Doug)
 - Factored some math out to a helper function (Kees)
 - Constified src in tpm2_key_encode().
 - Make Matthew's tag match author
 - Removed default n in Kconfig (Kees)
 - Expanded commit message (Jarkko)
 - Use Suggested-by tag instead of made up Sourced-from (Kees)
 - ENCRYPTED_HIBERNATION should depend on TCG_TPM2_RESTRCT_PCR
 - Remove pad struct member (Kees)
 - Use a struct to access creation data (Kees)
 - Build PCR bitmask programmatically in creation data (Kees)

Changes in v4:
 - Open code tpm2_pcr_reset implementation in tpm-interface.c (Jarkko)
 - Rename interface symbol to tpm2_pcr_reset, fix kerneldocs (Jarkko)
 - Augment the commit message (Jarkko)
 - Local ordering and whitespace changes (Jarkko)
 - s/tpm_pcr_reset/tpm2_pcr_reset/ due to change in other patch
 - Variable ordering and whitespace fixes (Jarkko)
 - Add NULL check explanation in teardown (Jarkko)
 - Change strlen+1 to sizeof for static buffer (Jarkko)
 - Fix nr_allocated_banks loop overflow (found via KASAN)
 - Local variable reordering (Jarkko)
 - Local variable ordering (Jarkko)

Changes in v3:
 - Unify tpm1/2_pcr_reset prototypes (Jarkko)
 - Wait no, remove the TPM1 stuff altogether (Jarkko)
 - Remove extra From tag and blank in commit msg (Jarkko).
 - Split find_and_validate_cc() export to its own patch (Jarkko)
 - Rename tpm_find_and_validate_cc() to tpm2_find_and_validate_cc().
 - Fix up commit message (Jarkko)
 - tpm2_find_and_validate_cc() was split (Jarkko)
 - Simply fully restrict TPM1 since v2 failed to account for tunnelled
   transport sessions (Stefan and Jarkko).
 - Fix SoB and -- note ordering (Kees)
 - Add comments describing the TPM2 spec type names for the new fields
   in tpm2key.asn1 (Kees)
 - Add len buffer checks in tpm2_key_encode() (Kees)
 - Clarified creationpcrs documentation (Ben)
 - Changed funky tag to suggested-by (Kees). Matthew, holler if you want
   something different.
 - ENCRYPTED_HIBERNATION needs TRUSTED_KEYS builtin for
   key_type_trusted.
 - Remove KEYS dependency since it's covered by TRUSTED_KEYS (Kees)
 - Changed funky tag to Co-developed-by (Kees). Matthew, holler if you
   want something different.
 - Changed funky tag to Co-developed-by (Kees)

Changes in v2:
 - Fixed sparse warnings
 - Adjust hash len by 2 due to new ASN.1 storage, and add underflow
   check.
 - Rework load/create_kernel_key() to eliminate a label (Andrey)
 - Call put_device() needed from calling tpm_default_chip().
 - Add missing static on snapshot_encrypted_byte_count()
 - Fold in only the used kernel key bytes to the user key.
 - Make the user key length 32 (Eric)
 - Use CRYPTO_LIB_SHA256 for less boilerplate (Eric)
 - Fixed some sparse warnings
 - Use CRYPTO_LIB_SHA256 to get rid of sha256_data() (Eric)
 - Adjusted offsets due to new ASN.1 format, and added a creation data
   length check.
 - Fix sparse warnings
 - Fix session type comment (Andrey)
 - Eliminate extra label in get/create_kernel_key() (Andrey)
 - Call tpm_try_get_ops() before calling tpm2_flush_context().

Evan Green (10):
  tpm: Add support for in-kernel resetting of PCRs
  tpm: Export and rename tpm2_find_and_validate_cc()
  tpm: Allow PCR 23 to be restricted to kernel-only use
  security: keys: trusted: Include TPM2 creation data
  security: keys: trusted: Verify creation data
  PM: hibernate: Add kernel-based encryption
  PM: hibernate: Use TPM-backed keys to encrypt image
  PM: hibernate: Mix user key in encrypted hibernate
  PM: hibernate: Verify the digest encryption key
  PM: hibernate: seal the encryption key with a PCR policy

Matthew Garrett (1):
  security: keys: trusted: Allow storage of PCR values in creation data

 Documentation/power/userland-swsusp.rst       |    8 +
 .../security/keys/trusted-encrypted.rst       |    6 +
 drivers/char/tpm/Kconfig                      |   12 +
 drivers/char/tpm/tpm-dev-common.c             |    6 +
 drivers/char/tpm/tpm-interface.c              |   47 +
 drivers/char/tpm/tpm.h                        |   15 +
 drivers/char/tpm/tpm2-cmd.c                   |   29 +-
 drivers/char/tpm/tpm2-space.c                 |    8 +-
 include/keys/trusted-type.h                   |    9 +
 include/linux/tpm.h                           |   19 +
 include/uapi/linux/suspend_ioctls.h           |   28 +-
 kernel/power/Kconfig                          |   15 +
 kernel/power/Makefile                         |    1 +
 kernel/power/power.h                          |    1 +
 kernel/power/snapenc.c                        | 1105 +++++++++++++++++
 kernel/power/snapshot.c                       |    5 +
 kernel/power/user.c                           |   44 +-
 kernel/power/user.h                           |  117 ++
 security/keys/trusted-keys/tpm2key.asn1       |   15 +-
 security/keys/trusted-keys/trusted_tpm1.c     |    9 +
 security/keys/trusted-keys/trusted_tpm2.c     |  355 +++++-
 21 files changed, 1797 insertions(+), 57 deletions(-)
 create mode 100644 kernel/power/snapenc.c
 create mode 100644 kernel/power/user.h

Comments

Evan Green Dec. 7, 2022, 11:54 p.m. UTC | #1
Hello, it's me again!

On Fri, Nov 11, 2022 at 3:19 PM Evan Green <evgreen@chromium.org> wrote:
>
> We are exploring enabling hibernation in some new scenarios. However,
> our security team has a few requirements, listed below:
> 1. The hibernate image must be encrypted with protection derived from
>    both the platform (eg TPM) and user authentication data (eg
>    password).
> 2. Hibernation must not be a vector by which a malicious userspace can
>    escalate to the kernel.
>
> Requirement #1 can be achieved solely with uswsusp, however requirement
> 2 necessitates mechanisms in the kernel to guarantee integrity of the
> hibernate image. The kernel needs a way to authenticate that it generated
> the hibernate image being loaded, and that the image has not been tampered
> with. Adding support for in-kernel AEAD encryption with a TPM-sealed key
> allows us to achieve both requirements with a single computation pass.
>
> Matthew Garrett published a series [1] that aligns closely with this
> goal. His series utilized the fact that PCR23 is a resettable PCR that
> can be blocked from access by usermode. The TPM can create a sealed key
> tied to PCR23 in two ways. First, the TPM can attest to the value of
> PCR23 when the key was created, which the kernel can use on resume to
> verify that the kernel must have created the key (since it is the only
> one capable of modifying PCR23). It can also create a policy that enforces
> PCR23 be set to a specific value as a condition of unsealing the key,
> preventing usermode from unsealing the key by talking directly to the
> TPM.
>
> This series adopts that primitive as a foundation, tweaking and building
> on it a bit. Where Matthew's series used the TPM-backed key to encrypt a
> hash of the image, this series uses the key directly as a gcm(aes)
> encryption key, which the kernel uses to encrypt and decrypt the
> hibernate image in chunks of 16 pages. This provides both encryption and
> integrity, which turns out to be a noticeable performance improvement over
> separate passes for encryption and hashing.
>
> The series also introduces the concept of mixing user key material into
> the encryption key. This allows usermode to introduce key material
> based on unspecified external authentication data (in our case derived
> from something like the user password or PIN), without requiring
> usermode to do a separate encryption pass.
>
> Matthew also documented issues his series had [2] related to generating
> fake images by booting alternate kernels without the PCR23 limiting.
> With access to PCR23 on the same machine, usermode can create fake
> hibernate images that are indistinguishable to the new kernel from
> genuine ones. His post outlines a solution that involves adding more
> PCRs into the creation data and policy, with some gyrations to make this
> work well on a standard PC.
>
> Our approach would be similar: on our machines PCR 0 indicates whether
> the system is booted in secure/verified mode or developer mode. By
> adding PCR0 to the policy, we can reject hibernate images made in
> developer mode while in verified mode (or vice versa).
>
> Additionally, mixing in the user authentication data limits both
> data exfiltration attacks (eg a stolen laptop) and forged hibernation
> image attacks to attackers that already know the authentication data (eg
> user's password). This, combined with our relatively sealed userspace
> (dm-verity on the rootfs), and some judicious clearing of the hibernate
> image (such as across an OS update) further reduce the risk of an online
> attack. The remaining attack space of a forgery from someone with
> physical access to the device and knowledge of the authentication data
> is out of scope for us, given that flipping to developer mode or
> reflashing RO firmware trivially achieves the same thing.
>
> A couple of patches still need to be written on top of this series. The
> generalized functionality to OR in additional PCRs via Kconfig (like PCR
> 0 or 5) still needs to be added. We'll also need a patch that disallows
> unencrypted forms of resume from hibernation, to fully close the door
> to malicious userspace. However, I wanted to get this series out first
> and get reactions from upstream before continuing to add to it.
>
> [1] https://patchwork.kernel.org/project/linux-pm/cover/20210220013255.1083202-1-matthewgarrett@google.com/
> [2] https://mjg59.dreamwidth.org/58077.html
>

Doug found a practical problem with this design. The security of this
mechanism depends on the kernel being able to prevent usermode from
manipulating PCR23. While this series has managed to add that gating
to the standard /dev/tpm interface, at least on ChromeOS, there are
still many "dangerous toys" lying around that might allow a malicious
root to communicate directly with the TPM. This raw access could allow
usermode to extend PCR23 manually and forge malicious hibernate images
that appear genuine. Examples of raw access include 1) i2cget -F, 2)
unbinding the driver and binding i2c-dev instead, 3) using /dev/mem to
manipulate the i2c controller registers directly, and 4) my favorite,
remuxing the i2c pins to GPIO and bitbanging.

We did some brainstorming and came up with a pivot that has the
benefits of 1) reusing a decent chunk of this series, 2) not taking
PCR23 away from usermode (which based on other comments seemed like it
might not fly anyway), and 3) pushing the TPM interaction back down
into usermode. The new element we take advantage of is that our early
userspace is still considered trusted, as we sign the rootfs and
protect it with dm-verity.

The idea is to have early userspace ask the TPM to create a sealed key
bound to a (non-resettable) PCR. We then save the blob to disk, extend
the PCR (to prevent future unsealings in this boot), and push the key
material up to the kernel for use as a "hibernate seed". The kernel
will hold this seed in memory, and at hibernate time will use it to
encrypt a randomly generated "bulk key". The bulk key is then used to
encrypt the main hibernate image. So on disk at hibernate, we have 1)
the encrypted hibernate image, protected by the bulk key, 2) the bulk
key, protected by the seed, and finally 3) the seed, a TPM-protected
key blob that can only be unsealed when a PCR is set to its boot
value. In our own userspace implementation we'd seal this against a
firmware PCR as well, to differentiate between Verified mode and
Developer mode.

At resume time, early userspace would find the blob, successfully
unseal it (because the PCRs had reset back to the value that matches
the policy), and push the recovered seed to the kernel. It can then
push the encrypted bulk key and encrypted hibernate image. On our
systems, this works fine as the PCRs seem to always reset across
hibernate. Is that true generally as well?

So my plan for the next spin of this series looks something like:
 * Drop the tpm: and security: subsystem patches
 * Keep the gist of the PM: patches as is, but instead of the TPM stuff...
 * Introduce two new sysfs files, one to allow usermode to save the
seed into kernel memory, and another to lock out future changes to the
hibernate seed (until the next reboot).
 * Use the hibernate seed to encrypt a randomly generated bulk key,
which is then used to encrypt the main hibernate image.
 * Keep the "PM: mix user key in" patch, as we still need the image to
be encrypted with a key based on user authentication data, which this
new mechanism alone doesn't provide.

Anyone have any big objections to that plan, or see new gaping holes
in the idea? In the end I think it's actually a little nicer, as it
decouples all of the TPM-specific machinery from the concept of secure
hibernate, as well as not trying to police PCR access. Casey and Greg,
I'm going to guess you don't want to be CCed on the next spin, given
that I'm dropping the notion of taking PCR23 away from userspace.
Please holler if you would like to be CCed.

-Evan