mbox series

[v5,0/9] Support microcode updates affecting SGX

Message ID 20220520103904.1216-1-cathy.zhang@intel.com (mailing list archive)
Headers show
Series Support microcode updates affecting SGX | expand

Message

Zhang, Cathy May 20, 2022, 10:38 a.m. UTC
v4:
https://lore.kernel.org/all/2682d32989ea40578065b2c14ef20b19@intel.com/T/

Changes since v4:
 - Add "goto err" for *if* branch and remove *else* branch for less
   branching for common (success) case in sgx_ioc_enclave_create().
   (Jarkko Sakkinen)
 - Add back the blank line removed unintenitonally in encl.h.
   (Jarkko Sakkinen)
 - Move *-EBUSY* into one line in the three sgx_zap_* functions'
   annotates. (Jarkko Sakkinen).
 - Remove #include <asm/sgx.h> from microcode.h which is not needed.
   (Borislav Petkov)

Changes since v3:
 - Refine the comments when sgx_update_cpusvn_intel() is called by
   microcode_check(). (suggested by Borislav Petkov, Dave Hansen)
 - Rename update_cpusvn_intel() as sgx_update_cpusvn_intel().
   (suggested by Dave Hansen)
 - Define both the 'static inline' stub *and* the declaration for
   sgx_update_cpusvn_intel() in sgx.h. (suggested by Dave Hansen)
 - Squash patch "x86/sgx: Provide VA page non-NULL owner" and
   "x86/sgx: Save enclave pointer for VA page". Update commit log.
   (suggested by Jarkko Sakkinen)
 - Rename SGX_EPC_PAGE_GUEST as SGX_EPC_PAGE_KVM_GUEST. (suggested by
   Jarkko, Sakkinen)
 - Refer to changelogs for other changes.

Changes since v2:
 - Changes are made in patch "x86/sgx: Introduce mechanism to prevent
   new initializations of EPC pages" by moving SGX2 related changes out.
   It allows this series to be applied on tip/x86/sgx branch with only
   picking up some auxiliary changes from SGX2 series, rather than
   depend on the whole set. (Jarkko Sakkinen, Reinette Chatre)

Changes since v1:
 - Remove the sysfs file svnupdate. (Thomas Gleixner, Dave Hansen)
 - Let late microcode load path call ENCLS[EUPDATESVN] procedure
   directly. (Borislav Petkov)
 - Update cover letter by removing saying that "...triggered by
   administrators via sysfs...".
 - Drop the patch for documentation change.

cover letter:

== General Microcode Background ==

Historically, microcode updates are applied by the BIOS or early in
boot. In recent years, several trends have made these old approaches
less palatable.

First, the cadence of microcode updates has increased to deliver
security mitigations. Second, the value of those updates has increased,
meaning that any delay in applying them is unacceptable. Third, users
have become accustomed to approaches like hot patching their kernels
and have a growing aversion to reboots in general.

Users want microcode updates to behave more like a hot patching a
kernel and less like a BIOS update.

== SGX Attestation Background ==

SGX enclaves have an attestation mechanism. An enclave might, for
instance, need to attest to its state before it is given a special
decryption key. Since SGX must trust the CPU microcode, attestation
incorporates the microcode versions of all processors on the system
and is affected by microcode updates. This allows the entity to which
the enclave is attesting to make deployment decisions based on the
microcode version. For example, an enclave might be denied a decryption
key if it runs on a system that has old microcode without a specific
mitigation.

Unfortunately, this attestation metric (called CPUSVN) is only a
snapshot. When the kernel first uses SGX (successfully executes any
ENCLS instruction), SGX inspects all CPUs in the system and incorporates
a record of their microcode versions into CPUSVN. Today, that value is
locked and is not updated until a reboot.

== Problems ==

This means that, although the microcode may be update, enclaves can
never attest to this fact. Enclaves are stuck attesting to the old
version until a reboot.

Old enclaves created before the microcode update are presumed to be
compromised must not be allowed to attest with the new microcode
version.

== Solution ==

EUPDATESVN is a new SGX instruction which allows enclave attestation
to include information about updated microcode without a reboot.

Whenever a microcode update affects SGX, the SGX attestation
architecture assumes that all running enclaves and cryptographic
assets (like internal SGX encryption keys) have been compromised.
To mitigate the impact of this presumed compromise, EUPDATESVN success
requires that all SGX memory to be marked as "unused" and its contents
destroyed. This requirement ensures that no compromised enclave can
survive the EUPDATESVN procedure and provides an opportunity to
generate new cryptographic assets.

This series implements the infrastructure needed to track and tear
down bare-metal enclaves and then run EUPDATESVN, it will be called
by the late microcode load path after the microcode update.

This is a very slow operation. It is, of course, exceedingly disruptive
to enclaves but should be infrequent as microcode updates are released
on the order of every few months. Also, this is not the first piece of
the SGX architecture which will destroy all enclave contents.

A follow-on series will add Virtual EPC (KVM guest) support.

Here is the spec for your reference:
https://cdrdv2.intel.com/v1/dl/getContent/648682?explicitVersion=true

This is series is based on tip/x86/sgx with the following additionally
applied:

"x86/sgx: Export sgx_encl_ewb_cpumask()"
https://lore.kernel.org/lkml/YnrllJ2OqmcqLUuv@kernel.org/T/#mc6d998d583c9fa512f25219f477353c3fbd214a0
"x86/sgx: Rename sgx_encl_ewb_cpumask() as sgx_encl_cpumask()"
https://lore.kernel.org/lkml/YnrllJ2OqmcqLUuv@kernel.org/T/#me953d1983c1749daeec25f86d0f3ff09cede6c7a
"x86/sgx: Make sgx_ipi_cb() available internally"
https://lore.kernel.org/lkml/YnrllJ2OqmcqLUuv@kernel.org/T/#m5f32fbecb20b34e7745d8bd30c5d133d588b15dc
"x86/sgx: Keep record of SGX page type"
https://lore.kernel.org/lkml/YnrllJ2OqmcqLUuv@kernel.org/T/#m3a218f751b2e41950068dbcccf465426d0ec771e

Cathy Zhang (9):
  x86/sgx: Introduce mechanism to prevent new initializations of EPC
    pages
  x86/sgx: Save enclave pointer for VA page
  x86/sgx: Keep record for SGX VA and Guest page type
  x86/sgx: Save the size of each EPC section
  x86/sgx: Forced EPC page zapping for EUPDATESVN
  x86/sgx: Define error codes for ENCLS[EUPDATESVN]
  x86/sgx: Implement ENCLS[EUPDATESVN]
  x86/cpu: Call ENCLS[EUPDATESVN] procedure in microcode update
  x86/sgx: Call ENCLS[EUPDATESVN] during SGX initialization

 arch/x86/include/asm/sgx.h      |  49 ++--
 arch/x86/kernel/cpu/sgx/encl.h  |   3 +-
 arch/x86/kernel/cpu/sgx/encls.h |  14 +
 arch/x86/kernel/cpu/sgx/sgx.h   |  23 +-
 arch/x86/kernel/cpu/common.c    |  10 +
 arch/x86/kernel/cpu/sgx/encl.c  |  39 ++-
 arch/x86/kernel/cpu/sgx/ioctl.c |  51 +++-
 arch/x86/kernel/cpu/sgx/main.c  | 456 +++++++++++++++++++++++++++++++-
 arch/x86/kernel/cpu/sgx/virt.c  |  22 ++
 9 files changed, 636 insertions(+), 31 deletions(-)

Comments

Thomas Gleixner May 24, 2022, 7:15 p.m. UTC | #1
Cathy,

On Fri, May 20 2022 at 18:38, Cathy Zhang wrote:
> First, the cadence of microcode updates has increased to deliver
> security mitigations. Second, the value of those updates has increased,
> meaning that any delay in applying them is unacceptable. Third, users
> have become accustomed to approaches like hot patching their kernels
> and have a growing aversion to reboots in general.
>
> Users want microcode updates to behave more like a hot patching a
> kernel and less like a BIOS update.

please don't take this personaly.

What users want and what's technically correct are two different things.

Fact is that late microcode updates especially those which change
features, add/remove functionality are simply broken. This has been
discussed to death already and I'm not going to find all the various
threads which provided that information. lore.kernel.org has excellent
search capabilities.

As a summary, there is a long standing request that for late loading
microcode needs to come with machine readable information about the
nature of the update which tells the kernel whether there are changes
which cannot be applied post boot.

This was agreed on by Intel folks and until this materializes any
attempt to load microcode late has to be considered as unsupported. This
is going on for years now and has been ignored.

As a consequence we are not adding a special SGX workaround for
something which is known to be broken.

What we are going to do and I'm fasttracking this is:

 https://lore.kernel.org/all/20220524185324.28395-1-bp@alien8.de

which make the SGX workaround moot.

Thanks

        tglx
Borislav Petkov May 24, 2022, 7:26 p.m. UTC | #2
On Tue, May 24, 2022 at 09:15:00PM +0200, Thomas Gleixner wrote:
> Cathy,
> 
> On Fri, May 20 2022 at 18:38, Cathy Zhang wrote:

Btw, this mail has this here too:

> Historically, microcode updates are applied by the BIOS or early in
> boot. In recent years, several trends have made these old approaches
> less palatable.

Actually, late loading is the old method. Early came after it.

> > First, the cadence of microcode updates has increased to deliver
> > security mitigations. Second, the value of those updates has increased,
> > meaning that any delay in applying them is unacceptable. Third, users
> > have become accustomed to approaches like hot patching their kernels
> > and have a growing aversion to reboots in general.

I had missed that argument: so how do those users update their kernels?
Livepatching? I don't think you can replace a whole live kernel - that
would be magic. Unless you kexec but then you can early load microcode
too.

So if you reboot your kernel because you've installed a new one, you can
just as well update microcode.

So sorry but I'm not buying this argument.

For cloud vendors who cannot reboot because they've promised their users
ponies, that's their problem. They might have a somewhat ok-ish argument.

But not for normal users - they can just as well reboot their machines
and do kernel updates together with microcode.

Thx.