Message ID | 147190832511.9523.10850626471583956499.stgit@brijesh-build-machine (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Aug 22, 2016 at 07:25:25PM -0400, Brijesh Singh wrote: > From: Tom Lendacky <thomas.lendacky@amd.com> > > EFI data is encrypted when the kernel is run under SEV. Update the > page table references to be sure the EFI memory areas are accessed > encrypted. > > Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> > --- > arch/x86/platform/efi/efi_64.c | 14 ++++++++++++-- > 1 file changed, 12 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c > index 0871ea4..98363f3 100644 > --- a/arch/x86/platform/efi/efi_64.c > +++ b/arch/x86/platform/efi/efi_64.c > @@ -213,7 +213,7 @@ void efi_sync_low_kernel_mappings(void) > > int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages) > { > - unsigned long pfn, text; > + unsigned long pfn, text, flags; > efi_memory_desc_t *md; > struct page *page; > unsigned npages; > @@ -230,6 +230,10 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages) > efi_scratch.efi_pgt = (pgd_t *)__sme_pa(efi_pgd); > pgd = efi_pgd; > > + flags = _PAGE_NX | _PAGE_RW; > + if (sev_active) > + flags |= _PAGE_ENC; So this is confusing me. There's this patch which says EFI data is accessed in the clear: https://lkml.kernel.org/r/20160822223738.29880.6909.stgit@tlendack-t1.amdoffice.net but now here it is encrypted when SEV is enabled. Do you mean, it is encrypted here because we're in the guest kernel? Thanks.
On 22/09/2016 16:35, Borislav Petkov wrote: >> > @@ -230,6 +230,10 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages) >> > efi_scratch.efi_pgt = (pgd_t *)__sme_pa(efi_pgd); >> > pgd = efi_pgd; >> > >> > + flags = _PAGE_NX | _PAGE_RW; >> > + if (sev_active) >> > + flags |= _PAGE_ENC; > So this is confusing me. There's this patch which says EFI data is > accessed in the clear: > > https://lkml.kernel.org/r/20160822223738.29880.6909.stgit@tlendack-t1.amdoffice.net > > but now here it is encrypted when SEV is enabled. > > Do you mean, it is encrypted here because we're in the guest kernel? I suspect this patch is untested, and also wrong. :) The main difference between the SME and SEV encryption, from the point of view of the kernel, is that real-mode always writes unencrypted in SME and always writes encrypted in SEV. But UEFI can run in 64-bit mode and learn about the C bit, so EFI boot data should be unprotected in SEV guests. Because the firmware volume is written to high memory in encrypted form, and because the PEI phase runs in 32-bit mode, the firmware code will be encrypted; on the other hand, data that is placed in low memory for the kernel can be unencrypted, thus limiting differences between SME and SEV. Important: I don't know what you guys are doing for SEV and Windows guests, but if you are doing something I would really appreciate doing things in the open. If Linux and Windows end up doing different things with EFI boot data, ACPI tables, etc. it will be a huge pain. On the other hand, if we can enjoy being first, that's great. In fact, I have suggested in the QEMU list that SEV guests should always use UEFI; because BIOS runs in real-mode or 32-bit non-paging protected mode, BIOS must always write encrypted data, which becomes painful in the kernel. And regarding the above "important" point, all I know is that Microsoft for sure will be happy to restrict SEV to UEFI guests. :) There are still some differences, mostly around the real mode trampoline executed by the kernel, but they should be much smaller. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Sep 22, 2016 at 04:45:51PM +0200, Paolo Bonzini wrote: > The main difference between the SME and SEV encryption, from the point > of view of the kernel, is that real-mode always writes unencrypted in > SME and always writes encrypted in SEV. But UEFI can run in 64-bit mode > and learn about the C bit, so EFI boot data should be unprotected in SEV > guests. Actually, it is different: you can start fully encrypted in SME, see: https://lkml.kernel.org/r/20160822223539.29880.96739.stgit@tlendack-t1.amdoffice.net The last paragraph alludes to a certain transparent mode where you're already encrypted and only certain pieces like EFI is not encrypted. I think the aim is to have the transparent mode be the default one, which makes most sense anyway. The EFI regions are unencrypted for obvious reasons and you need to access them as such. > Because the firmware volume is written to high memory in encrypted > form, and because the PEI phase runs in 32-bit mode, the firmware > code will be encrypted; on the other hand, data that is placed in low > memory for the kernel can be unencrypted, thus limiting differences > between SME and SEV. When you run fully encrypted, you still need to access EFI tables in the clear. That's why I'm confused about this patch here.
On 22/09/2016 16:59, Borislav Petkov wrote: > On Thu, Sep 22, 2016 at 04:45:51PM +0200, Paolo Bonzini wrote: >> The main difference between the SME and SEV encryption, from the point >> of view of the kernel, is that real-mode always writes unencrypted in >> SME and always writes encrypted in SEV. But UEFI can run in 64-bit mode >> and learn about the C bit, so EFI boot data should be unprotected in SEV >> guests. > > Actually, it is different: you can start fully encrypted in SME, see: > > https://lkml.kernel.org/r/20160822223539.29880.96739.stgit@tlendack-t1.amdoffice.net > > The last paragraph alludes to a certain transparent mode where you're > already encrypted and only certain pieces like EFI is not encrypted. Which paragraph? >> Because the firmware volume is written to high memory in encrypted >> form, and because the PEI phase runs in 32-bit mode, the firmware >> code will be encrypted; on the other hand, data that is placed in low >> memory for the kernel can be unencrypted, thus limiting differences >> between SME and SEV. > > When you run fully encrypted, you still need to access EFI tables in the > clear. That's why I'm confused about this patch here. I might be wrong, but I don't think this patch was tested with OVMF or Duet. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Sep 22, 2016 at 05:05:54PM +0200, Paolo Bonzini wrote:
> Which paragraph?
"Linux relies on BIOS to set this bit if BIOS has determined that the
reduction in the physical address space as a result of enabling memory
encryption..."
Basically, you can enable SME in the BIOS and you're all set.
On 22/09/2016 19:07, Borislav Petkov wrote: >> Which paragraph? > "Linux relies on BIOS to set this bit if BIOS has determined that the > reduction in the physical address space as a result of enabling memory > encryption..." > > Basically, you can enable SME in the BIOS and you're all set. That's not how I read it. I just figured that the BIOS has some magic things high in the physical address space and if you reduce the physical address space the BIOS (which is called from e.g. EFI runtime services) would have problems with that. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Sep 22, 2016 at 07:08:50PM +0200, Paolo Bonzini wrote: > That's not how I read it. I just figured that the BIOS has some magic > things high in the physical address space and if you reduce the physical > address space the BIOS (which is called from e.g. EFI runtime services) > would have problems with that. Yeah, I had to ask about that myself and Tom will have it explained better in the next version. The reduction in physical address space happens when SME enabled because you need a couple of bits in the PTE with which to say which key has encrypted that page. So it is an indelible part of the SME machinery. Btw, section "7.10 Secure Memory Encryption" has an initial writeup: http://support.amd.com/TechDocs/24593.pdf Now that I skim over it, it doesn't mention the BIOS thing but that'll be updated. HTH.
On 09/22/2016 09:35 AM, Borislav Petkov wrote: > On Mon, Aug 22, 2016 at 07:25:25PM -0400, Brijesh Singh wrote: >> From: Tom Lendacky <thomas.lendacky@amd.com> >> >> EFI data is encrypted when the kernel is run under SEV. Update the >> page table references to be sure the EFI memory areas are accessed >> encrypted. >> >> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> >> --- >> arch/x86/platform/efi/efi_64.c | 14 ++++++++++++-- >> 1 file changed, 12 insertions(+), 2 deletions(-) >> >> diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c >> index 0871ea4..98363f3 100644 >> --- a/arch/x86/platform/efi/efi_64.c >> +++ b/arch/x86/platform/efi/efi_64.c >> @@ -213,7 +213,7 @@ void efi_sync_low_kernel_mappings(void) >> >> int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages) >> { >> - unsigned long pfn, text; >> + unsigned long pfn, text, flags; >> efi_memory_desc_t *md; >> struct page *page; >> unsigned npages; >> @@ -230,6 +230,10 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages) >> efi_scratch.efi_pgt = (pgd_t *)__sme_pa(efi_pgd); >> pgd = efi_pgd; >> >> + flags = _PAGE_NX | _PAGE_RW; >> + if (sev_active) >> + flags |= _PAGE_ENC; > > So this is confusing me. There's this patch which says EFI data is > accessed in the clear: > > https://lkml.kernel.org/r/20160822223738.29880.6909.stgit@tlendack-t1.amdoffice.net > > but now here it is encrypted when SEV is enabled. > > Do you mean, it is encrypted here because we're in the guest kernel? Yes, the idea is that the SEV guest will be running encrypted from the start, including the BIOS/UEFI, and so all of the EFI related data will be encrypted. Thanks, Tom > > Thanks. > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 22/09/2016 19:46, Tom Lendacky wrote: >> > Do you mean, it is encrypted here because we're in the guest kernel? > Yes, the idea is that the SEV guest will be running encrypted from the > start, including the BIOS/UEFI, and so all of the EFI related data will > be encrypted. Unless this is part of some spec, it's easier if things are the same in SME and SEV. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Sep 22, 2016 at 08:23:36PM +0200, Paolo Bonzini wrote: > Unless this is part of some spec, it's easier if things are the same in > SME and SEV. Yeah, I was pondering over how sprinkling sev_active checks might not be so clean. I'm wondering if we could make the EFI regions presented to the guest unencrypted too, as part of some SEV-specific init routine so that the guest kernel doesn't need to do anything different.
On 22/09/2016 20:37, Borislav Petkov wrote: >> > Unless this is part of some spec, it's easier if things are the same in >> > SME and SEV. > Yeah, I was pondering over how sprinkling sev_active checks might not be > so clean. > > I'm wondering if we could make the EFI regions presented to the guest > unencrypted too, as part of some SEV-specific init routine so that the > guest kernel doesn't need to do anything different. That too, but why not fix it in the firmware?... (Again, if there's any MSFT guy looking at this offlist, let's involve him in the discussion). Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/22/2016 09:45 AM, Paolo Bonzini wrote: > > > On 22/09/2016 16:35, Borislav Petkov wrote: >>>> @@ -230,6 +230,10 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages) >>>> efi_scratch.efi_pgt = (pgd_t *)__sme_pa(efi_pgd); >>>> pgd = efi_pgd; >>>> >>>> + flags = _PAGE_NX | _PAGE_RW; >>>> + if (sev_active) >>>> + flags |= _PAGE_ENC; >> So this is confusing me. There's this patch which says EFI data is >> accessed in the clear: >> >> https://lkml.kernel.org/r/20160822223738.29880.6909.stgit@tlendack-t1.amdoffice.net >> >> but now here it is encrypted when SEV is enabled. >> >> Do you mean, it is encrypted here because we're in the guest kernel? > > I suspect this patch is untested, and also wrong. :) Yes, it is untested but not sure that it is wrong... It all depends on how we add SEV support to the guest UEFI BIOS. My take would be to have the EFI data and ACPI tables encrypted. > > The main difference between the SME and SEV encryption, from the point > of view of the kernel, is that real-mode always writes unencrypted in > SME and always writes encrypted in SEV. But UEFI can run in 64-bit mode > and learn about the C bit, so EFI boot data should be unprotected in SEV > guests. > > Because the firmware volume is written to high memory in encrypted form, > and because the PEI phase runs in 32-bit mode, the firmware code will be > encrypted; on the other hand, data that is placed in low memory for the > kernel can be unencrypted, thus limiting differences between SME and SEV. I like the idea of limiting the differences but it would leave the EFI data and ACPI tables exposed and able to be manipulated. > > Important: I don't know what you guys are doing for SEV and > Windows guests, but if you are doing something I would really > appreciate doing things in the open. If Linux and Windows end > up doing different things with EFI boot data, ACPI tables, etc. > it will be a huge pain. On the other hand, if we can enjoy > being first, that's great. We haven't discussed Windows guests under SEV yet, but as you say, we need to do things the same. Thanks, Tom > > In fact, I have suggested in the QEMU list that SEV guests should always > use UEFI; because BIOS runs in real-mode or 32-bit non-paging protected > mode, BIOS must always write encrypted data, which becomes painful in > the kernel. > > And regarding the above "important" point, all I know is that Microsoft > for sure will be happy to restrict SEV to UEFI guests. :) > > There are still some differences, mostly around the real mode trampoline > executed by the kernel, but they should be much smaller. > > Paolo > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 22/09/2016 20:47, Tom Lendacky wrote: > > Because the firmware volume is written to high memory in encrypted form, > > and because the PEI phase runs in 32-bit mode, the firmware code will be > > encrypted; on the other hand, data that is placed in low memory for the > > kernel can be unencrypted, thus limiting differences between SME and SEV. > > I like the idea of limiting the differences but it would leave the EFI > data and ACPI tables exposed and able to be manipulated. Hmm, that makes sense. So I guess this has to stay, and Borislav's proposal doesn't fly either. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/22/2016 09:59 AM, Borislav Petkov wrote: > On Thu, Sep 22, 2016 at 04:45:51PM +0200, Paolo Bonzini wrote: >> The main difference between the SME and SEV encryption, from the point >> of view of the kernel, is that real-mode always writes unencrypted in >> SME and always writes encrypted in SEV. But UEFI can run in 64-bit mode >> and learn about the C bit, so EFI boot data should be unprotected in SEV >> guests. > > Actually, it is different: you can start fully encrypted in SME, see: > > https://lkml.kernel.org/r/20160822223539.29880.96739.stgit@tlendack-t1.amdoffice.net > > The last paragraph alludes to a certain transparent mode where you're > already encrypted and only certain pieces like EFI is not encrypted. I > think the aim is to have the transparent mode be the default one, which > makes most sense anyway. There is a new Transparent SME mode that is now part of the overall SME support, but I'm not alluding to that in the documentation at all. In TSME mode, everything that goes through the memory controller would be encrypted and that would include EFI data, etc. TSME would be enabled through a BIOS option, thus allowing legacy OSes to benefit. > > The EFI regions are unencrypted for obvious reasons and you need to > access them as such. > >> Because the firmware volume is written to high memory in encrypted >> form, and because the PEI phase runs in 32-bit mode, the firmware >> code will be encrypted; on the other hand, data that is placed in low >> memory for the kernel can be unencrypted, thus limiting differences >> between SME and SEV. > > When you run fully encrypted, you still need to access EFI tables in the > clear. That's why I'm confused about this patch here. This patch assumes that the EFI regions of a guest would be encrypted. Thanks, Tom > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/22/2016 12:07 PM, Borislav Petkov wrote: > On Thu, Sep 22, 2016 at 05:05:54PM +0200, Paolo Bonzini wrote: >> Which paragraph? > > "Linux relies on BIOS to set this bit if BIOS has determined that the > reduction in the physical address space as a result of enabling memory > encryption..." > > Basically, you can enable SME in the BIOS and you're all set. That's not what I mean here. If the BIOS sets the SMEE bit in the SYS_CFG msr then, even if the encryption bit is never used, there is still a reduction in physical address space. Transparent SME (TSME) will be a BIOS option that will result in the memory controller performing encryption no matter what. In this case all data will be encrypted without a reduction in physical address space. Thanks, Tom > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Sep 22, 2016 at 02:04:27PM -0500, Tom Lendacky wrote: > That's not what I mean here. If the BIOS sets the SMEE bit in the > SYS_CFG msr then, even if the encryption bit is never used, there is > still a reduction in physical address space. I thought that reduction is the reservation of bits for the SME mask. What other reduction is there? > Transparent SME (TSME) will be a BIOS option that will result in the > memory controller performing encryption no matter what. In this case > all data will be encrypted without a reduction in physical address > space. Now I'm confused: aren't we reducing the address space with the SME mask? Or what reduction do you mean?
On 09/22/2016 02:11 PM, Borislav Petkov wrote: > On Thu, Sep 22, 2016 at 02:04:27PM -0500, Tom Lendacky wrote: >> That's not what I mean here. If the BIOS sets the SMEE bit in the >> SYS_CFG msr then, even if the encryption bit is never used, there is >> still a reduction in physical address space. > > I thought that reduction is the reservation of bits for the SME mask. > > What other reduction is there? There is a reduction in physical address space for the SME mask and the bits used to aid in identifying the ASID associated with the memory request. This allows for the memory controller to determine the key to be used for the encryption operation (host/hypervisor key vs. an SEV guest key). Thanks, Tom > >> Transparent SME (TSME) will be a BIOS option that will result in the >> memory controller performing encryption no matter what. In this case >> all data will be encrypted without a reduction in physical address >> space. > > Now I'm confused: aren't we reducing the address space with the SME > mask? > > Or what reduction do you mean? > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Sep 22, 2016 at 02:49:22PM -0500, Tom Lendacky wrote: > > I thought that reduction is the reservation of bits for the SME mask. > > > > What other reduction is there? > > There is a reduction in physical address space for the SME mask and the > bits used to aid in identifying the ASID associated with the memory > request. This allows for the memory controller to determine the key to > be used for the encryption operation (host/hypervisor key vs. an SEV > guest key). Ok, I think I see what you mean: you call SME mask the bit in CPUID Fn8000_001F[EBX][5:0], i.e., the C-bit, i.e. sme_me_mask. And the other reduction is the key ASID, i.e., CPUID Fn8000_001F[EBX][11:6], i.e. sme_me_loss. I think we're on the same page - I was simply calling everything SME mask because both are together in the PTE: "Additionally, in some implementations, the physical address size of the processor may be reduced when memory encryption features are enabled, for example from 48 to 43 bits."
On 23/09/16 06:37, Borislav Petkov wrote: > On Thu, Sep 22, 2016 at 08:23:36PM +0200, Paolo Bonzini wrote: >> Unless this is part of some spec, it's easier if things are the same in >> SME and SEV. > Yeah, I was pondering over how sprinkling sev_active checks might not be > so clean. > > I'm wondering if we could make the EFI regions presented to the guest > unencrypted too, as part of some SEV-specific init routine so that the > guest kernel doesn't need to do anything different. How is this even possible? The spec clearly says under SEV only in long mode or PAE mode guest can control whether memory is encrypted via c-bit, and in other modes guest will be always in encrypted mode. Guest EFI is also virtual, so are you suggesting EFI code (or code which loads EFI) should also be modified to load EFI as unencrypted? Looks it's not even possible to happen. Thanks, -Kai > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Sep 23, 2016 at 09:33:00PM +1200, Kai Huang wrote: > How is this even possible? The spec clearly says under SEV only in long mode > or PAE mode guest can control whether memory is encrypted via c-bit, and in > other modes guest will be always in encrypted mode. I was suggesting the hypervisor supplies the EFI ranges unencrypted. But that is not a good idea because firmware data is exposed then, see same thread from yesterday.
diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c index 0871ea4..98363f3 100644 --- a/arch/x86/platform/efi/efi_64.c +++ b/arch/x86/platform/efi/efi_64.c @@ -213,7 +213,7 @@ void efi_sync_low_kernel_mappings(void) int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages) { - unsigned long pfn, text; + unsigned long pfn, text, flags; efi_memory_desc_t *md; struct page *page; unsigned npages; @@ -230,6 +230,10 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages) efi_scratch.efi_pgt = (pgd_t *)__sme_pa(efi_pgd); pgd = efi_pgd; + flags = _PAGE_NX | _PAGE_RW; + if (sev_active) + flags |= _PAGE_ENC; + /* * It can happen that the physical address of new_memmap lands in memory * which is not mapped in the EFI page table. Therefore we need to go @@ -237,7 +241,7 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages) * phys_efi_set_virtual_address_map(). */ pfn = pa_memmap >> PAGE_SHIFT; - if (kernel_map_pages_in_pgd(pgd, pfn, pa_memmap, num_pages, _PAGE_NX | _PAGE_RW)) { + if (kernel_map_pages_in_pgd(pgd, pfn, pa_memmap, num_pages, flags)) { pr_err("Error ident-mapping new memmap (0x%lx)!\n", pa_memmap); return 1; } @@ -302,6 +306,9 @@ static void __init __map_region(efi_memory_desc_t *md, u64 va) if (!(md->attribute & EFI_MEMORY_WB)) flags |= _PAGE_PCD; + if (sev_active) + flags |= _PAGE_ENC; + pfn = md->phys_addr >> PAGE_SHIFT; if (kernel_map_pages_in_pgd(pgd, pfn, va, md->num_pages, flags)) pr_warn("Error mapping PA 0x%llx -> VA 0x%llx!\n", @@ -426,6 +433,9 @@ void __init efi_runtime_update_mappings(void) (md->type != EFI_RUNTIME_SERVICES_CODE)) pf |= _PAGE_RW; + if (sev_active) + pf |= _PAGE_ENC; + /* Update the 1:1 mapping */ pfn = md->phys_addr >> PAGE_SHIFT; if (kernel_map_pages_in_pgd(pgd, pfn, md->phys_addr, md->num_pages, pf))