Message ID | 1668147701-4583-6-git-send-email-mikelley@microsoft.com (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | Drivers: hv: Add PCI pass-thru support to Hyper-V Confidential VMs | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
On 11/11/22 00:21, Michael Kelley wrote: > Hyper-V guests on AMD SEV-SNP hardware have the option of using the > "virtual Top Of Memory" (vTOM) feature specified by the SEV-SNP > architecture. With vTOM, shared vs. private memory accesses are > controlled by splitting the guest physical address space into two > halves. vTOM is the dividing line where the uppermost bit of the > physical address space is set; e.g., with 47 bits of guest physical > address space, vTOM is 0x40000000000 (bit 46 is set). Guest phyiscal > memory is accessible at two parallel physical addresses -- one below > vTOM and one above vTOM. Accesses below vTOM are private (encrypted) > while accesses above vTOM are shared (decrypted). In this sense, vTOM > is like the GPA.SHARED bit in Intel TDX. > > Support for Hyper-V guests using vTOM was added to the Linux kernel in > two patch sets[1][2]. This support treats the vTOM bit as part of > the physical address. For accessing shared (decrypted) memory, these > patch sets create a second kernel virtual mapping that maps to physical > addresses above vTOM. > > A better approach is to treat the vTOM bit as a protection flag, not > as part of the physical address. This new approach is like the approach > for the GPA.SHARED bit in Intel TDX. Rather than creating a second kernel > virtual mapping, the existing mapping is updated using recently added > coco mechanisms. When memory is changed between private and shared using > set_memory_decrypted() and set_memory_encrypted(), the PTEs for the > existing kernel mapping are changed to add or remove the vTOM bit > in the guest physical address, just as with TDX. The hypercalls to > change the memory status on the host side are made using the existing > callback mechanism. Everything just works, with a minor tweak to map > the I/O APIC to use private accesses. > > To accomplish the switch in approach, the following must be done in > in this single patch: > > * Update Hyper-V initialization to set the cc _mask based on vTOM > and do other coco initialization. > > * Update physical_mask so the vTOM bit is no longer treated as part > of the physical address > > * Update cc_mkenc() and cc_mkdec() to be active for Hyper-V guests. > This makes the vTOM bit part of the protection flags. > > * Code already exists to make hypercalls to inform Hyper-V about pages > changing between shared and private. Update this code to run as a > callback from __set_memory_enc_pgtable(). > > * Remove the Hyper-V special case from __set_memory_enc_dec(), and > make the normal case active for Hyper-V VMs, which have > CC_ATTR_GUEST_MEM_ENCRYPT, but not CC_ATTR_MEM_ENCRYPT. > > [1] https://lore.kernel.org/all/20211025122116.264793-1-ltykernel@gmail.com/ > [2] https://lore.kernel.org/all/20211213071407.314309-1-ltykernel@gmail.com/ > > Signed-off-by: Michael Kelley <mikelley@microsoft.com> > Reviewed-by: Tianyu Lan <Tianyu.Lan@microsoft.com> > --- > arch/x86/coco/core.c | 10 ++++++++- > arch/x86/hyperv/ivm.c | 45 +++++++++++++++++++++++++++++++---------- > arch/x86/include/asm/mshyperv.h | 8 ++------ > arch/x86/kernel/cpu/mshyperv.c | 15 +++++++------- > arch/x86/mm/pat/set_memory.c | 6 ++---- > 5 files changed, 54 insertions(+), 30 deletions(-) > > diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c > index 06eb8910..024fbf4 100644 > --- a/arch/x86/mm/pat/set_memory.c > +++ b/arch/x86/mm/pat/set_memory.c > @@ -2126,10 +2126,8 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc) > > static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc) > { > - if (hv_is_isolation_supported()) > - return hv_set_mem_host_visibility(addr, numpages, !enc); > - > - if (cc_platform_has(CC_ATTR_MEM_ENCRYPT)) > + if (cc_platform_has(CC_ATTR_MEM_ENCRYPT) || > + cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) This seems kind of strange since CC_ATTR_MEM_ENCRYPT is supposed to mean either HOST or GUEST memory encryption, but then you check for GUEST memory encryption directly. Can your cc_platform_has() support be setup to handle the CC_ATTR_MEM_ENCRYPT attribute in some way? Thanks, Tom > return __set_memory_enc_pgtable(addr, numpages, enc); > > return 0;
From: Tom Lendacky <thomas.lendacky@amd.com> Sent: Friday, November 11, 2022 10:50 AM > > On 11/11/22 00:21, Michael Kelley wrote: [snip] > > diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c > > index 06eb8910..024fbf4 100644 > > --- a/arch/x86/mm/pat/set_memory.c > > +++ b/arch/x86/mm/pat/set_memory.c > > @@ -2126,10 +2126,8 @@ static int __set_memory_enc_pgtable(unsigned long > addr, int numpages, bool enc) > > > > static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc) > > { > > - if (hv_is_isolation_supported()) > > - return hv_set_mem_host_visibility(addr, numpages, !enc); > > - > > - if (cc_platform_has(CC_ATTR_MEM_ENCRYPT)) > > + if (cc_platform_has(CC_ATTR_MEM_ENCRYPT) || > > + cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) > > This seems kind of strange since CC_ATTR_MEM_ENCRYPT is supposed to mean > either HOST or GUEST memory encryption, but then you check for GUEST > memory encryption directly. Can your cc_platform_has() support be setup to > handle the CC_ATTR_MEM_ENCRYPT attribute in some way? > > Thanks, > Tom Current upstream code for Hyper-V guests with vTOM enables only CC_ATTR_GUEST_MEM_ENCRYPT. I had been wary of also enabling CC_ATTR_MEM_ENCRYPT because that would enable other code paths that Might not be right for the vTOM case. But looking at it more closely, enabling CC_ATTR_MEM_ENCRYPT may work. There are two problems with Hyper-V vTOM enabling CC_ATTR_MEM_ENCRYPT, but both are fixable: 1) The call to mem_encrypt_init() happens a little bit too soon. Hyper-V is fully initialized and hypercalls become possible after start_kernel() calls late_time_init(). mem_encrypt_init() needs to happen after the call to late_time_init() so that marking the swiotlb memory as decrypted can make the hypercalls to sync the page state change with the host. Moving mem_encrypt_init() a few lines later in start_kernel() works in my case, but I can't test all the cases that you probably have. This change also has the benefit of removing the call to swiotlb_update_mem_attributes() at the end of hyperv_init(), which always seemed like a hack. 2) mem_encrypt_free_decrypted_mem() is mismatched with sme_postprocess_startup() in its handling of bss decrypted memory. The decryption is done if sme_me_mask is non-zero, while the re-encryption is done if CC_ATTR_MEM_ENCRYPT is true, and those conditions won't be equivalent in a Hyper-V vTOM VM if we enable CC_ATTR_MEM_ENCRYPT (sme_me_mask is always zero in a Hyper-V vTOM VM). Changing mem_encrypt_free_decrypted_mem() to do re-encryption only if sme_me_mask is non-zero solves that problem. Note that there doesn't seem to be a way for a Hyper-V vTOM VM to have decrypted bss, since there's no way to sync the page state change with the host that early in the boot process, but I don't think there's a requirement for such, so all is good. With the above two changes, Hyper-V vTOM VMs can enable CC_ATTR_MEM_ENCRYPT. The Hyper-V hack in __set_memory_enc_dec() still goes away, and there's no change to the condition for invoking __set_memory_enc_pgtable(). Thoughts? Have I missed anything? Overall, I'm persuaded that this is a better approach and can submit a v3 patch series with these changes if you agree. Michael
On 11/13/22 10:01, Michael Kelley (LINUX) wrote: > From: Tom Lendacky <thomas.lendacky@amd.com> Sent: Friday, November 11, 2022 10:50 AM >> >> On 11/11/22 00:21, Michael Kelley wrote: > > [snip] > >>> diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c >>> index 06eb8910..024fbf4 100644 >>> --- a/arch/x86/mm/pat/set_memory.c >>> +++ b/arch/x86/mm/pat/set_memory.c >>> @@ -2126,10 +2126,8 @@ static int __set_memory_enc_pgtable(unsigned long >> addr, int numpages, bool enc) >>> >>> static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc) >>> { >>> - if (hv_is_isolation_supported()) >>> - return hv_set_mem_host_visibility(addr, numpages, !enc); >>> - >>> - if (cc_platform_has(CC_ATTR_MEM_ENCRYPT)) >>> + if (cc_platform_has(CC_ATTR_MEM_ENCRYPT) || >>> + cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) >> >> This seems kind of strange since CC_ATTR_MEM_ENCRYPT is supposed to mean >> either HOST or GUEST memory encryption, but then you check for GUEST >> memory encryption directly. Can your cc_platform_has() support be setup to >> handle the CC_ATTR_MEM_ENCRYPT attribute in some way? >> >> Thanks, >> Tom > > Current upstream code for Hyper-V guests with vTOM enables only > CC_ATTR_GUEST_MEM_ENCRYPT. I had been wary of also enabling > CC_ATTR_MEM_ENCRYPT because that would enable other code paths that > Might not be right for the vTOM case. But looking at it more closely, enabling > CC_ATTR_MEM_ENCRYPT may work. > > There are two problems with Hyper-V vTOM enabling CC_ATTR_MEM_ENCRYPT, > but both are fixable: > > 1) The call to mem_encrypt_init() happens a little bit too soon. Hyper-V is fully > initialized and hypercalls become possible after start_kernel() calls late_time_init(). > mem_encrypt_init() needs to happen after the call to late_time_init() so that > marking the swiotlb memory as decrypted can make the hypercalls to sync the > page state change with the host. Moving mem_encrypt_init() a few lines later in > start_kernel() works in my case, but I can't test all the cases that you probably > have. This change also has the benefit of removing the call to > swiotlb_update_mem_attributes() at the end of hyperv_init(), which always > seemed like a hack. It seems safe for SME/SEV since mem_encrypt_init() is only updating the SWIOTLB attributes at this point. I'll do some quick testing, but you might want to verify with TDX folks, too. > > 2) mem_encrypt_free_decrypted_mem() is mismatched with > sme_postprocess_startup() in its handling of bss decrypted memory. The > decryption is done if sme_me_mask is non-zero, while the re-encryption is > done if CC_ATTR_MEM_ENCRYPT is true, and those conditions won't be > equivalent in a Hyper-V vTOM VM if we enable CC_ATTR_MEM_ENCRYPT > (sme_me_mask is always zero in a Hyper-V vTOM VM). Changing > mem_encrypt_free_decrypted_mem() to do re-encryption only if sme_me_mask > is non-zero solves that problem. Note that there doesn't seem to be a way for a Hmmm, yes, this was because of an issue using the cc_platform_has() call during identity mapped paging. I think matching them in this case would be best, e.g., changing mem_encrypt_free_decrypted_mem() to check for a non-zero sme_me_mask - along with a nice comment on why it is checking sme_me_mask. Thanks, Tom > Hyper-V vTOM VM to have decrypted bss, since there's no way to sync the > page state change with the host that early in the boot process, but I don't think > there's a requirement for such, so all is good. > > With the above two changes, Hyper-V vTOM VMs can enable > CC_ATTR_MEM_ENCRYPT. The Hyper-V hack in __set_memory_enc_dec() > still goes away, and there's no change to the condition for invoking > __set_memory_enc_pgtable(). > > Thoughts? Have I missed anything? Overall, I'm persuaded that this is a better > approach and can submit a v3 patch series with these changes if you agree. > > Michael
diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c index 49b44f8..de4e83e 100644 --- a/arch/x86/coco/core.c +++ b/arch/x86/coco/core.c @@ -78,7 +78,13 @@ static bool amd_cc_platform_has(enum cc_attr attr) static bool hyperv_cc_platform_has(enum cc_attr attr) { - return attr == CC_ATTR_GUEST_MEM_ENCRYPT; + switch (attr) { + case CC_ATTR_GUEST_MEM_ENCRYPT: + case CC_ATTR_HAS_PARAVISOR: + return true; + default: + return false; + } } bool cc_platform_has(enum cc_attr attr) @@ -108,6 +114,7 @@ u64 cc_mkenc(u64 val) switch (vendor) { case CC_VENDOR_AMD: return val | cc_mask; + case CC_VENDOR_HYPERV: case CC_VENDOR_INTEL: return val & ~cc_mask; default: @@ -121,6 +128,7 @@ u64 cc_mkdec(u64 val) switch (vendor) { case CC_VENDOR_AMD: return val & ~cc_mask; + case CC_VENDOR_HYPERV: case CC_VENDOR_INTEL: return val | cc_mask; default: diff --git a/arch/x86/hyperv/ivm.c b/arch/x86/hyperv/ivm.c index e8be4c2..29ccbe8 100644 --- a/arch/x86/hyperv/ivm.c +++ b/arch/x86/hyperv/ivm.c @@ -13,6 +13,7 @@ #include <asm/svm.h> #include <asm/sev.h> #include <asm/io.h> +#include <asm/coco.h> #include <asm/mshyperv.h> #include <asm/hypervisor.h> @@ -233,7 +234,6 @@ void hv_ghcb_msr_read(u64 msr, u64 *value) local_irq_restore(flags); } EXPORT_SYMBOL_GPL(hv_ghcb_msr_read); -#endif /* * hv_mark_gpa_visibility - Set pages visible to host via hvcall. @@ -286,27 +286,25 @@ static int hv_mark_gpa_visibility(u16 count, const u64 pfn[], } /* - * hv_set_mem_host_visibility - Set specified memory visible to host. + * hv_vtom_set_host_visibility - Set specified memory visible to host. * * In Isolation VM, all guest memory is encrypted from host and guest * needs to set memory visible to host via hvcall before sharing memory * with host. This function works as wrap of hv_mark_gpa_visibility() * with memory base and size. */ -int hv_set_mem_host_visibility(unsigned long kbuffer, int pagecount, bool visible) +static bool hv_vtom_set_host_visibility(unsigned long kbuffer, int pagecount, bool enc) { - enum hv_mem_host_visibility visibility = visible ? - VMBUS_PAGE_VISIBLE_READ_WRITE : VMBUS_PAGE_NOT_VISIBLE; + enum hv_mem_host_visibility visibility = enc ? + VMBUS_PAGE_NOT_VISIBLE : VMBUS_PAGE_VISIBLE_READ_WRITE; u64 *pfn_array; int ret = 0; + bool result = true; int i, pfn; - if (!hv_is_isolation_supported() || !hv_hypercall_pg) - return 0; - pfn_array = kmalloc(HV_HYP_PAGE_SIZE, GFP_KERNEL); if (!pfn_array) - return -ENOMEM; + return false; for (i = 0, pfn = 0; i < pagecount; i++) { pfn_array[pfn] = virt_to_hvpfn((void *)kbuffer + i * HV_HYP_PAGE_SIZE); @@ -315,17 +313,42 @@ int hv_set_mem_host_visibility(unsigned long kbuffer, int pagecount, bool visibl if (pfn == HV_MAX_MODIFY_GPA_REP_COUNT || i == pagecount - 1) { ret = hv_mark_gpa_visibility(pfn, pfn_array, visibility); - if (ret) + if (ret) { + result = false; goto err_free_pfn_array; + } pfn = 0; } } err_free_pfn_array: kfree(pfn_array); - return ret; + return result; +} + +static bool hv_vtom_tlb_flush_required(bool private) +{ + return true; } +static bool hv_vtom_cache_flush_required(void) +{ + return false; +} + +void __init hv_vtom_init(void) +{ + cc_set_vendor(CC_VENDOR_HYPERV); + cc_set_mask(ms_hyperv.shared_gpa_boundary); + physical_mask &= ms_hyperv.shared_gpa_boundary - 1; + + x86_platform.guest.enc_cache_flush_required = hv_vtom_cache_flush_required; + x86_platform.guest.enc_tlb_flush_required = hv_vtom_tlb_flush_required; + x86_platform.guest.enc_status_change_finish = hv_vtom_set_host_visibility; +} + +#endif /* CONFIG_AMD_MEM_ENCRYPT */ + /* * hv_map_memory - map memory to extra space in the AMD SEV-SNP Isolation VM. */ diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h index 61f0c20..59b3310 100644 --- a/arch/x86/include/asm/mshyperv.h +++ b/arch/x86/include/asm/mshyperv.h @@ -174,18 +174,19 @@ static inline void hv_apic_init(void) {} int hv_map_ioapic_interrupt(int ioapic_id, bool level, int vcpu, int vector, struct hv_interrupt_entry *entry); int hv_unmap_ioapic_interrupt(int ioapic_id, struct hv_interrupt_entry *entry); -int hv_set_mem_host_visibility(unsigned long addr, int numpages, bool visible); #ifdef CONFIG_AMD_MEM_ENCRYPT void hv_ghcb_msr_write(u64 msr, u64 value); void hv_ghcb_msr_read(u64 msr, u64 *value); bool hv_ghcb_negotiate_protocol(void); void hv_ghcb_terminate(unsigned int set, unsigned int reason); +void hv_vtom_init(void); #else static inline void hv_ghcb_msr_write(u64 msr, u64 value) {} static inline void hv_ghcb_msr_read(u64 msr, u64 *value) {} static inline bool hv_ghcb_negotiate_protocol(void) { return false; } static inline void hv_ghcb_terminate(unsigned int set, unsigned int reason) {} +static inline void hv_vtom_init(void) {} #endif extern bool hv_isolation_type_snp(void); @@ -241,11 +242,6 @@ static inline int hyperv_flush_guest_mapping_range(u64 as, } static inline void hv_set_register(unsigned int reg, u64 value) { } static inline u64 hv_get_register(unsigned int reg) { return 0; } -static inline int hv_set_mem_host_visibility(unsigned long addr, int numpages, - bool visible) -{ - return -1; -} #endif /* CONFIG_HYPERV */ diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c index 8316139..b080795 100644 --- a/arch/x86/kernel/cpu/mshyperv.c +++ b/arch/x86/kernel/cpu/mshyperv.c @@ -33,7 +33,6 @@ #include <asm/nmi.h> #include <clocksource/hyperv_timer.h> #include <asm/numa.h> -#include <asm/coco.h> /* Is Linux running as the root partition? */ bool hv_root_partition; @@ -325,8 +324,10 @@ static void __init ms_hyperv_init_platform(void) if (ms_hyperv.priv_high & HV_ISOLATION) { ms_hyperv.isolation_config_a = cpuid_eax(HYPERV_CPUID_ISOLATION_CONFIG); ms_hyperv.isolation_config_b = cpuid_ebx(HYPERV_CPUID_ISOLATION_CONFIG); - ms_hyperv.shared_gpa_boundary = - BIT_ULL(ms_hyperv.shared_gpa_boundary_bits); + + if (ms_hyperv.shared_gpa_boundary_active) + ms_hyperv.shared_gpa_boundary = + BIT_ULL(ms_hyperv.shared_gpa_boundary_bits); pr_info("Hyper-V: Isolation Config: Group A 0x%x, Group B 0x%x\n", ms_hyperv.isolation_config_a, ms_hyperv.isolation_config_b); @@ -337,11 +338,6 @@ static void __init ms_hyperv_init_platform(void) swiotlb_unencrypted_base = ms_hyperv.shared_gpa_boundary; #endif } - /* Isolation VMs are unenlightened SEV-based VMs, thus this check: */ - if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) { - if (hv_get_isolation_type() != HV_ISOLATION_TYPE_NONE) - cc_set_vendor(CC_VENDOR_HYPERV); - } } if (hv_max_functions_eax >= HYPERV_CPUID_NESTED_FEATURES) { @@ -410,6 +406,9 @@ static void __init ms_hyperv_init_platform(void) i8253_clear_counter_on_shutdown = false; #if IS_ENABLED(CONFIG_HYPERV) + if ((hv_get_isolation_type() == HV_ISOLATION_TYPE_VBS) || + (hv_get_isolation_type() == HV_ISOLATION_TYPE_SNP)) + hv_vtom_init(); /* * Setup the hook to get control post apic initialization. */ diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c index 06eb8910..024fbf4 100644 --- a/arch/x86/mm/pat/set_memory.c +++ b/arch/x86/mm/pat/set_memory.c @@ -2126,10 +2126,8 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc) static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc) { - if (hv_is_isolation_supported()) - return hv_set_mem_host_visibility(addr, numpages, !enc); - - if (cc_platform_has(CC_ATTR_MEM_ENCRYPT)) + if (cc_platform_has(CC_ATTR_MEM_ENCRYPT) || + cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) return __set_memory_enc_pgtable(addr, numpages, enc); return 0;