Message ID | 1668624097-14884-8-git-send-email-mikelley@microsoft.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | Add PCI pass-thru support to Hyper-V Confidential VMs | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
On 11/17/2022 2:41 AM, Michael Kelley wrote: > Hyper-V guests on AMD SEV-SNP hardware have the option of using the > "virtual Top Of Memory" (vTOM) feature specified by the SEV-SNP > architecture. With vTOM, shared vs. private memory accesses are > controlled by splitting the guest physical address space into two > halves. vTOM is the dividing line where the uppermost bit of the > physical address space is set; e.g., with 47 bits of guest physical > address space, vTOM is 0x40000000000 (bit 46 is set). Guest phyiscal > memory is accessible at two parallel physical addresses -- one below > vTOM and one above vTOM. Accesses below vTOM are private (encrypted) > while accesses above vTOM are shared (decrypted). In this sense, vTOM > is like the GPA.SHARED bit in Intel TDX. > > Support for Hyper-V guests using vTOM was added to the Linux kernel in > two patch sets[1][2]. This support treats the vTOM bit as part of > the physical address. For accessing shared (decrypted) memory, these > patch sets create a second kernel virtual mapping that maps to physical > addresses above vTOM. > > A better approach is to treat the vTOM bit as a protection flag, not > as part of the physical address. This new approach is like the approach > for the GPA.SHARED bit in Intel TDX. Rather than creating a second kernel > virtual mapping, the existing mapping is updated using recently added > coco mechanisms. When memory is changed between private and shared using > set_memory_decrypted() and set_memory_encrypted(), the PTEs for the > existing kernel mapping are changed to add or remove the vTOM bit > in the guest physical address, just as with TDX. The hypercalls to > change the memory status on the host side are made using the existing > callback mechanism. Everything just works, with a minor tweak to map > the I/O APIC to use private accesses. > > To accomplish the switch in approach, the following must be done in > in this single patch: > > * Update Hyper-V initialization to set the cc_mask based on vTOM > and do other coco initialization. > > * Update physical_mask so the vTOM bit is no longer treated as part > of the physical address > > * Update cc_mkenc() and cc_mkdec() to be active for Hyper-V guests. > This makes the vTOM bit part of the protection flags. > > * Code already exists to make hypercalls to inform Hyper-V about pages > changing between shared and private. Update this code to run as a > callback from __set_memory_enc_pgtable(). > > * Remove the Hyper-V special case from __set_memory_enc_dec() > > * Remove the Hyper-V specific call to swiotlb_update_mem_attributes() > since mem_encrypt_init() will now do it. > > [1]https://lore.kernel.org/all/20211025122116.264793-1-ltykernel@gmail.com/ > [2]https://lore.kernel.org/all/20211213071407.314309-1-ltykernel@gmail.com/ > > Signed-off-by: Michael Kelley<mikelley@microsoft.com> > --- > arch/x86/coco/core.c | 11 +++++++++- > arch/x86/hyperv/hv_init.c | 11 ---------- > arch/x86/hyperv/ivm.c | 45 +++++++++++++++++++++++++++++++---------- > arch/x86/include/asm/mshyperv.h | 8 ++------ > arch/x86/kernel/cpu/mshyperv.c | 15 +++++++------- > arch/x86/mm/pat/set_memory.c | 3 --- > 6 files changed, 53 insertions(+), 40 deletions(-) Reviewed-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
On Wed, Nov 16, 2022 at 10:41:30AM -0800, Michael Kelley wrote: > Hyper-V guests on AMD SEV-SNP hardware have the option of using the > "virtual Top Of Memory" (vTOM) feature specified by the SEV-SNP > architecture. With vTOM, shared vs. private memory accesses are > controlled by splitting the guest physical address space into two > halves. vTOM is the dividing line where the uppermost bit of the > physical address space is set; e.g., with 47 bits of guest physical > address space, vTOM is 0x40000000000 (bit 46 is set). Guest phyiscal Unknown word [phyiscal] in commit message. Suggestions: ['physical', 'physically', ... ] Please introduce a spellchecker into your patch creation workflow. ... > @@ -108,6 +115,7 @@ u64 cc_mkenc(u64 val) > switch (vendor) { > case CC_VENDOR_AMD: > return val | cc_mask; > + case CC_VENDOR_HYPERV: > case CC_VENDOR_INTEL: > return val & ~cc_mask; > default: > @@ -121,6 +129,7 @@ u64 cc_mkdec(u64 val) > switch (vendor) { > case CC_VENDOR_AMD: > return val & ~cc_mask; > + case CC_VENDOR_HYPERV: > case CC_VENDOR_INTEL: > return val | cc_mask; > default: Uuuh, this needs a BIG FAT COMMENT. You're running on SNP and yet the enc/dec meaning is flipped. And that's because of vTOM. What happens if you have other types of SNP-based VMs on HyperV? The isolation VMs thing? Or is that the same? What happens when you do TDX guests with HyperV? This becomes wrong then. I think you need a more finer-grained check here in the sense of "is it a HyperV guest using a paravisor and vTOM is enabled" or so. Otherwise, I like the removal of the HyperV-specific checks ofc. Thx.
From: Borislav Petkov <bp@alien8.de> Sent: Monday, November 21, 2022 7:03 AM > > On Wed, Nov 16, 2022 at 10:41:30AM -0800, Michael Kelley wrote: > ... > > > @@ -108,6 +115,7 @@ u64 cc_mkenc(u64 val) > > switch (vendor) { > > case CC_VENDOR_AMD: > > return val | cc_mask; > > + case CC_VENDOR_HYPERV: > > case CC_VENDOR_INTEL: > > return val & ~cc_mask; > > default: > > @@ -121,6 +129,7 @@ u64 cc_mkdec(u64 val) > > switch (vendor) { > > case CC_VENDOR_AMD: > > return val & ~cc_mask; > > + case CC_VENDOR_HYPERV: > > case CC_VENDOR_INTEL: > > return val | cc_mask; > > default: > > Uuuh, this needs a BIG FAT COMMENT. > > You're running on SNP and yet the enc/dec meaning is flipped. And that's > because of vTOM. > > What happens if you have other types of SNP-based VMs on HyperV? The > isolation VMs thing? Or is that the same? > > What happens when you do TDX guests with HyperV? > > This becomes wrong then. > > I think you need a more finer-grained check here in the sense of "is it > a HyperV guest using a paravisor and vTOM is enabled" or so. > > Otherwise, I like the removal of the HyperV-specific checks ofc. > I think the core problem here is the naming and meaning of CC_VENDOR_HYPERV. The name was created originally when the Hyper-V vTOM handling code was a lot of special cases. With the changes in this patch series that make the vTOM functionality more mainstream, the name would be better as CC_VENDOR_AMD_VTOM. vTOM is part of the AMD SEV-SNP spec, and it's a different way of doing the encryption from the "C-bit" based approach. As much as possible, I'm trying to not make it be Hyper-V specific, though currently we have N=1 for hypervisors that offer the vTOM option, so it's a little hard to generalize. With the thinking oriented that way, a Linux guest on Hyper-V using TDX will run with CC_VENDOR_INTEL. A Linux guest on Hyper-V that is fully enlightened to use the "C-bit" will run with CC_VENDOR_AMD. Dexuan Cui just posted a patch set for initial TDX support on Hyper-V, and I think that runs with CC_VENDOR_INTEL (Dexuan -- correct me if I'm wrong about that -- I haven't reviewed your patches yet). Tianyu Lan has a patch set out for Hyper-V guests using the "C-bit". That patch set still uses CC_VENDOR_HYPERV. Tianyu and I need to work through whether his patch set can run with CC_VENDOR_AMD like everyone else using the "C-bit" approach. Yes, the polarity of the AMD vTOM bit matches the polarity of the TDX GPA.SHARED bit, and is the opposite polarity of the AMD "C-bit". I'll add a comment to that effect. Anyway, that's where I think this should go. Does it make sense? Other thoughts? Michael
On 11/22/22 10:22, Michael Kelley (LINUX) wrote: > Anyway, that's where I think this should go. Does it make sense? > Other thoughts? I think hard-coding the C-bit behavior and/or position to a vendor was probably a bad idea. Even the comment: u64 cc_mkenc(u64 val) { /* * Both AMD and Intel use a bit in the page table to indicate * encryption status of the page. * * - for AMD, bit *set* means the page is encrypted * - for Intel *clear* means encrypted. */ doesn't make a lot of sense now. Maybe we should just have a: CC_ATTR_ENC_SET which gets set for the "AMD" behavior and is clear for the "Intel" behavior. Hyper-V code running on AMD can set the attribute to get teh "Intel" behavior. That sure beats having a Hyper-V vendor.
From: Dave Hansen <dave.hansen@intel.com> Sent: Tuesday, November 22, 2022 10:30 AM > > On 11/22/22 10:22, Michael Kelley (LINUX) wrote: > > Anyway, that's where I think this should go. Does it make sense? > > Other thoughts? > > I think hard-coding the C-bit behavior and/or position to a vendor was > probably a bad idea. Even the comment: > > u64 cc_mkenc(u64 val) > { > /* > * Both AMD and Intel use a bit in the page table to indicate > * encryption status of the page. > * > * - for AMD, bit *set* means the page is encrypted > * - for Intel *clear* means encrypted. > */ > > doesn't make a lot of sense now. Maybe we should just have a: > > CC_ATTR_ENC_SET > > which gets set for the "AMD" behavior and is clear for the "Intel" > behavior. Hyper-V code running on AMD can set the attribute to get teh > "Intel" behavior. > > That sure beats having a Hyper-V vendor. To me, figuring out the encryption bit polarity and position isn't the hard part to set up. We've got three technologies: TDX, AMD "C-bit", and arguably, AMD vTOM. Future silicon and architectural enhancements will most likely be variations and improvements on these rather than something completely new (not that I'm necessarily aware of what the processor vendors may have planned). The AMD "C-bit" technology already has sub-cases (SME, SEV, SEV-ES, SEV-SNP) because of the architectural history. Any of the technologies may get additional subcases over time. Whether those subcases are represented as new CC_VENDOR_* options, or CC_ATTR_* options on an existing CC_VENDOR_* should be driven by the level of commonality with what already exists. There's no hard-and-fast rule. Just do whatever makes the most sense. I'm proposing AMD vTOM as a separate CC_VENDOR_AMD_VTOM because it is rather different from CC_VENDOR_AMD, and not just in the polarity and position of the encryption bit. But if we really wanted to just make it a sub-case of CC_VENDOR_AMD, I could probably be convinced. The key is cleanly handling all the other attributes like CC_ATTR_GUEST_STATE_ENCRYPT, CC_ATTR_ACCESS_IOAPIC_ENCRYPTED (that I add in this patch set), CC_ATTR_GUEST_UNROLL_STRING_IO, etc. where we want to avoid too many hacky special cases. The current code structure where the CC_VENDOR_* selection determines the CC_ATTR_* values seems to work OK. Anyway, that's my thinking. CC_VENDOR_HYPERV goes away, but the behavior is still essentially the same when replaced by CC_VENDOR_AMD_VTOM. Michael
On Tue, Nov 22, 2022 at 06:22:46PM +0000, Michael Kelley (LINUX) wrote: > I think the core problem here is the naming and meaning of > CC_VENDOR_HYPERV. The name was created originally when the > Hyper-V vTOM handling code was a lot of special cases. With the > changes in this patch series that make the vTOM functionality more > mainstream, the name would be better as CC_VENDOR_AMD_VTOM. No, if CC_VENDOR_HYPERV means different things depending on what kind of guests you're doing, then you should not use a CC_VENDOR at all. > vTOM is part of the AMD SEV-SNP spec, and it's a different way of > doing the encryption from the "C-bit" based approach. As much as > possible, I'm trying to not make it be Hyper-V specific, though > currently we have N=1 for hypervisors that offer the vTOM option, so > it's a little hard to generalize. Actually, it is very simple to generalize: vTOM and the paravisor and VMPL are all part of the effort to support unenlightened, unmodified guests with SNP. So, if KVM wants to run Windows NT 4.0 guests as SNP guests, then it probably would need the same contraptions. > With the thinking oriented that way, a Linux guest on Hyper-V using > TDX will run with CC_VENDOR_INTEL. A Linux guest on Hyper-V that > is fully enlightened to use the "C-bit" will run with CC_VENDOR_AMD. Right. > Dexuan Cui just posted a patch set for initial TDX support on Hyper-V, > and I think that runs with CC_VENDOR_INTEL (Dexuan -- correct me if > I'm wrong about that -- I haven't reviewed your patches yet). Tianyu Lan > has a patch set out for Hyper-V guests using the "C-bit". That patch set > still uses CC_VENDOR_HYPERV. Tianyu and I need to work through > whether his patch set can run with CC_VENDOR_AMD like everyone > else using the "C-bit" approach. So I'm not sure the vendor is the right approach here. I guess we need to specify the *type* of guest being supported. > Yes, the polarity of the AMD vTOM bit matches the polarity of the > TDX GPA.SHARED bit, and is the opposite polarity of the AMD "C-bit". > I'll add a comment to that effect. > > Anyway, that's where I think this should go. Does it make sense? > Other thoughts? I think all that polarity doesn't matter as long as we abstract it away with, "mark encrypted" and "mark decrypted". But it is late now and I could be wrong - I'll look at the rest tomorrow-ish. Thx.
From: Borislav Petkov <bp@alien8.de> Sent: Tuesday, November 22, 2022 2:18 PM > > On Tue, Nov 22, 2022 at 06:22:46PM +0000, Michael Kelley (LINUX) wrote: > > I think the core problem here is the naming and meaning of > > CC_VENDOR_HYPERV. The name was created originally when the > > Hyper-V vTOM handling code was a lot of special cases. With the > > changes in this patch series that make the vTOM functionality more > > mainstream, the name would be better as CC_VENDOR_AMD_VTOM. > > No, if CC_VENDOR_HYPERV means different things depending on what kind of > guests you're doing, then you should not use a CC_VENDOR at all. Agreed. My proposal is to drop CC_VENDOR_HYPERV entirely. Replace it with CC_VENDOR_AMD_VTOM (or something like that) that is set *only* by Linux guests that are running on AMD SEV-SNP processors and using the vTOM scheme instead of the AMD C-bit scheme. > > > vTOM is part of the AMD SEV-SNP spec, and it's a different way of > > doing the encryption from the "C-bit" based approach. As much as > > possible, I'm trying to not make it be Hyper-V specific, though > > currently we have N=1 for hypervisors that offer the vTOM option, so > > it's a little hard to generalize. > > Actually, it is very simple to generalize: vTOM and the paravisor and > VMPL are all part of the effort to support unenlightened, unmodified > guests with SNP. > > So, if KVM wants to run Windows NT 4.0 guests as SNP guests, then it > probably would need the same contraptions. Yes, agreed. My point about generalization is that Hyper-V is the only actual implementation today. Edge cases, like whether the IO-APIC is accessed as encrypted or as decrypted don't have a pattern yet. But that's not a blocker. Such cases can be resolved or special-cased later when/if N > 1. > > > With the thinking oriented that way, a Linux guest on Hyper-V using > > TDX will run with CC_VENDOR_INTEL. A Linux guest on Hyper-V that > > is fully enlightened to use the "C-bit" will run with CC_VENDOR_AMD. > > Right. Good. We're in agreement. :-) > > > Dexuan Cui just posted a patch set for initial TDX support on Hyper-V, > > and I think that runs with CC_VENDOR_INTEL (Dexuan -- correct me if > > I'm wrong about that -- I haven't reviewed your patches yet). I confirmed with Dexuan that his new patch set for TDX guests on Hyper-V has the guest running with CC_VENDOR_INTEL, which is what we want. > > Tianyu Lan > > has a patch set out for Hyper-V guests using the "C-bit". That patch set > > still uses CC_VENDOR_HYPERV. Tianyu and I need to work through > > whether his patch set can run with CC_VENDOR_AMD like everyone > > else using the "C-bit" approach. I haven't followed up with Tianyu yet. > > So I'm not sure the vendor is the right approach here. I guess we need > to specify the *type* of guest being supported. Yes, calling it the "vendor" turns out to not quite be right because in the AMD case, the technology/architecture/scheme/"type" (or whatever you want to call it) is not 1:1 with the vendor. Intel has just one (TDX) while AMD has two (C-bit and vTOM). "vendor" is just a label, but we should get the label right to avoid future confusion. The key point is that we'll have three top-level types: * TDX * AMD with C-bit (and this has some sub-types) * AMD with vTOM The CC_ATTR_* values are then derived from the "type". > > > Yes, the polarity of the AMD vTOM bit matches the polarity of the > > TDX GPA.SHARED bit, and is the opposite polarity of the AMD "C-bit". > > I'll add a comment to that effect. > > > > Anyway, that's where I think this should go. Does it make sense? > > Other thoughts? > > I think all that polarity doesn't matter as long as we abstract it away > with, "mark encrypted" and "mark decrypted". Agreed. Michael
From: Michael Kelley (LINUX) <mikelley@microsoft.com> Sent: Tuesday, November 22, 2022 4:59 PM > > From: Borislav Petkov <bp@alien8.de> Sent: Tuesday, November 22, 2022 2:18 PM > > > > On Tue, Nov 22, 2022 at 06:22:46PM +0000, Michael Kelley (LINUX) wrote: > > > I think the core problem here is the naming and meaning of > > > CC_VENDOR_HYPERV. The name was created originally when the > > > Hyper-V vTOM handling code was a lot of special cases. With the > > > changes in this patch series that make the vTOM functionality more > > > mainstream, the name would be better as CC_VENDOR_AMD_VTOM. > > > > No, if CC_VENDOR_HYPERV means different things depending on what kind of > > guests you're doing, then you should not use a CC_VENDOR at all. > > Agreed. My proposal is to drop CC_VENDOR_HYPERV entirely. > Replace it with CC_VENDOR_AMD_VTOM (or something like that) that > is set *only* by Linux guests that are running on AMD SEV-SNP processors > and using the vTOM scheme instead of the AMD C-bit scheme. > > > > > > vTOM is part of the AMD SEV-SNP spec, and it's a different way of > > > doing the encryption from the "C-bit" based approach. As much as > > > possible, I'm trying to not make it be Hyper-V specific, though > > > currently we have N=1 for hypervisors that offer the vTOM option, so > > > it's a little hard to generalize. > > > > Actually, it is very simple to generalize: vTOM and the paravisor and > > VMPL are all part of the effort to support unenlightened, unmodified > > guests with SNP. > > > > So, if KVM wants to run Windows NT 4.0 guests as SNP guests, then it > > probably would need the same contraptions. > > Yes, agreed. My point about generalization is that Hyper-V is the only > actual implementation today. Edge cases, like whether the IO-APIC is > accessed as encrypted or as decrypted don't have a pattern yet. But > that's not a blocker. Such cases can be resolved or special-cased later > when/if N > 1. > > > > > > With the thinking oriented that way, a Linux guest on Hyper-V using > > > TDX will run with CC_VENDOR_INTEL. A Linux guest on Hyper-V that > > > is fully enlightened to use the "C-bit" will run with CC_VENDOR_AMD. > > > > Right. > > Good. We're in agreement. :-) > > > > > > Dexuan Cui just posted a patch set for initial TDX support on Hyper-V, > > > and I think that runs with CC_VENDOR_INTEL (Dexuan -- correct me if > > > I'm wrong about that -- I haven't reviewed your patches yet). > > I confirmed with Dexuan that his new patch set for TDX guests on Hyper-V > has the guest running with CC_VENDOR_INTEL, which is what we want. > > > > Tianyu Lan > > > has a patch set out for Hyper-V guests using the "C-bit". That patch set > > > still uses CC_VENDOR_HYPERV. Tianyu and I need to work through > > > whether his patch set can run with CC_VENDOR_AMD like everyone > > > else using the "C-bit" approach. > > I haven't followed up with Tianyu yet. > > > > > So I'm not sure the vendor is the right approach here. I guess we need > > to specify the *type* of guest being supported. > > Yes, calling it the "vendor" turns out to not quite be right because in > the AMD case, the technology/architecture/scheme/"type" (or > whatever you want to call it) is not 1:1 with the vendor. Intel has just > one (TDX) while AMD has two (C-bit and vTOM). "vendor" is just a label, > but we should get the label right to avoid future confusion. The key point > is that we'll have three top-level types: > > * TDX > * AMD with C-bit (and this has some sub-types) > * AMD with vTOM > > The CC_ATTR_* values are then derived from the "type". > > > > > > Yes, the polarity of the AMD vTOM bit matches the polarity of the > > > TDX GPA.SHARED bit, and is the opposite polarity of the AMD "C-bit". > > > I'll add a comment to that effect. > > > > > > Anyway, that's where I think this should go. Does it make sense? > > > Other thoughts? > > > > I think all that polarity doesn't matter as long as we abstract it away > > with, "mark encrypted" and "mark decrypted". > > Agreed. > Boris -- Any further comment on this patch? I think we're agreement. For this patch series I propose to change the symbol "CC_VENDOR_HYPERV" to "CC_VENDOR_AMD_VTOM" and the function name hyperv_cc_platform_has() to amd_vtom_cc_platform_has(). There's no functionality change vs. the current version of this patch. The naming changes just make clear that the scope is only for the AMD vTOM encryption scheme. As discussed, the concept "vendor" isn't quite right either, but let's not add that change to this patch series. I can follow with a separate patch to change symbols with "vendor" or "VENDOR" to something else, which will probably require some discussion as to what that something else should be. Michael
On Mon, Nov 28, 2022 at 02:38:11PM +0000, Michael Kelley (LINUX) wrote: > Any further comment on this patch? I think we're agreement. For > this patch series I propose to change the symbol "CC_VENDOR_HYPERV" > to "CC_VENDOR_AMD_VTOM" and the function name > hyperv_cc_platform_has() to amd_vtom_cc_platform_has(). That doesn't sound optimal to me. So, let's clarify things first: those Isolation VMs - are they going to be the paravisors? I don't see any other option because the unmodified guest must be some old windoze.... So, if they're going to be that, then I guess this should be called CC_ATTR_PARAVISOR to denote that it is a thin layer of virt gunk between an unmodified guest and a hypervisor. And if TDX wants to do that too later, then they can use that flag too. Yes, no?
From: Borislav Petkov <bp@alien8.de> Sent: Monday, November 28, 2022 8:34 AM > > On Mon, Nov 28, 2022 at 02:38:11PM +0000, Michael Kelley (LINUX) wrote: > > Any further comment on this patch? I think we're agreement. For > > this patch series I propose to change the symbol "CC_VENDOR_HYPERV" > > to "CC_VENDOR_AMD_VTOM" and the function name > > hyperv_cc_platform_has() to amd_vtom_cc_platform_has(). > > That doesn't sound optimal to me. > > So, let's clarify things first: those Isolation VMs - are they going to > be the paravisors? No. The paravisor exists in another layer (VMPL 0 in the AMD vTOM architecture) that isn't visible to the Linux guest. The paravisor is a separate code base that isn't currently open source. All code in this patch series is code that runs in the Linux guest. From an encryption standpoint, the Linux guest sees the following: 1) Guest memory is encrypted by default 2) The Linux guest must set the vTOM flag in a PTE to access a page as unencrypted. 3) The Linux guest must explicitly notify the hypervisor to change a page from private (encrypted) to shared (decrypted), and vice versa. > > I don't see any other option because the unmodified guest must be some > old windoze.... What Windows guests do isn't really relevant. Again, the code in this patch series all runs directly in the Linux guest, not the paravisor. And the Linux guest isn't unmodified. We've added changes to understand vTOM and the need to communicate with the hypervisor about page changes between private and shared. But there are other changes for a fully enlightened guest that we don't have to make when using AMD vTOM, because the paravisor transparently (to the guest -- Linux or Windows) handles those issues. > > So, if they're going to be that, then I guess this should be called > > CC_ATTR_PARAVISOR > > to denote that it is a thin layer of virt gunk between an unmodified > guest and a hypervisor. No, the code is this patch series is not that at all. It's not code that is part of the paravisor. It's Linux guest code that understands it is running in an environment where AMD vTOM is the encryption scheme, which is different from the AMD C-bit encryption scheme. > > And if TDX wants to do that too later, then they can use that flag too. > Again, no. What I have proposed as CC_VENDOR_AMD_VTOM is specific to AMD's virtual-Top-of-Memory architecture. The TDX architecture doesn't really have a way to use a paravisor. To summarize, the code in this patch series is about a 3rd encryption scheme that is used by the guest. It is completely parallel to the AMD C-bit encryption scheme and the Intel TDX encryption scheme. With the AMD vTOM scheme, there is a paravisor that transparently emulates some things for the guest so there are fewer code changes needed in the guest, but this patch series is not about that paravisor code. Michael
On Mon, Nov 28, 2022 at 04:59:27PM +0000, Michael Kelley (LINUX) wrote: > 2) The Linux guest must set the vTOM flag in a PTE to access a page > as unencrypted. What exactly do you call the "vTOM flag in the PTE"? I see this here: "When bit 1 (vTOM) of SEV_FEATURES is set in the VMSA of an SNP-active VM, the VIRTUAL_TOM field is used to determine the C-bit for data accesses instead of the guest page table contents. All data accesses below VIRTUAL_TOM are accessed with an effective C-bit of 1 and all addresses at or above VIRTUAL_TOM are accessed with an effective C-bit of 0." Now you say "vTOM is the dividing line where the uppermost bit of the physical address space is set; e.g., with 47 bits of guest physical address space, vTOM is 0x40000000000 (bit 46 is set)." So on your guests, is VIRTUAL_TOM == 0x400000000000? Btw, that 0x4000.. does not have bit 46 set but bit 42. Bit 46 set is 0x400000000000 which means one more '0' at the end. So before we discuss this further, let's agree on the basics first. > What Windows guests do isn't really relevant. Again, the code in this patch > series all runs directly in the Linux guest, not the paravisor. And the Linux > guest isn't unmodified. We've added changes to understand vTOM and > the need to communicate with the hypervisor about page changes > between private and shared. But there are other changes for a fully > enlightened guest that we don't have to make when using AMD vTOM, > because the paravisor transparently (to the guest -- Linux or Windows) > handles those issues. So this is some other type of guest you wanna run. Where is the documentation of that thing? I'd like to know what exactly it is going to use in the kernel. > Again, no. What I have proposed as CC_VENDOR_AMD_VTOM is There's no vendor AMD_VTOM! We did the vendor thing to denote Intel or AMD wrt to confidential computing. Now you're coming up with something special. It can't be HYPERV because Hyper-V does other types of confidential solutions too, apparently. Now before this goes any further I'd like for "something special" to be defined properly and not just be a one-off Hyper-V thing. > specific to AMD's virtual-Top-of-Memory architecture. The TDX > architecture doesn't really have a way to use a paravisor. > > To summarize, the code in this patch series is about a 3rd encryption > scheme that is used by the guest. Yes, can that third thing be used by other hypervisors or is this Hyper-V special? > It is completely parallel to the AMD C-bit encryption scheme and > the Intel TDX encryption scheme. With the AMD vTOM scheme, there is > a paravisor that transparently emulates some things for the guest > so there are fewer code changes needed in the guest, but this patch > series is not about that paravisor code. Would other hypervisors want to support such a scheme? Is this architecture documented somewhere? If so, where? What would it mean for the kernel to support it. And so on and so on. Thanks.
From: Borislav Petkov <bp@alien8.de> Sent: Monday, November 28, 2022 9:25 AM > > On Mon, Nov 28, 2022 at 04:59:27PM +0000, Michael Kelley (LINUX) wrote: > > 2) The Linux guest must set the vTOM flag in a PTE to access a page > > as unencrypted. > > What exactly do you call the "vTOM flag in the PTE"? > > I see this here: > > "When bit 1 (vTOM) of SEV_FEATURES is set in the VMSA of an SNP-active > VM, the VIRTUAL_TOM field is used to determine the C-bit for data > accesses instead of the guest page table contents. All data accesses > below VIRTUAL_TOM are accessed with an effective C-bit of 1 and all > addresses at or above VIRTUAL_TOM are accessed with an effective C-bit > of 0." > > Now you say > > "vTOM is the dividing line where the uppermost bit of the physical > address space is set; e.g., with 47 bits of guest physical address > space, vTOM is 0x40000000000 (bit 46 is set)." > > So on your guests, is VIRTUAL_TOM == 0x400000000000? > > Btw, that 0x4000.. does not have bit 46 set but bit 42. Bit 46 set is > > 0x400000000000 > > which means one more '0' at the end. Yeah, looks like I dropped a '0' in my comment text. Will fix. > > So before we discuss this further, let's agree on the basics first. > > > What Windows guests do isn't really relevant. Again, the code in this patch > > series all runs directly in the Linux guest, not the paravisor. And the Linux > > guest isn't unmodified. We've added changes to understand vTOM and > > the need to communicate with the hypervisor about page changes > > between private and shared. But there are other changes for a fully > > enlightened guest that we don't have to make when using AMD vTOM, > > because the paravisor transparently (to the guest -- Linux or Windows) > > handles those issues. > > So this is some other type of guest you wanna run. > > Where is the documentation of that thing? > > I'd like to know what exactly it is going to use in the kernel. Standard Linux is the guest. It's fully functional for running general Purpose workloads that want "confidential computing" where guest memory is encrypted and the data in the guest is not visible to the host hypervisor. It's a standard Linux kernel. I'm not sure what you mean by "some other type of guest". > > > Again, no. What I have proposed as CC_VENDOR_AMD_VTOM is > > There's no vendor AMD_VTOM! > > We did the vendor thing to denote Intel or AMD wrt to confidential > computing. But vendor AMD effectively offers two different encryption schemes that could be seen by the guest VM. The hypervisor chooses which scheme a particular guest will see. Hyper-V has chosen to present the vTOM scheme to guest VMs, including normal Linux and Windows guests, that have been modestly updated to understand vTOM. > > Now you're coming up with something special. It can't be HYPERV because > Hyper-V does other types of confidential solutions too, apparently. In the future, Hyper-V may also choose to present original AMD C-bit scheme in some guest VMs, depending on the use case. And it will present the Intel TDX scheme when running on that hardware. > > Now before this goes any further I'd like for "something special" to be > defined properly and not just be a one-off Hyper-V thing. > > > specific to AMD's virtual-Top-of-Memory architecture. The TDX > > architecture doesn't really have a way to use a paravisor. > > > > To summarize, the code in this patch series is about a 3rd encryption > > scheme that is used by the guest. > > Yes, can that third thing be used by other hypervisors or is this > Hyper-V special? This third thing is part of the AMD SEV-SNP architecture and could be used by any hypervisor. > > > It is completely parallel to the AMD C-bit encryption scheme and > > the Intel TDX encryption scheme. With the AMD vTOM scheme, there is > > a paravisor that transparently emulates some things for the guest > > so there are fewer code changes needed in the guest, but this patch > > series is not about that paravisor code. > > Would other hypervisors want to support such a scheme? > > Is this architecture documented somewhere? If so, where? The AMD vTOM scheme is documented in the AMD SEV-SNP architecture specs. > > What would it mean for the kernel to support it. The Linux kernel running as a guest already supports it, at least when running on Hyper-V. The code went into the 5.15 kernel [1][2] about a year ago. But that code is more Hyper-V specific than is desirable. This patch set makes the AMD vTOM support more parallel to the Intel TDX and AMD C-bit support. To my knowledge, KVM does not support the AMD vTOM scheme. Someone from AMD may have a better sense whether adding that support is likely in the future. [1] https://lore.kernel.org/all/20211025122116.264793-1-ltykernel@gmail.com/ [2] https://lore.kernel.org/all/20211213071407.314309-1-ltykernel@gmail.com/ Michael
On Mon, Nov 28, 2022 at 05:55:11PM +0000, Michael Kelley (LINUX) wrote: > But vendor AMD effectively offers two different encryption schemes that > could be seen by the guest VM. The hypervisor chooses which scheme a > particular guest will see. Hyper-V has chosen to present the vTOM scheme > to guest VMs, including normal Linux and Windows guests, that have been > modestly updated to understand vTOM. If this is a standard SNP guest then you can detect vTOM support using SEV_FEATURES. See this thread here: https://lore.kernel.org/r/20221117044433.244656-1-nikunj@amd.com Which then means, you don't need any special gunk except extending this patch above to check SNP has vTOM support. > In the future, Hyper-V may also choose to present original AMD C-bit scheme > in some guest VMs, depending on the use case. And it will present the Intel > TDX scheme when running on that hardware. And all those should JustWork(tm) because we already support such guests. > To my knowledge, KVM does not support the AMD vTOM scheme. > Someone from AMD may have a better sense whether adding that > support is likely in the future. Yah, see above. Thx.
From: Borislav Petkov <bp@alien8.de> Sent: Monday, November 28, 2022 11:57 AM > > On Mon, Nov 28, 2022 at 05:55:11PM +0000, Michael Kelley (LINUX) wrote: > > But vendor AMD effectively offers two different encryption schemes that > > could be seen by the guest VM. The hypervisor chooses which scheme a > > particular guest will see. Hyper-V has chosen to present the vTOM scheme > > to guest VMs, including normal Linux and Windows guests, that have been > > modestly updated to understand vTOM. > > If this is a standard SNP guest then you can detect vTOM support using > SEV_FEATURES. See this thread here: > > https://lore.kernel.org/all/20221117044433.244656-1-nikunj@amd.com/ > > Which then means, you don't need any special gunk except extending this > patch above to check SNP has vTOM support. Yes, we could do that. But this patch set is still needed to get the preliminaries in place. The initial code for supporting AMD vTOM that went upstream a year ago is too Hyper-V specific. This patch set removes the Hyper-V specific stuff, and integrates vTOM support into the overall confidential computing framework in arch/x86/coco/core.c. The Hyper-V code was already somewhat there via CC_VENDOR_HYPERV, but that really should be named something else now. So that's why I'm suggesting CC_VENDOR_AMD_VTOM. > > > In the future, Hyper-V may also choose to present original AMD C-bit scheme > > in some guest VMs, depending on the use case. And it will present the Intel > > TDX scheme when running on that hardware. > > And all those should JustWork(tm) because we already support such guests. Agreed. That's where we want to be. Of course, when you go from N=1 hypervisors (i.e., KVM) to N=2 (KVM and Hyper-V, you find some places where incorrect assumptions were made or some generalizations are needed. Dexuan Cui's patch set for TDX support is fixing those places that he has encountered. But with those fixes, the TDX support will JustWork(tm) for Linux guests on Hyper-V. I haven't gone deeply into the situation with AMD C-bit support on Hyper-V. Tianyu Lan's set of patches for that support is a bit bigger, and I'm planning to look closely to understand whether it's also just fixing incorrect assumptions and such, or whether they really are some differences with Hyper-V. If there are differences, I want to understand why. Michael
On Tue, Nov 29, 2022 at 01:15:39AM +0000, Michael Kelley (LINUX) wrote: > So that's why I'm suggesting CC_VENDOR_AMD_VTOM. There's no CC_VENDOR_AMD_VTOM. How many times do I need to explain this?! CC_VENDOR is well the CC vendor - not some special case. IOW, the goal here is for generic SNP functionality to be the same for *all* SNP guests. We won't do the AMD's version of vTOM enablement, Hyper-V's version of vTOM enablement and so on. It must be a single vTOM feature which works on *all* hypervisors as vTOM is a generic SNP feature - not Hyper-V feature. > Of course, when you go from N=1 hypervisors (i.e., KVM) to N=2 (KVM > and Hyper-V, you find some places where incorrect assumptions were > made or some generalizations are needed. Dexuan Cui's patch set for > TDX support is fixing those places that he has encountered. But with > those fixes, the TDX support will JustWork(tm) for Linux guests on > Hyper-V. That sounds like the right direction to take. > I haven't gone deeply into the situation with AMD C-bit support on > Hyper-V. Tianyu Lan's set of patches for that support is a bit bigger, > and I'm planning to look closely to understand whether it's also just > fixing incorrect assumptions and such, or whether they really are > some differences with Hyper-V. If there are differences, I want to > understand why. You and me too. So I guess we should look at Tianyu's set first. Thx.
From: Borislav Petkov <bp@alien8.de> Sent: Tuesday, November 29, 2022 12:41 AM > > On Tue, Nov 29, 2022 at 01:15:39AM +0000, Michael Kelley (LINUX) wrote: > > So that's why I'm suggesting CC_VENDOR_AMD_VTOM. > > There's no CC_VENDOR_AMD_VTOM. How many times do I need to explain this?! > CC_VENDOR is well the CC vendor - not some special case. Let's back up half a step. We have CC_VENDOR_INTEL and CC_VENDOR_AMD because that's a convenient way to identify two fairly different schemes for doing guest encryption. The schemes have some things in common, but the details are pretty different. Tagging the schemes based on the vendor makes sense because the schemes were independently developed by each processor vendor. But it turns out that AMD really has two fairly different schemes: the C-bit scheme and the vTOM scheme. The details of these two AMD schemes are pretty different. vTOM is *not* just a minor option on the C-bit scheme. It's an either/or -- a guest VM is either doing the C-bit scheme or the vTOM scheme, not some combination. Linux code in coco/core.c could choose to treat C-bit and vTOM as two sub-schemes under CC_VENDOR_AMD, but that makes the code a bit messy because we end up with "if" statements to figure out whether to do things the C-bit way or the vTOM way. The difference in the C-bit way and the vTOM way is not Hyper-V specific. Any hypervisor running on an AMD processor can make the same choice to offer C-bit VMs or vTOM VMs. Or we could model the two AMD schemes as two different vendors, which is what I'm suggesting. Doing so recognizes that the two schemes are fairly disjoint, and it makes the code cleaner. Tom Lendacky -- any thoughts on this question from AMD's standpoint? > > IOW, the goal here is for generic SNP functionality to be the same for > *all* SNP guests. We won't do the AMD's version of vTOM enablement, > Hyper-V's version of vTOM enablement and so on. It must be a single > vTOM feature which works on *all* hypervisors as vTOM is a generic SNP > feature - not Hyper-V feature. Yes, there's only one version of vTOM enablement -- the AMD version. But the broader AMD SNP functionality is bifurcated: there's the C-bit scheme and the vTOM scheme. The question is how Linux code should model those two different schemes. I'm suggesting that things are cleaner conceptually and in the code if we model the two different AMD schemes as two different vendors. Michael > > > Of course, when you go from N=1 hypervisors (i.e., KVM) to N=2 (KVM > > and Hyper-V, you find some places where incorrect assumptions were > > made or some generalizations are needed. Dexuan Cui's patch set for > > TDX support is fixing those places that he has encountered. But with > > those fixes, the TDX support will JustWork(tm) for Linux guests on > > Hyper-V. > > That sounds like the right direction to take. > > > I haven't gone deeply into the situation with AMD C-bit support on > > Hyper-V. Tianyu Lan's set of patches for that support is a bit bigger, > > and I'm planning to look closely to understand whether it's also just > > fixing incorrect assumptions and such, or whether they really are > > some differences with Hyper-V. If there are differences, I want to > > understand why. > > You and me too. So I guess we should look at Tianyu's set first. >
On Tue, Nov 29, 2022 at 03:49:06PM +0000, Michael Kelley (LINUX) wrote: > But it turns out that AMD really has two fairly different schemes: > the C-bit scheme and the vTOM scheme. Except it doesn't: "In the VMSA of an SNP-active guest, the VIRTUAL_TOM field designates a 2MB aligned guest physical address called the virtual top of memory. When bit 1 (vTOM) of SEV_FEATURES is set in the VMSA of an SNP-active VM, the VIRTUAL_TOM..." So SEV_FEATURES[1] is vTOM and it is part of SNP. Why do you keep harping on this being something else is beyond me... I already pointed you to the patch which adds this along with the other SEV_FEATURES. > The details of these two AMD schemes are pretty different. vTOM is > *not* just a minor option on the C-bit scheme. It's an either/or -- a > guest VM is either doing the C-bit scheme or the vTOM scheme, not some > combination. Linux code in coco/core.c could choose to treat C-bit and > vTOM as two sub-schemes under CC_VENDOR_AMD, but that makes the code a > bit messy because we end up with "if" statements to figure out whether > to do things the C-bit way or the vTOM way. Are you saying that that: if (cc_vendor == CC_VENDOR_AMD && sev_features & MSR_AMD64_SNP_VTOM_ENABLED) is messy? Why? We will have to support vTOM sooner or later. > Or we could model the two AMD schemes as two different vendors, > which is what I'm suggesting. Doing so recognizes that the two schemes > are fairly disjoint, and it makes the code cleaner. How is that any different from the above check? You *need* some sort of a check to differentiate between the two anyway.
From: Borislav Petkov <bp@alien8.de> Sent: Tuesday, November 29, 2022 9:47 AM > > On Tue, Nov 29, 2022 at 03:49:06PM +0000, Michael Kelley (LINUX) wrote: > > But it turns out that AMD really has two fairly different schemes: > > the C-bit scheme and the vTOM scheme. > > Except it doesn't: > > "In the VMSA of an SNP-active guest, the VIRTUAL_TOM field designates > a 2MB aligned guest physical address called the virtual top of memory. > When bit 1 (vTOM) of SEV_FEATURES is set in the VMSA of an SNP-active > VM, the VIRTUAL_TOM..." > > So SEV_FEATURES[1] is vTOM and it is part of SNP. > > Why do you keep harping on this being something else is beyond me... > > I already pointed you to the patch which adds this along with the other > SEV_FEATURES. > > > The details of these two AMD schemes are pretty different. vTOM is > > *not* just a minor option on the C-bit scheme. It's an either/or -- a > > guest VM is either doing the C-bit scheme or the vTOM scheme, not some > > combination. Linux code in coco/core.c could choose to treat C-bit and > > vTOM as two sub-schemes under CC_VENDOR_AMD, but that makes the code a > > bit messy because we end up with "if" statements to figure out whether > > to do things the C-bit way or the vTOM way. > > Are you saying that that: > > if (cc_vendor == CC_VENDOR_AMD && > sev_features & MSR_AMD64_SNP_VTOM_ENABLED) > > is messy? Why? > > We will have to support vTOM sooner or later. > > > Or we could model the two AMD schemes as two different vendors, > > which is what I'm suggesting. Doing so recognizes that the two schemes > > are fairly disjoint, and it makes the code cleaner. > > How is that any different from the above check? > > You *need* some sort of a check to differentiate between the two anyway. > Alright. Enough conceptual debate. I'll do a v4 of the patch series with the AMD C-bit and vTOM schemes folder under CC_VENDOR_AMD and we can see if there's any further feedback. I should have that v4 out later today or tomorrow. Michael
diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c index 49b44f8..f5e1f2d 100644 --- a/arch/x86/coco/core.c +++ b/arch/x86/coco/core.c @@ -78,7 +78,14 @@ static bool amd_cc_platform_has(enum cc_attr attr) static bool hyperv_cc_platform_has(enum cc_attr attr) { - return attr == CC_ATTR_GUEST_MEM_ENCRYPT; + switch (attr) { + case CC_ATTR_GUEST_MEM_ENCRYPT: + case CC_ATTR_MEM_ENCRYPT: + case CC_ATTR_EMULATED_IOAPIC: + return true; + default: + return false; + } } bool cc_platform_has(enum cc_attr attr) @@ -108,6 +115,7 @@ u64 cc_mkenc(u64 val) switch (vendor) { case CC_VENDOR_AMD: return val | cc_mask; + case CC_VENDOR_HYPERV: case CC_VENDOR_INTEL: return val & ~cc_mask; default: @@ -121,6 +129,7 @@ u64 cc_mkdec(u64 val) switch (vendor) { case CC_VENDOR_AMD: return val & ~cc_mask; + case CC_VENDOR_HYPERV: case CC_VENDOR_INTEL: return val | cc_mask; default: diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c index f49bc3e..89a97d7 100644 --- a/arch/x86/hyperv/hv_init.c +++ b/arch/x86/hyperv/hv_init.c @@ -29,7 +29,6 @@ #include <linux/syscore_ops.h> #include <clocksource/hyperv_timer.h> #include <linux/highmem.h> -#include <linux/swiotlb.h> int hyperv_init_cpuhp; u64 hv_current_partition_id = ~0ull; @@ -504,16 +503,6 @@ void __init hyperv_init(void) /* Query the VMs extended capability once, so that it can be cached. */ hv_query_ext_cap(0); -#ifdef CONFIG_SWIOTLB - /* - * Swiotlb bounce buffer needs to be mapped in extra address - * space. Map function doesn't work in the early place and so - * call swiotlb_update_mem_attributes() here. - */ - if (hv_is_isolation_supported()) - swiotlb_update_mem_attributes(); -#endif - return; clean_guest_os_id: diff --git a/arch/x86/hyperv/ivm.c b/arch/x86/hyperv/ivm.c index e8be4c2..29ccbe8 100644 --- a/arch/x86/hyperv/ivm.c +++ b/arch/x86/hyperv/ivm.c @@ -13,6 +13,7 @@ #include <asm/svm.h> #include <asm/sev.h> #include <asm/io.h> +#include <asm/coco.h> #include <asm/mshyperv.h> #include <asm/hypervisor.h> @@ -233,7 +234,6 @@ void hv_ghcb_msr_read(u64 msr, u64 *value) local_irq_restore(flags); } EXPORT_SYMBOL_GPL(hv_ghcb_msr_read); -#endif /* * hv_mark_gpa_visibility - Set pages visible to host via hvcall. @@ -286,27 +286,25 @@ static int hv_mark_gpa_visibility(u16 count, const u64 pfn[], } /* - * hv_set_mem_host_visibility - Set specified memory visible to host. + * hv_vtom_set_host_visibility - Set specified memory visible to host. * * In Isolation VM, all guest memory is encrypted from host and guest * needs to set memory visible to host via hvcall before sharing memory * with host. This function works as wrap of hv_mark_gpa_visibility() * with memory base and size. */ -int hv_set_mem_host_visibility(unsigned long kbuffer, int pagecount, bool visible) +static bool hv_vtom_set_host_visibility(unsigned long kbuffer, int pagecount, bool enc) { - enum hv_mem_host_visibility visibility = visible ? - VMBUS_PAGE_VISIBLE_READ_WRITE : VMBUS_PAGE_NOT_VISIBLE; + enum hv_mem_host_visibility visibility = enc ? + VMBUS_PAGE_NOT_VISIBLE : VMBUS_PAGE_VISIBLE_READ_WRITE; u64 *pfn_array; int ret = 0; + bool result = true; int i, pfn; - if (!hv_is_isolation_supported() || !hv_hypercall_pg) - return 0; - pfn_array = kmalloc(HV_HYP_PAGE_SIZE, GFP_KERNEL); if (!pfn_array) - return -ENOMEM; + return false; for (i = 0, pfn = 0; i < pagecount; i++) { pfn_array[pfn] = virt_to_hvpfn((void *)kbuffer + i * HV_HYP_PAGE_SIZE); @@ -315,17 +313,42 @@ int hv_set_mem_host_visibility(unsigned long kbuffer, int pagecount, bool visibl if (pfn == HV_MAX_MODIFY_GPA_REP_COUNT || i == pagecount - 1) { ret = hv_mark_gpa_visibility(pfn, pfn_array, visibility); - if (ret) + if (ret) { + result = false; goto err_free_pfn_array; + } pfn = 0; } } err_free_pfn_array: kfree(pfn_array); - return ret; + return result; +} + +static bool hv_vtom_tlb_flush_required(bool private) +{ + return true; } +static bool hv_vtom_cache_flush_required(void) +{ + return false; +} + +void __init hv_vtom_init(void) +{ + cc_set_vendor(CC_VENDOR_HYPERV); + cc_set_mask(ms_hyperv.shared_gpa_boundary); + physical_mask &= ms_hyperv.shared_gpa_boundary - 1; + + x86_platform.guest.enc_cache_flush_required = hv_vtom_cache_flush_required; + x86_platform.guest.enc_tlb_flush_required = hv_vtom_tlb_flush_required; + x86_platform.guest.enc_status_change_finish = hv_vtom_set_host_visibility; +} + +#endif /* CONFIG_AMD_MEM_ENCRYPT */ + /* * hv_map_memory - map memory to extra space in the AMD SEV-SNP Isolation VM. */ diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h index 61f0c20..59b3310 100644 --- a/arch/x86/include/asm/mshyperv.h +++ b/arch/x86/include/asm/mshyperv.h @@ -174,18 +174,19 @@ static inline void hv_apic_init(void) {} int hv_map_ioapic_interrupt(int ioapic_id, bool level, int vcpu, int vector, struct hv_interrupt_entry *entry); int hv_unmap_ioapic_interrupt(int ioapic_id, struct hv_interrupt_entry *entry); -int hv_set_mem_host_visibility(unsigned long addr, int numpages, bool visible); #ifdef CONFIG_AMD_MEM_ENCRYPT void hv_ghcb_msr_write(u64 msr, u64 value); void hv_ghcb_msr_read(u64 msr, u64 *value); bool hv_ghcb_negotiate_protocol(void); void hv_ghcb_terminate(unsigned int set, unsigned int reason); +void hv_vtom_init(void); #else static inline void hv_ghcb_msr_write(u64 msr, u64 value) {} static inline void hv_ghcb_msr_read(u64 msr, u64 *value) {} static inline bool hv_ghcb_negotiate_protocol(void) { return false; } static inline void hv_ghcb_terminate(unsigned int set, unsigned int reason) {} +static inline void hv_vtom_init(void) {} #endif extern bool hv_isolation_type_snp(void); @@ -241,11 +242,6 @@ static inline int hyperv_flush_guest_mapping_range(u64 as, } static inline void hv_set_register(unsigned int reg, u64 value) { } static inline u64 hv_get_register(unsigned int reg) { return 0; } -static inline int hv_set_mem_host_visibility(unsigned long addr, int numpages, - bool visible) -{ - return -1; -} #endif /* CONFIG_HYPERV */ diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c index 8316139..b080795 100644 --- a/arch/x86/kernel/cpu/mshyperv.c +++ b/arch/x86/kernel/cpu/mshyperv.c @@ -33,7 +33,6 @@ #include <asm/nmi.h> #include <clocksource/hyperv_timer.h> #include <asm/numa.h> -#include <asm/coco.h> /* Is Linux running as the root partition? */ bool hv_root_partition; @@ -325,8 +324,10 @@ static void __init ms_hyperv_init_platform(void) if (ms_hyperv.priv_high & HV_ISOLATION) { ms_hyperv.isolation_config_a = cpuid_eax(HYPERV_CPUID_ISOLATION_CONFIG); ms_hyperv.isolation_config_b = cpuid_ebx(HYPERV_CPUID_ISOLATION_CONFIG); - ms_hyperv.shared_gpa_boundary = - BIT_ULL(ms_hyperv.shared_gpa_boundary_bits); + + if (ms_hyperv.shared_gpa_boundary_active) + ms_hyperv.shared_gpa_boundary = + BIT_ULL(ms_hyperv.shared_gpa_boundary_bits); pr_info("Hyper-V: Isolation Config: Group A 0x%x, Group B 0x%x\n", ms_hyperv.isolation_config_a, ms_hyperv.isolation_config_b); @@ -337,11 +338,6 @@ static void __init ms_hyperv_init_platform(void) swiotlb_unencrypted_base = ms_hyperv.shared_gpa_boundary; #endif } - /* Isolation VMs are unenlightened SEV-based VMs, thus this check: */ - if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) { - if (hv_get_isolation_type() != HV_ISOLATION_TYPE_NONE) - cc_set_vendor(CC_VENDOR_HYPERV); - } } if (hv_max_functions_eax >= HYPERV_CPUID_NESTED_FEATURES) { @@ -410,6 +406,9 @@ static void __init ms_hyperv_init_platform(void) i8253_clear_counter_on_shutdown = false; #if IS_ENABLED(CONFIG_HYPERV) + if ((hv_get_isolation_type() == HV_ISOLATION_TYPE_VBS) || + (hv_get_isolation_type() == HV_ISOLATION_TYPE_SNP)) + hv_vtom_init(); /* * Setup the hook to get control post apic initialization. */ diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c index 06eb8910..0757cfe 100644 --- a/arch/x86/mm/pat/set_memory.c +++ b/arch/x86/mm/pat/set_memory.c @@ -2126,9 +2126,6 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc) static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc) { - if (hv_is_isolation_supported()) - return hv_set_mem_host_visibility(addr, numpages, !enc); - if (cc_platform_has(CC_ATTR_MEM_ENCRYPT)) return __set_memory_enc_pgtable(addr, numpages, enc);
Hyper-V guests on AMD SEV-SNP hardware have the option of using the "virtual Top Of Memory" (vTOM) feature specified by the SEV-SNP architecture. With vTOM, shared vs. private memory accesses are controlled by splitting the guest physical address space into two halves. vTOM is the dividing line where the uppermost bit of the physical address space is set; e.g., with 47 bits of guest physical address space, vTOM is 0x40000000000 (bit 46 is set). Guest phyiscal memory is accessible at two parallel physical addresses -- one below vTOM and one above vTOM. Accesses below vTOM are private (encrypted) while accesses above vTOM are shared (decrypted). In this sense, vTOM is like the GPA.SHARED bit in Intel TDX. Support for Hyper-V guests using vTOM was added to the Linux kernel in two patch sets[1][2]. This support treats the vTOM bit as part of the physical address. For accessing shared (decrypted) memory, these patch sets create a second kernel virtual mapping that maps to physical addresses above vTOM. A better approach is to treat the vTOM bit as a protection flag, not as part of the physical address. This new approach is like the approach for the GPA.SHARED bit in Intel TDX. Rather than creating a second kernel virtual mapping, the existing mapping is updated using recently added coco mechanisms. When memory is changed between private and shared using set_memory_decrypted() and set_memory_encrypted(), the PTEs for the existing kernel mapping are changed to add or remove the vTOM bit in the guest physical address, just as with TDX. The hypercalls to change the memory status on the host side are made using the existing callback mechanism. Everything just works, with a minor tweak to map the I/O APIC to use private accesses. To accomplish the switch in approach, the following must be done in in this single patch: * Update Hyper-V initialization to set the cc_mask based on vTOM and do other coco initialization. * Update physical_mask so the vTOM bit is no longer treated as part of the physical address * Update cc_mkenc() and cc_mkdec() to be active for Hyper-V guests. This makes the vTOM bit part of the protection flags. * Code already exists to make hypercalls to inform Hyper-V about pages changing between shared and private. Update this code to run as a callback from __set_memory_enc_pgtable(). * Remove the Hyper-V special case from __set_memory_enc_dec() * Remove the Hyper-V specific call to swiotlb_update_mem_attributes() since mem_encrypt_init() will now do it. [1] https://lore.kernel.org/all/20211025122116.264793-1-ltykernel@gmail.com/ [2] https://lore.kernel.org/all/20211213071407.314309-1-ltykernel@gmail.com/ Signed-off-by: Michael Kelley <mikelley@microsoft.com> --- arch/x86/coco/core.c | 11 +++++++++- arch/x86/hyperv/hv_init.c | 11 ---------- arch/x86/hyperv/ivm.c | 45 +++++++++++++++++++++++++++++++---------- arch/x86/include/asm/mshyperv.h | 8 ++------ arch/x86/kernel/cpu/mshyperv.c | 15 +++++++------- arch/x86/mm/pat/set_memory.c | 3 --- 6 files changed, 53 insertions(+), 40 deletions(-)