Message ID | 20240530111643.1091816-30-pankaj.gupta@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add AMD Secure Nested Paging (SEV-SNP) support | expand |
On Thu, May 30, 2024 at 1:17 PM Pankaj Gupta <pankaj.gupta@amd.com> wrote: > diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c > index def77a4429fb24e62748 > +static void pc_system_flash_map(PCMachineState *pcms, > + MemoryRegion *rom_memory) > +{ > + pc_system_flash_map_partial(pcms, rom_memory, 0, false); > +} > + > void pc_system_firmware_init(PCMachineState *pcms, > MemoryRegion *rom_memory) > { > @@ -238,9 +248,12 @@ void pc_system_firmware_init(PCMachineState *pcms, > } > } > > - if (!pflash_blk[0]) { > + if (!pflash_blk[0] || sev_snp_enabled()) { > /* Machine property pflash0 not set, use ROM mode */ > x86_bios_rom_init(X86_MACHINE(pcms), "bios.bin", rom_memory, false); > + if (sev_snp_enabled()) { > + pc_system_flash_map_partial(pcms, rom_memory, 3653632, true); > + } This number is a bit too specific. :) The main issue here is that we want to have both a ROM and a non-executable pflash device. I think in this case (which should be gated by machine_require_guest_memfd(MACHINE(pcms)), just like in earlier patches), we need to: 1) give an error if pflash_blk[1] is specified, i.e. support only one flash device 2) possibly, give a warning if -bios is _not_ specified. 3) map pflash_blk[0] below the BIOS ROM and expect it to be just the variables The need to use -bios for code and pflash0 (if desired) for variables also needs to be documented of course. Some parts of this patch can be salvaged, others are not needed anymore... I'll let you figure it out. :) Paolo
On Thu, May 30, 2024 at 06:16:41AM -0500, Pankaj Gupta wrote: > From: Michael Roth <michael.roth@amd.com> > > SEV-ES and SEV-SNP support OVMF images with non-volatile storage in > cases where the storage area is generated as a separate image as part > of the OVMF build process. IIUC, right now all OVMF builds for SEV/-ES/-SNP should be done as so called "stateless" image. ie *without* any separate NVRAM image, because that image will not be covered by the VM boot measurement and thus the NVRAM state is liable to undermine trust of the VM. Using NVRAM for SNP is theoretically possible in future but would be reliant on SVSM providing a secure encryption mechanism on the storage. > > Currently these are exposed with unit=0 corresponding to the actual BIOS > image, and unit=1 corresponding to the storage image. However, pflash > images are mapped guest memory using read-only memslots, which are not > allowed in conjunction with guest_memfd-backed ranges. This makes that > approach unusable for SEV-SNP, where the BIOS range will be encrypted > and mapped as private guest_memfd-backed memory. For this reason, > SEV-SNP will instead rely on -bios to handle loading the BIOS image. > > To allow for pflash to still be used for the storage image, rework the > existing logic to remove assumptions that unit=0 contains the BIOS image > when SEV-SNP, so that it can instead be used to handle only the storage > image. Mixing both BIOS and pflash is pretty undesirable, not least because that setup cannot be currently represented by the firmware descriptor format described by docs/interop/firmware.json. So at the very least this patch is incomplete, as it would need to propose changes to the firmware.json to allow this setup to be expressed. I really wish we didn't have to introduce this though - is there really no way to make it possible to use pflash for both CODE & VARS with SNP, as is done with traditional VMs, so we don't diverge in setup, needing yet more changes up the mgmt stack ? > > Signed-off-by: Michael Roth <michael.roth@amd.com> > Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com> > --- > hw/i386/pc_sysfw.c | 47 +++++++++++++++++++++++++++++----------------- > 1 file changed, 30 insertions(+), 17 deletions(-) > > diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c > index def77a442d..7f97e62b16 100644 > --- a/hw/i386/pc_sysfw.c > +++ b/hw/i386/pc_sysfw.c > @@ -125,21 +125,10 @@ void pc_system_flash_cleanup_unused(PCMachineState *pcms) > } > } > > -/* > - * Map the pcms->flash[] from 4GiB downward, and realize. > - * Map them in descending order, i.e. pcms->flash[0] at the top, > - * without gaps. > - * Stop at the first pcms->flash[0] lacking a block backend. > - * Set each flash's size from its block backend. Fatal error if the > - * size isn't a non-zero multiple of 4KiB, or the total size exceeds > - * pcms->max_fw_size. > - * > - * If pcms->flash[0] has a block backend, its memory is passed to > - * pc_isa_bios_init(). Merging several flash devices for isa-bios is > - * not supported. > - */ > -static void pc_system_flash_map(PCMachineState *pcms, > - MemoryRegion *rom_memory) > +static void pc_system_flash_map_partial(PCMachineState *pcms, > + MemoryRegion *rom_memory, > + hwaddr offset, > + bool storage_only) > { > X86MachineState *x86ms = X86_MACHINE(pcms); > PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms); > @@ -154,6 +143,8 @@ static void pc_system_flash_map(PCMachineState *pcms, > > assert(PC_MACHINE_GET_CLASS(pcms)->pci_enabled); > > + total_size = offset; > + > for (i = 0; i < ARRAY_SIZE(pcms->flash); i++) { > hwaddr gpa; > > @@ -192,7 +183,7 @@ static void pc_system_flash_map(PCMachineState *pcms, > sysbus_realize_and_unref(SYS_BUS_DEVICE(system_flash), &error_fatal); > sysbus_mmio_map(SYS_BUS_DEVICE(system_flash), 0, gpa); > > - if (i == 0) { > + if (i == 0 && !storage_only) { > flash_mem = pflash_cfi01_get_memory(system_flash); > if (pcmc->isa_bios_alias) { > x86_isa_bios_init(&x86ms->isa_bios, rom_memory, flash_mem, > @@ -211,6 +202,25 @@ static void pc_system_flash_map(PCMachineState *pcms, > } > } > > +/* > + * Map the pcms->flash[] from 4GiB downward, and realize. > + * Map them in descending order, i.e. pcms->flash[0] at the top, > + * without gaps. > + * Stop at the first pcms->flash[0] lacking a block backend. > + * Set each flash's size from its block backend. Fatal error if the > + * size isn't a non-zero multiple of 4KiB, or the total size exceeds > + * pcms->max_fw_size. > + * > + * If pcms->flash[0] has a block backend, its memory is passed to > + * pc_isa_bios_init(). Merging several flash devices for isa-bios is > + * not supported. > + */ > +static void pc_system_flash_map(PCMachineState *pcms, > + MemoryRegion *rom_memory) > +{ > + pc_system_flash_map_partial(pcms, rom_memory, 0, false); > +} > + > void pc_system_firmware_init(PCMachineState *pcms, > MemoryRegion *rom_memory) > { > @@ -238,9 +248,12 @@ void pc_system_firmware_init(PCMachineState *pcms, > } > } > > - if (!pflash_blk[0]) { > + if (!pflash_blk[0] || sev_snp_enabled()) { > /* Machine property pflash0 not set, use ROM mode */ > x86_bios_rom_init(X86_MACHINE(pcms), "bios.bin", rom_memory, false); > + if (sev_snp_enabled()) { > + pc_system_flash_map_partial(pcms, rom_memory, 3653632, true); > + } > } else { > if (kvm_enabled() && !kvm_readonly_mem_enabled()) { > /* > -- > 2.34.1 > With regards, Daniel
On Mon, Jun 3, 2024 at 1:55 PM Daniel P. Berrangé <berrange@redhat.com> wrote: > I really wish we didn't have to introduce this though - is there really > no way to make it possible to use pflash for both CODE & VARS with SNP, > as is done with traditional VMs, so we don't diverge in setup, needing > yet more changes up the mgmt stack ? No, you cannot use pflash for CODE in either SNP or TDX. The hardware does not support it. One possibility is to only support non-pflash-based variable store. This is not yet in QEMU, but it is how both AWS and Google implemented UEFI variables and I think Gerd was going to work on it for QEMU. Paolo
On Mon, Jun 03, 2024 at 12:55:43PM +0100, Daniel P. Berrangé wrote: > On Thu, May 30, 2024 at 06:16:41AM -0500, Pankaj Gupta wrote: > > From: Michael Roth <michael.roth@amd.com> > > > > SEV-ES and SEV-SNP support OVMF images with non-volatile storage in > > cases where the storage area is generated as a separate image as part > > of the OVMF build process. > > IIUC, right now all OVMF builds for SEV/-ES/-SNP should be done as so > called "stateless" image. ie *without* any separate NVRAM image, because > that image will not be covered by the VM boot measurement and thus the > NVRAM state is liable to undermine trust of the VM. Technically both stateless and stateful are supported for SEV-ES, so this is mainly to provide similar support for SNP. Unfortunately we can't re-use the existing QEMU topology because of the limitations are using read-only ROMs. But I'm not sure it's something we have a hard requirement for, and even then maybe there are other alternatives to consider that would offer better security. In any case, I think it makes total sense to decouple this from the series and re-submit this as a separate patchset if we determine that it's truly necessary (and if so, address Paolo's comments, and handle the 64K-alignment thing in the context of that work). So for now maybe we should plan to drop it from qemu-coco-queue and focus on the stateless builds for the initial code merge. -Mike > > Using NVRAM for SNP is theoretically possible in future but would be > reliant on SVSM providing a secure encryption mechanism on the storage. > > > > > > > Currently these are exposed with unit=0 corresponding to the actual BIOS > > image, and unit=1 corresponding to the storage image. However, pflash > > images are mapped guest memory using read-only memslots, which are not > > allowed in conjunction with guest_memfd-backed ranges. This makes that > > approach unusable for SEV-SNP, where the BIOS range will be encrypted > > and mapped as private guest_memfd-backed memory. For this reason, > > SEV-SNP will instead rely on -bios to handle loading the BIOS image. > > > > To allow for pflash to still be used for the storage image, rework the > > existing logic to remove assumptions that unit=0 contains the BIOS image > > when SEV-SNP, so that it can instead be used to handle only the storage > > image. > > Mixing both BIOS and pflash is pretty undesirable, not least because > that setup cannot be currently represented by the firmware descriptor > format described by docs/interop/firmware.json. > > So at the very least this patch is incomplete, as it would need to > propose changes to the firmware.json to allow this setup to be expressed. > > I really wish we didn't have to introduce this though - is there really > no way to make it possible to use pflash for both CODE & VARS with SNP, > as is done with traditional VMs, so we don't diverge in setup, needing > yet more changes up the mgmt stack ? > > > > > Signed-off-by: Michael Roth <michael.roth@amd.com> > > Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com> > > --- > > hw/i386/pc_sysfw.c | 47 +++++++++++++++++++++++++++++----------------- > > 1 file changed, 30 insertions(+), 17 deletions(-) > > > > diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c > > index def77a442d..7f97e62b16 100644 > > --- a/hw/i386/pc_sysfw.c > > +++ b/hw/i386/pc_sysfw.c > > @@ -125,21 +125,10 @@ void pc_system_flash_cleanup_unused(PCMachineState *pcms) > > } > > } > > > > -/* > > - * Map the pcms->flash[] from 4GiB downward, and realize. > > - * Map them in descending order, i.e. pcms->flash[0] at the top, > > - * without gaps. > > - * Stop at the first pcms->flash[0] lacking a block backend. > > - * Set each flash's size from its block backend. Fatal error if the > > - * size isn't a non-zero multiple of 4KiB, or the total size exceeds > > - * pcms->max_fw_size. > > - * > > - * If pcms->flash[0] has a block backend, its memory is passed to > > - * pc_isa_bios_init(). Merging several flash devices for isa-bios is > > - * not supported. > > - */ > > -static void pc_system_flash_map(PCMachineState *pcms, > > - MemoryRegion *rom_memory) > > +static void pc_system_flash_map_partial(PCMachineState *pcms, > > + MemoryRegion *rom_memory, > > + hwaddr offset, > > + bool storage_only) > > { > > X86MachineState *x86ms = X86_MACHINE(pcms); > > PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms); > > @@ -154,6 +143,8 @@ static void pc_system_flash_map(PCMachineState *pcms, > > > > assert(PC_MACHINE_GET_CLASS(pcms)->pci_enabled); > > > > + total_size = offset; > > + > > for (i = 0; i < ARRAY_SIZE(pcms->flash); i++) { > > hwaddr gpa; > > > > @@ -192,7 +183,7 @@ static void pc_system_flash_map(PCMachineState *pcms, > > sysbus_realize_and_unref(SYS_BUS_DEVICE(system_flash), &error_fatal); > > sysbus_mmio_map(SYS_BUS_DEVICE(system_flash), 0, gpa); > > > > - if (i == 0) { > > + if (i == 0 && !storage_only) { > > flash_mem = pflash_cfi01_get_memory(system_flash); > > if (pcmc->isa_bios_alias) { > > x86_isa_bios_init(&x86ms->isa_bios, rom_memory, flash_mem, > > @@ -211,6 +202,25 @@ static void pc_system_flash_map(PCMachineState *pcms, > > } > > } > > > > +/* > > + * Map the pcms->flash[] from 4GiB downward, and realize. > > + * Map them in descending order, i.e. pcms->flash[0] at the top, > > + * without gaps. > > + * Stop at the first pcms->flash[0] lacking a block backend. > > + * Set each flash's size from its block backend. Fatal error if the > > + * size isn't a non-zero multiple of 4KiB, or the total size exceeds > > + * pcms->max_fw_size. > > + * > > + * If pcms->flash[0] has a block backend, its memory is passed to > > + * pc_isa_bios_init(). Merging several flash devices for isa-bios is > > + * not supported. > > + */ > > +static void pc_system_flash_map(PCMachineState *pcms, > > + MemoryRegion *rom_memory) > > +{ > > + pc_system_flash_map_partial(pcms, rom_memory, 0, false); > > +} > > + > > void pc_system_firmware_init(PCMachineState *pcms, > > MemoryRegion *rom_memory) > > { > > @@ -238,9 +248,12 @@ void pc_system_firmware_init(PCMachineState *pcms, > > } > > } > > > > - if (!pflash_blk[0]) { > > + if (!pflash_blk[0] || sev_snp_enabled()) { > > /* Machine property pflash0 not set, use ROM mode */ > > x86_bios_rom_init(X86_MACHINE(pcms), "bios.bin", rom_memory, false); > > + if (sev_snp_enabled()) { > > + pc_system_flash_map_partial(pcms, rom_memory, 3653632, true); > > + } > > } else { > > if (kvm_enabled() && !kvm_readonly_mem_enabled()) { > > /* > > -- > > 2.34.1 > > > > With regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| >
On Mon, Jun 3, 2024 at 4:28 PM Michael Roth <michael.roth@amd.com> wrote: > So for now maybe we should plan to drop it from qemu-coco-queue and > focus on the stateless builds for the initial code merge. Yes, I included it in qemu-coco-queue to ensure that other things didn't break split firmware (or they were properly identified), but basically everything else in qemu-coco-queue is ready for merge. Please double check "i386/sev: Allow measured direct kernel boot on SNP" as well, as I did some reorganization of the code into a new class method for sev-guest and sev-snp-guest objects. Paolo
On Mon, Jun 03, 2024 at 04:31:45PM +0200, Paolo Bonzini wrote: > On Mon, Jun 3, 2024 at 4:28 PM Michael Roth <michael.roth@amd.com> wrote: > > So for now maybe we should plan to drop it from qemu-coco-queue and > > focus on the stateless builds for the initial code merge. > > Yes, I included it in qemu-coco-queue to ensure that other things > didn't break split firmware (or they were properly identified), but > basically everything else in qemu-coco-queue is ready for merge. > > Please double check "i386/sev: Allow measured direct kernel boot on > SNP" as well, as I did some reorganization of the code into a new > class method for sev-guest and sev-snp-guest objects. The patch changes look sensible to me, and I re-tested the following permutations of split/unsplit OVMF with/without kernel hashing and everything looks good (this is with PATCH 29/31 reverted): |split |split |unsplit |unsplit | |hashing=on|hashing=off|hashing=on|hashing=off| |----------|-----------|----------|-----------| svm | n/a | PASS | n/a | PASS | sev | n/a | PASS | PASS | PASS | sev-es | n/a | PASS | PASS | PASS | snp | n/a | n/a | PASS | PASS | (split + hashing=on is not possible because hashing requires AmdSevX64 OVMF package which is built unsplit-only by design. split tests done using OvmfPkgX64) -Mike > > Paolo >
On Mon, Jun 03, 2024 at 03:38:05PM GMT, Paolo Bonzini wrote: > On Mon, Jun 3, 2024 at 1:55 PM Daniel P. Berrangé <berrange@redhat.com> wrote: > > I really wish we didn't have to introduce this though - is there really > > no way to make it possible to use pflash for both CODE & VARS with SNP, > > as is done with traditional VMs, so we don't diverge in setup, needing > > yet more changes up the mgmt stack ? > > No, you cannot use pflash for CODE in either SNP or TDX. The hardware > does not support it. > > One possibility is to only support non-pflash-based variable store. > This is not yet in QEMU, but it is how both AWS and Google implemented > UEFI variables and I think Gerd was going to work on it for QEMU. Yes, working on and off on it. Progress is slower that I wish it would be due to getting side tracked into other important edk2 things ... But, yes, the longer-term plan is that edk2 wouldn't manage the variable store itself. It will be either qemu (non-confidential setups), or the svsm (confidential setups). Where we are going to store svsm state (vtpm, efi vars, ...) is not fully clear yet. pflash is one option, but we are also checking out alternatives like virtio-blk (via virtio-mmio). take care, Gerd
diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c index def77a442d..7f97e62b16 100644 --- a/hw/i386/pc_sysfw.c +++ b/hw/i386/pc_sysfw.c @@ -125,21 +125,10 @@ void pc_system_flash_cleanup_unused(PCMachineState *pcms) } } -/* - * Map the pcms->flash[] from 4GiB downward, and realize. - * Map them in descending order, i.e. pcms->flash[0] at the top, - * without gaps. - * Stop at the first pcms->flash[0] lacking a block backend. - * Set each flash's size from its block backend. Fatal error if the - * size isn't a non-zero multiple of 4KiB, or the total size exceeds - * pcms->max_fw_size. - * - * If pcms->flash[0] has a block backend, its memory is passed to - * pc_isa_bios_init(). Merging several flash devices for isa-bios is - * not supported. - */ -static void pc_system_flash_map(PCMachineState *pcms, - MemoryRegion *rom_memory) +static void pc_system_flash_map_partial(PCMachineState *pcms, + MemoryRegion *rom_memory, + hwaddr offset, + bool storage_only) { X86MachineState *x86ms = X86_MACHINE(pcms); PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms); @@ -154,6 +143,8 @@ static void pc_system_flash_map(PCMachineState *pcms, assert(PC_MACHINE_GET_CLASS(pcms)->pci_enabled); + total_size = offset; + for (i = 0; i < ARRAY_SIZE(pcms->flash); i++) { hwaddr gpa; @@ -192,7 +183,7 @@ static void pc_system_flash_map(PCMachineState *pcms, sysbus_realize_and_unref(SYS_BUS_DEVICE(system_flash), &error_fatal); sysbus_mmio_map(SYS_BUS_DEVICE(system_flash), 0, gpa); - if (i == 0) { + if (i == 0 && !storage_only) { flash_mem = pflash_cfi01_get_memory(system_flash); if (pcmc->isa_bios_alias) { x86_isa_bios_init(&x86ms->isa_bios, rom_memory, flash_mem, @@ -211,6 +202,25 @@ static void pc_system_flash_map(PCMachineState *pcms, } } +/* + * Map the pcms->flash[] from 4GiB downward, and realize. + * Map them in descending order, i.e. pcms->flash[0] at the top, + * without gaps. + * Stop at the first pcms->flash[0] lacking a block backend. + * Set each flash's size from its block backend. Fatal error if the + * size isn't a non-zero multiple of 4KiB, or the total size exceeds + * pcms->max_fw_size. + * + * If pcms->flash[0] has a block backend, its memory is passed to + * pc_isa_bios_init(). Merging several flash devices for isa-bios is + * not supported. + */ +static void pc_system_flash_map(PCMachineState *pcms, + MemoryRegion *rom_memory) +{ + pc_system_flash_map_partial(pcms, rom_memory, 0, false); +} + void pc_system_firmware_init(PCMachineState *pcms, MemoryRegion *rom_memory) { @@ -238,9 +248,12 @@ void pc_system_firmware_init(PCMachineState *pcms, } } - if (!pflash_blk[0]) { + if (!pflash_blk[0] || sev_snp_enabled()) { /* Machine property pflash0 not set, use ROM mode */ x86_bios_rom_init(X86_MACHINE(pcms), "bios.bin", rom_memory, false); + if (sev_snp_enabled()) { + pc_system_flash_map_partial(pcms, rom_memory, 3653632, true); + } } else { if (kvm_enabled() && !kvm_readonly_mem_enabled()) { /*