Message ID | 20240704000019.3928862-1-michael.roth@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | i386/sev: Don't allow automatic fallback to legacy KVM_SEV*_INIT | expand |
On Thu, Jul 4, 2024 at 2:01 AM Michael Roth <michael.roth@amd.com> wrote: > Currently if the 'legacy-vm-type' property of the sev-guest object is > left unset, QEMU will attempt to use the newer KVM_SEV_INIT2 kernel > interface in conjunction with the newer KVM_X86_SEV_VM and > KVM_X86_SEV_ES_VM KVM VM types. > > This can lead to measurement changes if, for instance, an SEV guest was > created on a host that originally had an older kernel that didn't > support KVM_SEV_INIT2, but is booted on the same host later on after the > host kernel was upgraded. I think this is the right thing to do for SEV-ES. I agree that it's bad to require a very new kernel (6.10 will be released only a month before QEMU 9.1), on the other hand the KVM_SEV_ES_INIT API is broken in several ways. As long as there is a way to go back to it, and it's not changed by old machine types, not using it for SEV-ES is the better choice for upstream. On the other hand, I think it makes no difference for SEV? Should we always use KVM_SEV_INIT, or alternatively fall back as it was before this patch? Paolo > Cc: Daniel P. Berrangé <berrange@redhat.com> > Cc: Paolo Bonzini <pbonzini@redhat.com> > cc: kvm@vger.kernel.org > Signed-off-by: Michael Roth <michael.roth@amd.com> > --- > qapi/qom.json | 11 ++++++----- > target/i386/sev.c | 30 ++++++++++++++++++++++++------ > 2 files changed, 30 insertions(+), 11 deletions(-) > > diff --git a/qapi/qom.json b/qapi/qom.json > index 8bd299265e..a212c009aa 100644 > --- a/qapi/qom.json > +++ b/qapi/qom.json > @@ -912,11 +912,12 @@ > # @handle: SEV firmware handle (default: 0) > # > # @legacy-vm-type: Use legacy KVM_SEV_INIT KVM interface for creating the VM. > -# The newer KVM_SEV_INIT2 interface syncs additional vCPU > -# state when initializing the VMSA structures, which will > -# result in a different guest measurement. Set this to > -# maintain compatibility with older QEMU or kernel versions > -# that rely on legacy KVM_SEV_INIT behavior. > +# The newer KVM_SEV_INIT2 interface, from Linux >= 6.10, syncs > +# additional vCPU state when initializing the VMSA structures, > +# which will result in a different guest measurement. Set > +# this to force compatibility with older QEMU or kernel > +# versions that rely on legacy KVM_SEV_INIT behavior. > +# Otherwise, QEMU will require KVM_SEV_INIT2 for SEV guests. > # (default: false) (since 9.1) > # > # Since: 2.12 > diff --git a/target/i386/sev.c b/target/i386/sev.c > index 3ab8b3c28b..8f56c0cf0c 100644 > --- a/target/i386/sev.c > +++ b/target/i386/sev.c > @@ -1347,14 +1347,22 @@ static int sev_kvm_type(X86ConfidentialGuest *cg) > goto out; > } > > + if (sev_guest->legacy_vm_type) { > + sev_common->kvm_type = KVM_X86_DEFAULT_VM; > + goto out; > + } > + > kvm_type = (sev_guest->policy & SEV_POLICY_ES) ? > KVM_X86_SEV_ES_VM : KVM_X86_SEV_VM; > - if (kvm_is_vm_type_supported(kvm_type) && !sev_guest->legacy_vm_type) { > - sev_common->kvm_type = kvm_type; > - } else { > - sev_common->kvm_type = KVM_X86_DEFAULT_VM; > + if (!kvm_is_vm_type_supported(kvm_type)) { > + error_report("SEV: host kernel does not support requested %s VM type. To allow use of " > + "legacy KVM_X86_DEFAULT_VM VM type, the 'legacy-vm-type' argument must be " > + "set to true for the sev-guest object.", > + kvm_type == KVM_X86_SEV_VM ? "KVM_X86_SEV_VM" : "KVM_X86_SEV_ES_VM"); > + return -1; > } > > + sev_common->kvm_type = kvm_type; > out: > return sev_common->kvm_type; > } > @@ -1445,14 +1453,24 @@ static int sev_common_kvm_init(ConfidentialGuestSupport *cgs, Error **errp) > } > > trace_kvm_sev_init(); > - if (x86_klass->kvm_type(X86_CONFIDENTIAL_GUEST(sev_common)) == KVM_X86_DEFAULT_VM) { > + switch (x86_klass->kvm_type(X86_CONFIDENTIAL_GUEST(sev_common))) { > + case KVM_X86_DEFAULT_VM: > cmd = sev_es_enabled() ? KVM_SEV_ES_INIT : KVM_SEV_INIT; > > ret = sev_ioctl(sev_common->sev_fd, cmd, NULL, &fw_error); > - } else { > + break; > + case KVM_X86_SEV_VM: > + case KVM_X86_SEV_ES_VM: > + case KVM_X86_SNP_VM: { > struct kvm_sev_init args = { 0 }; > > ret = sev_ioctl(sev_common->sev_fd, KVM_SEV_INIT2, &args, &fw_error); > + break; > + } > + default: > + error_setg(errp, "%s: host kernel does not support the requested SEV configuration.", > + __func__); > + return -1; > } > > if (ret) { > -- > 2.25.1 >
On Thu, Jul 04, 2024 at 08:51:05AM +0200, Paolo Bonzini wrote: > On Thu, Jul 4, 2024 at 2:01 AM Michael Roth <michael.roth@amd.com> wrote: > > Currently if the 'legacy-vm-type' property of the sev-guest object is > > left unset, QEMU will attempt to use the newer KVM_SEV_INIT2 kernel > > interface in conjunction with the newer KVM_X86_SEV_VM and > > KVM_X86_SEV_ES_VM KVM VM types. > > > > This can lead to measurement changes if, for instance, an SEV guest was > > created on a host that originally had an older kernel that didn't > > support KVM_SEV_INIT2, but is booted on the same host later on after the > > host kernel was upgraded. > > I think this is the right thing to do for SEV-ES. I agree that it's > bad to require a very new kernel (6.10 will be released only a month > before QEMU 9.1), on the other hand the KVM_SEV_ES_INIT API is broken > in several ways. As long as there is a way to go back to it, and it's > not changed by old machine types, not using it for SEV-ES is the > better choice for upstream. Broken how ? I know there was the regression with the 'debug_swap' parameter, but was something that should just be fixed in the kernel, rather than breaking userspace. What else is a problem ? I don't think its reasonable for QEMU to require a brand new kernel for new machine types, given SEV & SEV-ES have been deployed for many years already. With regards, Daniel
On Thu, Jul 4, 2024 at 10:42 AM Daniel P. Berrangé <berrange@redhat.com> wrote: > > On Thu, Jul 04, 2024 at 08:51:05AM +0200, Paolo Bonzini wrote: > > On Thu, Jul 4, 2024 at 2:01 AM Michael Roth <michael.roth@amd.com> wrote: > > > Currently if the 'legacy-vm-type' property of the sev-guest object is > > > left unset, QEMU will attempt to use the newer KVM_SEV_INIT2 kernel > > > interface in conjunction with the newer KVM_X86_SEV_VM and > > > KVM_X86_SEV_ES_VM KVM VM types. > > > > > > This can lead to measurement changes if, for instance, an SEV guest was > > > created on a host that originally had an older kernel that didn't > > > support KVM_SEV_INIT2, but is booted on the same host later on after the > > > host kernel was upgraded. > > > > I think this is the right thing to do for SEV-ES. I agree that it's > > bad to require a very new kernel (6.10 will be released only a month > > before QEMU 9.1), on the other hand the KVM_SEV_ES_INIT API is broken > > in several ways. As long as there is a way to go back to it, and it's > > not changed by old machine types, not using it for SEV-ES is the > > better choice for upstream. > > Broken how ? I know there was the regression with the 'debug_swap' > parameter, but was something that should just be fixed in the kernel, > rather than breaking userspace. What else is a problem ? The debug_swap parameter simply could not be enabled in the old API without breaking measurements. The new API *is the fix* to allow using it (though QEMU doesn't have the option plumbed in yet). There is no extensibility. Enabling debug_swap by default is also a thorny problem; it cannot be enabled by default because not all CPUs support it, and also we'd have the same problem that we cannot enable debug_swap on new machine types without requiring a new kernel. Tying the default to the -cpu model would work but it is confusing. But I guess we can add support for debug_swap, disabled by default and switch to the new API if debug_swap is enabled. > I don't think its reasonable for QEMU to require a brand new kernel > for new machine types, given SEV & SEV-ES have been deployed for > many years already. I think it's reasonable if the fix is displayed right into the error message. It's only needed for SEV-ES though, SEV can use the old and new APIs interchangeably. Paolo
On Thu, Jul 04, 2024 at 11:31:16AM +0200, Paolo Bonzini wrote: > On Thu, Jul 4, 2024 at 10:42 AM Daniel P. Berrangé <berrange@redhat.com> wrote: > > > > On Thu, Jul 04, 2024 at 08:51:05AM +0200, Paolo Bonzini wrote: > > > On Thu, Jul 4, 2024 at 2:01 AM Michael Roth <michael.roth@amd.com> wrote: > > > > Currently if the 'legacy-vm-type' property of the sev-guest object is > > > > left unset, QEMU will attempt to use the newer KVM_SEV_INIT2 kernel > > > > interface in conjunction with the newer KVM_X86_SEV_VM and > > > > KVM_X86_SEV_ES_VM KVM VM types. > > > > > > > > This can lead to measurement changes if, for instance, an SEV guest was > > > > created on a host that originally had an older kernel that didn't > > > > support KVM_SEV_INIT2, but is booted on the same host later on after the > > > > host kernel was upgraded. > > > > > > I think this is the right thing to do for SEV-ES. I agree that it's > > > bad to require a very new kernel (6.10 will be released only a month > > > before QEMU 9.1), on the other hand the KVM_SEV_ES_INIT API is broken > > > in several ways. As long as there is a way to go back to it, and it's > > > not changed by old machine types, not using it for SEV-ES is the > > > better choice for upstream. > > > > Broken how ? I know there was the regression with the 'debug_swap' > > parameter, but was something that should just be fixed in the kernel, > > rather than breaking userspace. What else is a problem ? > > The debug_swap parameter simply could not be enabled in the old API > without breaking measurements. The new API *is the fix* to allow using > it (though QEMU doesn't have the option plumbed in yet). There is no > extensibility. > > Enabling debug_swap by default is also a thorny problem; it cannot be > enabled by default because not all CPUs support it, and also we'd have > the same problem that we cannot enable debug_swap on new machine types > without requiring a new kernel. Tying the default to the -cpu model > would work but it is confusing. Presumably we can tie it to '-cpu host' without much problem, and then just leave it as an opt-in feature flag for named CPU models. > But I guess we can add support for debug_swap, disabled by default and > switch to the new API if debug_swap is enabled. > > > I don't think its reasonable for QEMU to require a brand new kernel > > for new machine types, given SEV & SEV-ES have been deployed for > > many years already. > > I think it's reasonable if the fix is displayed right into the error > message. It's only needed for SEV-ES though, SEV can use the old and > new APIs interchangeably. FYI currently it is proposed to unconditionally force set legacy-vm-type=true in libvirt, so QEMU guests would *never* use the new ioctl, to fix what we consider to be a QEMU / kernel guest ABI regression. If libvirt adds this though, it is basically a forever setting we would never be able to remove as removing would break ABI compat again. https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/message/KP24LPFV4OCMN45TDHNXQVBPDZ56QSRR/ I'd much rather QEMU did NOT set this by default in its machine types, so libvirt doesn't have to add this forced flag. That way downstream distros who /can/ guarantee new enough kernels, can still use legacy-vm-type=false in their downstream machine types, without libvirt overriding this. With regards, Daniel
On Thu, Jul 4, 2024 at 11:39 AM Daniel P. Berrangé <berrange@redhat.com> wrote: > > The debug_swap parameter simply could not be enabled in the old API > > without breaking measurements. The new API *is the fix* to allow using > > it (though QEMU doesn't have the option plumbed in yet). There is no > > extensibility. > > > > Enabling debug_swap by default is also a thorny problem; it cannot be > > enabled by default because not all CPUs support it, and also we'd have > > the same problem that we cannot enable debug_swap on new machine types > > without requiring a new kernel. Tying the default to the -cpu model > > would work but it is confusing. > > Presumably we can tie it to '-cpu host' without much problem, and > then just leave it as an opt-in feature flag for named CPU models. '-cpu host' for SEV-SNP is also problematic, since CPUID is part of the measurement. It's okay for starting guests in a quick and dirty manner, but it cannot be used if measurement is in use. It's weird to have "-cpu" provide the default for "-object", since -object is created much earlier than CPUs. But then "-cpu" itself is weird because it's a kind of "factory" for future objects. Maybe we should redo the same exercise I did to streamline machine initialization, this time focusing on -cpu/-machine/-accel, to understand the various phases and where sev-{,snp-}guest initialization fits. > > I think it's reasonable if the fix is displayed right into the error > > message. It's only needed for SEV-ES though, SEV can use the old and > > new APIs interchangeably. > > FYI currently it is proposed to unconditionally force set legacy-vm-type=true > in libvirt, so QEMU guests would *never* use the new ioctl, to fix what we > consider to be a QEMU / kernel guest ABI regression. Ok, so it's probably best to apply both this and your patch for now. Later debug-swap can be enabled and will automatically disable legacy-vm-type if the user left it to the default. If you want to test this combo and send a pull request (either to me or to Richard), that would help because I'm mostly away for a few days. Paolo
On Thu, Jul 04, 2024 at 11:53:33AM +0200, Paolo Bonzini wrote: > On Thu, Jul 4, 2024 at 11:39 AM Daniel P. Berrangé <berrange@redhat.com> wrote: > > > The debug_swap parameter simply could not be enabled in the old API > > > without breaking measurements. The new API *is the fix* to allow using > > > it (though QEMU doesn't have the option plumbed in yet). There is no > > > extensibility. > > > > > > Enabling debug_swap by default is also a thorny problem; it cannot be > > > enabled by default because not all CPUs support it, and also we'd have > > > the same problem that we cannot enable debug_swap on new machine types > > > without requiring a new kernel. Tying the default to the -cpu model > > > would work but it is confusing. > > > > Presumably we can tie it to '-cpu host' without much problem, and > > then just leave it as an opt-in feature flag for named CPU models. > > '-cpu host' for SEV-SNP is also problematic, since CPUID is part of > the measurement. It's okay for starting guests in a quick and dirty > manner, but it cannot be used if measurement is in use. > > It's weird to have "-cpu" provide the default for "-object", since > -object is created much earlier than CPUs. But then "-cpu" itself is > weird because it's a kind of "factory" for future objects. Maybe we > should redo the same exercise I did to streamline machine > initialization, this time focusing on -cpu/-machine/-accel, to > understand the various phases and where sev-{,snp-}guest > initialization fits. > > > > I think it's reasonable if the fix is displayed right into the error > > > message. It's only needed for SEV-ES though, SEV can use the old and > > > new APIs interchangeably. > > > > FYI currently it is proposed to unconditionally force set legacy-vm-type=true > > in libvirt, so QEMU guests would *never* use the new ioctl, to fix what we > > consider to be a QEMU / kernel guest ABI regression. > > Ok, so it's probably best to apply both this and your patch for now. > Later debug-swap can be enabled and will automatically disable > legacy-vm-type if the user left it to the default. I think this seems like the ideal default behavior, where QEMU will continue to stick with legacy interface unless the user specifically enables a new option like debug-swap that relies on KVM_SEV_INIT2. So I reworked this patch to make legacy-vm-type an OnOffAuto field, where 'auto' implements the above semantics, and 'on'/'off' continue to behave as they do here in v1. At the moment, since QEMU doesn't actually expose anything that requires KVM_SEV_INIT2 for SEV-ES, setting legacy-vm-type to 'auto' or 'on' end up being equivalent, but by defaulting to 'auto' things will continue to "just work" on the libvirt side even when new features are enabled. And by adding a bit of that infrastructure now it's less likely that some option gets exposed in a way that doesn't abide by the above semantics. So as part of v2 I switched the default for 9.1+ to 'auto'. But if v1+Daniel's patch is still preferred that should be fine too. -Mike > > If you want to test this combo and send a pull request (either to me > or to Richard), that would help because I'm mostly away for a few > days. > > Paolo >
On Tue, Jul 09, 2024 at 11:03:19PM -0500, Michael Roth wrote: > On Thu, Jul 04, 2024 at 11:53:33AM +0200, Paolo Bonzini wrote: > > On Thu, Jul 4, 2024 at 11:39 AM Daniel P. Berrangé <berrange@redhat.com> wrote: > > > > The debug_swap parameter simply could not be enabled in the old API > > > > without breaking measurements. The new API *is the fix* to allow using > > > > it (though QEMU doesn't have the option plumbed in yet). There is no > > > > extensibility. > > > > > > > > Enabling debug_swap by default is also a thorny problem; it cannot be > > > > enabled by default because not all CPUs support it, and also we'd have > > > > the same problem that we cannot enable debug_swap on new machine types > > > > without requiring a new kernel. Tying the default to the -cpu model > > > > would work but it is confusing. > > > > > > Presumably we can tie it to '-cpu host' without much problem, and > > > then just leave it as an opt-in feature flag for named CPU models. > > > > '-cpu host' for SEV-SNP is also problematic, since CPUID is part of > > the measurement. It's okay for starting guests in a quick and dirty > > manner, but it cannot be used if measurement is in use. > > > > It's weird to have "-cpu" provide the default for "-object", since > > -object is created much earlier than CPUs. But then "-cpu" itself is > > weird because it's a kind of "factory" for future objects. Maybe we > > should redo the same exercise I did to streamline machine > > initialization, this time focusing on -cpu/-machine/-accel, to > > understand the various phases and where sev-{,snp-}guest > > initialization fits. > > > > > > I think it's reasonable if the fix is displayed right into the error > > > > message. It's only needed for SEV-ES though, SEV can use the old and > > > > new APIs interchangeably. > > > > > > FYI currently it is proposed to unconditionally force set legacy-vm-type=true > > > in libvirt, so QEMU guests would *never* use the new ioctl, to fix what we > > > consider to be a QEMU / kernel guest ABI regression. > > > > Ok, so it's probably best to apply both this and your patch for now. > > Later debug-swap can be enabled and will automatically disable > > legacy-vm-type if the user left it to the default. > > I think this seems like the ideal default behavior, where QEMU will > continue to stick with legacy interface unless the user specifically > enables a new option like debug-swap that relies on KVM_SEV_INIT2. > So I reworked this patch to make legacy-vm-type an OnOffAuto field, > where 'auto' implements the above semantics, and 'on'/'off' continue > to behave as they do here in v1. > > At the moment, since QEMU doesn't actually expose anything that requires > KVM_SEV_INIT2 for SEV-ES, setting legacy-vm-type to 'auto' or 'on' end > up being equivalent, but by defaulting to 'auto' things will continue to > "just work" on the libvirt side even when new features are enabled. And > by adding a bit of that infrastructure now it's less likely that some > option gets exposed in a way that doesn't abide by the above semantics. > > So as part of v2 I switched the default for 9.1+ to 'auto'. But if > v1+Daniel's patch is still preferred that should be fine too. I think your v2 patch is good on its own. It avoids the issues that concerned me and has sensible future behaviour too. With regards, Daniel
diff --git a/qapi/qom.json b/qapi/qom.json index 8bd299265e..a212c009aa 100644 --- a/qapi/qom.json +++ b/qapi/qom.json @@ -912,11 +912,12 @@ # @handle: SEV firmware handle (default: 0) # # @legacy-vm-type: Use legacy KVM_SEV_INIT KVM interface for creating the VM. -# The newer KVM_SEV_INIT2 interface syncs additional vCPU -# state when initializing the VMSA structures, which will -# result in a different guest measurement. Set this to -# maintain compatibility with older QEMU or kernel versions -# that rely on legacy KVM_SEV_INIT behavior. +# The newer KVM_SEV_INIT2 interface, from Linux >= 6.10, syncs +# additional vCPU state when initializing the VMSA structures, +# which will result in a different guest measurement. Set +# this to force compatibility with older QEMU or kernel +# versions that rely on legacy KVM_SEV_INIT behavior. +# Otherwise, QEMU will require KVM_SEV_INIT2 for SEV guests. # (default: false) (since 9.1) # # Since: 2.12 diff --git a/target/i386/sev.c b/target/i386/sev.c index 3ab8b3c28b..8f56c0cf0c 100644 --- a/target/i386/sev.c +++ b/target/i386/sev.c @@ -1347,14 +1347,22 @@ static int sev_kvm_type(X86ConfidentialGuest *cg) goto out; } + if (sev_guest->legacy_vm_type) { + sev_common->kvm_type = KVM_X86_DEFAULT_VM; + goto out; + } + kvm_type = (sev_guest->policy & SEV_POLICY_ES) ? KVM_X86_SEV_ES_VM : KVM_X86_SEV_VM; - if (kvm_is_vm_type_supported(kvm_type) && !sev_guest->legacy_vm_type) { - sev_common->kvm_type = kvm_type; - } else { - sev_common->kvm_type = KVM_X86_DEFAULT_VM; + if (!kvm_is_vm_type_supported(kvm_type)) { + error_report("SEV: host kernel does not support requested %s VM type. To allow use of " + "legacy KVM_X86_DEFAULT_VM VM type, the 'legacy-vm-type' argument must be " + "set to true for the sev-guest object.", + kvm_type == KVM_X86_SEV_VM ? "KVM_X86_SEV_VM" : "KVM_X86_SEV_ES_VM"); + return -1; } + sev_common->kvm_type = kvm_type; out: return sev_common->kvm_type; } @@ -1445,14 +1453,24 @@ static int sev_common_kvm_init(ConfidentialGuestSupport *cgs, Error **errp) } trace_kvm_sev_init(); - if (x86_klass->kvm_type(X86_CONFIDENTIAL_GUEST(sev_common)) == KVM_X86_DEFAULT_VM) { + switch (x86_klass->kvm_type(X86_CONFIDENTIAL_GUEST(sev_common))) { + case KVM_X86_DEFAULT_VM: cmd = sev_es_enabled() ? KVM_SEV_ES_INIT : KVM_SEV_INIT; ret = sev_ioctl(sev_common->sev_fd, cmd, NULL, &fw_error); - } else { + break; + case KVM_X86_SEV_VM: + case KVM_X86_SEV_ES_VM: + case KVM_X86_SNP_VM: { struct kvm_sev_init args = { 0 }; ret = sev_ioctl(sev_common->sev_fd, KVM_SEV_INIT2, &args, &fw_error); + break; + } + default: + error_setg(errp, "%s: host kernel does not support the requested SEV configuration.", + __func__); + return -1; } if (ret) {
Currently if the 'legacy-vm-type' property of the sev-guest object is left unset, QEMU will attempt to use the newer KVM_SEV_INIT2 kernel interface in conjunction with the newer KVM_X86_SEV_VM and KVM_X86_SEV_ES_VM KVM VM types. This can lead to measurement changes if, for instance, an SEV guest was created on a host that originally had an older kernel that didn't support KVM_SEV_INIT2, but is booted on the same host later on after the host kernel was upgraded. Cc: Daniel P. Berrangé <berrange@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> cc: kvm@vger.kernel.org Signed-off-by: Michael Roth <michael.roth@amd.com> --- qapi/qom.json | 11 ++++++----- target/i386/sev.c | 30 ++++++++++++++++++++++++------ 2 files changed, 30 insertions(+), 11 deletions(-)