Message ID | 20240322181116.1228416-13-pbonzini@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86, kvm: common confidential computing subset | expand |
On 22/3/24 19:11, Paolo Bonzini wrote: > So far, KVM has allowed KVM_GET/SET_* ioctls to execute even if the > guest state is encrypted, in which case they do nothing. For the new > API using VM types, instead, the ioctls will fail which is a safer and > more robust approach. > > The new API will be the only one available for SEV-SNP and TDX, but it > is also usable for SEV and SEV-ES. In preparation for that, require > architecture-specific KVM code to communicate the point at which guest > state is protected (which must be after kvm_cpu_synchronize_post_init(), > though that might change in the future in order to suppor migration). > From that point, skip reading registers so that cpu->vcpu_dirty is > never true: if it ever becomes true, kvm_arch_put_registers() will > fail miserably. > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > include/sysemu/kvm.h | 2 ++ > include/sysemu/kvm_int.h | 1 + > accel/kvm/kvm-all.c | 14 ++++++++++++-- > target/i386/sev.c | 1 + > 4 files changed, 16 insertions(+), 2 deletions(-) Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
On 3/23/2024 2:11 AM, Paolo Bonzini wrote: > So far, KVM has allowed KVM_GET/SET_* ioctls to execute even if the > guest state is encrypted, in which case they do nothing. For the new > API using VM types, instead, the ioctls will fail which is a safer and > more robust approach. > > The new API will be the only one available for SEV-SNP and TDX, but it > is also usable for SEV and SEV-ES. In preparation for that, require > architecture-specific KVM code to communicate the point at which guest > state is protected (which must be after kvm_cpu_synchronize_post_init(), > though that might change in the future in order to suppor migration). > From that point, skip reading registers so that cpu->vcpu_dirty is > never true: if it ever becomes true, kvm_arch_put_registers() will > fail miserably. > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > include/sysemu/kvm.h | 2 ++ > include/sysemu/kvm_int.h | 1 + > accel/kvm/kvm-all.c | 14 ++++++++++++-- > target/i386/sev.c | 1 + > 4 files changed, 16 insertions(+), 2 deletions(-) > > diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h > index fad9a7e8ff3..302e8f6f1e5 100644 > --- a/include/sysemu/kvm.h > +++ b/include/sysemu/kvm.h > @@ -539,6 +539,8 @@ bool kvm_dirty_ring_enabled(void); > > uint32_t kvm_dirty_ring_size(void); > > +void kvm_mark_guest_state_protected(void); > + > /** > * kvm_hwpoisoned_mem - indicate if there is any hwpoisoned page > * reported for the VM. > diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h > index 882e37e12c5..3496be7997a 100644 > --- a/include/sysemu/kvm_int.h > +++ b/include/sysemu/kvm_int.h > @@ -87,6 +87,7 @@ struct KVMState > bool kernel_irqchip_required; > OnOffAuto kernel_irqchip_split; > bool sync_mmu; > + bool guest_state_protected; > uint64_t manual_dirty_log_protect; > /* The man page (and posix) say ioctl numbers are signed int, but > * they're not. Linux, glibc and *BSD all treat ioctl numbers as > diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c > index a8cecd040eb..05fa3533c66 100644 > --- a/accel/kvm/kvm-all.c > +++ b/accel/kvm/kvm-all.c > @@ -2698,7 +2698,7 @@ bool kvm_cpu_check_are_resettable(void) > > static void do_kvm_cpu_synchronize_state(CPUState *cpu, run_on_cpu_data arg) > { > - if (!cpu->vcpu_dirty) { > + if (!cpu->vcpu_dirty && !kvm_state->guest_state_protected) { > int ret = kvm_arch_get_registers(cpu); > if (ret) { > error_report("Failed to get registers: %s", strerror(-ret)); > @@ -2712,7 +2712,7 @@ static void do_kvm_cpu_synchronize_state(CPUState *cpu, run_on_cpu_data arg) > > void kvm_cpu_synchronize_state(CPUState *cpu) > { > - if (!cpu->vcpu_dirty) { > + if (!cpu->vcpu_dirty && !kvm_state->guest_state_protected) { > run_on_cpu(cpu, do_kvm_cpu_synchronize_state, RUN_ON_CPU_NULL); > } > } > @@ -2747,6 +2747,11 @@ static void do_kvm_cpu_synchronize_post_init(CPUState *cpu, run_on_cpu_data arg) > > void kvm_cpu_synchronize_post_init(CPUState *cpu) > { > + /* > + * This runs before the machine_init_done notifiers, and is the last > + * opportunity to synchronize the state of confidential guests. > + */ > + assert(!kvm_state->guest_state_protected); So, this requires confidential guests to call kvm_mark_guest_state_protected() in its machine_init_done notifier callback? But for TDX, the guest_state is protected at the beginning, not some time later when machine_init_done. > run_on_cpu(cpu, do_kvm_cpu_synchronize_post_init, RUN_ON_CPU_NULL); > } > > @@ -4094,3 +4099,8 @@ void query_stats_schemas_cb(StatsSchemaList **result, Error **errp) > query_stats_schema_vcpu(first_cpu, &stats_args); > } > } > + > +void kvm_mark_guest_state_protected(void) > +{ > + kvm_state->guest_state_protected = true; > +} > diff --git a/target/i386/sev.c b/target/i386/sev.c > index b8f79d34d19..c49a8fd55eb 100644 > --- a/target/i386/sev.c > +++ b/target/i386/sev.c > @@ -755,6 +755,7 @@ sev_launch_get_measure(Notifier *notifier, void *unused) > if (ret) { > exit(1); > } > + kvm_mark_guest_state_protected(); > } > > /* query the measurement blob length */
On Tue, Mar 26, 2024 at 4:48 PM Xiaoyao Li <xiaoyao.li@intel.com> wrote: > So, this requires confidential guests to call > kvm_mark_guest_state_protected() in its machine_init_done notifier callback? > > But for TDX, the guest_state is protected at the beginning, not some > time later when machine_init_done. Good point, I will change this to an "if". Paolo
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h index fad9a7e8ff3..302e8f6f1e5 100644 --- a/include/sysemu/kvm.h +++ b/include/sysemu/kvm.h @@ -539,6 +539,8 @@ bool kvm_dirty_ring_enabled(void); uint32_t kvm_dirty_ring_size(void); +void kvm_mark_guest_state_protected(void); + /** * kvm_hwpoisoned_mem - indicate if there is any hwpoisoned page * reported for the VM. diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h index 882e37e12c5..3496be7997a 100644 --- a/include/sysemu/kvm_int.h +++ b/include/sysemu/kvm_int.h @@ -87,6 +87,7 @@ struct KVMState bool kernel_irqchip_required; OnOffAuto kernel_irqchip_split; bool sync_mmu; + bool guest_state_protected; uint64_t manual_dirty_log_protect; /* The man page (and posix) say ioctl numbers are signed int, but * they're not. Linux, glibc and *BSD all treat ioctl numbers as diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index a8cecd040eb..05fa3533c66 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -2698,7 +2698,7 @@ bool kvm_cpu_check_are_resettable(void) static void do_kvm_cpu_synchronize_state(CPUState *cpu, run_on_cpu_data arg) { - if (!cpu->vcpu_dirty) { + if (!cpu->vcpu_dirty && !kvm_state->guest_state_protected) { int ret = kvm_arch_get_registers(cpu); if (ret) { error_report("Failed to get registers: %s", strerror(-ret)); @@ -2712,7 +2712,7 @@ static void do_kvm_cpu_synchronize_state(CPUState *cpu, run_on_cpu_data arg) void kvm_cpu_synchronize_state(CPUState *cpu) { - if (!cpu->vcpu_dirty) { + if (!cpu->vcpu_dirty && !kvm_state->guest_state_protected) { run_on_cpu(cpu, do_kvm_cpu_synchronize_state, RUN_ON_CPU_NULL); } } @@ -2747,6 +2747,11 @@ static void do_kvm_cpu_synchronize_post_init(CPUState *cpu, run_on_cpu_data arg) void kvm_cpu_synchronize_post_init(CPUState *cpu) { + /* + * This runs before the machine_init_done notifiers, and is the last + * opportunity to synchronize the state of confidential guests. + */ + assert(!kvm_state->guest_state_protected); run_on_cpu(cpu, do_kvm_cpu_synchronize_post_init, RUN_ON_CPU_NULL); } @@ -4094,3 +4099,8 @@ void query_stats_schemas_cb(StatsSchemaList **result, Error **errp) query_stats_schema_vcpu(first_cpu, &stats_args); } } + +void kvm_mark_guest_state_protected(void) +{ + kvm_state->guest_state_protected = true; +} diff --git a/target/i386/sev.c b/target/i386/sev.c index b8f79d34d19..c49a8fd55eb 100644 --- a/target/i386/sev.c +++ b/target/i386/sev.c @@ -755,6 +755,7 @@ sev_launch_get_measure(Notifier *notifier, void *unused) if (ret) { exit(1); } + kvm_mark_guest_state_protected(); } /* query the measurement blob length */
So far, KVM has allowed KVM_GET/SET_* ioctls to execute even if the guest state is encrypted, in which case they do nothing. For the new API using VM types, instead, the ioctls will fail which is a safer and more robust approach. The new API will be the only one available for SEV-SNP and TDX, but it is also usable for SEV and SEV-ES. In preparation for that, require architecture-specific KVM code to communicate the point at which guest state is protected (which must be after kvm_cpu_synchronize_post_init(), though that might change in the future in order to suppor migration). From that point, skip reading registers so that cpu->vcpu_dirty is never true: if it ever becomes true, kvm_arch_put_registers() will fail miserably. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- include/sysemu/kvm.h | 2 ++ include/sysemu/kvm_int.h | 1 + accel/kvm/kvm-all.c | 14 ++++++++++++-- target/i386/sev.c | 1 + 4 files changed, 16 insertions(+), 2 deletions(-)