diff mbox series

[RFC,v3,26/59] KVM: x86: Introduce vm_teardown() hook in kvm_arch_vm_destroy()

Message ID 1fa2d0db387a99352d44247728c5b8ae5f5cab4d.1637799475.git.isaku.yamahata@intel.com (mailing list archive)
State New, archived
Headers show
Series KVM: X86: TDX support | expand

Commit Message

Isaku Yamahata Nov. 25, 2021, 12:20 a.m. UTC
From: Sean Christopherson <sean.j.christopherson@intel.com>

Add a second kvm_x86_ops hook in kvm_arch_vm_destroy() to support TDX's
destruction path, which needs to first put the VM into a teardown state,
then free per-vCPU resource, and finally free per-VM resources.

Note, this knowingly creates a discrepancy in nomenclature for SVM as
svm_vm_teardown() invokes avic_vm_destroy() and sev_vm_destroy().
Moving the now-misnamed functions or renaming them is left to a future
patch so as not to introduce a functional change for SVM.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
---
 arch/x86/include/asm/kvm-x86-ops.h | 1 +
 arch/x86/include/asm/kvm_host.h    | 1 +
 arch/x86/kvm/svm/svm.c             | 5 +++--
 arch/x86/kvm/vmx/vmx.c             | 7 +++++++
 arch/x86/kvm/x86.c                 | 3 ++-
 5 files changed, 14 insertions(+), 3 deletions(-)

Comments

Thomas Gleixner Nov. 25, 2021, 7:46 p.m. UTC | #1
On Wed, Nov 24 2021 at 16:20, isaku yamahata wrote:
> Add a second kvm_x86_ops hook in kvm_arch_vm_destroy() to support TDX's
> destruction path, which needs to first put the VM into a teardown state,
> then free per-vCPU resource, and finally free per-VM resources.
>
> Note, this knowingly creates a discrepancy in nomenclature for SVM as
> svm_vm_teardown() invokes avic_vm_destroy() and sev_vm_destroy().
> Moving the now-misnamed functions or renaming them is left to a future
> patch so as not to introduce a functional change for SVM.

That's just the wrong way around. Fixup SVM first and then add the TDX
muck on top. Stop this 'left to a future patch' nonsense. I know for
sure that those future patches never materialize.

The way it works is:

    1) Refactor the code to make room for your new functionality in a
       way that the existing code still works.

    2) Add your new muck on top.

Anything else is not acceptable at all.

Thanks,

        tglx
Paolo Bonzini Nov. 25, 2021, 8:54 p.m. UTC | #2
On 11/25/21 20:46, Thomas Gleixner wrote:
> On Wed, Nov 24 2021 at 16:20, isaku yamahata wrote:
>> Add a second kvm_x86_ops hook in kvm_arch_vm_destroy() to support TDX's
>> destruction path, which needs to first put the VM into a teardown state,
>> then free per-vCPU resource, and finally free per-VM resources.
>>
>> Note, this knowingly creates a discrepancy in nomenclature for SVM as
>> svm_vm_teardown() invokes avic_vm_destroy() and sev_vm_destroy().
>> Moving the now-misnamed functions or renaming them is left to a future
>> patch so as not to introduce a functional change for SVM.
> That's just the wrong way around. Fixup SVM first and then add the TDX
> muck on top. Stop this 'left to a future patch' nonsense. I know for
> sure that those future patches never materialize.

Or just keep vm_destroy for the "early" destruction, and give a new name 
to the new hook.  It is used to give back the TDCS memory, so perhaps 
you can call it vm_free?

Paolo
Thomas Gleixner Nov. 25, 2021, 9:11 p.m. UTC | #3
On Thu, Nov 25 2021 at 21:54, Paolo Bonzini wrote:
> On 11/25/21 20:46, Thomas Gleixner wrote:
>> On Wed, Nov 24 2021 at 16:20, isaku yamahata wrote:
>>> Add a second kvm_x86_ops hook in kvm_arch_vm_destroy() to support TDX's
>>> destruction path, which needs to first put the VM into a teardown state,
>>> then free per-vCPU resource, and finally free per-VM resources.
>>>
>>> Note, this knowingly creates a discrepancy in nomenclature for SVM as
>>> svm_vm_teardown() invokes avic_vm_destroy() and sev_vm_destroy().
>>> Moving the now-misnamed functions or renaming them is left to a future
>>> patch so as not to introduce a functional change for SVM.
>> That's just the wrong way around. Fixup SVM first and then add the TDX
>> muck on top. Stop this 'left to a future patch' nonsense. I know for
>> sure that those future patches never materialize.
>
> Or just keep vm_destroy for the "early" destruction, and give a new name 
> to the new hook.  It is used to give back the TDCS memory, so perhaps 
> you can call it vm_free?

Up to you, but the current approach is bogus. I rather go for a fully
symmetric interface and let the various incarnations opt in at the right
place. Similar to what cpu hotplug states are implementing.

Thanks,

        tglx
Sean Christopherson Nov. 29, 2021, 6:16 p.m. UTC | #4
On Thu, Nov 25, 2021, Thomas Gleixner wrote:
> On Thu, Nov 25 2021 at 21:54, Paolo Bonzini wrote:
> > On 11/25/21 20:46, Thomas Gleixner wrote:
> >> On Wed, Nov 24 2021 at 16:20, isaku yamahata wrote:
> >>> Add a second kvm_x86_ops hook in kvm_arch_vm_destroy() to support TDX's
> >>> destruction path, which needs to first put the VM into a teardown state,
> >>> then free per-vCPU resource, and finally free per-VM resources.
> >>>
> >>> Note, this knowingly creates a discrepancy in nomenclature for SVM as
> >>> svm_vm_teardown() invokes avic_vm_destroy() and sev_vm_destroy().
> >>> Moving the now-misnamed functions or renaming them is left to a future
> >>> patch so as not to introduce a functional change for SVM.
> >> That's just the wrong way around. Fixup SVM first and then add the TDX
> >> muck on top. Stop this 'left to a future patch' nonsense. I know for
> >> sure that those future patches never materialize.
> >
> > Or just keep vm_destroy for the "early" destruction, and give a new name 
> > to the new hook.  It is used to give back the TDCS memory, so perhaps 
> > you can call it vm_free?
> 
> Up to you, but the current approach is bogus. I rather go for a fully
> symmetric interface and let the various incarnations opt in at the right
> place. Similar to what cpu hotplug states are implementing.

Naming aside, that's what is being done, TDX simply needs two hooks instead of one
due to the way KVM handles VM and vCPU destruction.  The alternative would be to
shove and duplicate what is currently common x86 code into VMX/SVM, which IMO is
far worse.

Regarding the naming, I 100% agree SVM should be refactored prior to adding TDX
stuff if we choose to go with vm_teardown() and vm_destroy() instead of Paolo's
suggestion of vm_destroy() and vm_free().  When this patch/code was originally
written, letting SVM become stale was a deliberate choice to reduce conflicts with
upstream as we knew the code would live out of tree for quite some time.  But that
was purely meant to be development "hack", not upstream behavior.
diff mbox series

Patch

diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index 30754f2b8a99..1009541fd6c2 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -20,6 +20,7 @@  KVM_X86_OP(has_emulated_msr)
 KVM_X86_OP(vcpu_after_set_cpuid)
 KVM_X86_OP(is_vm_type_supported)
 KVM_X86_OP(vm_init)
+KVM_X86_OP_NULL(vm_teardown)
 KVM_X86_OP_NULL(vm_destroy)
 KVM_X86_OP(vcpu_create)
 KVM_X86_OP(vcpu_free)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 74c3c1629563..3bc5417f94ec 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1321,6 +1321,7 @@  struct kvm_x86_ops {
 	bool (*is_vm_type_supported)(unsigned long vm_type);
 	unsigned int vm_size;
 	int (*vm_init)(struct kvm *kvm);
+	void (*vm_teardown)(struct kvm *kvm);
 	void (*vm_destroy)(struct kvm *kvm);
 
 	/* Create, but do not attach this VCPU */
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 2ef77d4566a9..1bf410add13b 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4556,7 +4556,7 @@  static void svm_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
 	sev_vcpu_deliver_sipi_vector(vcpu, vector);
 }
 
-static void svm_vm_destroy(struct kvm *kvm)
+static void svm_vm_teardown(struct kvm *kvm)
 {
 	avic_vm_destroy(kvm);
 	sev_vm_destroy(kvm);
@@ -4597,7 +4597,8 @@  static struct kvm_x86_ops svm_x86_ops __initdata = {
 	.is_vm_type_supported = svm_is_vm_type_supported,
 	.vm_size = sizeof(struct kvm_svm),
 	.vm_init = svm_vm_init,
-	.vm_destroy = svm_vm_destroy,
+	.vm_teardown = svm_vm_teardown,
+	.vm_destroy = NULL,
 
 	.prepare_guest_switch = svm_prepare_guest_switch,
 	.vcpu_load = svm_vcpu_load,
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index f2905e00b063..5e4c6ac9fe69 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6890,6 +6890,11 @@  static int vmx_vm_init(struct kvm *kvm)
 	return 0;
 }
 
+static void vmx_vm_destroy(struct kvm *kvm)
+{
+
+}
+
 static int __init vmx_check_processor_compat(void)
 {
 	struct vmcs_config vmcs_conf;
@@ -7513,6 +7518,8 @@  static struct kvm_x86_ops vmx_x86_ops __initdata = {
 	.is_vm_type_supported = vmx_is_vm_type_supported,
 	.vm_size = sizeof(struct kvm_vmx),
 	.vm_init = vmx_vm_init,
+	.vm_teardown = NULL,
+	.vm_destroy = vmx_vm_destroy,
 
 	.vcpu_create = vmx_create_vcpu,
 	.vcpu_free = vmx_free_vcpu,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f2091c4b928a..88b55ddb22a7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11613,7 +11613,7 @@  void kvm_arch_destroy_vm(struct kvm *kvm)
 		__x86_set_memory_region(kvm, TSS_PRIVATE_MEMSLOT, 0, 0);
 		mutex_unlock(&kvm->slots_lock);
 	}
-	static_call_cond(kvm_x86_vm_destroy)(kvm);
+	static_call_cond(kvm_x86_vm_teardown)(kvm);
 	kvm_free_msr_filter(srcu_dereference_check(kvm->arch.msr_filter, &kvm->srcu, 1));
 	kvm_pic_destroy(kvm);
 	kvm_ioapic_destroy(kvm);
@@ -11624,6 +11624,7 @@  void kvm_arch_destroy_vm(struct kvm *kvm)
 	kvm_page_track_cleanup(kvm);
 	kvm_xen_destroy_vm(kvm);
 	kvm_hv_destroy_vm(kvm);
+	static_call_cond(kvm_x86_vm_destroy)(kvm);
 }
 
 static void memslot_rmap_free(struct kvm_memory_slot *slot)