diff mbox series

[12/13] KVM: nVMX: Don't mark vmcs12 as dirty when L1 writes pin controls

Message ID 20190507191805.9932-13-sean.j.christopherson@intel.com (mailing list archive)
State New, archived
Headers show
Series KVM: VMX: Reduce VMWRITEs to VMCS controls | expand

Commit Message

Sean Christopherson May 7, 2019, 7:18 p.m. UTC
Pin controls doesn't affect dirty logic, e.g. the preemption timer value
is loaded from vmcs12 even if vmcs12 is "clean", i.e. there is no need
to mark vmcs12 dirty when L1 writes pin controls.

KVM currently toggles the VMX_PREEMPTION_TIMER control flag when it
disables or enables the timer.  The VMWRITE to toggle the flag can be
responsible for a large percentage of vmcs12 dirtying when running KVM
as L1 (depending on the behavior of L2).

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/nested.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Paolo Bonzini June 6, 2019, 5:16 p.m. UTC | #1
On 07/05/19 21:18, Sean Christopherson wrote:
> +			 * path of prepare_vmcs02.  Pin controls is an exception as
> +			 * writing pin controls doesn't affect KVM's dirty logic and
> +			 * the VMX_PREEMPTION_TIMER flag may be toggled frequently,
> +			 * but not frequently enough to justify shadowing.
>  			 */
> +		case PIN_BASED_VM_EXEC_CONTROL:
>  			break;

Hmm, is it really that bad to shadow it?

Paolo
Paolo Bonzini June 6, 2019, 5:24 p.m. UTC | #2
On 07/05/19 21:18, Sean Christopherson wrote:
> Pin controls doesn't affect dirty logic, e.g. the preemption timer value
> is loaded from vmcs12 even if vmcs12 is "clean", i.e. there is no need
> to mark vmcs12 dirty when L1 writes pin controls.
> 
> KVM currently toggles the VMX_PREEMPTION_TIMER control flag when it
> disables or enables the timer.  The VMWRITE to toggle the flag can be
> responsible for a large percentage of vmcs12 dirtying when running KVM
> as L1 (depending on the behavior of L2).
> 
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>

I think either we wait for patch 13 to get in the wild so that
VMX_PREEMPTION_TIMER writes do not become so frequent, or we can do
something like

--------- 8< ------------
From: Paolo Bonzini <pbonzini@redhat.com>
Subject: [PATCH] KVM: nVMX: shadow pin based execution controls

The VMX_PREEMPTION_TIMER flag may be toggled frequently, though not
*very* frequently.  Since it does not affect KVM's dirty logic, e.g.
the preemption timer value is loaded from vmcs12 even if vmcs12 is
"clean", there is no need to mark vmcs12 dirty when L1 writes pin
controls, and shadowing the field achieves that.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

diff --git a/arch/x86/kvm/vmx/vmcs_shadow_fields.h
b/arch/x86/kvm/vmx/vmcs_shadow_fields.h
index 4cea018ba285..eb1ecd16fd22 100644
--- a/arch/x86/kvm/vmx/vmcs_shadow_fields.h
+++ b/arch/x86/kvm/vmx/vmcs_shadow_fields.h
@@ -47,6 +47,7 @@
 SHADOW_FIELD_RO(GUEST_CS_AR_BYTES, guest_cs_ar_bytes)
 SHADOW_FIELD_RO(GUEST_SS_AR_BYTES, guest_ss_ar_bytes)
 SHADOW_FIELD_RW(CPU_BASED_VM_EXEC_CONTROL, cpu_based_vm_exec_control)
+SHADOW_FIELD_RW(PIN_BASED_VM_EXEC_CONTROL, pin_based_vm_exec_control)
 SHADOW_FIELD_RW(EXCEPTION_BITMAP, exception_bitmap)
 SHADOW_FIELD_RW(VM_ENTRY_EXCEPTION_ERROR_CODE,
vm_entry_exception_error_code)
 SHADOW_FIELD_RW(VM_ENTRY_INTR_INFO_FIELD, vm_entry_intr_info_field)
Sean Christopherson June 6, 2019, 7:05 p.m. UTC | #3
On Thu, Jun 06, 2019 at 07:24:38PM +0200, Paolo Bonzini wrote:
> On 07/05/19 21:18, Sean Christopherson wrote:
> > Pin controls doesn't affect dirty logic, e.g. the preemption timer value
> > is loaded from vmcs12 even if vmcs12 is "clean", i.e. there is no need
> > to mark vmcs12 dirty when L1 writes pin controls.
> > 
> > KVM currently toggles the VMX_PREEMPTION_TIMER control flag when it
> > disables or enables the timer.  The VMWRITE to toggle the flag can be
> > responsible for a large percentage of vmcs12 dirtying when running KVM
> > as L1 (depending on the behavior of L2).
> > 
> > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> 
> I think either we wait for patch 13 to get in the wild so that
> VMX_PREEMPTION_TIMER writes do not become so frequent, or we can do
> something like

I'd prefer to get something in now.  I assume a fair number of users will
be running current/older versions of KVM as L1 for years to come.

I have no objection to shadowing pin controls.  I opted for the cheesy
approach because I couldn't provide numbers that actually showed a
performance improvement by shadowing.

> --------- 8< ------------
> From: Paolo Bonzini <pbonzini@redhat.com>
> Subject: [PATCH] KVM: nVMX: shadow pin based execution controls
> 
> The VMX_PREEMPTION_TIMER flag may be toggled frequently, though not
> *very* frequently.  Since it does not affect KVM's dirty logic, e.g.
> the preemption timer value is loaded from vmcs12 even if vmcs12 is
> "clean", there is no need to mark vmcs12 dirty when L1 writes pin
> controls, and shadowing the field achieves that.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> 
> diff --git a/arch/x86/kvm/vmx/vmcs_shadow_fields.h
> b/arch/x86/kvm/vmx/vmcs_shadow_fields.h
> index 4cea018ba285..eb1ecd16fd22 100644
> --- a/arch/x86/kvm/vmx/vmcs_shadow_fields.h
> +++ b/arch/x86/kvm/vmx/vmcs_shadow_fields.h
> @@ -47,6 +47,7 @@
>  SHADOW_FIELD_RO(GUEST_CS_AR_BYTES, guest_cs_ar_bytes)
>  SHADOW_FIELD_RO(GUEST_SS_AR_BYTES, guest_ss_ar_bytes)
>  SHADOW_FIELD_RW(CPU_BASED_VM_EXEC_CONTROL, cpu_based_vm_exec_control)
> +SHADOW_FIELD_RW(PIN_BASED_VM_EXEC_CONTROL, pin_based_vm_exec_control)
>  SHADOW_FIELD_RW(EXCEPTION_BITMAP, exception_bitmap)
>  SHADOW_FIELD_RW(VM_ENTRY_EXCEPTION_ERROR_CODE,
> vm_entry_exception_error_code)
>  SHADOW_FIELD_RW(VM_ENTRY_INTR_INFO_FIELD, vm_entry_intr_info_field)
diff mbox series

Patch

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 6ecdcfc67245..652022a77b64 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4508,8 +4508,12 @@  static int handle_vmwrite(struct kvm_vcpu *vcpu)
 			/*
 			 * The fields that can be updated by L1 without a vmexit are
 			 * always updated in the vmcs02, the others go down the slow
-			 * path of prepare_vmcs02.
+			 * path of prepare_vmcs02.  Pin controls is an exception as
+			 * writing pin controls doesn't affect KVM's dirty logic and
+			 * the VMX_PREEMPTION_TIMER flag may be toggled frequently,
+			 * but not frequently enough to justify shadowing.
 			 */
+		case PIN_BASED_VM_EXEC_CONTROL:
 			break;
 		default:
 			vmx->nested.dirty_vmcs12 = true;