[v3,03/37] KVM: nVMX: Invalidate all EPTP contexts when emulating INVEPT for L1
diff mbox series

Message ID 20200320212833.3507-4-sean.j.christopherson@intel.com
State New
Headers show
Series
  • KVM: x86: TLB flushing fixes and enhancements
Related show

Commit Message

Sean Christopherson March 20, 2020, 9:27 p.m. UTC
Free all L2 (guest_mmu) roots when emulating INVEPT for L1.  Outstanding
changes to the EPT tables managed by L1 need to be recognized, and
relying on KVM to always flush L2's EPTP context on nested VM-Enter is
dangerous.

Similar to handle_invpcid(), rely on kvm_mmu_free_roots() to do a remote
TLB flush if necessary, e.g. if L1 has never entered L2 then there is
nothing to be done.

Nuking all L2 roots is overkill for the single-context variant, but it's
the safe and easy bet.  A more precise zap mechanism will be added in
the future.  Add a TODO to call out that KVM only needs to invalidate
affected contexts.

Fixes: b119019847fbc ("kvm: nVMX: Remove unnecessary sync_roots from handle_invept")
Reported-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/nested.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Vitaly Kuznetsov March 23, 2020, 3:24 p.m. UTC | #1
Sean Christopherson <sean.j.christopherson@intel.com> writes:

> Free all L2 (guest_mmu) roots when emulating INVEPT for L1.  Outstanding
> changes to the EPT tables managed by L1 need to be recognized, and
> relying on KVM to always flush L2's EPTP context on nested VM-Enter is
> dangerous.
>
> Similar to handle_invpcid(), rely on kvm_mmu_free_roots() to do a remote
> TLB flush if necessary, e.g. if L1 has never entered L2 then there is
> nothing to be done.
>
> Nuking all L2 roots is overkill for the single-context variant, but it's
> the safe and easy bet.  A more precise zap mechanism will be added in
> the future.  Add a TODO to call out that KVM only needs to invalidate
> affected contexts.
>
> Fixes: b119019847fbc ("kvm: nVMX: Remove unnecessary sync_roots from handle_invept")
> Reported-by: Jim Mattson <jmattson@google.com>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>  arch/x86/kvm/vmx/nested.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index f3774cef4fd4..9624cea4ed9f 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -5160,12 +5160,12 @@ static int handle_invept(struct kvm_vcpu *vcpu)
>  		if (!nested_vmx_check_eptp(vcpu, operand.eptp))
>  			return nested_vmx_failValid(vcpu,
>  				VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID);
> +
> +		/* TODO: sync only the target EPTP context. */
>  		fallthrough;
>  	case VMX_EPT_EXTENT_GLOBAL:
> -	/*
> -	 * TODO: Sync the necessary shadow EPT roots here, rather than
> -	 * at the next emulated VM-entry.
> -	 */
> +		kvm_mmu_free_roots(vcpu, &vcpu->arch.guest_mmu,
> +				   KVM_MMU_ROOTS_ALL);
>  		break;

An ignorant reader may wonder "and how do we know that L1 actaully uses
EPT" as he may find out that guest_mmu is not being used otherwise. The
answer to the question will likely be "if L1 doesn't use EPT for some of
its guests than there's nothing we should do here as we will be
resetting root_mmu when switching to/from them". Hope the ignorant
reviewer typing this is not very wrong :-)

>  	default:
>  		BUG_ON(1);

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Sean Christopherson March 23, 2020, 3:53 p.m. UTC | #2
On Mon, Mar 23, 2020 at 04:24:05PM +0100, Vitaly Kuznetsov wrote:
> Sean Christopherson <sean.j.christopherson@intel.com> writes:
> 
> > Free all L2 (guest_mmu) roots when emulating INVEPT for L1.  Outstanding
> > changes to the EPT tables managed by L1 need to be recognized, and
> > relying on KVM to always flush L2's EPTP context on nested VM-Enter is
> > dangerous.
> >
> > Similar to handle_invpcid(), rely on kvm_mmu_free_roots() to do a remote
> > TLB flush if necessary, e.g. if L1 has never entered L2 then there is
> > nothing to be done.
> >
> > Nuking all L2 roots is overkill for the single-context variant, but it's
> > the safe and easy bet.  A more precise zap mechanism will be added in
> > the future.  Add a TODO to call out that KVM only needs to invalidate
> > affected contexts.
> >
> > Fixes: b119019847fbc ("kvm: nVMX: Remove unnecessary sync_roots from handle_invept")
> > Reported-by: Jim Mattson <jmattson@google.com>
> > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> > ---
> >  arch/x86/kvm/vmx/nested.c | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > index f3774cef4fd4..9624cea4ed9f 100644
> > --- a/arch/x86/kvm/vmx/nested.c
> > +++ b/arch/x86/kvm/vmx/nested.c
> > @@ -5160,12 +5160,12 @@ static int handle_invept(struct kvm_vcpu *vcpu)
> >  		if (!nested_vmx_check_eptp(vcpu, operand.eptp))
> >  			return nested_vmx_failValid(vcpu,
> >  				VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID);
> > +
> > +		/* TODO: sync only the target EPTP context. */
> >  		fallthrough;
> >  	case VMX_EPT_EXTENT_GLOBAL:
> > -	/*
> > -	 * TODO: Sync the necessary shadow EPT roots here, rather than
> > -	 * at the next emulated VM-entry.
> > -	 */
> > +		kvm_mmu_free_roots(vcpu, &vcpu->arch.guest_mmu,
> > +				   KVM_MMU_ROOTS_ALL);
> >  		break;
> 
> An ignorant reader may wonder "and how do we know that L1 actaully uses
> EPT" as he may find out that guest_mmu is not being used otherwise. The
> answer to the question will likely be "if L1 doesn't use EPT for some of
> its guests than there's nothing we should do here as we will be
> resetting root_mmu when switching to/from them". Hope the ignorant
> reviewer typing this is not very wrong :-)

A different way to put it would be:

  KVM never uses root_mmu to hold nested EPT roots.

Invalidating too much is functionally ok, though sub-optimal for performance.
Invalidating too little is what we really care about.

FWIW, VMX currently uses guest_mmu iff nested EPT is enabled.  In theory,
KVM could be enhanced to also used guest_mmu when nested-TDP is disabled,
e.g. to enable VMX to preserve L1's root_mmu when emulating INVVPID.  That
would likely be a decent performance boost for nested VMX+VPID without
nested EPT, but I'm guessing that the cross-section of users that care
about nested performance and don't use nested EPT is quite small.

> >  	default:
> >  		BUG_ON(1);
> 
> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> 
> -- 
> Vitaly
>
Jim Mattson March 23, 2020, 4:24 p.m. UTC | #3
On Fri, Mar 20, 2020 at 2:29 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> Free all L2 (guest_mmu) roots when emulating INVEPT for L1.  Outstanding
> changes to the EPT tables managed by L1 need to be recognized, and
> relying on KVM to always flush L2's EPTP context on nested VM-Enter is
> dangerous.
>
> Similar to handle_invpcid(), rely on kvm_mmu_free_roots() to do a remote
> TLB flush if necessary, e.g. if L1 has never entered L2 then there is
> nothing to be done.
>
> Nuking all L2 roots is overkill for the single-context variant, but it's
> the safe and easy bet.  A more precise zap mechanism will be added in
> the future.  Add a TODO to call out that KVM only needs to invalidate
> affected contexts.
>
> Fixes: b119019847fbc ("kvm: nVMX: Remove unnecessary sync_roots from handle_invept")

The bug existed well before the commit indicated in the "Fixes" line.
Sean Christopherson March 23, 2020, 4:28 p.m. UTC | #4
On Mon, Mar 23, 2020 at 09:24:25AM -0700, Jim Mattson wrote:
> On Fri, Mar 20, 2020 at 2:29 PM Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
> >
> > Free all L2 (guest_mmu) roots when emulating INVEPT for L1.  Outstanding
> > changes to the EPT tables managed by L1 need to be recognized, and
> > relying on KVM to always flush L2's EPTP context on nested VM-Enter is
> > dangerous.
> >
> > Similar to handle_invpcid(), rely on kvm_mmu_free_roots() to do a remote
> > TLB flush if necessary, e.g. if L1 has never entered L2 then there is
> > nothing to be done.
> >
> > Nuking all L2 roots is overkill for the single-context variant, but it's
> > the safe and easy bet.  A more precise zap mechanism will be added in
> > the future.  Add a TODO to call out that KVM only needs to invalidate
> > affected contexts.
> >
> > Fixes: b119019847fbc ("kvm: nVMX: Remove unnecessary sync_roots from handle_invept")
> 
> The bug existed well before the commit indicated in the "Fixes" line.

Ah, my bad.  A cursory glance at commit b119019847fbc makes that quite
obvious.  This should be

  Fixes: bfd0a56b9000 ("nEPT: Nested INVEPT")
Jim Mattson March 23, 2020, 4:36 p.m. UTC | #5
On Mon, Mar 23, 2020 at 9:28 AM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> On Mon, Mar 23, 2020 at 09:24:25AM -0700, Jim Mattson wrote:
> > On Fri, Mar 20, 2020 at 2:29 PM Sean Christopherson
> > <sean.j.christopherson@intel.com> wrote:
> > >
> > > Free all L2 (guest_mmu) roots when emulating INVEPT for L1.  Outstanding
> > > changes to the EPT tables managed by L1 need to be recognized, and
> > > relying on KVM to always flush L2's EPTP context on nested VM-Enter is
> > > dangerous.
> > >
> > > Similar to handle_invpcid(), rely on kvm_mmu_free_roots() to do a remote
> > > TLB flush if necessary, e.g. if L1 has never entered L2 then there is
> > > nothing to be done.
> > >
> > > Nuking all L2 roots is overkill for the single-context variant, but it's
> > > the safe and easy bet.  A more precise zap mechanism will be added in
> > > the future.  Add a TODO to call out that KVM only needs to invalidate
> > > affected contexts.
> > >
> > > Fixes: b119019847fbc ("kvm: nVMX: Remove unnecessary sync_roots from handle_invept")
> >
> > The bug existed well before the commit indicated in the "Fixes" line.
>
> Ah, my bad.  A cursory glance at commit b119019847fbc makes that quite
> obvious.  This should be
>
>   Fixes: bfd0a56b9000 ("nEPT: Nested INVEPT")

Actually, I think that things were fine back then (though we
gratuitously flushed L1's TLB as a result of an emulated INVEPT). The
problem started when we stopped flushing the TLB on every emulated
VM-entry (i.e. L1 -> L2 transitions). I'm not sure what that commit
was, but I think you referenced it in an earlier email.
Sean Christopherson March 23, 2020, 4:44 p.m. UTC | #6
On Mon, Mar 23, 2020 at 09:36:22AM -0700, Jim Mattson wrote:
> On Mon, Mar 23, 2020 at 9:28 AM Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
> >
> > On Mon, Mar 23, 2020 at 09:24:25AM -0700, Jim Mattson wrote:
> > > On Fri, Mar 20, 2020 at 2:29 PM Sean Christopherson
> > > <sean.j.christopherson@intel.com> wrote:
> > > >
> > > > Free all L2 (guest_mmu) roots when emulating INVEPT for L1.  Outstanding
> > > > changes to the EPT tables managed by L1 need to be recognized, and
> > > > relying on KVM to always flush L2's EPTP context on nested VM-Enter is
> > > > dangerous.
> > > >
> > > > Similar to handle_invpcid(), rely on kvm_mmu_free_roots() to do a remote
> > > > TLB flush if necessary, e.g. if L1 has never entered L2 then there is
> > > > nothing to be done.
> > > >
> > > > Nuking all L2 roots is overkill for the single-context variant, but it's
> > > > the safe and easy bet.  A more precise zap mechanism will be added in
> > > > the future.  Add a TODO to call out that KVM only needs to invalidate
> > > > affected contexts.
> > > >
> > > > Fixes: b119019847fbc ("kvm: nVMX: Remove unnecessary sync_roots from handle_invept")
> > >
> > > The bug existed well before the commit indicated in the "Fixes" line.
> >
> > Ah, my bad.  A cursory glance at commit b119019847fbc makes that quite
> > obvious.  This should be
> >
> >   Fixes: bfd0a56b9000 ("nEPT: Nested INVEPT")
> 
> Actually, I think that things were fine back then (though we
> gratuitously flushed L1's TLB as a result of an emulated INVEPT). The
> problem started when we stopped flushing the TLB on every emulated
> VM-entry (i.e. L1 -> L2 transitions). I'm not sure what that commit
> was, but I think you referenced it in an earlier email.

Hmm, true.  I was thinking it was the original commit because it didn't
operate on guest_mmu, but guest_mmu didn't exist back then.  So I think

  Fixes: 14c07ad89f4d ("x86/kvm/mmu: introduce guest_mmu")

would be appropriate?
Paolo Bonzini March 23, 2020, 11:50 p.m. UTC | #7
On 23/03/20 17:44, Sean Christopherson wrote:
> So I think
> 
>   Fixes: 14c07ad89f4d ("x86/kvm/mmu: introduce guest_mmu")
> 
> would be appropriate?
> 

Yes.  I changed it and also added the comment

+		/*
+		 * Nested EPT roots are always held through guest_mmu,
+		 * not root_mmu.
+		 */

which isn't unlike what you suggested elsewhere in the thread.

Paolo
Jim Mattson March 24, 2020, 12:12 a.m. UTC | #8
On Mon, Mar 23, 2020 at 4:51 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 23/03/20 17:44, Sean Christopherson wrote:
> > So I think
> >
> >   Fixes: 14c07ad89f4d ("x86/kvm/mmu: introduce guest_mmu")
> >
> > would be appropriate?
> >
>
> Yes.

I think it was actually commit efebf0aaec3d ("KVM: nVMX: Do not flush
TLB on L1<->L2 transitions if L1 uses VPID and EPT").
Sean Christopherson March 30, 2020, 6:38 p.m. UTC | #9
On Mon, Mar 23, 2020 at 05:12:04PM -0700, Jim Mattson wrote:
> On Mon, Mar 23, 2020 at 4:51 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
> >
> > On 23/03/20 17:44, Sean Christopherson wrote:
> > > So I think
> > >
> > >   Fixes: 14c07ad89f4d ("x86/kvm/mmu: introduce guest_mmu")
> > >
> > > would be appropriate?
> > >
> >
> > Yes.
> 
> I think it was actually commit efebf0aaec3d ("KVM: nVMX: Do not flush
> TLB on L1<->L2 transitions if L1 uses VPID and EPT").

Hmm, commit efebf0aaec3d it only changed flushing behavior, it didn't
affect KVM's behavior with respect to refreshing unsync'd SPTE, i.e.
reloading guest_mmu.

It's somewhat of a moot point, because _technically_ there is no bug since,
at the time of this fix, KVM always flushes and reloads on nested VM-Enter.

Patch
diff mbox series

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index f3774cef4fd4..9624cea4ed9f 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -5160,12 +5160,12 @@  static int handle_invept(struct kvm_vcpu *vcpu)
 		if (!nested_vmx_check_eptp(vcpu, operand.eptp))
 			return nested_vmx_failValid(vcpu,
 				VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID);
+
+		/* TODO: sync only the target EPTP context. */
 		fallthrough;
 	case VMX_EPT_EXTENT_GLOBAL:
-	/*
-	 * TODO: Sync the necessary shadow EPT roots here, rather than
-	 * at the next emulated VM-entry.
-	 */
+		kvm_mmu_free_roots(vcpu, &vcpu->arch.guest_mmu,
+				   KVM_MMU_ROOTS_ALL);
 		break;
 	default:
 		BUG_ON(1);