KVM: x86/MMU: Zap all when removing memslot if VM has assigned device
diff mbox series

Message ID 20190815151228.32242-1-sean.j.christopherson@intel.com
State New
Headers show
Series
  • KVM: x86/MMU: Zap all when removing memslot if VM has assigned device
Related show

Commit Message

Sean Christopherson Aug. 15, 2019, 3:12 p.m. UTC
Alex Williamson reported regressions with device assignment when KVM
changed its memslot removal logic to zap only the SPTEs for the memslot
being removed.  The source of the bug is unknown at this time, and root
causing the issue will likely be a slow process.  In the short term, fix
the regression by zapping all SPTEs when removing a memslot from a VM
with assigned device(s).

Fixes: 4e103134b862 ("KVM: x86/mmu: Zap only the relevant pages when removing a memslot", 2019-02-05)
Reported-by: Alex Willamson <alex.williamson@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---

An alternative idea to a full revert.  I assume this would be easy to
backport, and also easy to revert or quirk depending on where the bug
is hiding.

 arch/x86/kvm/mmu.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

Comments

Alex Williamson Aug. 15, 2019, 7:42 p.m. UTC | #1
On Thu, 15 Aug 2019 08:12:28 -0700
Sean Christopherson <sean.j.christopherson@intel.com> wrote:

> Alex Williamson reported regressions with device assignment when KVM
> changed its memslot removal logic to zap only the SPTEs for the memslot
> being removed.  The source of the bug is unknown at this time, and root
> causing the issue will likely be a slow process.  In the short term, fix
> the regression by zapping all SPTEs when removing a memslot from a VM
> with assigned device(s).
> 
> Fixes: 4e103134b862 ("KVM: x86/mmu: Zap only the relevant pages when removing a memslot", 2019-02-05)
> Reported-by: Alex Willamson <alex.williamson@redhat.com>
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
> 
> An alternative idea to a full revert.  I assume this would be easy to
> backport, and also easy to revert or quirk depending on where the bug
> is hiding.
> 
>  arch/x86/kvm/mmu.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 8f72526e2f68..358b93882ac6 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -5659,6 +5659,17 @@ static void kvm_mmu_invalidate_zap_pages_in_memslot(struct kvm *kvm,
>  	bool flush;
>  	gfn_t gfn;
>  
> +	/*
> +	 * Zapping only the removed memslot introduced regressions for VMs with
> +	 * assigned devices.  It is unknown what piece of code is buggy.  Until
> +	 * the source of the bug is identified, zap everything if the VM has an
> +	 * assigned device.
> +	 */
> +	if (kvm_arch_has_assigned_device(kvm)) {
> +		kvm_mmu_zap_all(kvm);
> +		return;
> +	}
> +
>  	spin_lock(&kvm->mmu_lock);
>  
>  	if (list_empty(&kvm->arch.active_mmu_pages))

Though if we want to zoom in a little further, the patch below seems to
work.  Both versions of these perhaps just highlight that we don't
really know why the original code doesn't work with device assignment,
whether it's something special about GPU mapping, or if it hints that
there's something more generally wrong and difficult to trigger.

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 24843cf49579..3956b5844479 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -5670,7 +5670,8 @@ static void kvm_mmu_invalidate_zap_pages_in_memslot(struct kvm *kvm,
 		gfn = slot->base_gfn + i;
 
 		for_each_valid_sp(kvm, sp, gfn) {
-			if (sp->gfn != gfn)
+			if (sp->gfn != gfn &&
+			    !kvm_arch_has_assigned_device(kvm))
 				continue;
 
 			kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);
Paolo Bonzini Aug. 16, 2019, 7:16 a.m. UTC | #2
On 15/08/19 17:12, Sean Christopherson wrote:
> Alex Williamson reported regressions with device assignment when KVM
> changed its memslot removal logic to zap only the SPTEs for the memslot
> being removed.  The source of the bug is unknown at this time, and root
> causing the issue will likely be a slow process.  In the short term, fix
> the regression by zapping all SPTEs when removing a memslot from a VM
> with assigned device(s).
> 
> Fixes: 4e103134b862 ("KVM: x86/mmu: Zap only the relevant pages when removing a memslot", 2019-02-05)
> Reported-by: Alex Willamson <alex.williamson@redhat.com>
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
> 
> An alternative idea to a full revert.  I assume this would be easy to
> backport, and also easy to revert or quirk depending on where the bug
> is hiding.

We're not sure that it only happens with assigned devices; it's just
that assigned BARs are the memslots that are more likely to be
reprogrammed at boot.  So this patch feels unsafe.

Paolo

> 
>  arch/x86/kvm/mmu.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 8f72526e2f68..358b93882ac6 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -5659,6 +5659,17 @@ static void kvm_mmu_invalidate_zap_pages_in_memslot(struct kvm *kvm,
>  	bool flush;
>  	gfn_t gfn;
>  
> +	/*
> +	 * Zapping only the removed memslot introduced regressions for VMs with
> +	 * assigned devices.  It is unknown what piece of code is buggy.  Until
> +	 * the source of the bug is identified, zap everything if the VM has an
> +	 * assigned device.
> +	 */
> +	if (kvm_arch_has_assigned_device(kvm)) {
> +		kvm_mmu_zap_all(kvm);
> +		return;
> +	}
> +
>  	spin_lock(&kvm->mmu_lock);
>  
>  	if (list_empty(&kvm->arch.active_mmu_pages))
>

Patch
diff mbox series

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 8f72526e2f68..358b93882ac6 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -5659,6 +5659,17 @@  static void kvm_mmu_invalidate_zap_pages_in_memslot(struct kvm *kvm,
 	bool flush;
 	gfn_t gfn;
 
+	/*
+	 * Zapping only the removed memslot introduced regressions for VMs with
+	 * assigned devices.  It is unknown what piece of code is buggy.  Until
+	 * the source of the bug is identified, zap everything if the VM has an
+	 * assigned device.
+	 */
+	if (kvm_arch_has_assigned_device(kvm)) {
+		kvm_mmu_zap_all(kvm);
+		return;
+	}
+
 	spin_lock(&kvm->mmu_lock);
 
 	if (list_empty(&kvm->arch.active_mmu_pages))