diff mbox

[RESEND] KVM: MMU: Fix mmu_shrink() so that it can free mmu pages as intended

Message ID 20120814151712.GA14582@amt.cnet (mailing list archive)
State New, archived
Headers show

Commit Message

Marcelo Tosatti Aug. 14, 2012, 3:17 p.m. UTC
On Tue, Aug 14, 2012 at 09:06:51AM +0900, Takuya Yoshikawa wrote:
> On Mon, 13 Aug 2012 19:15:23 -0300
> Marcelo Tosatti <mtosatti@redhat.com> wrote:
> 
> > On Fri, Aug 10, 2012 at 05:16:12PM +0900, Takuya Yoshikawa wrote:
> > > The following commit changed mmu_shrink() so that it would skip VMs
> > > whose n_used_mmu_pages was not zero and try to free pages from others:
> > > 
> > >   commit 1952639665e92481c34c34c3e2a71bf3e66ba362
> > >   KVM: MMU: do not iterate over all VMs in mmu_shrink()
> > > 
> > > This patch fixes the function so that it can free mmu pages as before.
> > > Note that "if (!nr_to_scan--)" check is removed since we do not try to
> > > free mmu pages from more than one VM.
> > > 
> > > Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
> > > Cc: Gleb Natapov <gleb@redhat.com>
> > > ---
> > >  This patch just recovers the original behaviour and is not related
> > >  to how to improve mmu_shrink() further; so please apply.
> > 
> > Before 1952639665e92481c34 the code was maxed at nr_to_scan loops. So
> > removing if (!nr_to_scan--) patch does change behaviour.
> > 
> > Am i missing something here?
> 
> No.  You are right about that.
> 
> But as Gleb and I confirmed when I first sent this patch, the possiblity
> that we see "n_used_mmu_pages == 0" 128 times is quite low that it is
> almost impossible to see the effect.
> 
> If you prefer to have the check, I will do so.

"kvm->arch.n_used_mmu_pages == 0" is an unlikely scenario, agree
with that.

                spin_lock(&kvm->mmu_lock);

This patch removes the maximum (successful) loops, which is nr_scan ==
sc->nr_to_scan.

The description above where you say 'possibility that we see
"n_used_mmu_pages == 0" 128 times' does not match the patch above.
If the patch is correct, then please explain it clearly in the
changelog.

What is the reasoning to remove nr_to_scan? What tests did you perform?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Takuya Yoshikawa Aug. 15, 2012, 2:11 a.m. UTC | #1
On Tue, 14 Aug 2012 12:17:12 -0300
Marcelo Tosatti <mtosatti@redhat.com> wrote:

> -               if (kvm->arch.n_used_mmu_pages > 0) {
> -                       if (!nr_to_scan--)
> -                               break;

-- (*1)

> +               if (!kvm->arch.n_used_mmu_pages)
>                         continue;

-- (*2)

> -               }
> 
>                 idx = srcu_read_lock(&kvm->srcu);
>                 spin_lock(&kvm->mmu_lock);
> 
> This patch removes the maximum (successful) loops, which is nr_scan ==
> sc->nr_to_scan.

IIUC, there was no successful loop from the beginning:

  if (kvm->arch.n_used_mmu_pages > 0) {
    if (!nr_to_scan--)
      break;  -- (*1)
    continue; -- (*2)
  }

Before the patch even when we find a VM with kvm->arch.n_used_mmu_pages
greater than 0, we just do either:
  skip it (*2) or
  break   (*1) if nr_to_scan becomes 0.

We only reach to
  kvm_mmu_remove_some_alloc_mmu_pages(kvm, &invalid_list);
  kvm_mmu_commit_zap_page(kvm, &invalid_list);
when (kvm->arch.n_used_mmu_pages == 0) that is probably why 

  commit 85b7059169e128c57a3a8a3e588fb89cb2031da1
  KVM: MMU: fix shrinking page from the empty mmu

could hit the very unlikely condition so easily.

So we are just looping for trying to free from empty MMUs.

> The description above where you say 'possibility that we see
> "n_used_mmu_pages == 0" 128 times' does not match the patch above.

Sorry about that.

> If the patch is correct, then please explain it clearly in the
> changelog.

Yes, I will do so.

> What is the reasoning to remove nr_to_scan? What tests did you perform?

I just confirmed:
 - mmu_shrink() did not free any pages:
   just checked all VMs and did "continue"
 - with my patch, it could free from the first VM with (n_used_mmu_pages > 0)

About nr_to_scan:
If my explanation above is right, this is not functioning at all.
But since it will not hurt anyone and may help us when we change
our batch size, I won't remove it in the next version.

Thanks,
	Takuya
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marcelo Tosatti Aug. 15, 2012, 6:22 p.m. UTC | #2
On Wed, Aug 15, 2012 at 11:11:51AM +0900, Takuya Yoshikawa wrote:
> On Tue, 14 Aug 2012 12:17:12 -0300
> Marcelo Tosatti <mtosatti@redhat.com> wrote:
> 
> > -               if (kvm->arch.n_used_mmu_pages > 0) {
> > -                       if (!nr_to_scan--)
> > -                               break;
> 
> -- (*1)
> 
> > +               if (!kvm->arch.n_used_mmu_pages)
> >                         continue;
> 
> -- (*2)
> 
> > -               }
> > 
> >                 idx = srcu_read_lock(&kvm->srcu);
> >                 spin_lock(&kvm->mmu_lock);
> > 
> > This patch removes the maximum (successful) loops, which is nr_scan ==
> > sc->nr_to_scan.
> 
> IIUC, there was no successful loop from the beginning:
> 
>   if (kvm->arch.n_used_mmu_pages > 0) {
>     if (!nr_to_scan--)
>       break;  -- (*1)
>     continue; -- (*2)
>   }
> 
> Before the patch even when we find a VM with kvm->arch.n_used_mmu_pages
> greater than 0, we just do either:
>   skip it (*2) or
>   break   (*1) if nr_to_scan becomes 0.
> 
> We only reach to
>   kvm_mmu_remove_some_alloc_mmu_pages(kvm, &invalid_list);
>   kvm_mmu_commit_zap_page(kvm, &invalid_list);
> when (kvm->arch.n_used_mmu_pages == 0) that is probably why 
>   commit 85b7059169e128c57a3a8a3e588fb89cb2031da1
>   KVM: MMU: fix shrinking page from the empty mmu
> 
> could hit the very unlikely condition so easily.
> 
> So we are just looping for trying to free from empty MMUs.
> 
> > The description above where you say 'possibility that we see
> > "n_used_mmu_pages == 0" 128 times' does not match the patch above.
> 
> Sorry about that.
> 
> > If the patch is correct, then please explain it clearly in the
> > changelog.
> 
> Yes, I will do so.
> 
> > What is the reasoning to remove nr_to_scan? What tests did you perform?
> 
> I just confirmed:
>  - mmu_shrink() did not free any pages:
>    just checked all VMs and did "continue"
>  - with my patch, it could free from the first VM with (n_used_mmu_pages > 0)
> 
> About nr_to_scan:
> If my explanation above is right, this is not functioning at all.
> But since it will not hurt anyone and may help us when we change
> our batch size, I won't remove it in the next version.
> 
> Thanks,
> 	Takuya

Ouch.

OK, please resend with dumb-proof changelog.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 9651c2c..4aeec72 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -4154,11 +4154,8 @@  static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
                 * want to shrink a VM that only started to populate its
                 * MMU
                 * anyway.
                 */
-               if (kvm->arch.n_used_mmu_pages > 0) {
-                       if (!nr_to_scan--)
-                               break;
+               if (!kvm->arch.n_used_mmu_pages)
                        continue;
-               }

                idx = srcu_read_lock(&kvm->srcu);