KVM: MMU: Segregate mmu pages created with different cr4.pge settings
diff mbox

Message ID 20090107134606.GA4977@amt.cnet
State Accepted, archived
Headers show

Commit Message

Marcelo Tosatti Jan. 7, 2009, 1:46 p.m. UTC
On Wed, Jan 07, 2009 at 01:32:41PM +0200, Avi Kivity wrote:
> Marcelo Tosatti wrote:
>> Let me shoot at one direction: a shadow page with PGE bit in either
>> state is created. Later that shadow page is nuked (via mmu notifiers,
>> for example). 
>
> I doubt that mmu notifiers were invoked in this case (the bug would be  
> very rare); in any case we flush the tlb.

This comment is worrying

        /*
         * FIXME: Tis shouldn't be necessary here, but there is a flush
         * missing in the MMU code. Until we find this bug, flush the
         * complete TLB here on an NPF
         */
        if (npt_enabled)
                svm_flush_tlb(&svm->vcpu);

Alexander, you might want to try this patch, -ENONPT here (and revert the previous
one). I have no clue, what else could be causing this?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Alexander Graf Jan. 8, 2009, 7:53 p.m. UTC | #1
Sorry for the late reply - I wanted to know who kvm hangs in the host  
kernel context :)

On 07.01.2009, at 14:46, Marcelo Tosatti <mtosatti@redhat.com> wrote:

> On Wed, Jan 07, 2009 at 01:32:41PM +0200, Avi Kivity wrote:
>> Marcelo Tosatti wrote:
>>> Let me shoot at one direction: a shadow page with PGE bit in either
>>> state is created. Later that shadow page is nuked (via mmu  
>>> notifiers,
>>> for example).
>>
>> I doubt that mmu notifiers were invoked in this case (the bug would  
>> be
>> very rare); in any case we flush the tlb.
>
> This comment is worrying
>
>        /*
>         * FIXME: Tis shouldn't be necessary here, but there is a flush
>         * missing in the MMU code. Until we find this bug, flush the
>         * complete TLB here on an NPF
>         */
>        if (npt_enabled)
>                svm_flush_tlb(&svm->vcpu);
>

This is in, because netbench in an npt-guest failed after a few  
minutes (see Alex W's bug report) and this appeard to fix it.


> Alexander, you might want to try this patch, -ENONPT here (and  
> revert the previous
> one).

Eh, what?

> I have no clue, what else could be causing this?
>
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 10bdb2a..bf68e5b 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -33,6 +33,7 @@
> #include <asm/cmpxchg.h>
> #include <asm/io.h>
> #include <asm/vmx.h>
> +#include <asm/tlbflush.h>
>
> /*
>  * When setting this variable to true it enables Two-Dimensional- 
> Paging
> @@ -1850,6 +1851,11 @@ static int __direct_map(struct kvm_vcpu  
> *vcpu, gpa_t v, int write,
>
>        if (*iterator.sptep == shadow_trap_nonpresent_pte) {
>            pseudo_gfn = (iterator.addr & PT64_DIR_BASE_ADDR_MASK) >>  
> PAGE_SHIFT;
> +
> +                        kvm_flush_remote_tlbs(vcpu->kvm);
> +                        kvm_mmu_flush_tlb(vcpu);
> +                        __flush_tlb();
> +
>            sp = kvm_mmu_get_page(vcpu, pseudo_gfn, iterator.addr,
>                          iterator.level - 1,
>                          1, ACC_ALL, iterator.sptep);
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marcelo Tosatti Jan. 9, 2009, 12:36 a.m. UTC | #2
On Thu, Jan 08, 2009 at 08:53:21PM +0100, Alexander Graf wrote:
> Sorry for the late reply - I wanted to know who kvm hangs in the host  
> kernel context :)
>
> On 07.01.2009, at 14:46, Marcelo Tosatti <mtosatti@redhat.com> wrote:
>
>> On Wed, Jan 07, 2009 at 01:32:41PM +0200, Avi Kivity wrote:
>>> Marcelo Tosatti wrote:
>>>> Let me shoot at one direction: a shadow page with PGE bit in either
>>>> state is created. Later that shadow page is nuked (via mmu  
>>>> notifiers,
>>>> for example).
>>>
>>> I doubt that mmu notifiers were invoked in this case (the bug would  
>>> be
>>> very rare); in any case we flush the tlb.
>>
>> This comment is worrying
>>
>>        /*
>>         * FIXME: Tis shouldn't be necessary here, but there is a flush
>>         * missing in the MMU code. Until we find this bug, flush the
>>         * complete TLB here on an NPF
>>         */
>>        if (npt_enabled)
>>                svm_flush_tlb(&svm->vcpu);
>>
>
> This is in, because netbench in an npt-guest failed after a few minutes 
> (see Alex W's bug report) and this appeard to fix it.

Right. The comment is scary, thats all. Maybe the hypothetical missing
flush is also the cause for this bug you're not. Or not.

>> Alexander, you might want to try this patch, -ENONPT here (and revert 
>> the previous
>> one).
>
> Eh, what?

I meant I have no NPT box to test myself easily.

"revert the previous one" = remove the !tdp_enabled test on set_cr4.

The patch below is just a hack to flush the TLB of remote vcpu's before
updating the host TLB. To confirm an experimental theory (read: guess).

Hum, NPT is surely not available inside the guest for ESX to use?

Yeah, gathering more information would be helpful.

>> I have no clue, what else could be causing this?
>>
>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>> index 10bdb2a..bf68e5b 100644
>> --- a/arch/x86/kvm/mmu.c
>> +++ b/arch/x86/kvm/mmu.c
>> @@ -33,6 +33,7 @@
>> #include <asm/cmpxchg.h>
>> #include <asm/io.h>
>> #include <asm/vmx.h>
>> +#include <asm/tlbflush.h>
>>
>> /*
>>  * When setting this variable to true it enables Two-Dimensional- 
>> Paging
>> @@ -1850,6 +1851,11 @@ static int __direct_map(struct kvm_vcpu *vcpu, 
>> gpa_t v, int write,
>>
>>        if (*iterator.sptep == shadow_trap_nonpresent_pte) {
>>            pseudo_gfn = (iterator.addr & PT64_DIR_BASE_ADDR_MASK) >>  
>> PAGE_SHIFT;
>> +
>> +                        kvm_flush_remote_tlbs(vcpu->kvm);
>> +                        kvm_mmu_flush_tlb(vcpu);
>> +                        __flush_tlb();
>> +
>>            sp = kvm_mmu_get_page(vcpu, pseudo_gfn, iterator.addr,
>>                          iterator.level - 1,
>>                          1, ACC_ALL, iterator.sptep);
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Graf Jan. 9, 2009, 10:43 a.m. UTC | #3
On 09.01.2009, at 01:36, Marcelo Tosatti <mtosatti@redhat.com> wrote:

> On Thu, Jan 08, 2009 at 08:53:21PM +0100, Alexander Graf wrote:
>> Sorry for the late reply - I wanted to know who kvm hangs in the host
>> kernel context :)
>>
>> On 07.01.2009, at 14:46, Marcelo Tosatti <mtosatti@redhat.com> wrote:
>>
>>> On Wed, Jan 07, 2009 at 01:32:41PM +0200, Avi Kivity wrote:
>>>> Marcelo Tosatti wrote:
>>>>> Let me shoot at one direction: a shadow page with PGE bit in  
>>>>> either
>>>>> state is created. Later that shadow page is nuked (via mmu
>>>>> notifiers,
>>>>> for example).
>>>>
>>>> I doubt that mmu notifiers were invoked in this case (the bug would
>>>> be
>>>> very rare); in any case we flush the tlb.
>>>
>>> This comment is worrying
>>>
>>>       /*
>>>        * FIXME: Tis shouldn't be necessary here, but there is a  
>>> flush
>>>        * missing in the MMU code. Until we find this bug, flush the
>>>        * complete TLB here on an NPF
>>>        */
>>>       if (npt_enabled)
>>>               svm_flush_tlb(&svm->vcpu);
>>>
>>
>> This is in, because netbench in an npt-guest failed after a few  
>> minutes
>> (see Alex W's bug report) and this appeard to fix it.
>
> Right. The comment is scary, thats all. Maybe the hypothetical missing
> flush is also the cause for this bug you're not. Or not.
>
>>> Alexander, you might want to try this patch, -ENONPT here (and  
>>> revert
>>> the previous
>>> one).
>>
>> Eh, what?
>
> I meant I have no NPT box to test myself easily.

Ah, what a pity :o

>
> "revert the previous one" = remove the !tdp_enabled test on set_cr4.
>
> The patch below is just a hack to flush the TLB of remote vcpu's  
> before
> updating the host TLB. To confirm an experimental theory (read:  
> guess).

Ok, will do.

>
>
> Hum, NPT is surely not available inside the guest for ESX to use?

Npt can only be used with SVM, and I haven't seen esx call svm  
instructions at all so far.

Alex

> Yeah, gathering more information would be helpful.
>
>>> I have no clue, what else could be causing this?
>>>
>>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>>> index 10bdb2a..bf68e5b 100644
>>> --- a/arch/x86/kvm/mmu.c
>>> +++ b/arch/x86/kvm/mmu.c
>>> @@ -33,6 +33,7 @@
>>> #include <asm/cmpxchg.h>
>>> #include <asm/io.h>
>>> #include <asm/vmx.h>
>>> +#include <asm/tlbflush.h>
>>>
>>> /*
>>> * When setting this variable to true it enables Two-Dimensional-
>>> Paging
>>> @@ -1850,6 +1851,11 @@ static int __direct_map(struct kvm_vcpu  
>>> *vcpu,
>>> gpa_t v, int write,
>>>
>>>       if (*iterator.sptep == shadow_trap_nonpresent_pte) {
>>>           pseudo_gfn = (iterator.addr & PT64_DIR_BASE_ADDR_MASK) >>
>>> PAGE_SHIFT;
>>> +
>>> +                        kvm_flush_remote_tlbs(vcpu->kvm);
>>> +                        kvm_mmu_flush_tlb(vcpu);
>>> +                        __flush_tlb();
>>> +
>>>           sp = kvm_mmu_get_page(vcpu, pseudo_gfn, iterator.addr,
>>>                         iterator.level - 1,
>>>                         1, ACC_ALL, iterator.sptep);
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch
diff mbox

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 10bdb2a..bf68e5b 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -33,6 +33,7 @@ 
 #include <asm/cmpxchg.h>
 #include <asm/io.h>
 #include <asm/vmx.h>
+#include <asm/tlbflush.h>
 
 /*
  * When setting this variable to true it enables Two-Dimensional-Paging
@@ -1850,6 +1851,11 @@  static int __direct_map(struct kvm_vcpu *vcpu, gpa_t v, int write,
 
 		if (*iterator.sptep == shadow_trap_nonpresent_pte) {
 			pseudo_gfn = (iterator.addr & PT64_DIR_BASE_ADDR_MASK) >> PAGE_SHIFT;
+
+                        kvm_flush_remote_tlbs(vcpu->kvm);
+                        kvm_mmu_flush_tlb(vcpu);
+                        __flush_tlb();
+                        
 			sp = kvm_mmu_get_page(vcpu, pseudo_gfn, iterator.addr,
 					      iterator.level - 1,
 					      1, ACC_ALL, iterator.sptep);