[v2,06/14] KVM: ARM: Memory virtualization setup

Message ID	CANM98qKAFQgfwRuFP2WOP4Af+1Un-wuQPC8-TiGi4B+s3XB2hQ@mail.gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org> MIME-Version: 1.0 In-Reply-To: <000401cda2a0$69670a40$3c351ec0$@samsung.com> References: <20121001090945.49198.68950.stgit@ubuntu> <20121001091042.49198.93241.stgit@ubuntu> <000401cda2a0$69670a40$3c351ec0$@samsung.com> Date: Sat, 6 Oct 2012 17:33:43 -0400 Message-ID: <CANM98qKAFQgfwRuFP2WOP4Af+1Un-wuQPC8-TiGi4B+s3XB2hQ@mail.gmail.com> Subject: Re: [PATCH v2 06/14] KVM: ARM: Memory virtualization setup From: Christoffer Dall <c.dall@virtualopensystems.com> To: Min-gyu Kim <mingyu84.kim@samsung.com> summary: Content analysis details: (-2.6 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low trust [209.85.212.49 listed in list.dnswl.org] -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] Cc: Marc Zyngier <marc.zyngier@arm.com>, =?UTF-8?B?6rmA7LC97ZmY?= <changhwan.m.kim@samsung.com>, linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu Precedence: list Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: linux-arm-kernel-bounces@lists.infradead.org Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org

Message ID

CANM98qKAFQgfwRuFP2WOP4Af+1Un-wuQPC8-TiGi4B+s3XB2hQ@mail.gmail.com (mailing list archive)

State

New, archived

Headers

MIME-Version: 1.0
In-Reply-To: <000401cda2a0$69670a40$3c351ec0$@samsung.com>
References: <20121001090945.49198.68950.stgit@ubuntu>
	<20121001091042.49198.93241.stgit@ubuntu>
	<000401cda2a0$69670a40$3c351ec0$@samsung.com>
Date: Sat, 6 Oct 2012 17:33:43 -0400
Message-ID: <CANM98qKAFQgfwRuFP2WOP4Af+1Un-wuQPC8-TiGi4B+s3XB2hQ@mail.gmail.com>
Subject: Re: [PATCH v2 06/14] KVM: ARM: Memory virtualization setup
From: Christoffer Dall <c.dall@virtualopensystems.com>
To: Min-gyu Kim <mingyu84.kim@samsung.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>,
	=?UTF-8?B?6rmA7LC97ZmY?= <changhwan.m.kim@samsung.com>,
	linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org,
	kvmarm@lists.cs.columbia.edu
Precedence: list
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: linux-arm-kernel-bounces@lists.infradead.org
Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org

Commit Message

Christoffer Dall Oct. 6, 2012, 9:33 p.m. UTC

On Thu, Oct 4, 2012 at 10:23 PM, Min-gyu Kim <mingyu84.kim@samsung.com> wrote:
>
>
>> -----Original Message-----
>> From: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org] On
>> Behalf Of Christoffer Dall
>> Sent: Monday, October 01, 2012 6:11 PM
>> To: kvm@vger.kernel.org; linux-arm-kernel@lists.infradead.org;
>> kvmarm@lists.cs.columbia.edu
>> Cc: Marc Zyngier
>> Subject: [PATCH v2 06/14] KVM: ARM: Memory virtualization setup
>>
>> +static void stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache
>> *cache,
>> +                        phys_addr_t addr, const pte_t *new_pte) {
>> +     pgd_t *pgd;
>> +     pud_t *pud;
>> +     pmd_t *pmd;
>> +     pte_t *pte, old_pte;
>> +
>> +     /* Create 2nd stage page table mapping - Level 1 */
>> +     pgd = kvm->arch.pgd + pgd_index(addr);
>> +     pud = pud_offset(pgd, addr);
>> +     if (pud_none(*pud)) {
>> +             if (!cache)
>> +                     return; /* ignore calls from kvm_set_spte_hva */
>> +             pmd = mmu_memory_cache_alloc(cache);
>> +             pud_populate(NULL, pud, pmd);
>> +             pmd += pmd_index(addr);
>> +             get_page(virt_to_page(pud));
>> +     } else
>> +             pmd = pmd_offset(pud, addr);
>> +
>> +     /* Create 2nd stage page table mapping - Level 2 */
>> +     if (pmd_none(*pmd)) {
>> +             if (!cache)
>> +                     return; /* ignore calls from kvm_set_spte_hva */
>> +             pte = mmu_memory_cache_alloc(cache);
>> +             clean_pte_table(pte);
>> +             pmd_populate_kernel(NULL, pmd, pte);
>> +             pte += pte_index(addr);
>> +             get_page(virt_to_page(pmd));
>> +     } else
>> +             pte = pte_offset_kernel(pmd, addr);
>> +
>> +     /* Create 2nd stage page table mapping - Level 3 */
>> +     old_pte = *pte;
>> +     set_pte_ext(pte, *new_pte, 0);
>> +     if (pte_present(old_pte))
>> +             __kvm_tlb_flush_vmid(kvm);
>> +     else
>> +             get_page(virt_to_page(pte));
>> +}
>
>
> I'm not sure about the 3-level page table, but isn't it necessary to
> clean the page table for 2nd level?
> There are two mmu_memory_cache_alloc calls. One has following clean_pte_table
> and the other doesn't have.

hmm, it probably is - I couldn't really find the common case where
this is done in the kernel normally (except for some custom loop in
ioremap and idmap), but I added this fix:

 		get_page(virt_to_page(pud));


>
> And why do you ignore calls from kvm_set_spte_hva? It is supposed to happen when
> host moves the page, right? Then you ignore the case because it can be handled
> later when fault actually happens? Is there any other reason that I miss?
>

kvm_set_spte_hva tells us that a page at some IPA is going to be
backed by another physical page, which means we must adjust the stage
2 mapping. However, if we don't have that page mapped in the stage 2
page table, we don't need to do anything, and certainly don't want to
start allocating unnecessary level2 and level3 page tables.


Thanks!
-Christoffer

Comments

Marc Zyngier Oct. 9, 2012, 12:56 p.m. UTC | #1

On Sat, 6 Oct 2012 17:33:43 -0400, Christoffer Dall
<c.dall@virtualopensystems.com> wrote:
> On Thu, Oct 4, 2012 at 10:23 PM, Min-gyu Kim <mingyu84.kim@samsung.com>
> wrote:
>>
>>
>>> -----Original Message-----
>>> From: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org] On
>>> Behalf Of Christoffer Dall
>>> Sent: Monday, October 01, 2012 6:11 PM
>>> To: kvm@vger.kernel.org; linux-arm-kernel@lists.infradead.org;
>>> kvmarm@lists.cs.columbia.edu
>>> Cc: Marc Zyngier
>>> Subject: [PATCH v2 06/14] KVM: ARM: Memory virtualization setup
>>>
>>> +static void stage2_set_pte(struct kvm *kvm, struct
kvm_mmu_memory_cache
>>> *cache,
>>> +                        phys_addr_t addr, const pte_t *new_pte) {
>>> +     pgd_t *pgd;
>>> +     pud_t *pud;
>>> +     pmd_t *pmd;
>>> +     pte_t *pte, old_pte;
>>> +
>>> +     /* Create 2nd stage page table mapping - Level 1 */
>>> +     pgd = kvm->arch.pgd + pgd_index(addr);
>>> +     pud = pud_offset(pgd, addr);
>>> +     if (pud_none(*pud)) {
>>> +             if (!cache)
>>> +                     return; /* ignore calls from kvm_set_spte_hva */
>>> +             pmd = mmu_memory_cache_alloc(cache);
>>> +             pud_populate(NULL, pud, pmd);
>>> +             pmd += pmd_index(addr);
>>> +             get_page(virt_to_page(pud));
>>> +     } else
>>> +             pmd = pmd_offset(pud, addr);
>>> +
>>> +     /* Create 2nd stage page table mapping - Level 2 */
>>> +     if (pmd_none(*pmd)) {
>>> +             if (!cache)
>>> +                     return; /* ignore calls from kvm_set_spte_hva */
>>> +             pte = mmu_memory_cache_alloc(cache);
>>> +             clean_pte_table(pte);
>>> +             pmd_populate_kernel(NULL, pmd, pte);
>>> +             pte += pte_index(addr);
>>> +             get_page(virt_to_page(pmd));
>>> +     } else
>>> +             pte = pte_offset_kernel(pmd, addr);
>>> +
>>> +     /* Create 2nd stage page table mapping - Level 3 */
>>> +     old_pte = *pte;
>>> +     set_pte_ext(pte, *new_pte, 0);
>>> +     if (pte_present(old_pte))
>>> +             __kvm_tlb_flush_vmid(kvm);
>>> +     else
>>> +             get_page(virt_to_page(pte));
>>> +}
>>
>>
>> I'm not sure about the 3-level page table, but isn't it necessary to
>> clean the page table for 2nd level?
>> There are two mmu_memory_cache_alloc calls. One has following
>> clean_pte_table
>> and the other doesn't have.
> 
> hmm, it probably is - I couldn't really find the common case where
> this is done in the kernel normally (except for some custom loop in
> ioremap and idmap), but I added this fix:
> 
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 5394a52..f11ba27f 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -430,6 +430,7 @@ static void stage2_set_pte(struct kvm *kvm, struct
> kvm_mmu_memory_cache *cache,
>  		if (!cache)
>  			return; /* ignore calls from kvm_set_spte_hva */
>  		pmd = mmu_memory_cache_alloc(cache);
> +		clean_dcache_area(pmd, PTRS_PER_PMD * sizeof(pmd_t));
>  		pud_populate(NULL, pud, pmd);
>  		pmd += pmd_index(addr);
>  		get_page(virt_to_page(pud));

There ought to be a test of ID_MMFR3[23:20] to find out whether or not it
is useful to clean the dcache. Not sure if that's a massive gain, but it is
certainly an optimisation to consider for the kernel as a whole.

        M.

??? Oct. 10, 2012, 1:02 a.m. UTC | #2

> -----Original Message-----
> From: Marc Zyngier [mailto:marc.zyngier@arm.com]
> Sent: Tuesday, October 09, 2012 9:56 PM
> To: Christoffer Dall
> Cc: Min-gyu Kim; ???; linux-arm-kernel@lists.infradead.org;
> kvm@vger.kernel.org; kvmarm@lists.cs.columbia.edu
> Subject: Re: [kvmarm] [PATCH v2 06/14] KVM: ARM: Memory virtualization
> setup
> 
> On Sat, 6 Oct 2012 17:33:43 -0400, Christoffer Dall
> <c.dall@virtualopensystems.com> wrote:
> > On Thu, Oct 4, 2012 at 10:23 PM, Min-gyu Kim
> > <mingyu84.kim@samsung.com>
> > wrote:
> >>
> >>
> >>> -----Original Message-----
> >>> From: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org]
> >>> On Behalf Of Christoffer Dall
> >>> Sent: Monday, October 01, 2012 6:11 PM
> >>> To: kvm@vger.kernel.org; linux-arm-kernel@lists.infradead.org;
> >>> kvmarm@lists.cs.columbia.edu
> >>> Cc: Marc Zyngier
> >>> Subject: [PATCH v2 06/14] KVM: ARM: Memory virtualization setup
> >>>
> >>> +static void stage2_set_pte(struct kvm *kvm, struct
> kvm_mmu_memory_cache
> >>> *cache,
> >>> +                        phys_addr_t addr, const pte_t *new_pte) {
> >>> +     pgd_t *pgd;
> >>> +     pud_t *pud;
> >>> +     pmd_t *pmd;
> >>> +     pte_t *pte, old_pte;
> >>> +
> >>> +     /* Create 2nd stage page table mapping - Level 1 */
> >>> +     pgd = kvm->arch.pgd + pgd_index(addr);
> >>> +     pud = pud_offset(pgd, addr);
> >>> +     if (pud_none(*pud)) {
> >>> +             if (!cache)
> >>> +                     return; /* ignore calls from kvm_set_spte_hva */
> >>> +             pmd = mmu_memory_cache_alloc(cache);
> >>> +             pud_populate(NULL, pud, pmd);
> >>> +             pmd += pmd_index(addr);
> >>> +             get_page(virt_to_page(pud));
> >>> +     } else
> >>> +             pmd = pmd_offset(pud, addr);
> >>> +
> >>> +     /* Create 2nd stage page table mapping - Level 2 */
> >>> +     if (pmd_none(*pmd)) {
> >>> +             if (!cache)
> >>> +                     return; /* ignore calls from kvm_set_spte_hva */
> >>> +             pte = mmu_memory_cache_alloc(cache);
> >>> +             clean_pte_table(pte);
> >>> +             pmd_populate_kernel(NULL, pmd, pte);
> >>> +             pte += pte_index(addr);
> >>> +             get_page(virt_to_page(pmd));
> >>> +     } else
> >>> +             pte = pte_offset_kernel(pmd, addr);
> >>> +
> >>> +     /* Create 2nd stage page table mapping - Level 3 */
> >>> +     old_pte = *pte;
> >>> +     set_pte_ext(pte, *new_pte, 0);
> >>> +     if (pte_present(old_pte))
> >>> +             __kvm_tlb_flush_vmid(kvm);
> >>> +     else
> >>> +             get_page(virt_to_page(pte)); }
> >>
> >>
> >> I'm not sure about the 3-level page table, but isn't it necessary to
> >> clean the page table for 2nd level?
> >> There are two mmu_memory_cache_alloc calls. One has following
> >> clean_pte_table and the other doesn't have.
> >
> > hmm, it probably is - I couldn't really find the common case where
> > this is done in the kernel normally (except for some custom loop in
> > ioremap and idmap), but I added this fix:
> >
> > diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c index
> > 5394a52..f11ba27f 100644
> > --- a/arch/arm/kvm/mmu.c
> > +++ b/arch/arm/kvm/mmu.c
> > @@ -430,6 +430,7 @@ static void stage2_set_pte(struct kvm *kvm, struct
> > kvm_mmu_memory_cache *cache,
> >  		if (!cache)
> >  			return; /* ignore calls from kvm_set_spte_hva */
> >  		pmd = mmu_memory_cache_alloc(cache);
> > +		clean_dcache_area(pmd, PTRS_PER_PMD * sizeof(pmd_t));
> >  		pud_populate(NULL, pud, pmd);
> >  		pmd += pmd_index(addr);
> >  		get_page(virt_to_page(pud));
> 
> There ought to be a test of ID_MMFR3[23:20] to find out whether or not it
> is useful to clean the dcache. Not sure if that's a massive gain, but it
> is certainly an optimisation to consider for the kernel as a whole.

That's part of what I was wondering. clean_dcache_area is substituted by nop if
TLB_CAN_READ_FROM_L1_CACHE is defined(mm/proc-v7.S). 
But I couldn't find any place where it is defined.
If it is part of bsp to enable TLB_CAN_RAD_FROM_L1_CACHE according to ID_MMFR3[23:20],
it would not be necessary to concerned about cleaning or not cleaning.

However, am I right?

> 
>         M.
> --
> Fast, cheap, reliable. Pick two.

diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 5394a52..f11ba27f 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -430,6 +430,7 @@  static void stage2_set_pte(struct kvm *kvm, struct
kvm_mmu_memory_cache *cache,
 		if (!cache)
 			return; /* ignore calls from kvm_set_spte_hva */
 		pmd = mmu_memory_cache_alloc(cache);
+		clean_dcache_area(pmd, PTRS_PER_PMD * sizeof(pmd_t));
 		pud_populate(NULL, pud, pmd);
 		pmd += pmd_index(addr);

[v2,06/14] KVM: ARM: Memory virtualization setup

Commit Message

Comments

Patch