diff mbox series

[MM-PART2,RESEND,v2,14/19] xen/arm32: mm: Avoid cleaning the cache for secondary CPUs page-tables

Message ID 20190514122456.28559-15-julien.grall@arm.com (mailing list archive)
State New, archived
Headers show
Series xen/arm: Clean-up & fixes in boot/mm code | expand

Commit Message

Julien Grall May 14, 2019, 12:24 p.m. UTC
The page-table walker is configured to use the same shareability and
cacheability as the access performed when updating the page-tables. This
means cleaning the cache for secondary CPUs runtime page-tables is
unnecessary.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Andrii Anisov <andrii_anisov@epam.com>

---
    Changes in v2:
        - Add Andrii's reviewed-by
---
 xen/arch/arm/mm.c | 3 ---
 1 file changed, 3 deletions(-)

Comments

Stefano Stabellini June 4, 2019, 11:11 p.m. UTC | #1
On Tue, 14 May 2019, Julien Grall wrote:
> The page-table walker is configured to use the same shareability and
> cacheability as the access performed when updating the page-tables. This
> means cleaning the cache for secondary CPUs runtime page-tables is
> unnecessary.

All right. Is there an explicit configuration for the shareability and
cacheability used by the page-table walker or is it specified as such in
the Arm Arm? Also, isn't it possible that CPUs on a different cluster
(big.LITTLE) would have issues with this if the cache could be split
between the two clusters?


> Signed-off-by: Julien Grall <julien.grall@arm.com>
> Reviewed-by: Andrii Anisov <andrii_anisov@epam.com>
> 
> ---
>     Changes in v2:
>         - Add Andrii's reviewed-by
> ---
>  xen/arch/arm/mm.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index cda2847d00..6db7dda0da 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -769,9 +769,6 @@ int init_secondary_pagetables(int cpu)
>          write_pte(&first[first_table_offset(DOMHEAP_VIRT_START+i*FIRST_SIZE)], pte);
>      }
>  
> -    clean_dcache_va_range(first, PAGE_SIZE);
> -    clean_dcache_va_range(domheap, DOMHEAP_SECOND_PAGES*PAGE_SIZE);
> -
>      per_cpu(xen_pgtable, cpu) = first;
>      per_cpu(xen_dommap, cpu) = domheap;
>  
> -- 
> 2.11.0
>
Julien Grall June 5, 2019, 10:19 a.m. UTC | #2
On 05/06/2019 00:11, Stefano Stabellini wrote:
> On Tue, 14 May 2019, Julien Grall wrote:
>> The page-table walker is configured to use the same shareability and
>> cacheability as the access performed when updating the page-tables. This
>> means cleaning the cache for secondary CPUs runtime page-tables is
>> unnecessary.
> 
> All right. Is there an explicit configuration for the shareability and
> cacheability used by the page-table walker or is it specified as such in
> the Arm Arm?

See the configuration of TCR_EL2, I can mention it.

> Also, isn't it possible that CPUs on a different cluster
> (big.LITTLE) would have issues with this if the cache could be split
> between the two clusters?

I don't understand this... Cache should be coherent when a CPU leaves EL3.
But we already share some bits of the page tables between the processor (see 
create_xen_page_tables). So I don't see where there is a possible problem here.

Cheers,

> 
> 
>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>> Reviewed-by: Andrii Anisov <andrii_anisov@epam.com>
>>
>> ---
>>      Changes in v2:
>>          - Add Andrii's reviewed-by
>> ---
>>   xen/arch/arm/mm.c | 3 ---
>>   1 file changed, 3 deletions(-)
>>
>> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
>> index cda2847d00..6db7dda0da 100644
>> --- a/xen/arch/arm/mm.c
>> +++ b/xen/arch/arm/mm.c
>> @@ -769,9 +769,6 @@ int init_secondary_pagetables(int cpu)
>>           write_pte(&first[first_table_offset(DOMHEAP_VIRT_START+i*FIRST_SIZE)], pte);
>>       }
>>   
>> -    clean_dcache_va_range(first, PAGE_SIZE);
>> -    clean_dcache_va_range(domheap, DOMHEAP_SECOND_PAGES*PAGE_SIZE);
>> -
>>       per_cpu(xen_pgtable, cpu) = first;
>>       per_cpu(xen_dommap, cpu) = domheap;
>>   
>> -- 
>> 2.11.0
>>
Julien Grall June 10, 2019, 10:15 a.m. UTC | #3
Hi,

On 05/06/2019 11:19, Julien Grall wrote:
> On 05/06/2019 00:11, Stefano Stabellini wrote:
>> On Tue, 14 May 2019, Julien Grall wrote:
>>> The page-table walker is configured to use the same shareability and
>>> cacheability as the access performed when updating the page-tables. This
>>> means cleaning the cache for secondary CPUs runtime page-tables is
>>> unnecessary.
>>
>> All right. Is there an explicit configuration for the shareability and
>> cacheability used by the page-table walker or is it specified as such in
>> the Arm Arm?
> 
> See the configuration of TCR_EL2, I can mention it.
> 
>> Also, isn't it possible that CPUs on a different cluster
>> (big.LITTLE) would have issues with this if the cache could be split
>> between the two clusters?

Gentle ping, can you please clarify your question.

> 
> I don't understand this... Cache should be coherent when a CPU leaves EL3.
> But we already share some bits of the page tables between the processor (see 
> create_xen_page_tables). So I don't see where there is a possible problem here.

Cheers,
Stefano Stabellini June 10, 2019, 8:28 p.m. UTC | #4
On Wed, 5 Jun 2019, Julien Grall wrote:
> On 05/06/2019 00:11, Stefano Stabellini wrote:
> > On Tue, 14 May 2019, Julien Grall wrote:
> > > The page-table walker is configured to use the same shareability and
> > > cacheability as the access performed when updating the page-tables. This
> > > means cleaning the cache for secondary CPUs runtime page-tables is
> > > unnecessary.
> > 
> > All right. Is there an explicit configuration for the shareability and
> > cacheability used by the page-table walker or is it specified as such in
> > the Arm Arm?
> 
> See the configuration of TCR_EL2, I can mention it.
 
That would be nice. I double-checked and it is as you wrote.


> > Also, isn't it possible that CPUs on a different cluster
> > (big.LITTLE) would have issues with this if the cache could be split
> > between the two clusters?
> 
> I don't understand this... Cache should be coherent when a CPU leaves EL3.
> But we already share some bits of the page tables between the processor (see
> create_xen_page_tables). So I don't see where there is a possible problem
> here.

If the cache is always coherent across the clusters and the
pagetable-walkers of different clusters, then this is fine.

 
> > > Signed-off-by: Julien Grall <julien.grall@arm.com>
> > > Reviewed-by: Andrii Anisov <andrii_anisov@epam.com>
> > > 
> > > ---
> > >      Changes in v2:
> > >          - Add Andrii's reviewed-by
> > > ---
> > >   xen/arch/arm/mm.c | 3 ---
> > >   1 file changed, 3 deletions(-)
> > > 
> > > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> > > index cda2847d00..6db7dda0da 100644
> > > --- a/xen/arch/arm/mm.c
> > > +++ b/xen/arch/arm/mm.c
> > > @@ -769,9 +769,6 @@ int init_secondary_pagetables(int cpu)
> > >           write_pte(&first[first_table_offset(DOMHEAP_VIRT_START+i*FIRST_SIZE)],
> > > pte);
> > >       }
> > >   -    clean_dcache_va_range(first, PAGE_SIZE);
> > > -    clean_dcache_va_range(domheap, DOMHEAP_SECOND_PAGES*PAGE_SIZE);
> > > -
> > >       per_cpu(xen_pgtable, cpu) = first;
> > >       per_cpu(xen_dommap, cpu) = domheap;
> > >   
> > > -- 
> > > 2.11.0
> > > 
> 
> -- 
> Julien Grall
>
Julien Grall June 10, 2019, 8:40 p.m. UTC | #5
Hi Steano,

On 6/10/19 9:28 PM, Stefano Stabellini wrote:
> On Wed, 5 Jun 2019, Julien Grall wrote:
>> On 05/06/2019 00:11, Stefano Stabellini wrote:
>>> On Tue, 14 May 2019, Julien Grall wrote:
>>>> The page-table walker is configured to use the same shareability and
>>>> cacheability as the access performed when updating the page-tables. This
>>>> means cleaning the cache for secondary CPUs runtime page-tables is
>>>> unnecessary.
>>>
>>> All right. Is there an explicit configuration for the shareability and
>>> cacheability used by the page-table walker or is it specified as such in
>>> the Arm Arm?
>>
>> See the configuration of TCR_EL2, I can mention it.
>   
> That would be nice. I double-checked and it is as you wrote.

Sure. How about:

"The page-table walker is configured by TCR_EL2 to use shareability and 
cacheability as the access performed when updating the page-tables. [...]"

> 
> 
>>> Also, isn't it possible that CPUs on a different cluster
>>> (big.LITTLE) would have issues with this if the cache could be split
>>> between the two clusters?
>>
>> I don't understand this... Cache should be coherent when a CPU leaves EL3.
>> But we already share some bits of the page tables between the processor (see
>> create_xen_page_tables). So I don't see where there is a possible problem
>> here.
> 
> If the cache is always coherent across the clusters and the
> pagetable-walkers of different clusters, then this is fine.

Xen (and Linux) built on the assumption that all the CPUs (and 
page-table walker) are in the same shareable domain (i.e 
innershareable). If you have a platform where it is not the case, then 
Xen is going to be badly broken.

This is also inline with the expectatio  from the Arm Arm (B2-123 in DDI 
0487D.a):

"The Inner Shareable domain is expected to be the set of PEs controlled 
by a single hypervisor or operating system."

Do you have a case where Xen needs to run on PEs in different domains?

Cheers,
Stefano Stabellini June 10, 2019, 8:54 p.m. UTC | #6
On Mon, 10 Jun 2019, Julien Grall wrote:
> Hi Steano,
> 
> On 6/10/19 9:28 PM, Stefano Stabellini wrote:
> > On Wed, 5 Jun 2019, Julien Grall wrote:
> > > On 05/06/2019 00:11, Stefano Stabellini wrote:
> > > > On Tue, 14 May 2019, Julien Grall wrote:
> > > > > The page-table walker is configured to use the same shareability and
> > > > > cacheability as the access performed when updating the page-tables.
> > > > > This
> > > > > means cleaning the cache for secondary CPUs runtime page-tables is
> > > > > unnecessary.
> > > > 
> > > > All right. Is there an explicit configuration for the shareability and
> > > > cacheability used by the page-table walker or is it specified as such in
> > > > the Arm Arm?
> > > 
> > > See the configuration of TCR_EL2, I can mention it.
> >   That would be nice. I double-checked and it is as you wrote.
> 
> Sure. How about:
> 
> "The page-table walker is configured by TCR_EL2 to use shareability and
> cacheability as the access performed when updating the page-tables. [...]"

That's great thank you. With that, add my reviewed-by.


> > > > Also, isn't it possible that CPUs on a different cluster
> > > > (big.LITTLE) would have issues with this if the cache could be split
> > > > between the two clusters?
> > > 
> > > I don't understand this... Cache should be coherent when a CPU leaves EL3.
> > > But we already share some bits of the page tables between the processor
> > > (see
> > > create_xen_page_tables). So I don't see where there is a possible problem
> > > here.
> > 
> > If the cache is always coherent across the clusters and the
> > pagetable-walkers of different clusters, then this is fine.
> 
> Xen (and Linux) built on the assumption that all the CPUs (and page-table
> walker) are in the same shareable domain (i.e innershareable). If you have a
> platform where it is not the case, then Xen is going to be badly broken.
> 
> This is also inline with the expectatio  from the Arm Arm (B2-123 in DDI
> 0487D.a):
> 
> "The Inner Shareable domain is expected to be the set of PEs controlled by a
> single hypervisor or operating system."
> 
> Do you have a case where Xen needs to run on PEs in different domains?

No, thankfully I don't :-)

I was worried that some big.LITTLE SoCs might be built like that so. (I
don't have any big.LITTLE machines here to confirm/deny.) It is good
that we don't have to worry about it.
diff mbox series

Patch

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index cda2847d00..6db7dda0da 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -769,9 +769,6 @@  int init_secondary_pagetables(int cpu)
         write_pte(&first[first_table_offset(DOMHEAP_VIRT_START+i*FIRST_SIZE)], pte);
     }
 
-    clean_dcache_va_range(first, PAGE_SIZE);
-    clean_dcache_va_range(domheap, DOMHEAP_SECOND_PAGES*PAGE_SIZE);
-
     per_cpu(xen_pgtable, cpu) = first;
     per_cpu(xen_dommap, cpu) = domheap;