diff mbox

[For,Xen-4.10,Resend,3/3] Avoid excess icache flushes in populate_physmap() before domain has been created

Message ID 20170515141012.6612-4-punit.agrawal@arm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Punit Agrawal May 15, 2017, 2:10 p.m. UTC
populate_physmap() calls alloc_heap_pages() per requested
extent. alloc_heap_pages() invalidates the entire icache per
extent. During domain creation, the icache invalidations can be deffered
until all the extents have been allocated as there is no risk of
executing stale instructions from the icache.

Introduce a new flag "MEMF_no_icache_flush" to be used to prevent
alloc_heap_pages() from performing icache maintenance operations. Use
the flag in populate_physmap() before the domain has been unpaused and
perform required icache maintenance function at the end of the
allocation.

One concern is the lack of synchronisation around testing for
"creation_finished". But it seems, in practice the window where it is
out of sync should be small enough to not matter.

Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
---
 xen/common/memory.c        | 31 ++++++++++++++++++++++---------
 xen/common/page_alloc.c    |  2 +-
 xen/include/asm-x86/page.h |  4 ++++
 xen/include/xen/mm.h       |  2 ++
 4 files changed, 29 insertions(+), 10 deletions(-)

Comments

Jan Beulich May 17, 2017, 3:46 p.m. UTC | #1
>>> On 15.05.17 at 16:10, <punit.agrawal@arm.com> wrote:
> --- a/xen/include/asm-x86/page.h
> +++ b/xen/include/asm-x86/page.h
> @@ -375,6 +375,10 @@ perms_strictly_increased(uint32_t old_flags, uint32_t new_flags)
>  
>  #define PAGE_ALIGN(x) (((x) + PAGE_SIZE - 1) & PAGE_MASK)
>  
> +static inline void invalidate_icache(void)
> +{
> +}

This function clearly does not what its name says, so there should
be a brief comment saying why.

Everything else looks reasonable.

Jan
Punit Agrawal May 17, 2017, 4:15 p.m. UTC | #2
Jan Beulich <JBeulich@suse.com> writes:

>>>> On 15.05.17 at 16:10, <punit.agrawal@arm.com> wrote:
>> --- a/xen/include/asm-x86/page.h
>> +++ b/xen/include/asm-x86/page.h
>> @@ -375,6 +375,10 @@ perms_strictly_increased(uint32_t old_flags, uint32_t new_flags)
>>  
>>  #define PAGE_ALIGN(x) (((x) + PAGE_SIZE - 1) & PAGE_MASK)
>>  
>> +static inline void invalidate_icache(void)
>> +{
>> +}
>
> This function clearly does not what its name says, so there should
> be a brief comment saying why.

Ack. I've added the following comment block above the function
definition.

/*
 * While allocating memory for a domain, invalidate_icache() is called
 * to ensure that guest vcpu does not execute any stale instructions
 * from the recently allocated memory. There is nothing to be done
 * here as icaches are coherent on x86.
 */

My x86 foo is weak and I'd appreciate if somebody familiar with x86
could give this a once over.

>
> Everything else looks reasonable.

Thanks, Jan. I'll post a new version once the ARM maintainers have had a
chance to comment.

>
> Jan
Jan Beulich May 18, 2017, 8:26 a.m. UTC | #3
>>> On 17.05.17 at 18:15, <punit.agrawal@arm.com> wrote:
> Jan Beulich <JBeulich@suse.com> writes:
> 
>>>>> On 15.05.17 at 16:10, <punit.agrawal@arm.com> wrote:
>>> --- a/xen/include/asm-x86/page.h
>>> +++ b/xen/include/asm-x86/page.h
>>> @@ -375,6 +375,10 @@ perms_strictly_increased(uint32_t old_flags, uint32_t new_flags)
>>>  
>>>  #define PAGE_ALIGN(x) (((x) + PAGE_SIZE - 1) & PAGE_MASK)
>>>  
>>> +static inline void invalidate_icache(void)
>>> +{
>>> +}
>>
>> This function clearly does not what its name says, so there should
>> be a brief comment saying why.
> 
> Ack. I've added the following comment block above the function
> definition.
> 
> /*
>  * While allocating memory for a domain, invalidate_icache() is called
>  * to ensure that guest vcpu does not execute any stale instructions
>  * from the recently allocated memory. There is nothing to be done
>  * here as icaches are coherent on x86.
>  */
> 
> My x86 foo is weak and I'd appreciate if somebody familiar with x86
> could give this a once over.

The text looks okay, but it ties the function to its _current_ single
user. I'd like to ask you to rather use just the last sentence, and it's
questionable (a matter of taste mostly) whether such a comment
wouldn't better be placed inside the function. And perhaps it would
be a good idea to insert "sufficiently", as they're not fully coherent
(see the self modifying code constraints on MP systems).

Jan
Punit Agrawal May 18, 2017, 9:22 a.m. UTC | #4
"Jan Beulich" <JBeulich@suse.com> writes:

>>>> On 17.05.17 at 18:15, <punit.agrawal@arm.com> wrote:
>> Jan Beulich <JBeulich@suse.com> writes:
>> 
>>>>>> On 15.05.17 at 16:10, <punit.agrawal@arm.com> wrote:
>>>> --- a/xen/include/asm-x86/page.h
>>>> +++ b/xen/include/asm-x86/page.h
>>>> @@ -375,6 +375,10 @@ perms_strictly_increased(uint32_t old_flags, uint32_t new_flags)
>>>>  
>>>>  #define PAGE_ALIGN(x) (((x) + PAGE_SIZE - 1) & PAGE_MASK)
>>>>  
>>>> +static inline void invalidate_icache(void)
>>>> +{
>>>> +}
>>>
>>> This function clearly does not what its name says, so there should
>>> be a brief comment saying why.
>> 
>> Ack. I've added the following comment block above the function
>> definition.
>> 
>> /*
>>  * While allocating memory for a domain, invalidate_icache() is called
>>  * to ensure that guest vcpu does not execute any stale instructions
>>  * from the recently allocated memory. There is nothing to be done
>>  * here as icaches are coherent on x86.
>>  */
>> 
>> My x86 foo is weak and I'd appreciate if somebody familiar with x86
>> could give this a once over.
>
> The text looks okay, but it ties the function to its _current_ single
> user. I'd like to ask you to rather use just the last sentence, and it's
> questionable (a matter of taste mostly) whether such a comment
> wouldn't better be placed inside the function. And perhaps it would
> be a good idea to insert "sufficiently", as they're not fully coherent
> (see the self modifying code constraints on MP systems).

I've updated the text and moved it inside the function.

Thanks for the feedback!

>
> Jan
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
Stefano Stabellini May 23, 2017, 10:13 p.m. UTC | #5
On Mon, 15 May 2017, Punit Agrawal wrote:
> populate_physmap() calls alloc_heap_pages() per requested
> extent. alloc_heap_pages() invalidates the entire icache per
> extent. During domain creation, the icache invalidations can be deffered
> until all the extents have been allocated as there is no risk of
> executing stale instructions from the icache.
> 
> Introduce a new flag "MEMF_no_icache_flush" to be used to prevent
> alloc_heap_pages() from performing icache maintenance operations. Use
> the flag in populate_physmap() before the domain has been unpaused and
> perform required icache maintenance function at the end of the
> allocation.
> 
> One concern is the lack of synchronisation around testing for
> "creation_finished". But it seems, in practice the window where it is
> out of sync should be small enough to not matter.
> 
> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
>  xen/common/memory.c        | 31 ++++++++++++++++++++++---------
>  xen/common/page_alloc.c    |  2 +-
>  xen/include/asm-x86/page.h |  4 ++++
>  xen/include/xen/mm.h       |  2 ++
>  4 files changed, 29 insertions(+), 10 deletions(-)
> 
> diff --git a/xen/common/memory.c b/xen/common/memory.c
> index 52879e7438..34d2dda8b4 100644
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -152,16 +152,26 @@ static void populate_physmap(struct memop_args *a)
>                              max_order(curr_d)) )
>          return;
>  
> -    /*
> -     * With MEMF_no_tlbflush set, alloc_heap_pages() will ignore
> -     * TLB-flushes. After VM creation, this is a security issue (it can
> -     * make pages accessible to guest B, when guest A may still have a
> -     * cached mapping to them). So we do this only during domain creation,
> -     * when the domain itself has not yet been unpaused for the first
> -     * time.
> -     */
>      if ( unlikely(!d->creation_finished) )
> +    {
> +        /*
> +         * With MEMF_no_tlbflush set, alloc_heap_pages() will ignore
> +         * TLB-flushes. After VM creation, this is a security issue (it can
> +         * make pages accessible to guest B, when guest A may still have a
> +         * cached mapping to them). So we do this only during domain creation,
> +         * when the domain itself has not yet been unpaused for the first
> +         * time.
> +         */
>          a->memflags |= MEMF_no_tlbflush;
> +        /*
> +         * With MEMF_no_icache_flush, alloc_heap_pages() will skip
> +         * performing icache flushes. We do it only before domain
> +         * creation as once the domain is running there is a danger of
> +         * executing instructions from stale caches if icache flush is
> +         * delayed.
> +         */
> +        a->memflags |= MEMF_no_icache_flush;
> +    }
>  
>      for ( i = a->nr_done; i < a->nr_extents; i++ )
>      {
> @@ -211,7 +221,6 @@ static void populate_physmap(struct memop_args *a)
>                  }
>  
>                  mfn = gpfn;
> -                page = mfn_to_page(mfn);
>              }
>              else
>              {
> @@ -255,6 +264,10 @@ static void populate_physmap(struct memop_args *a)
>  out:
>      if ( need_tlbflush )
>          filtered_flush_tlb_mask(tlbflush_timestamp);
> +
> +    if ( a->memflags & MEMF_no_icache_flush )
> +        invalidate_icache();
> +
>      a->nr_done = i;
>  }
>  
> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
> index eba78f1a3d..8bcef6a547 100644
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -833,7 +833,7 @@ static struct page_info *alloc_heap_pages(
>          /* Ensure cache and RAM are consistent for platforms where the
>           * guest can control its own visibility of/through the cache.
>           */
> -        flush_page_to_ram(page_to_mfn(&pg[i]), true);
> +        flush_page_to_ram(page_to_mfn(&pg[i]), !(memflags & MEMF_no_icache_flush));
>      }
>  
>      spin_unlock(&heap_lock);
> diff --git a/xen/include/asm-x86/page.h b/xen/include/asm-x86/page.h
> index 4cadb12646..3a375282f6 100644
> --- a/xen/include/asm-x86/page.h
> +++ b/xen/include/asm-x86/page.h
> @@ -375,6 +375,10 @@ perms_strictly_increased(uint32_t old_flags, uint32_t new_flags)
>  
>  #define PAGE_ALIGN(x) (((x) + PAGE_SIZE - 1) & PAGE_MASK)
>  
> +static inline void invalidate_icache(void)
> +{
> +}
> +
>  #endif /* __X86_PAGE_H__ */
>  
>  /*
> diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
> index 88de3c1fa6..ee50d4cd7b 100644
> --- a/xen/include/xen/mm.h
> +++ b/xen/include/xen/mm.h
> @@ -224,6 +224,8 @@ struct npfec {
>  #define  MEMF_no_owner    (1U<<_MEMF_no_owner)
>  #define _MEMF_no_tlbflush 6
>  #define  MEMF_no_tlbflush (1U<<_MEMF_no_tlbflush)
> +#define _MEMF_no_icache_flush 7
> +#define  MEMF_no_icache_flush (1U<<_MEMF_no_icache_flush)
>  #define _MEMF_node        8
>  #define  MEMF_node_mask   ((1U << (8 * sizeof(nodeid_t))) - 1)
>  #define  MEMF_node(n)     ((((n) + 1) & MEMF_node_mask) << _MEMF_node)
> -- 
> 2.11.0
>
Punit Agrawal May 24, 2017, 10:30 a.m. UTC | #6
Stefano Stabellini <sstabellini@kernel.org> writes:

> On Mon, 15 May 2017, Punit Agrawal wrote:
>> populate_physmap() calls alloc_heap_pages() per requested
>> extent. alloc_heap_pages() invalidates the entire icache per
>> extent. During domain creation, the icache invalidations can be deffered
>> until all the extents have been allocated as there is no risk of
>> executing stale instructions from the icache.
>> 
>> Introduce a new flag "MEMF_no_icache_flush" to be used to prevent
>> alloc_heap_pages() from performing icache maintenance operations. Use
>> the flag in populate_physmap() before the domain has been unpaused and
>> perform required icache maintenance function at the end of the
>> allocation.
>> 
>> One concern is the lack of synchronisation around testing for
>> "creation_finished". But it seems, in practice the window where it is
>> out of sync should be small enough to not matter.
>> 
>> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
>
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

Thanks, Stefano!

I'll add the tags and post a new version with the changes suggested by
Jan.

>
>
>> ---
>>  xen/common/memory.c        | 31 ++++++++++++++++++++++---------
>>  xen/common/page_alloc.c    |  2 +-
>>  xen/include/asm-x86/page.h |  4 ++++
>>  xen/include/xen/mm.h       |  2 ++
>>  4 files changed, 29 insertions(+), 10 deletions(-)
>> 
>> diff --git a/xen/common/memory.c b/xen/common/memory.c
>> index 52879e7438..34d2dda8b4 100644
>> --- a/xen/common/memory.c
>> +++ b/xen/common/memory.c
>> @@ -152,16 +152,26 @@ static void populate_physmap(struct memop_args *a)
>>                              max_order(curr_d)) )
>>          return;
>>  
>> -    /*
>> -     * With MEMF_no_tlbflush set, alloc_heap_pages() will ignore
>> -     * TLB-flushes. After VM creation, this is a security issue (it can
>> -     * make pages accessible to guest B, when guest A may still have a
>> -     * cached mapping to them). So we do this only during domain creation,
>> -     * when the domain itself has not yet been unpaused for the first
>> -     * time.
>> -     */
>>      if ( unlikely(!d->creation_finished) )
>> +    {
>> +        /*
>> +         * With MEMF_no_tlbflush set, alloc_heap_pages() will ignore
>> +         * TLB-flushes. After VM creation, this is a security issue (it can
>> +         * make pages accessible to guest B, when guest A may still have a
>> +         * cached mapping to them). So we do this only during domain creation,
>> +         * when the domain itself has not yet been unpaused for the first
>> +         * time.
>> +         */
>>          a->memflags |= MEMF_no_tlbflush;
>> +        /*
>> +         * With MEMF_no_icache_flush, alloc_heap_pages() will skip
>> +         * performing icache flushes. We do it only before domain
>> +         * creation as once the domain is running there is a danger of
>> +         * executing instructions from stale caches if icache flush is
>> +         * delayed.
>> +         */
>> +        a->memflags |= MEMF_no_icache_flush;
>> +    }
>>  
>>      for ( i = a->nr_done; i < a->nr_extents; i++ )
>>      {
>> @@ -211,7 +221,6 @@ static void populate_physmap(struct memop_args *a)
>>                  }
>>  
>>                  mfn = gpfn;
>> -                page = mfn_to_page(mfn);
>>              }
>>              else
>>              {
>> @@ -255,6 +264,10 @@ static void populate_physmap(struct memop_args *a)
>>  out:
>>      if ( need_tlbflush )
>>          filtered_flush_tlb_mask(tlbflush_timestamp);
>> +
>> +    if ( a->memflags & MEMF_no_icache_flush )
>> +        invalidate_icache();
>> +
>>      a->nr_done = i;
>>  }
>>  
>> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
>> index eba78f1a3d..8bcef6a547 100644
>> --- a/xen/common/page_alloc.c
>> +++ b/xen/common/page_alloc.c
>> @@ -833,7 +833,7 @@ static struct page_info *alloc_heap_pages(
>>          /* Ensure cache and RAM are consistent for platforms where the
>>           * guest can control its own visibility of/through the cache.
>>           */
>> -        flush_page_to_ram(page_to_mfn(&pg[i]), true);
>> +        flush_page_to_ram(page_to_mfn(&pg[i]), !(memflags & MEMF_no_icache_flush));
>>      }
>>  
>>      spin_unlock(&heap_lock);
>> diff --git a/xen/include/asm-x86/page.h b/xen/include/asm-x86/page.h
>> index 4cadb12646..3a375282f6 100644
>> --- a/xen/include/asm-x86/page.h
>> +++ b/xen/include/asm-x86/page.h
>> @@ -375,6 +375,10 @@ perms_strictly_increased(uint32_t old_flags, uint32_t new_flags)
>>  
>>  #define PAGE_ALIGN(x) (((x) + PAGE_SIZE - 1) & PAGE_MASK)
>>  
>> +static inline void invalidate_icache(void)
>> +{
>> +}
>> +
>>  #endif /* __X86_PAGE_H__ */
>>  
>>  /*
>> diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
>> index 88de3c1fa6..ee50d4cd7b 100644
>> --- a/xen/include/xen/mm.h
>> +++ b/xen/include/xen/mm.h
>> @@ -224,6 +224,8 @@ struct npfec {
>>  #define  MEMF_no_owner    (1U<<_MEMF_no_owner)
>>  #define _MEMF_no_tlbflush 6
>>  #define  MEMF_no_tlbflush (1U<<_MEMF_no_tlbflush)
>> +#define _MEMF_no_icache_flush 7
>> +#define  MEMF_no_icache_flush (1U<<_MEMF_no_icache_flush)
>>  #define _MEMF_node        8
>>  #define  MEMF_node_mask   ((1U << (8 * sizeof(nodeid_t))) - 1)
>>  #define  MEMF_node(n)     ((((n) + 1) & MEMF_node_mask) << _MEMF_node)
>> -- 
>> 2.11.0
>> 
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
diff mbox

Patch

diff --git a/xen/common/memory.c b/xen/common/memory.c
index 52879e7438..34d2dda8b4 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -152,16 +152,26 @@  static void populate_physmap(struct memop_args *a)
                             max_order(curr_d)) )
         return;
 
-    /*
-     * With MEMF_no_tlbflush set, alloc_heap_pages() will ignore
-     * TLB-flushes. After VM creation, this is a security issue (it can
-     * make pages accessible to guest B, when guest A may still have a
-     * cached mapping to them). So we do this only during domain creation,
-     * when the domain itself has not yet been unpaused for the first
-     * time.
-     */
     if ( unlikely(!d->creation_finished) )
+    {
+        /*
+         * With MEMF_no_tlbflush set, alloc_heap_pages() will ignore
+         * TLB-flushes. After VM creation, this is a security issue (it can
+         * make pages accessible to guest B, when guest A may still have a
+         * cached mapping to them). So we do this only during domain creation,
+         * when the domain itself has not yet been unpaused for the first
+         * time.
+         */
         a->memflags |= MEMF_no_tlbflush;
+        /*
+         * With MEMF_no_icache_flush, alloc_heap_pages() will skip
+         * performing icache flushes. We do it only before domain
+         * creation as once the domain is running there is a danger of
+         * executing instructions from stale caches if icache flush is
+         * delayed.
+         */
+        a->memflags |= MEMF_no_icache_flush;
+    }
 
     for ( i = a->nr_done; i < a->nr_extents; i++ )
     {
@@ -211,7 +221,6 @@  static void populate_physmap(struct memop_args *a)
                 }
 
                 mfn = gpfn;
-                page = mfn_to_page(mfn);
             }
             else
             {
@@ -255,6 +264,10 @@  static void populate_physmap(struct memop_args *a)
 out:
     if ( need_tlbflush )
         filtered_flush_tlb_mask(tlbflush_timestamp);
+
+    if ( a->memflags & MEMF_no_icache_flush )
+        invalidate_icache();
+
     a->nr_done = i;
 }
 
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index eba78f1a3d..8bcef6a547 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -833,7 +833,7 @@  static struct page_info *alloc_heap_pages(
         /* Ensure cache and RAM are consistent for platforms where the
          * guest can control its own visibility of/through the cache.
          */
-        flush_page_to_ram(page_to_mfn(&pg[i]), true);
+        flush_page_to_ram(page_to_mfn(&pg[i]), !(memflags & MEMF_no_icache_flush));
     }
 
     spin_unlock(&heap_lock);
diff --git a/xen/include/asm-x86/page.h b/xen/include/asm-x86/page.h
index 4cadb12646..3a375282f6 100644
--- a/xen/include/asm-x86/page.h
+++ b/xen/include/asm-x86/page.h
@@ -375,6 +375,10 @@  perms_strictly_increased(uint32_t old_flags, uint32_t new_flags)
 
 #define PAGE_ALIGN(x) (((x) + PAGE_SIZE - 1) & PAGE_MASK)
 
+static inline void invalidate_icache(void)
+{
+}
+
 #endif /* __X86_PAGE_H__ */
 
 /*
diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
index 88de3c1fa6..ee50d4cd7b 100644
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -224,6 +224,8 @@  struct npfec {
 #define  MEMF_no_owner    (1U<<_MEMF_no_owner)
 #define _MEMF_no_tlbflush 6
 #define  MEMF_no_tlbflush (1U<<_MEMF_no_tlbflush)
+#define _MEMF_no_icache_flush 7
+#define  MEMF_no_icache_flush (1U<<_MEMF_no_icache_flush)
 #define _MEMF_node        8
 #define  MEMF_node_mask   ((1U << (8 * sizeof(nodeid_t))) - 1)
 #define  MEMF_node(n)     ((((n) + 1) & MEMF_node_mask) << _MEMF_node)