diff mbox

[RFC,4/6] ARM: kernel: update cpu_suspend code to use cache LoUIS operations

Message ID 1347531651-28218-5-git-send-email-lorenzo.pieralisi@arm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Lorenzo Pieralisi Sept. 13, 2012, 10:20 a.m. UTC
In processors like A15/A7 L2 cache is unified and integrated within the
processor cache hierarchy, so that it is not considered an outer cache
anymore. For processors like A15/A7 flush_cache_all() ends up cleaning
all cache levels up to Level of Coherency (LoC) that includes
the L2 unified cache.

When a single CPU is suspended (CPU idle) a complete L2 clean is not
required, so generic cpu_suspend code must clean the data cache using the
newly introduced cache LoUIS function.

The context and stack pointer (context pointer) are cleaned to main memory
using cache area functions that operate on MVA and guarantee that the data
is written back to main memory (perform cache cleaning up to the Point of
Coherency - PoC) so that the processor can fetch the context when the MMU
is off in the cpu_resume code path.

outer_cache management remains unchanged.

Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
---
 arch/arm/kernel/suspend.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

Comments

tip-bot for Dave Martin Sept. 13, 2012, 12:53 p.m. UTC | #1
On Thu, Sep 13, 2012 at 11:20:49AM +0100, Lorenzo Pieralisi wrote:
> In processors like A15/A7 L2 cache is unified and integrated within the
> processor cache hierarchy, so that it is not considered an outer cache
> anymore. For processors like A15/A7 flush_cache_all() ends up cleaning
> all cache levels up to Level of Coherency (LoC) that includes
> the L2 unified cache.
> 
> When a single CPU is suspended (CPU idle) a complete L2 clean is not
> required, so generic cpu_suspend code must clean the data cache using the
> newly introduced cache LoUIS function.
> 
> The context and stack pointer (context pointer) are cleaned to main memory
> using cache area functions that operate on MVA and guarantee that the data
> is written back to main memory (perform cache cleaning up to the Point of
> Coherency - PoC) so that the processor can fetch the context when the MMU
> is off in the cpu_resume code path.
> 
> outer_cache management remains unchanged.

LoUIS matches the power domain affected by turning a single CPU off
on most (all?) current v7 SoCs where this matters, but only by coincidence.
There is no guarantee of that.

The _louis() API is useful, because this is exactly what you need to to
I-/D-/TLB synchronisation in an SMP OS.  Using it as preparation for
powering a CPU off feels like misuse, at least in theory.

For powerdown, we would logically need a separate function,
flush_cache_cpu() or something, whose job is precisely to flush those
levels which will be affected by the powerdown if that single CPU.

In a multi-cluster system, it's possible that the architectural cache
level this corresponds to is not even the same across all clusters (though
for the foreseeable future it probably will be -- at least for all clusters
participating in SMP).


I don't know how urgent it is to fix this if there are just a few call
sites for flush_cache_louis().  My worry would be that misuse of these
functions propagates before we find that this needs cleaning up...

Cheers
---Dave

> 
> Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> ---
>  arch/arm/kernel/suspend.c | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/kernel/suspend.c b/arch/arm/kernel/suspend.c
> index 1794cc3..358bca3 100644
> --- a/arch/arm/kernel/suspend.c
> +++ b/arch/arm/kernel/suspend.c
> @@ -17,6 +17,8 @@ extern void cpu_resume_mmu(void);
>   */
>  void __cpu_suspend_save(u32 *ptr, u32 ptrsz, u32 sp, u32 *save_ptr)
>  {
> +	u32 *ctx = ptr;
> +
>  	*save_ptr = virt_to_phys(ptr);
>  
>  	/* This must correspond to the LDM in cpu_resume() assembly */
> @@ -26,7 +28,20 @@ void __cpu_suspend_save(u32 *ptr, u32 ptrsz, u32 sp, u32 *save_ptr)
>  
>  	cpu_do_suspend(ptr);
>  
> -	flush_cache_all();
> +	flush_cache_louis();
> +
> +	/*
> +	 * flush_cache_louis does not guarantee that
> +	 * save_ptr and ptr are cleaned to main memory,
> +	 * just up to the Level of Unification Inner Shareable.
> +	 * Since the context pointer and context itself
> +	 * are to be retrieved with the MMU off that
> +	 * data must be cleaned from all cache levels
> +	 * to main memory using "area" cache primitives.
> +	*/
> +	__cpuc_flush_dcache_area(ctx, ptrsz);
> +	__cpuc_flush_dcache_area(save_ptr, sizeof(*save_ptr));
> +
>  	outer_clean_range(*save_ptr, *save_ptr + ptrsz);
>  	outer_clean_range(virt_to_phys(save_ptr),
>  			  virt_to_phys(save_ptr) + sizeof(*save_ptr));
> -- 
> 1.7.12
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Santosh Shilimkar Sept. 13, 2012, 1:01 p.m. UTC | #2
On Thu, Sep 13, 2012 at 6:23 PM, Dave Martin <dave.martin@linaro.org> wrote:
> On Thu, Sep 13, 2012 at 11:20:49AM +0100, Lorenzo Pieralisi wrote:
>> In processors like A15/A7 L2 cache is unified and integrated within the
>> processor cache hierarchy, so that it is not considered an outer cache
>> anymore. For processors like A15/A7 flush_cache_all() ends up cleaning
>> all cache levels up to Level of Coherency (LoC) that includes
>> the L2 unified cache.
>>
>> When a single CPU is suspended (CPU idle) a complete L2 clean is not
>> required, so generic cpu_suspend code must clean the data cache using the
>> newly introduced cache LoUIS function.
>>
>> The context and stack pointer (context pointer) are cleaned to main memory
>> using cache area functions that operate on MVA and guarantee that the data
>> is written back to main memory (perform cache cleaning up to the Point of
>> Coherency - PoC) so that the processor can fetch the context when the MMU
>> is off in the cpu_resume code path.
>>
>> outer_cache management remains unchanged.
>
> LoUIS matches the power domain affected by turning a single CPU off
> on most (all?) current v7 SoCs where this matters, but only by coincidence.
> There is no guarantee of that.
>
> The _louis() API is useful, because this is exactly what you need to to
> I-/D-/TLB synchronisation in an SMP OS.  Using it as preparation for
> powering a CPU off feels like misuse, at least in theory.
>
> For powerdown, we would logically need a separate function,
> flush_cache_cpu() or something, whose job is precisely to flush those
> levels which will be affected by the power-down if that single CPU.
>
In the series, there is patch "[PATCH 3/6]" which adds an
API which let you operate on a specific level.

Regards
Santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Russell King - ARM Linux Sept. 13, 2012, 1:08 p.m. UTC | #3
On Thu, Sep 13, 2012 at 06:31:35PM +0530, Shilimkar, Santosh wrote:
> In the series, there is patch "[PATCH 3/6]" which adds an
> API which let you operate on a specific level.

Which is introduced but as far as I can see, is never used in the patch
set.  Therefore, it shouldn't be introduced.

We've been here before many many many times, where people introduce stuff
into the kernel, and then they never get around to using the damned stuff.
It's happened far too many times to permit on a "but I will use it in the
future" kind of arguments.

If you're going to introduce something new, include the users in the patch
set, or don't bother submitting the new function in the vague hope that
some day it will get used.
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Santosh Shilimkar Sept. 13, 2012, 1:18 p.m. UTC | #4
On Thu, Sep 13, 2012 at 6:38 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Thu, Sep 13, 2012 at 06:31:35PM +0530, Shilimkar, Santosh wrote:
>> In the series, there is patch "[PATCH 3/6]" which adds an
>> API which let you operate on a specific level.
>
> Which is introduced but as far as I can see, is never used in the patch
> set.  Therefore, it shouldn't be introduced.
>
> We've been here before many many many times, where people introduce stuff
> into the kernel, and then they never get around to using the damned stuff.
> It's happened far too many times to permit on a "but I will use it in the
> future" kind of arguments.
>
> If you're going to introduce something new, include the users in the patch
> set, or don't bother submitting the new function in the vague hope that
> some day it will get used.

Fair enough. We can postpone adding that API now in this series and
add it along with the user. For the record, it was added to use in the A15
low power code to operate on L1 and L2 levels based on the power domain
states.

Regards
santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lorenzo Pieralisi Sept. 13, 2012, 2:18 p.m. UTC | #5
On Thu, Sep 13, 2012 at 01:53:48PM +0100, Dave Martin wrote:
> On Thu, Sep 13, 2012 at 11:20:49AM +0100, Lorenzo Pieralisi wrote:
> > In processors like A15/A7 L2 cache is unified and integrated within the
> > processor cache hierarchy, so that it is not considered an outer cache
> > anymore. For processors like A15/A7 flush_cache_all() ends up cleaning
> > all cache levels up to Level of Coherency (LoC) that includes
> > the L2 unified cache.
> > 
> > When a single CPU is suspended (CPU idle) a complete L2 clean is not
> > required, so generic cpu_suspend code must clean the data cache using the
> > newly introduced cache LoUIS function.
> > 
> > The context and stack pointer (context pointer) are cleaned to main memory
> > using cache area functions that operate on MVA and guarantee that the data
> > is written back to main memory (perform cache cleaning up to the Point of
> > Coherency - PoC) so that the processor can fetch the context when the MMU
> > is off in the cpu_resume code path.
> > 
> > outer_cache management remains unchanged.
> 
> LoUIS matches the power domain affected by turning a single CPU off
> on most (all?) current v7 SoCs where this matters, but only by coincidence.
> There is no guarantee of that.
> 
> The _louis() API is useful, because this is exactly what you need to to
> I-/D-/TLB synchronisation in an SMP OS.  Using it as preparation for
> powering a CPU off feels like misuse, at least in theory.
> 
> For powerdown, we would logically need a separate function,
> flush_cache_cpu() or something, whose job is precisely to flush those
> levels which will be affected by the powerdown if that single CPU.

Yes, you are right, we discussed this at length and I think that
cleaning to LoUIS is a good trade-off for now. Detecting what cache levels are
affected by the power down in question implies linking caches to power
domains in some sensible (and ARM generic) way, let's not go that far for now.

Thanks,
Lorenzo

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lorenzo Pieralisi Sept. 13, 2012, 2:28 p.m. UTC | #6
On Thu, Sep 13, 2012 at 02:01:35PM +0100, Shilimkar, Santosh wrote:
> On Thu, Sep 13, 2012 at 6:23 PM, Dave Martin <dave.martin@linaro.org> wrote:

[...]

> > LoUIS matches the power domain affected by turning a single CPU off
> > on most (all?) current v7 SoCs where this matters, but only by coincidence.
> > There is no guarantee of that.
> >
> > The _louis() API is useful, because this is exactly what you need to to
> > I-/D-/TLB synchronisation in an SMP OS.  Using it as preparation for
> > powering a CPU off feels like misuse, at least in theory.
> >
> > For powerdown, we would logically need a separate function,
> > flush_cache_cpu() or something, whose job is precisely to flush those
> > levels which will be affected by the power-down if that single CPU.
> >
> In the series, there is patch "[PATCH 3/6]" which adds an
> API which let you operate on a specific level.

Yep, but that's not callable from __cpu_suspend_save in a generic way
(it is v7 specific and we are not going to add another API/macro for
that to be usable in generic code, at least for now).

Let's keep that patch in our trees and as Russell suggested we will repost it
with the PM BSP accordingly to raise the point and take it from there.

Thanks,
Lorenzo

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/arm/kernel/suspend.c b/arch/arm/kernel/suspend.c
index 1794cc3..358bca3 100644
--- a/arch/arm/kernel/suspend.c
+++ b/arch/arm/kernel/suspend.c
@@ -17,6 +17,8 @@  extern void cpu_resume_mmu(void);
  */
 void __cpu_suspend_save(u32 *ptr, u32 ptrsz, u32 sp, u32 *save_ptr)
 {
+	u32 *ctx = ptr;
+
 	*save_ptr = virt_to_phys(ptr);
 
 	/* This must correspond to the LDM in cpu_resume() assembly */
@@ -26,7 +28,20 @@  void __cpu_suspend_save(u32 *ptr, u32 ptrsz, u32 sp, u32 *save_ptr)
 
 	cpu_do_suspend(ptr);
 
-	flush_cache_all();
+	flush_cache_louis();
+
+	/*
+	 * flush_cache_louis does not guarantee that
+	 * save_ptr and ptr are cleaned to main memory,
+	 * just up to the Level of Unification Inner Shareable.
+	 * Since the context pointer and context itself
+	 * are to be retrieved with the MMU off that
+	 * data must be cleaned from all cache levels
+	 * to main memory using "area" cache primitives.
+	*/
+	__cpuc_flush_dcache_area(ctx, ptrsz);
+	__cpuc_flush_dcache_area(save_ptr, sizeof(*save_ptr));
+
 	outer_clean_range(*save_ptr, *save_ptr + ptrsz);
 	outer_clean_range(virt_to_phys(save_ptr),
 			  virt_to_phys(save_ptr) + sizeof(*save_ptr));