diff mbox

[1/4] ARM: vexpress/dcscb: fix cache disabling sequences

Message ID alpine.LFD.2.03.1307181231510.14924@syhkavp.arg (mailing list archive)
State New, archived
Headers show

Commit Message

Nicolas Pitre July 18, 2013, 5:24 p.m. UTC
[ added Russell for his opinion on the patch below ]

On Thu, 18 Jul 2013, Dave Martin wrote:

> On Wed, Jul 17, 2013 at 11:28:33PM -0400, Nicolas Pitre wrote:
> > Unlike real A15/A7's, the RTSM simulation doesn't appear to hit the
> > cache when the CTRL.C bit is cleared.  Let's ensure there is no memory
> > access within the disable and flush cache sequence, including to the
> > stack.
> > 
> > Signed-off-by: Nicolas Pitre <nico@linaro.org>
> > ---
> >  arch/arm/mach-vexpress/dcscb.c | 58 +++++++++++++++++++++++++++---------------
> >  1 file changed, 37 insertions(+), 21 deletions(-)
> > 
> > diff --git a/arch/arm/mach-vexpress/dcscb.c b/arch/arm/mach-vexpress/dcscb.c
> > index 16d57a8a9d..9f01c04d58 100644
> > --- a/arch/arm/mach-vexpress/dcscb.c
> > +++ b/arch/arm/mach-vexpress/dcscb.c
> > @@ -136,14 +136,29 @@ static void dcscb_power_down(void)
> >  		/*
> >  		 * Flush all cache levels for this cluster.
> >  		 *
> > -		 * A15/A7 can hit in the cache with SCTLR.C=0, so we don't need
> > -		 * a preliminary flush here for those CPUs.  At least, that's
> > -		 * the theory -- without the extra flush, Linux explodes on
> > -		 * RTSM (to be investigated).
> > +		 * To do so we do:
> > +		 * - Clear the CTLR.C bit to prevent further cache allocations
> 
> SCTLR

Fixed.

> > +		 * - Flush the whole cache
> > +		 * - Disable local coherency by clearing the ACTLR "SMP" bit
> > +		 *
> > +		 * Let's do it in the safest possible way i.e. with
> > +		 * no memory access within the following sequence
> > +		 * including the stack.
> >  		 */
> > -		flush_cache_all();
> > -		set_cr(get_cr() & ~CR_C);
> > -		flush_cache_all();
> > +		asm volatile(
> > +		"mrc	p15, 0, r0, c1, c0, 0	@ get CR \n\t"
> > +		"bic	r0, r0, #"__stringify(CR_C)" \n\t"
> > +		"mcr	p15, 0, r0, c1, c0, 0	@ set CR \n\t"
> > +		"isb	\n\t"
> > +		"bl	v7_flush_dcache_all \n\t"
> > +		"clrex	\n\t"
> > +		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR \n\t"
> > +		"bic	r0, r0, #(1 << 6)	@ disable local coherency \n\t"
> > +		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR \n\t"
> > +		"isb	\n\t"
> > +		"dsb	"
> > +		: : : "r0","r1","r2","r3","r4","r5","r6","r7",
> > +		      "r9","r10","r11","lr","memory");
> 
> Along with the TC2 support, we now have 4 copies of this code sequence.
> 
> This is basically the A15/A7 native "exit coherency and flash and
> disable some levels of dcache" operation, whose only parameter is which
> cache levels to flush.
> 
> That's a big mouthful -- we can probably come up with a better name --
> but we've pretty much concluded that there is no way to break this
> operation apart into bitesize pieces.  Nonetheless, any native
> powerdown sequence for these processors will need to do this, or
> something closely related.
> 
> Is it worth consolidating, or is that premature?

It is probably worth consolidating.

What about this:

commit 390cf8b9b83eeeebdfef51912f5003a6a9b84115
Author: Nicolas Pitre <nicolas.pitre@linaro.org>
Date:   Thu Jul 18 13:12:48 2013 -0400

    ARM: cacheflush: consolidate single-CPU ARMv7 cache disabling code
    
    This code is becoming duplicated in many places.  So let's consolidate
    it into a handy macro that is known to be right and available for reuse.
    
    Signed-off-by: Nicolas Pitre <nico@linaro.org>

Comments

Dave Martin July 18, 2013, 6:03 p.m. UTC | #1
On Thu, Jul 18, 2013 at 01:24:48PM -0400, Nicolas Pitre wrote:
> [ added Russell for his opinion on the patch below ]

[...]

> commit 390cf8b9b83eeeebdfef51912f5003a6a9b84115
> Author: Nicolas Pitre <nicolas.pitre@linaro.org>
> Date:   Thu Jul 18 13:12:48 2013 -0400
> 
>     ARM: cacheflush: consolidate single-CPU ARMv7 cache disabling code
>     
>     This code is becoming duplicated in many places.  So let's consolidate
>     it into a handy macro that is known to be right and available for reuse.
>     
>     Signed-off-by: Nicolas Pitre <nico@linaro.org>
> 
> diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
> index 17d0ae8672..8a76933e80 100644
> --- a/arch/arm/include/asm/cacheflush.h
> +++ b/arch/arm/include/asm/cacheflush.h
> @@ -436,4 +436,33 @@ static inline void __sync_cache_range_r(volatile void *p, size_t size)
>  #define sync_cache_w(ptr) __sync_cache_range_w(ptr, sizeof *(ptr))
>  #define sync_cache_r(ptr) __sync_cache_range_r(ptr, sizeof *(ptr))
>  
> +/*
> + * Disabling cache access for one CPU in an ARMv7 SMP system is tricky.
> + * To do so we must:
> + *
> + * - Clear the SCTLR.C bit to prevent further cache allocations
> + * - Flush the desired level of cache
> + * - Clear the ACTLR "SMP" bit to disable local coherency
> + *
> + * ... and so without any intervening memory access in between those steps,
> + * not even to the stack.
> + *
> + * The clobber list is dictated by the call to v7_flush_dcache_*.
> + */
> +#define v7_disable_flush_cache(level) \
> +	asm volatile( \
> +	"mrc	p15, 0, r0, c1, c0, 0	@ get CR \n\t" \
> +	"bic	r0, r0, #"__stringify(CR_C)" \n\t" \
> +	"mcr	p15, 0, r0, c1, c0, 0	@ set CR \n\t" \
> +	"isb	\n\t" \
> +	"bl	v7_flush_dcache_"__stringify(level)" \n\t" \
> +	"clrex	\n\t" \
> +	"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR \n\t" \
> +	"bic	r0, r0, #(1 << 6)	@ disable local coherency \n\t" \
> +	"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR \n\t" \
> +	"isb	\n\t" \
> +	"dsb	" \
> +	: : : "r0","r1","r2","r3","r4","r5","r6","r7", \
> +	      "r9","r10","r11","lr","memory" )
> +
>  #endif

That's roughly what I had in mind, though it might belong somewhere more
obscure than cacheflush.h(?)

"disable_flush_cache" sounds a too general-purpose for my liking.

"v7" isn't really right because we touch the ACTLR.  This only works
on certain CPUs and when Linux is running Secure.  Maybe saying "a15"
instead of "a7" is reasonable, since a7 is supposed to be an a15
workalike in most places.

I had other names in mind, like "coherency_exit" or "cache_exit".
Those are not very intelligible, but that might at least make people
pause and think before blindly using it.

Really, it's for the very specific purpose of doing native low-level
powerdown of an individual CPU in a multiprocessor system, when the
platform as a whole is not being shut down, on A15 or A7 (+ possibly
other future v7 CPUs)... it's not suitable for any other situation.
Even kexec probably shouldn't be disabling the SMP bit.

Cheers
---Dave
Nicolas Pitre July 18, 2013, 6:59 p.m. UTC | #2
On Thu, 18 Jul 2013, Dave Martin wrote:

> On Thu, Jul 18, 2013 at 01:24:48PM -0400, Nicolas Pitre wrote:
> > [ added Russell for his opinion on the patch below ]
> 
> [...]
> 
> > commit 390cf8b9b83eeeebdfef51912f5003a6a9b84115
> > Author: Nicolas Pitre <nicolas.pitre@linaro.org>
> > Date:   Thu Jul 18 13:12:48 2013 -0400
> > 
> >     ARM: cacheflush: consolidate single-CPU ARMv7 cache disabling code
> >     
> >     This code is becoming duplicated in many places.  So let's consolidate
> >     it into a handy macro that is known to be right and available for reuse.
> >     
> >     Signed-off-by: Nicolas Pitre <nico@linaro.org>
> > 
> > diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
> > index 17d0ae8672..8a76933e80 100644
> > --- a/arch/arm/include/asm/cacheflush.h
> > +++ b/arch/arm/include/asm/cacheflush.h
> > @@ -436,4 +436,33 @@ static inline void __sync_cache_range_r(volatile void *p, size_t size)
> >  #define sync_cache_w(ptr) __sync_cache_range_w(ptr, sizeof *(ptr))
> >  #define sync_cache_r(ptr) __sync_cache_range_r(ptr, sizeof *(ptr))
> >  
> > +/*
> > + * Disabling cache access for one CPU in an ARMv7 SMP system is tricky.
> > + * To do so we must:
> > + *
> > + * - Clear the SCTLR.C bit to prevent further cache allocations
> > + * - Flush the desired level of cache
> > + * - Clear the ACTLR "SMP" bit to disable local coherency
> > + *
> > + * ... and so without any intervening memory access in between those steps,
> > + * not even to the stack.
> > + *
> > + * The clobber list is dictated by the call to v7_flush_dcache_*.
> > + */
> > +#define v7_disable_flush_cache(level) \
> > +	asm volatile( \
> > +	"mrc	p15, 0, r0, c1, c0, 0	@ get CR \n\t" \
> > +	"bic	r0, r0, #"__stringify(CR_C)" \n\t" \
> > +	"mcr	p15, 0, r0, c1, c0, 0	@ set CR \n\t" \
> > +	"isb	\n\t" \
> > +	"bl	v7_flush_dcache_"__stringify(level)" \n\t" \
> > +	"clrex	\n\t" \
> > +	"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR \n\t" \
> > +	"bic	r0, r0, #(1 << 6)	@ disable local coherency \n\t" \
> > +	"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR \n\t" \
> > +	"isb	\n\t" \
> > +	"dsb	" \
> > +	: : : "r0","r1","r2","r3","r4","r5","r6","r7", \
> > +	      "r9","r10","r11","lr","memory" )
> > +
> >  #endif
> 
> That's roughly what I had in mind, though it might belong somewhere more
> obscure than cacheflush.h(?)

Any suggestions for that?  Maybe asm/mcpm.h but that might be used in 
other circumstances as well.

> "disable_flush_cache" sounds a too general-purpose for my liking.
> 
> "v7" isn't really right because we touch the ACTLR.  This only works
> on certain CPUs and when Linux is running Secure.  Maybe saying "a15"
> instead of "a7" is reasonable, since a7 is supposed to be an a15
> workalike in most places.

Isn't this how this should be done on an A9 too?  Lorenzo asked me to do 
it this way, by fear of seeing people copy the previous implementation, 
so the same sequence works on A9 as well as A15.

> I had other names in mind, like "coherency_exit" or "cache_exit".
> Those are not very intelligible, but that might at least make people
> pause and think before blindly using it.

Good point.  It should still embody the architecture name for which it 
is valid though.


Nicolas
Dave Martin July 19, 2013, 10:28 a.m. UTC | #3
On Thu, Jul 18, 2013 at 02:59:06PM -0400, Nicolas Pitre wrote:
> On Thu, 18 Jul 2013, Dave Martin wrote:
> 
> > On Thu, Jul 18, 2013 at 01:24:48PM -0400, Nicolas Pitre wrote:
> > > [ added Russell for his opinion on the patch below ]
> > 
> > [...]
> > 
> > > commit 390cf8b9b83eeeebdfef51912f5003a6a9b84115
> > > Author: Nicolas Pitre <nicolas.pitre@linaro.org>
> > > Date:   Thu Jul 18 13:12:48 2013 -0400
> > > 
> > >     ARM: cacheflush: consolidate single-CPU ARMv7 cache disabling code
> > >     
> > >     This code is becoming duplicated in many places.  So let's consolidate
> > >     it into a handy macro that is known to be right and available for reuse.
> > >     
> > >     Signed-off-by: Nicolas Pitre <nico@linaro.org>
> > > 
> > > diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
> > > index 17d0ae8672..8a76933e80 100644
> > > --- a/arch/arm/include/asm/cacheflush.h
> > > +++ b/arch/arm/include/asm/cacheflush.h
> > > @@ -436,4 +436,33 @@ static inline void __sync_cache_range_r(volatile void *p, size_t size)
> > >  #define sync_cache_w(ptr) __sync_cache_range_w(ptr, sizeof *(ptr))
> > >  #define sync_cache_r(ptr) __sync_cache_range_r(ptr, sizeof *(ptr))
> > >  
> > > +/*
> > > + * Disabling cache access for one CPU in an ARMv7 SMP system is tricky.
> > > + * To do so we must:
> > > + *
> > > + * - Clear the SCTLR.C bit to prevent further cache allocations
> > > + * - Flush the desired level of cache
> > > + * - Clear the ACTLR "SMP" bit to disable local coherency
> > > + *
> > > + * ... and so without any intervening memory access in between those steps,
> > > + * not even to the stack.
> > > + *
> > > + * The clobber list is dictated by the call to v7_flush_dcache_*.
> > > + */
> > > +#define v7_disable_flush_cache(level) \
> > > +	asm volatile( \
> > > +	"mrc	p15, 0, r0, c1, c0, 0	@ get CR \n\t" \
> > > +	"bic	r0, r0, #"__stringify(CR_C)" \n\t" \
> > > +	"mcr	p15, 0, r0, c1, c0, 0	@ set CR \n\t" \
> > > +	"isb	\n\t" \
> > > +	"bl	v7_flush_dcache_"__stringify(level)" \n\t" \
> > > +	"clrex	\n\t" \
> > > +	"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR \n\t" \
> > > +	"bic	r0, r0, #(1 << 6)	@ disable local coherency \n\t" \
> > > +	"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR \n\t" \
> > > +	"isb	\n\t" \
> > > +	"dsb	" \
> > > +	: : : "r0","r1","r2","r3","r4","r5","r6","r7", \
> > > +	      "r9","r10","r11","lr","memory" )
> > > +
> > >  #endif
> > 
> > That's roughly what I had in mind, though it might belong somewhere more
> > obscure than cacheflush.h(?)
> 
> Any suggestions for that?  Maybe asm/mcpm.h but that might be used in 
> other circumstances as well.

I don't really have a better idea...  if nobody objects strongly to
putting it in cacheflush.h, I happy to go with that.
> 
> > "disable_flush_cache" sounds a too general-purpose for my liking.
> > 
> > "v7" isn't really right because we touch the ACTLR.  This only works
> > on certain CPUs and when Linux is running Secure.  Maybe saying "a15"
> > instead of "a7" is reasonable, since a7 is supposed to be an a15
> > workalike in most places.
> 
> Isn't this how this should be done on an A9 too?  Lorenzo asked me to do 

Quite possibly -- the point is, it's not guaranteed that all v7 CPUs will
work the same way, especially the custom implementations.   Third-party v7
implementations like Snapdragon etc. need not be in any way the same as
ARM's CPUs with regard to details like the SMP bit.

> it this way, by fear of seeing people copy the previous implementation, 
> so the same sequence works on A9 as well as A15.

Would it make sense to have things like

#define a15_something() asm(volatile ... )
#define a7_something() a15_something()
#define a9_something() a15_something()

(where the correct one to call for a b.L system would probably be the
one corresponding to the big cluster)

... or just call is something like

cortex_a_something()

with a big comment (or alias macros, as above) documenting which CPUs we
know it works for, and warning people to stop and think before using it
on some other CPU.

> > I had other names in mind, like "coherency_exit" or "cache_exit".
> > Those are not very intelligible, but that might at least make people
> > pause and think before blindly using it.
> 
> Good point.  It should still embody the architecture name for which it 
> is valid though.

Sure, I was assuming something would be pasted on the start of the name.

Cheers
---Dave
Lorenzo Pieralisi July 19, 2013, 10:59 a.m. UTC | #4
On Fri, Jul 19, 2013 at 11:28:49AM +0100, Dave Martin wrote:
> On Thu, Jul 18, 2013 at 02:59:06PM -0400, Nicolas Pitre wrote:
> > On Thu, 18 Jul 2013, Dave Martin wrote:
> > 
> > > On Thu, Jul 18, 2013 at 01:24:48PM -0400, Nicolas Pitre wrote:
> > > > [ added Russell for his opinion on the patch below ]
> > > 
> > > [...]
> > > 
> > > > commit 390cf8b9b83eeeebdfef51912f5003a6a9b84115
> > > > Author: Nicolas Pitre <nicolas.pitre@linaro.org>
> > > > Date:   Thu Jul 18 13:12:48 2013 -0400
> > > > 
> > > >     ARM: cacheflush: consolidate single-CPU ARMv7 cache disabling code
> > > >     
> > > >     This code is becoming duplicated in many places.  So let's consolidate
> > > >     it into a handy macro that is known to be right and available for reuse.
> > > >     
> > > >     Signed-off-by: Nicolas Pitre <nico@linaro.org>
> > > > 
> > > > diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
> > > > index 17d0ae8672..8a76933e80 100644
> > > > --- a/arch/arm/include/asm/cacheflush.h
> > > > +++ b/arch/arm/include/asm/cacheflush.h
> > > > @@ -436,4 +436,33 @@ static inline void __sync_cache_range_r(volatile void *p, size_t size)
> > > >  #define sync_cache_w(ptr) __sync_cache_range_w(ptr, sizeof *(ptr))
> > > >  #define sync_cache_r(ptr) __sync_cache_range_r(ptr, sizeof *(ptr))
> > > >  
> > > > +/*
> > > > + * Disabling cache access for one CPU in an ARMv7 SMP system is tricky.
> > > > + * To do so we must:
> > > > + *
> > > > + * - Clear the SCTLR.C bit to prevent further cache allocations
> > > > + * - Flush the desired level of cache
> > > > + * - Clear the ACTLR "SMP" bit to disable local coherency
> > > > + *
> > > > + * ... and so without any intervening memory access in between those steps,
> > > > + * not even to the stack.
> > > > + *
> > > > + * The clobber list is dictated by the call to v7_flush_dcache_*.
> > > > + */
> > > > +#define v7_disable_flush_cache(level) \
> > > > +	asm volatile( \
> > > > +	"mrc	p15, 0, r0, c1, c0, 0	@ get CR \n\t" \
> > > > +	"bic	r0, r0, #"__stringify(CR_C)" \n\t" \
> > > > +	"mcr	p15, 0, r0, c1, c0, 0	@ set CR \n\t" \
> > > > +	"isb	\n\t" \
> > > > +	"bl	v7_flush_dcache_"__stringify(level)" \n\t" \
> > > > +	"clrex	\n\t" \
> > > > +	"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR \n\t" \
> > > > +	"bic	r0, r0, #(1 << 6)	@ disable local coherency \n\t" \
> > > > +	"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR \n\t" \
> > > > +	"isb	\n\t" \
> > > > +	"dsb	" \
> > > > +	: : : "r0","r1","r2","r3","r4","r5","r6","r7", \
> > > > +	      "r9","r10","r11","lr","memory" )
> > > > +
> > > >  #endif
> > > 
> > > That's roughly what I had in mind, though it might belong somewhere more
> > > obscure than cacheflush.h(?)
> > 
> > Any suggestions for that?  Maybe asm/mcpm.h but that might be used in 
> > other circumstances as well.
> 
> I don't really have a better idea...  if nobody objects strongly to
> putting it in cacheflush.h, I happy to go with that.
> > 
> > > "disable_flush_cache" sounds a too general-purpose for my liking.
> > > 
> > > "v7" isn't really right because we touch the ACTLR.  This only works
> > > on certain CPUs and when Linux is running Secure.  Maybe saying "a15"
> > > instead of "a7" is reasonable, since a7 is supposed to be an a15
> > > workalike in most places.
> > 
> > Isn't this how this should be done on an A9 too?  Lorenzo asked me to do 
> 
> Quite possibly -- the point is, it's not guaranteed that all v7 CPUs will
> work the same way, especially the custom implementations.   Third-party v7
> implementations like Snapdragon etc. need not be in any way the same as
> ARM's CPUs with regard to details like the SMP bit.

Well, if they change the SMP bit behaviour from what the TRM describes they
certainly know what they are doing, I am not sure we should cater for that,
they will just use their own function if their behaviour differs from
"standard".

At least all v7 implementations I've run into in the kernel (mainline
and platform trees) could have used the sequence above (well, security
notwithstanding).

Also, we should point out that after clearing the C bit ldr/str exclusive
must not be used anymore (even though this is implementation defined since it
depends on a global monitor in DRAM). The macro is also a barrier in
terms of coherency when we call it we should know what to expect,
basically a CPU with caches that are not coherent anymore.

> > it this way, by fear of seeing people copy the previous implementation, 
> > so the same sequence works on A9 as well as A15.
> 
> Would it make sense to have things like
> 
> #define a15_something() asm(volatile ... )
> #define a7_something() a15_something()
> #define a9_something() a15_something()
> 
> (where the correct one to call for a b.L system would probably be the
> one corresponding to the big cluster)
> 
> ... or just call is something like
> 
> cortex_a_something()
> 
> with a big comment (or alias macros, as above) documenting which CPUs we
> know it works for, and warning people to stop and think before using it
> on some other CPU.

It might be, even though a well written comment would do IMHO.

> > > I had other names in mind, like "coherency_exit" or "cache_exit".
> > > Those are not very intelligible, but that might at least make people
> > > pause and think before blindly using it.
> > 
> > Good point.  It should still embody the architecture name for which it 
> > is valid though.
> 
> Sure, I was assuming something would be pasted on the start of the name.

v7 :-) with a comment describing the assumptions (in particular related
as Dave mentioned to the SMP bit behaviour) ?

Lorenzo
diff mbox

Patch

diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
index 17d0ae8672..8a76933e80 100644
--- a/arch/arm/include/asm/cacheflush.h
+++ b/arch/arm/include/asm/cacheflush.h
@@ -436,4 +436,33 @@  static inline void __sync_cache_range_r(volatile void *p, size_t size)
 #define sync_cache_w(ptr) __sync_cache_range_w(ptr, sizeof *(ptr))
 #define sync_cache_r(ptr) __sync_cache_range_r(ptr, sizeof *(ptr))
 
+/*
+ * Disabling cache access for one CPU in an ARMv7 SMP system is tricky.
+ * To do so we must:
+ *
+ * - Clear the SCTLR.C bit to prevent further cache allocations
+ * - Flush the desired level of cache
+ * - Clear the ACTLR "SMP" bit to disable local coherency
+ *
+ * ... and so without any intervening memory access in between those steps,
+ * not even to the stack.
+ *
+ * The clobber list is dictated by the call to v7_flush_dcache_*.
+ */
+#define v7_disable_flush_cache(level) \
+	asm volatile( \
+	"mrc	p15, 0, r0, c1, c0, 0	@ get CR \n\t" \
+	"bic	r0, r0, #"__stringify(CR_C)" \n\t" \
+	"mcr	p15, 0, r0, c1, c0, 0	@ set CR \n\t" \
+	"isb	\n\t" \
+	"bl	v7_flush_dcache_"__stringify(level)" \n\t" \
+	"clrex	\n\t" \
+	"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR \n\t" \
+	"bic	r0, r0, #(1 << 6)	@ disable local coherency \n\t" \
+	"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR \n\t" \
+	"isb	\n\t" \
+	"dsb	" \
+	: : : "r0","r1","r2","r3","r4","r5","r6","r7", \
+	      "r9","r10","r11","lr","memory" )
+
 #endif
diff --git a/arch/arm/mach-vexpress/dcscb.c b/arch/arm/mach-vexpress/dcscb.c
index 85fffa702f..145d8237d5 100644
--- a/arch/arm/mach-vexpress/dcscb.c
+++ b/arch/arm/mach-vexpress/dcscb.c
@@ -133,32 +133,8 @@  static void dcscb_power_down(void)
 	if (last_man && __mcpm_outbound_enter_critical(cpu, cluster)) {
 		arch_spin_unlock(&dcscb_lock);
 
-		/*
-		 * Flush all cache levels for this cluster.
-		 *
-		 * To do so we do:
-		 * - Clear the SCTLR.C bit to prevent further cache allocations
-		 * - Flush the whole cache
-		 * - Clear the ACTLR "SMP" bit to disable local coherency
-		 *
-		 * Let's do it in the safest possible way i.e. with
-		 * no memory access within the following sequence
-		 * including to the stack.
-		 */
-		asm volatile(
-		"mrc	p15, 0, r0, c1, c0, 0	@ get CR \n\t"
-		"bic	r0, r0, #"__stringify(CR_C)" \n\t"
-		"mcr	p15, 0, r0, c1, c0, 0	@ set CR \n\t"
-		"isb	\n\t"
-		"bl	v7_flush_dcache_all \n\t"
-		"clrex	\n\t"
-		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR \n\t"
-		"bic	r0, r0, #(1 << 6)	@ disable local coherency \n\t"
-		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR \n\t"
-		"isb	\n\t"
-		"dsb	"
-		: : : "r0","r1","r2","r3","r4","r5","r6","r7",
-		      "r9","r10","r11","lr","memory");
+		/* Flush all cache levels for this cluster. */
+		v7_disable_flush_cache(all);
 
 		/*
 		 * This is a harmless no-op.  On platforms with a real
@@ -177,24 +153,8 @@  static void dcscb_power_down(void)
 	} else {
 		arch_spin_unlock(&dcscb_lock);
 
-		/*
-		 * Flush the local CPU cache.
-		 * Let's do it in the safest possible way as above.
-		 */
-		asm volatile(
-		"mrc	p15, 0, r0, c1, c0, 0	@ get CR \n\t"
-		"bic	r0, r0, #"__stringify(CR_C)" \n\t"
-		"mcr	p15, 0, r0, c1, c0, 0	@ set CR \n\t"
-		"isb	\n\t"
-		"bl	v7_flush_dcache_louis \n\t"
-		"clrex	\n\t"
-		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR \n\t"
-		"bic	r0, r0, #(1 << 6)	@ disable local coherency \n\t"
-		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR \n\t"
-		"isb	\n\t"
-		"dsb	"
-		: : : "r0","r1","r2","r3","r4","r5","r6","r7",
-		      "r9","r10","r11","lr","memory");
+		/* Flush the local CPU cache. */
+		v7_disable_flush_cache(louis);
 	}
 
 	__mcpm_cpu_down(cpu, cluster);
diff --git a/arch/arm/mach-vexpress/tc2_pm.c b/arch/arm/mach-vexpress/tc2_pm.c
index dfb55d45b6..fd8bc2d931 100644
--- a/arch/arm/mach-vexpress/tc2_pm.c
+++ b/arch/arm/mach-vexpress/tc2_pm.c
@@ -134,26 +134,7 @@  static void tc2_pm_down(u64 residency)
 			: : "r" (0x400) );
 		}
 
-		/*
-		 * We need to disable and flush the whole (L1 and L2) cache.
-		 * Let's do it in the safest possible way i.e. with
-		 * no memory access within the following sequence
-		 * including the stack.
-		 */
-		asm volatile(
-		"mrc	p15, 0, r0, c1, c0, 0	@ get CR \n\t"
-		"bic	r0, r0, #"__stringify(CR_C)" \n\t"
-		"mcr	p15, 0, r0, c1, c0, 0	@ set CR \n\t"
-		"isb	\n\t"
-		"bl	v7_flush_dcache_all \n\t"
-		"clrex	\n\t"
-		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR \n\t"
-		"bic	r0, r0, #(1 << 6)	@ disable local coherency \n\t"
-		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR \n\t"
-		"isb	\n\t"
-		"dsb	"
-		: : : "r0","r1","r2","r3","r4","r5","r6","r7",
-		      "r9","r10","r11","lr","memory");
+		v7_disable_flush_cache(all);
 
 		cci_disable_port_by_cpu(mpidr);
 
@@ -169,24 +150,7 @@  static void tc2_pm_down(u64 residency)
 
 		arch_spin_unlock(&tc2_pm_lock);
 
-		/*
-		 * We need to disable and flush only the L1 cache.
-		 * Let's do it in the safest possible way as above.
-		 */
-		asm volatile(
-		"mrc	p15, 0, r0, c1, c0, 0	@ get CR \n\t"
-		"bic	r0, r0, #"__stringify(CR_C)" \n\t"
-		"mcr	p15, 0, r0, c1, c0, 0	@ set CR \n\t"
-		"isb	\n\t"
-		"bl	v7_flush_dcache_louis \n\t"
-		"clrex	\n\t"
-		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR \n\t"
-		"bic	r0, r0, #(1 << 6)	@ disable local coherency \n\t"
-		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR \n\t"
-		"isb	\n\t"
-		"dsb	"
-		: : : "r0","r1","r2","r3","r4","r5","r6","r7",
-		      "r9","r10","r11","lr","memory");
+		v7_disable_flush_cache(louis);
 	}
 
 	__mcpm_cpu_down(cpu, cluster);