diff mbox

[v4,00/15] multi-cluster power management

Message ID alpine.LFD.2.03.1304231657150.17375@syhkavp.arg (mailing list archive)
State New, archived
Headers show

Commit Message

Nicolas Pitre April 23, 2013, 9:03 p.m. UTC
On Tue, 23 Apr 2013, Russell King - ARM Linux wrote:

> What I suggest for the time being is to provide new inline function(s)
> in arch/arm/include/cacheflush.h which are purposed for your application,
> document them in that file, and call the implementation you're currently
> using.  That means if we do have to change it in the future (for example,
> we don't need to do anything in the dcache flushing stuff) we don't have
> to hunt through all the code to find _your_ different use of that function
> and fix it - we can just fix it in one place and we have the reference
> there for what your code expects.

What about the patch below?  Once you tell me it is fine to you I'll 
adapt the MCPM series to it.

----- >8
From: Nicolas Pitre <nicolas.pitre@linaro.org>
Date: Tue, 23 Apr 2013 16:45:40 -0400
Subject: [PATCH] ARM: cacheflush: add synchronization helpers for mixed cache state accesses

Algorithms used by the MCPM layer rely on state variables which are
accessed while the cache is either active or inactive, depending
on the code path and the active state.

This patch introduces generic cache maintenance helpers to provide the
necessary cache synchronization for such state variables to always hit
main memory in a ordered way.

Signed-off-by: Nicolas Pitre <nico@linaro.org>

Comments

Russell King - ARM Linux April 23, 2013, 9:46 p.m. UTC | #1
On Tue, Apr 23, 2013 at 05:03:06PM -0400, Nicolas Pitre wrote:
> On Tue, 23 Apr 2013, Russell King - ARM Linux wrote:
> 
> > What I suggest for the time being is to provide new inline function(s)
> > in arch/arm/include/cacheflush.h which are purposed for your application,
> > document them in that file, and call the implementation you're currently
> > using.  That means if we do have to change it in the future (for example,
> > we don't need to do anything in the dcache flushing stuff) we don't have
> > to hunt through all the code to find _your_ different use of that function
> > and fix it - we can just fix it in one place and we have the reference
> > there for what your code expects.
> 
> What about the patch below?  Once you tell me it is fine to you I'll 
> adapt the MCPM series to it.

Yes, that looks like a saner solution.  Thanks.
Nicolas Pitre April 23, 2013, 9:56 p.m. UTC | #2
On Tue, 23 Apr 2013, Russell King - ARM Linux wrote:

> On Tue, Apr 23, 2013 at 05:03:06PM -0400, Nicolas Pitre wrote:
> > On Tue, 23 Apr 2013, Russell King - ARM Linux wrote:
> > 
> > > What I suggest for the time being is to provide new inline function(s)
> > > in arch/arm/include/cacheflush.h which are purposed for your application,
> > > document them in that file, and call the implementation you're currently
> > > using.  That means if we do have to change it in the future (for example,
> > > we don't need to do anything in the dcache flushing stuff) we don't have
> > > to hunt through all the code to find _your_ different use of that function
> > > and fix it - we can just fix it in one place and we have the reference
> > > there for what your code expects.
> > 
> > What about the patch below?  Once you tell me it is fine to you I'll 
> > adapt the MCPM series to it.
> 
> Yes, that looks like a saner solution.  Thanks.

May I add your ACK to the patch?


Nicolas
Russell King - ARM Linux April 23, 2013, 10:44 p.m. UTC | #3
On Tue, Apr 23, 2013 at 05:56:03PM -0400, Nicolas Pitre wrote:
> On Tue, 23 Apr 2013, Russell King - ARM Linux wrote:
> 
> > On Tue, Apr 23, 2013 at 05:03:06PM -0400, Nicolas Pitre wrote:
> > > On Tue, 23 Apr 2013, Russell King - ARM Linux wrote:
> > > 
> > > > What I suggest for the time being is to provide new inline function(s)
> > > > in arch/arm/include/cacheflush.h which are purposed for your application,
> > > > document them in that file, and call the implementation you're currently
> > > > using.  That means if we do have to change it in the future (for example,
> > > > we don't need to do anything in the dcache flushing stuff) we don't have
> > > > to hunt through all the code to find _your_ different use of that function
> > > > and fix it - we can just fix it in one place and we have the reference
> > > > there for what your code expects.
> > > 
> > > What about the patch below?  Once you tell me it is fine to you I'll 
> > > adapt the MCPM series to it.
> > 
> > Yes, that looks like a saner solution.  Thanks.
> 
> May I add your ACK to the patch?

Yes, sure.

Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Nicolas Pitre April 24, 2013, 4:11 a.m. UTC | #4
On Tue, 23 Apr 2013, Russell King - ARM Linux wrote:

> On Tue, Apr 23, 2013 at 05:56:03PM -0400, Nicolas Pitre wrote:
> > On Tue, 23 Apr 2013, Russell King - ARM Linux wrote:
> > 
> > > On Tue, Apr 23, 2013 at 05:03:06PM -0400, Nicolas Pitre wrote:
> > > > On Tue, 23 Apr 2013, Russell King - ARM Linux wrote:
> > > > 
> > > > > What I suggest for the time being is to provide new inline function(s)
> > > > > in arch/arm/include/cacheflush.h which are purposed for your application,
> > > > > document them in that file, and call the implementation you're currently
> > > > > using.  That means if we do have to change it in the future (for example,
> > > > > we don't need to do anything in the dcache flushing stuff) we don't have
> > > > > to hunt through all the code to find _your_ different use of that function
> > > > > and fix it - we can just fix it in one place and we have the reference
> > > > > there for what your code expects.
> > > > 
> > > > What about the patch below?  Once you tell me it is fine to you I'll 
> > > > adapt the MCPM series to it.
> > > 
> > > Yes, that looks like a saner solution.  Thanks.
> > 
> > May I add your ACK to the patch?
> 
> Yes, sure.
> 
> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>

Thanks.  I've adapted the series to this, and mcpm_set_entry_vector() 
now looks like this:

+extern unsigned long mcpm_entry_vectors[MAX_NR_CLUSTERS][MAX_CPUS_PER_CLUSTER];
+
+void mcpm_set_entry_vector(unsigned cpu, unsigned cluster, void *ptr)
+{
+	unsigned long val = ptr ? virt_to_phys(ptr) : 0;
+	mcpm_entry_vectors[cluster][cpu] = val;
+	sync_cache_w(&mcpm_entry_vectors[cluster][cpu]);
+}

Similarly for the CCI driver:

+	/*
+	 * Multi-cluster systems may need this data when non-coherent, during
+	 * cluster power-up/power-down. Make sure it reaches main memory:
+	 */
+	sync_cache_w(info);
+	sync_cache_w(&info);

This is also much prettier.

Assorted small changes for comments that followed the v4 post have been 
integrated as well.  I don't think they justify a repost of the whole 
thing though.  They are:

- Replace custom cache sync helpers with the sync_cache_*() ones from 
  cacheflush.h.

- Remove volatile annotations since explicit cache ops act as memory 
  barriers now.

- Rename mcpm_entry.h to mcpm.h since this is not solely about entry but
  more general.

- Export the __CACHE_WRITEBACK_* constants via asm_offsets.h since 
  they're needed by assembly code.

- Add some more comments and clarify some existing ones.

I managed to retest this on RTSM after fighting with the license system.

The latest MCPM patches are available here:

  git://git.linaro.org/people/nico/linux mcpm

Diffstat:

 Documentation/arm/cluster-pm-race-avoidance.txt | 498 ++++++++++++++++++
 Documentation/arm/vlocks.txt                    | 211 ++++++++
 arch/arm/Kconfig                                |   8 +
 arch/arm/common/Makefile                        |   1 +
 arch/arm/common/mcpm_entry.c                    | 263 +++++++++
 arch/arm/common/mcpm_head.S                     | 219 ++++++++
 arch/arm/common/mcpm_platsmp.c                  |  92 ++++
 arch/arm/common/vlock.S                         | 108 ++++
 arch/arm/common/vlock.h                         |  29 +
 arch/arm/include/asm/cacheflush.h               |  75 +++
 arch/arm/include/asm/mcpm.h                     | 209 ++++++++
 arch/arm/kernel/asm-offsets.c                   |   4 +
 12 files changed, 1717 insertions(+)

The corresponding DCSCB support for RTSM is here:

  git://git.linaro.org/people/nico/linux mcpm+dcscb

although that part is probably more suitable for ARM-SOC.

Do you want those patches to be posted again, or should I send a new 
pull request?


Nicolas
tip-bot for Dave Martin April 24, 2013, 2:25 p.m. UTC | #5
On Tue, Apr 23, 2013 at 05:03:06PM -0400, Nicolas Pitre wrote:
> On Tue, 23 Apr 2013, Russell King - ARM Linux wrote:
> 
> > What I suggest for the time being is to provide new inline function(s)
> > in arch/arm/include/cacheflush.h which are purposed for your application,
> > document them in that file, and call the implementation you're currently
> > using.  That means if we do have to change it in the future (for example,
> > we don't need to do anything in the dcache flushing stuff) we don't have
> > to hunt through all the code to find _your_ different use of that function
> > and fix it - we can just fix it in one place and we have the reference
> > there for what your code expects.
> 
> What about the patch below?  Once you tell me it is fine to you I'll 
> adapt the MCPM series to it.

This looks appropriate for mcpm's needs, and generic enough for other
code to reuse.  Thanks for that.

Acked-by: Dave Martin <dave.martin@linaro.org>

> 
> ----- >8
> From: Nicolas Pitre <nicolas.pitre@linaro.org>
> Date: Tue, 23 Apr 2013 16:45:40 -0400
> Subject: [PATCH] ARM: cacheflush: add synchronization helpers for mixed cache state accesses
> 
> Algorithms used by the MCPM layer rely on state variables which are
> accessed while the cache is either active or inactive, depending
> on the code path and the active state.
> 
> This patch introduces generic cache maintenance helpers to provide the
> necessary cache synchronization for such state variables to always hit
> main memory in a ordered way.
> 
> Signed-off-by: Nicolas Pitre <nico@linaro.org>
> 
> diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
> index e1489c54cd..bff71388e7 100644
> --- a/arch/arm/include/asm/cacheflush.h
> +++ b/arch/arm/include/asm/cacheflush.h
> @@ -363,4 +363,79 @@ static inline void flush_cache_vunmap(unsigned long start, unsigned long end)
>  		flush_cache_all();
>  }
>  
> +/*
> + * Memory synchronization helpers for mixed cached vs non cached accesses.
> + *
> + * Some synchronization algorithms have to set states in memory with the
> + * cache enabled or disabled depending on the code path.  It is crucial
> + * to always ensure proper cache maintenance to update main memory right
> + * away in that case.
> + *
> + * Any cached write must be followed by a cache clean operation.
> + * Any cached read must be preceded by a cache invalidate operation.
> + * Yet, in the read case, a cache flush i.e. atomic clean+invalidate
> + * operation is needed to avoid discarding possible concurrent writes to the
> + * accessed memory.
> + *
> + * Also, in order to prevent a cached writer from interfering with an
> + * adjacent non-cached writer, each state variable must be located to
> + * a separate cache line.
> + */
> +
> +/*
> + * This needs to be >= the max cache writeback size of all
> + * supported platforms included in the current kernel configuration.
> + * This is used to align state variables to their own cache lines.
> + */
> +#define __CACHE_WRITEBACK_ORDER 6  /* guessed from existing platforms */
> +#define __CACHE_WRITEBACK_GRANULE (1 << __CACHE_WRITEBACK_ORDER)
> +
> +/*
> + * There is no __cpuc_clean_dcache_area but we use it anyway for
> + * code intent clarity, and alias it to __cpuc_flush_dcache_area.
> + */
> +#define __cpuc_clean_dcache_area __cpuc_flush_dcache_area
> +
> +/*
> + * Ensure preceding writes to *p by this CPU are visible to
> + * subsequent reads by other CPUs:
> + */
> +static inline void __sync_cache_range_w(volatile void *p, size_t size)
> +{
> +	char *_p = (char *)p;
> +
> +	__cpuc_clean_dcache_area(_p, size);
> +	outer_clean_range(__pa(_p), __pa(_p + size));
> +}
> +
> +/*
> + * Ensure preceding writes to *p by other CPUs are visible to
> + * subsequent reads by this CPU.  We must be careful not to
> + * discard data simultaneously written by another CPU, hence the
> + * usage of flush rather than invalidate operations.
> + */
> +static inline void __sync_cache_range_r(volatile void *p, size_t size)
> +{
> +	char *_p = (char *)p;
> +
> +#ifdef CONFIG_OUTER_CACHE
> +	if (outer_cache.flush_range) {
> +		/*
> +		 * Ensure dirty data migrated from other CPUs into our cache
> +		 * are cleaned out safely before the outer cache is cleaned:
> +		 */
> +		__cpuc_clean_dcache_area(_p, size);
> +
> +		/* Clean and invalidate stale data for *p from outer ... */
> +		outer_flush_range(__pa(_p), __pa(_p + size));
> +	}
> +#endif
> +
> +	/* ... and inner cache: */
> +	__cpuc_flush_dcache_area(_p, size);
> +}
> +
> +#define sync_cache_w(ptr) __sync_cache_range_w(ptr, sizeof *(ptr))
> +#define sync_cache_r(ptr) __sync_cache_range_r(ptr, sizeof *(ptr))
> +
>  #endif
Russell King - ARM Linux April 24, 2013, 8:25 p.m. UTC | #6
On Wed, Apr 24, 2013 at 12:11:26AM -0400, Nicolas Pitre wrote:
> The latest MCPM patches are available here:
> 
>   git://git.linaro.org/people/nico/linux mcpm
> 
> Diffstat:
> 
>  Documentation/arm/cluster-pm-race-avoidance.txt | 498 ++++++++++++++++++
>  Documentation/arm/vlocks.txt                    | 211 ++++++++
>  arch/arm/Kconfig                                |   8 +
>  arch/arm/common/Makefile                        |   1 +
>  arch/arm/common/mcpm_entry.c                    | 263 +++++++++
>  arch/arm/common/mcpm_head.S                     | 219 ++++++++
>  arch/arm/common/mcpm_platsmp.c                  |  92 ++++
>  arch/arm/common/vlock.S                         | 108 ++++
>  arch/arm/common/vlock.h                         |  29 +
>  arch/arm/include/asm/cacheflush.h               |  75 +++
>  arch/arm/include/asm/mcpm.h                     | 209 ++++++++
>  arch/arm/kernel/asm-offsets.c                   |   4 +
>  12 files changed, 1717 insertions(+)
> 
> The corresponding DCSCB support for RTSM is here:
> 
>   git://git.linaro.org/people/nico/linux mcpm+dcscb
> 
> although that part is probably more suitable for ARM-SOC.
> 
> Do you want those patches to be posted again, or should I send a new 
> pull request?

I've been doing some further digging in these patches this evening, via
https://git.linaro.org/gitweb?p=people/nico/linux.git;a=shortlog;h=refs/heads/mcpm

I think they're now in pretty good shape and are ready to be pulled.
Nicolas Pitre April 24, 2013, 11:31 p.m. UTC | #7
On Wed, 24 Apr 2013, Russell King - ARM Linux wrote:

> On Wed, Apr 24, 2013 at 12:11:26AM -0400, Nicolas Pitre wrote:
> > The latest MCPM patches are available here:
> > 
> >   git://git.linaro.org/people/nico/linux mcpm
> > 
> > Diffstat:
> > 
> >  Documentation/arm/cluster-pm-race-avoidance.txt | 498 ++++++++++++++++++
> >  Documentation/arm/vlocks.txt                    | 211 ++++++++
> >  arch/arm/Kconfig                                |   8 +
> >  arch/arm/common/Makefile                        |   1 +
> >  arch/arm/common/mcpm_entry.c                    | 263 +++++++++
> >  arch/arm/common/mcpm_head.S                     | 219 ++++++++
> >  arch/arm/common/mcpm_platsmp.c                  |  92 ++++
> >  arch/arm/common/vlock.S                         | 108 ++++
> >  arch/arm/common/vlock.h                         |  29 +
> >  arch/arm/include/asm/cacheflush.h               |  75 +++
> >  arch/arm/include/asm/mcpm.h                     | 209 ++++++++
> >  arch/arm/kernel/asm-offsets.c                   |   4 +
> >  12 files changed, 1717 insertions(+)
> > 
> > The corresponding DCSCB support for RTSM is here:
> > 
> >   git://git.linaro.org/people/nico/linux mcpm+dcscb
> > 
> > although that part is probably more suitable for ARM-SOC.
> > 
> > Do you want those patches to be posted again, or should I send a new 
> > pull request?
> 
> I've been doing some further digging in these patches this evening, via
> https://git.linaro.org/gitweb?p=people/nico/linux.git;a=shortlog;h=refs/heads/mcpm
> 
> I think they're now in pretty good shape and are ready to be pulled.

Great!  Please feel free to pull the mcpm branch.  I've nothing to add 
to it.


Nicolas
diff mbox

Patch

diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
index e1489c54cd..bff71388e7 100644
--- a/arch/arm/include/asm/cacheflush.h
+++ b/arch/arm/include/asm/cacheflush.h
@@ -363,4 +363,79 @@  static inline void flush_cache_vunmap(unsigned long start, unsigned long end)
 		flush_cache_all();
 }
 
+/*
+ * Memory synchronization helpers for mixed cached vs non cached accesses.
+ *
+ * Some synchronization algorithms have to set states in memory with the
+ * cache enabled or disabled depending on the code path.  It is crucial
+ * to always ensure proper cache maintenance to update main memory right
+ * away in that case.
+ *
+ * Any cached write must be followed by a cache clean operation.
+ * Any cached read must be preceded by a cache invalidate operation.
+ * Yet, in the read case, a cache flush i.e. atomic clean+invalidate
+ * operation is needed to avoid discarding possible concurrent writes to the
+ * accessed memory.
+ *
+ * Also, in order to prevent a cached writer from interfering with an
+ * adjacent non-cached writer, each state variable must be located to
+ * a separate cache line.
+ */
+
+/*
+ * This needs to be >= the max cache writeback size of all
+ * supported platforms included in the current kernel configuration.
+ * This is used to align state variables to their own cache lines.
+ */
+#define __CACHE_WRITEBACK_ORDER 6  /* guessed from existing platforms */
+#define __CACHE_WRITEBACK_GRANULE (1 << __CACHE_WRITEBACK_ORDER)
+
+/*
+ * There is no __cpuc_clean_dcache_area but we use it anyway for
+ * code intent clarity, and alias it to __cpuc_flush_dcache_area.
+ */
+#define __cpuc_clean_dcache_area __cpuc_flush_dcache_area
+
+/*
+ * Ensure preceding writes to *p by this CPU are visible to
+ * subsequent reads by other CPUs:
+ */
+static inline void __sync_cache_range_w(volatile void *p, size_t size)
+{
+	char *_p = (char *)p;
+
+	__cpuc_clean_dcache_area(_p, size);
+	outer_clean_range(__pa(_p), __pa(_p + size));
+}
+
+/*
+ * Ensure preceding writes to *p by other CPUs are visible to
+ * subsequent reads by this CPU.  We must be careful not to
+ * discard data simultaneously written by another CPU, hence the
+ * usage of flush rather than invalidate operations.
+ */
+static inline void __sync_cache_range_r(volatile void *p, size_t size)
+{
+	char *_p = (char *)p;
+
+#ifdef CONFIG_OUTER_CACHE
+	if (outer_cache.flush_range) {
+		/*
+		 * Ensure dirty data migrated from other CPUs into our cache
+		 * are cleaned out safely before the outer cache is cleaned:
+		 */
+		__cpuc_clean_dcache_area(_p, size);
+
+		/* Clean and invalidate stale data for *p from outer ... */
+		outer_flush_range(__pa(_p), __pa(_p + size));
+	}
+#endif
+
+	/* ... and inner cache: */
+	__cpuc_flush_dcache_area(_p, size);
+}
+
+#define sync_cache_w(ptr) __sync_cache_range_w(ptr, sizeof *(ptr))
+#define sync_cache_r(ptr) __sync_cache_range_r(ptr, sizeof *(ptr))
+
 #endif