diff mbox

arm64: Make L1_CACHE_SHIFT configurable

Message ID 1518479125-14428-1-git-send-email-f.fainelli@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Florian Fainelli Feb. 12, 2018, 11:45 p.m. UTC
On many platforms, including, but not limited to Brahma-B53 platforms,
the L1 cache line size is 64bytes. Increasing the value to 128bytes
appears to be creating performance problems for workloads involving
network drivers and lots of data movement. In order to keep what was
introduced with 97303480753e ("arm64: Increase the max granular size"),
a kernel built for ARCH_THUNDER or ARCH_THUNDER2 will get a 128bytes
cache line size definition.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 arch/arm64/Kconfig             | 10 ++++++++++
 arch/arm64/Kconfig.platforms   |  2 ++
 arch/arm64/include/asm/cache.h |  2 +-
 3 files changed, 13 insertions(+), 1 deletion(-)

Comments

Timur Tabi Feb. 12, 2018, 11:52 p.m. UTC | #1
On 02/12/2018 05:45 PM, Florian Fainelli wrote:
> +config ARM64_L1_CACHE_SHIFT
> +	int
> +	default 7 if ARM64_L1_CACHE_SHIFT_7
> +	default 6

Shouldn't this be the other way around?  Everyone is used to 7 now, so 
you're changing the default back to 6.  I would think that it should be 
7 by default, and platforms like Brahma-B53 should force it to 6.
Florian Fainelli Feb. 12, 2018, 11:57 p.m. UTC | #2
On 02/12/2018 03:52 PM, Timur Tabi wrote:
> On 02/12/2018 05:45 PM, Florian Fainelli wrote:
>> +config ARM64_L1_CACHE_SHIFT
>> +    int
>> +    default 7 if ARM64_L1_CACHE_SHIFT_7
>> +    default 6
> 
> Shouldn't this be the other way around?  Everyone is used to 7 now, so
> you're changing the default back to 6.  I would think that it should be
> 7 by default, and platforms like Brahma-B53 should force it to 6.

That is debatable, is there a good publicly available table of what the
typical L1 cache line size is on ARMv8 platforms?
Timur Tabi Feb. 13, 2018, 12:10 a.m. UTC | #3
On 02/12/2018 05:57 PM, Florian Fainelli wrote:
> That is debatable, is there a good publicly available table of what the
> typical L1 cache line size is on ARMv8 platforms?

I don't have that, but I was under the impression that we moved from 6 
to 7 because more and more ARMv8 platforms have 128-byte caches, so that 
is the "new normal".
Florian Fainelli Feb. 13, 2018, 12:17 a.m. UTC | #4
On 02/12/2018 04:10 PM, Timur Tabi wrote:
> On 02/12/2018 05:57 PM, Florian Fainelli wrote:
>> That is debatable, is there a good publicly available table of what the
>> typical L1 cache line size is on ARMv8 platforms?
> 
> I don't have that, but I was under the impression that we moved from 6
> to 7 because more and more ARMv8 platforms have 128-byte caches, so that
> is the "new normal".
> 

That does not seem to be the data that I am collecting from ARM's
website and some quick googling:

The following cores appear to have a 64bytes L1D cache line size: A55,
A73 (fixed), A35, A32, A53, A57 (fixed), A72 (fixed) even the Falkor
seems to be that way according to [1].

APM Mustang also seems to be 64b L1D according to [2].

[1]: https://en.wikichip.org/wiki/qualcomm/microarchitectures/falkor
[2]: http://www.7-cpu.com/cpu/X-Gene.html

And then we seem to covering what the ARM64 mainline kernel knows about
non-ARM implementations: ThunderX and ThunderX2 (formerly Broadcom
Vulcan). There is possibly the Qualcomm Kryo is different, but wikipedia
seems to suggest it is a derivative of existing Cortex-A CPUs which have
a 64b cache line size.

Let's see what Catalin and Will think about what the default should be.

Thanks!
Catalin Marinas Feb. 13, 2018, 11:57 a.m. UTC | #5
On Mon, Feb 12, 2018 at 03:45:23PM -0800, Florian Fainelli wrote:
> On many platforms, including, but not limited to Brahma-B53 platforms,
> the L1 cache line size is 64bytes. Increasing the value to 128bytes
> appears to be creating performance problems for workloads involving
> network drivers and lots of data movement. In order to keep what was
> introduced with 97303480753e ("arm64: Increase the max granular size"),
> a kernel built for ARCH_THUNDER or ARCH_THUNDER2 will get a 128bytes
> cache line size definition.

This approach has been raised before ([1] as an example but you can
probably find other threads) and NAK'ed. I really don't want this macro
to be configurable as we aim for a single kernel Image.

My proposal was to move L1_CACHE_SHIFT back to 6 and ARCH_DMA_MIN_ALIGN
to 128 as this is the largest known CWG. The networking code is wrong in
assuming SKB_DATA_ALIGN only needs SMP_CACHE_BYTES for DMA alignment but
we can add some safety checks (i.e. WARN_ON) in the arch dma ops code if
the device is non-coherent.

I'll send a patch to the list (hopefully later today).

Catalin

[1] https://patchwork.kernel.org/patch/8634481/
Jon Masters Feb. 19, 2018, 11:46 p.m. UTC | #6
On 02/12/2018 07:17 PM, Florian Fainelli wrote:
> On 02/12/2018 04:10 PM, Timur Tabi wrote:
>> On 02/12/2018 05:57 PM, Florian Fainelli wrote:
>>> That is debatable, is there a good publicly available table of what the
>>> typical L1 cache line size is on ARMv8 platforms?

With a server hat on...

There are many ARMv8 server platforms that do 64b today, but future
designs are likely to head toward 128b (for a variety of reasons). Many
of the earlier designs were 64b because that's what certain other arches
were using in their server cores. I doubt Vulcan will remain a unique
and special case for very long. On the CCIX side of things, I've been
trying to push people to go with 128b lines in future designs too.

Jon.
diff mbox

Patch

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index b488076d63c2..8060cbbbfd77 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -782,6 +782,16 @@  config ARCH_WANT_HUGE_PMD_SHARE
 config ARCH_HAS_CACHE_LINE_SIZE
 	def_bool y
 
+config ARM64_L1_CACHE_SHIFT_7
+	bool
+	help
+	  Setting ARM64 L1 cache line size to 128 bytes.
+
+config ARM64_L1_CACHE_SHIFT
+	int
+	default 7 if ARM64_L1_CACHE_SHIFT_7
+	default 6
+
 source "mm/Kconfig"
 
 config SECCOMP
diff --git a/arch/arm64/Kconfig.platforms b/arch/arm64/Kconfig.platforms
index 2401373565ff..b595f5624f75 100644
--- a/arch/arm64/Kconfig.platforms
+++ b/arch/arm64/Kconfig.platforms
@@ -228,11 +228,13 @@  config ARCH_SPRD
 
 config ARCH_THUNDER
 	bool "Cavium Inc. Thunder SoC Family"
+	select ARM64_L1_CACHE_SHIFT_7
 	help
 	  This enables support for Cavium's Thunder Family of SoCs.
 
 config ARCH_THUNDER2
 	bool "Cavium ThunderX2 Server Processors"
+	select ARM64_L1_CACHE_SHIFT_7
 	select GPIOLIB
 	help
 	  This enables support for Cavium's ThunderX2 CN99XX family of
diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h
index ea9bb4e0e9bb..2ff64929e6bd 100644
--- a/arch/arm64/include/asm/cache.h
+++ b/arch/arm64/include/asm/cache.h
@@ -29,7 +29,7 @@ 
 #define ICACHE_POLICY_VIPT	2
 #define ICACHE_POLICY_PIPT	3
 
-#define L1_CACHE_SHIFT		7
+#define L1_CACHE_SHIFT		CONFIG_ARM64_L1_CACHE_SHIFT
 #define L1_CACHE_BYTES		(1 << L1_CACHE_SHIFT)
 
 /*