[v2,2/2] riscv: Set SHMLBA according to cache geometry

Message ID	20191126224446.15145-3-consult-mg@gstardust.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=CG02=ZS=lists.infradead.org=linux-riscv-bounces+patchwork-linux-riscv=patchwork.kernel.org@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 95F0620722 From: Marc Gauthier <consult-mg@gstardust.com> To: linux-riscv@lists.infradead.org Subject: [PATCH v2 2/2] riscv: Set SHMLBA according to cache geometry Date: Tue, 26 Nov 2019 17:44:46 -0500 Message-Id: <20191126224446.15145-3-consult-mg@gstardust.com> In-Reply-To: <20191126224446.15145-1-consult-mg@gstardust.com> References: <20191126224446.15145-1-consult-mg@gstardust.com> summary: Content analysis details: (-0.9 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at https://www.dnswl.org/, low trust [64.147.123.19 listed in list.dnswl.org] -0.0 SPF_HELO_PASS SPF: HELO matches SPF record 0.0 SPF_NONE SPF: sender does not publish an SPF Record -0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature from envelope-from domain 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain Precedence: list Cc: Marc Gauthier <consult-mg@gstardust.com>, Palmer Dabbelt <palmer@dabbelt.com>, Albert Ou <aou@eecs.berkeley.edu>, Paul Walmsley <paul.walmsley@sifive.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-riscv" <linux-riscv-bounces@lists.infradead.org> Errors-To: linux-riscv-bounces+patchwork-linux-riscv=patchwork.kernel.org@lists.infradead.org
Series	riscv: Align shared mappings to avoid cache aliasing \| expand [v2,0/2] riscv: Align shared mappings to avoid cache aliasing [v2,1/2] riscv: Align shared mappings to SHMLBA [v2,2/2] riscv: Set SHMLBA according to cache geometry

Message ID

20191126224446.15145-3-consult-mg@gstardust.com (mailing list archive)

State

New, archived

Headers

DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 95F0620722
From: Marc Gauthier <consult-mg@gstardust.com>
To: linux-riscv@lists.infradead.org
Subject: [PATCH v2 2/2] riscv: Set SHMLBA according to cache geometry
Date: Tue, 26 Nov 2019 17:44:46 -0500
Message-Id: <20191126224446.15145-3-consult-mg@gstardust.com>
In-Reply-To: <20191126224446.15145-1-consult-mg@gstardust.com>
References: <20191126224446.15145-1-consult-mg@gstardust.com>
Precedence: list
Cc: Marc Gauthier <consult-mg@gstardust.com>,
 Palmer Dabbelt <palmer@dabbelt.com>, Albert Ou <aou@eecs.berkeley.edu>,
 Paul Walmsley <paul.walmsley@sifive.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: "linux-riscv" <linux-riscv-bounces@lists.infradead.org>
Errors-To: 
 linux-riscv-bounces+patchwork-linux-riscv=patchwork.kernel.org@lists.infradead.org

Series

riscv: Align shared mappings to avoid cache aliasing | expand

Commit Message

Marc Gauthier Nov. 26, 2019, 10:44 p.m. UTC

Set SHMLBA to the maximum cache "span" (line size * number of sets) of
all CPU L1 instruction and data caches (L2 and up are rarely VIPT).
This avoids VIPT cache aliasing with minimal alignment constraints.

If the device tree does not provide cache parameters, use a conservative
default of 16 KB:  only large enough to avoid aliasing in most VIPT caches.

Signed-off-by: Marc Gauthier <consult-mg@gstardust.com>
---
 arch/riscv/include/asm/Kbuild     |  1 -
 arch/riscv/include/asm/shmparam.h | 12 +++++++
 arch/riscv/kernel/cacheinfo.c     | 52 +++++++++++++++++++++++++++++++
 3 files changed, 64 insertions(+), 1 deletion(-)
 create mode 100644 arch/riscv/include/asm/shmparam.h

Comments

Palmer Dabbelt Dec. 5, 2019, 11:03 p.m. UTC | #1

On Tue, 26 Nov 2019 14:44:46 PST (-0800), consult-mg@gstardust.com wrote:
> Set SHMLBA to the maximum cache "span" (line size * number of sets) of
> all CPU L1 instruction and data caches (L2 and up are rarely VIPT).
> This avoids VIPT cache aliasing with minimal alignment constraints.
>
> If the device tree does not provide cache parameters, use a conservative
> default of 16 KB:  only large enough to avoid aliasing in most VIPT caches.
>
> Signed-off-by: Marc Gauthier <consult-mg@gstardust.com>
> ---
>  arch/riscv/include/asm/Kbuild     |  1 -
>  arch/riscv/include/asm/shmparam.h | 12 +++++++
>  arch/riscv/kernel/cacheinfo.c     | 52 +++++++++++++++++++++++++++++++
>  3 files changed, 64 insertions(+), 1 deletion(-)
>  create mode 100644 arch/riscv/include/asm/shmparam.h
>
> diff --git a/arch/riscv/include/asm/Kbuild b/arch/riscv/include/asm/Kbuild
> index 16970f246860..3905765807af 100644
> --- a/arch/riscv/include/asm/Kbuild
> +++ b/arch/riscv/include/asm/Kbuild
> @@ -27,7 +27,6 @@ generic-y += percpu.h
>  generic-y += preempt.h
>  generic-y += sections.h
>  generic-y += serial.h
> -generic-y += shmparam.h
>  generic-y += topology.h
>  generic-y += trace_clock.h
>  generic-y += unaligned.h
> diff --git a/arch/riscv/include/asm/shmparam.h b/arch/riscv/include/asm/shmparam.h
> new file mode 100644
> index 000000000000..9b6a98153648
> --- /dev/null
> +++ b/arch/riscv/include/asm/shmparam.h
> @@ -0,0 +1,12 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_RISCV_SHMPARAM_H
> +#define _ASM_RISCV_SHMPARAM_H
> +
> +/*
> + * Minimum alignment of shared memory segments as a function of cache geometry.
> + */
> +#define	SHMLBA	arch_shmlba()

I'd prefer if we inline the memoization, which would avoid the cost of a
function call in the general case.  You can also avoid that 0 test by
initializing the variable to PAGE_SIZE and the filling it out in our early init
code -- maybe setup_vm()?  That's what SPARC is doing.

> +
> +long arch_shmlba(void);
> +
> +#endif /* _ASM_RISCV_SHMPARAM_H */
> diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c
> index 4c90c07d8c39..1bc7df8577d6 100644
> --- a/arch/riscv/kernel/cacheinfo.c
> +++ b/arch/riscv/kernel/cacheinfo.c
> @@ -1,12 +1,61 @@
>  // SPDX-License-Identifier: GPL-2.0-only
>  /*
>   * Copyright (C) 2017 SiFive
> + * Copyright (C) 2019 Aril Inc
>   */
>
>  #include <linux/cacheinfo.h>
>  #include <linux/cpu.h>
>  #include <linux/of.h>
>  #include <linux/of_device.h>
> +#include <linux/mm.h>
> +
> +static long shmlba;
> +
> +
> +/*
> + * Assuming  cache size = line size * #sets * N  for N-way associative caches,
> + * return the max cache "span" == (line size * #sets) == (cache size / N)
> + * across all L1 caches, or 0 if cache parameters are not available.
> + * VIPT caches with span > min page size are susceptible to aliasing.
> + */
> +static long get_max_cache_span(void)
> +{
> +	struct cpu_cacheinfo *this_cpu_ci;
> +	struct cacheinfo *this_leaf;
> +	long span, max_span = 0;
> +	int cpu, leaf;
> +
> +	for_each_possible_cpu(cpu) {
> +		this_cpu_ci = get_cpu_cacheinfo(cpu);
> +		this_leaf = this_cpu_ci->info_list;
> +		for (leaf = 0; leaf < this_cpu_ci->num_leaves; leaf++) {
> +			if (this_leaf->level > 1)
> +				break;
> +			span = this_leaf->coherency_line_size *
> +			       this_leaf->number_of_sets;
> +			if (span > max_span)
> +				max_span = span;
> +			this_leaf++;
> +		}
> +	}
> +	return max_span;
> +}
> +
> +/*
> + * Align shared mappings to the maximum cache "span" to avoid aliasing
> + * in VIPT caches, for performance.
> + * The returned SHMLBA value is always a power-of-two multiple of PAGE_SIZE.
> + */
> +long arch_shmlba(void)
> +{
> +	if (shmlba == 0) {
> +		long max_span = get_max_cache_span();
> +
> +		shmlba = max_span ? PAGE_ALIGN(max_span) : 4 * PAGE_SIZE;

I'd prefer to avoid sneaking in a default 4*PAGE_SIZE here, just default to
PAGE_SIZE and rely on systems with this behavior specifying the correct tuning
value in the device tree.  This avoids changing the behavior for existing
systems, which is a slight regression as the alignment uses more memory.  It's
not a big deal, but on systems that don't require alignment for high
performance there's no reason to just throw away memory -- particularly as we
have some RISC-V systems with pretty limited memory (I'm thinking of the
Kendryte boards, though I don't know how SHMLBA interacts with NOMMU so it
might not matter).

> +	}
> +	return shmlba;
> +}
>
>  static void ci_leaf_init(struct cacheinfo *this_leaf,
>  			 struct device_node *node,
> @@ -93,6 +142,9 @@ static int __populate_cache_leaves(unsigned int cpu)
>  	}
>  	of_node_put(np);
>
> +	/* Force recalculating SHMLBA if cache parameters are updated. */
> +	shmlba = 0;
> +
>  	return 0;
>  }

Marc Gauthier Dec. 5, 2019, 11:58 p.m. UTC | #2

Palmer Dabbelt wrote on 2019-12-05 18:03:
> On Tue, 26 Nov 2019 14:44:46 PST (-0800), consult-mg@gstardust.com wrote:
>> Set SHMLBA to the maximum cache "span" (line size * number of sets) of
>> all CPU L1 instruction and data caches (L2 and up are rarely VIPT).
>> This avoids VIPT cache aliasing with minimal alignment constraints.
>>
>> If the device tree does not provide cache parameters, use a conservative
>> default of 16 KB:  only large enough to avoid aliasing in most VIPT 
>> caches.
>>
>> Signed-off-by: Marc Gauthier <consult-mg@gstardust.com>
>> ---
>>  arch/riscv/include/asm/Kbuild     |  1 -
>>  arch/riscv/include/asm/shmparam.h | 12 +++++++
>>  arch/riscv/kernel/cacheinfo.c     | 52 +++++++++++++++++++++++++++++++
>>  3 files changed, 64 insertions(+), 1 deletion(-)
>>  create mode 100644 arch/riscv/include/asm/shmparam.h
>>
>> diff --git a/arch/riscv/include/asm/Kbuild 
>> b/arch/riscv/include/asm/Kbuild
>> index 16970f246860..3905765807af 100644
>> --- a/arch/riscv/include/asm/Kbuild
>> +++ b/arch/riscv/include/asm/Kbuild
>> @@ -27,7 +27,6 @@ generic-y += percpu.h
>>  generic-y += preempt.h
>>  generic-y += sections.h
>>  generic-y += serial.h
>> -generic-y += shmparam.h
>>  generic-y += topology.h
>>  generic-y += trace_clock.h
>>  generic-y += unaligned.h
>> diff --git a/arch/riscv/include/asm/shmparam.h 
>> b/arch/riscv/include/asm/shmparam.h
>> new file mode 100644
>> index 000000000000..9b6a98153648
>> --- /dev/null
>> +++ b/arch/riscv/include/asm/shmparam.h
>> @@ -0,0 +1,12 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef _ASM_RISCV_SHMPARAM_H
>> +#define _ASM_RISCV_SHMPARAM_H
>> +
>> +/*
>> + * Minimum alignment of shared memory segments as a function of 
>> cache geometry.
>> + */
>> +#define    SHMLBA    arch_shmlba()
>
> I'd prefer if we inline the memoization, which would avoid the cost of a
> function call in the general case.  You can also avoid that 0 test by
> initializing the variable to PAGE_SIZE and the filling it out in our 
> early init
> code -- maybe setup_vm()?  That's what SPARC is doing.

Good point.
Unlike SPARC, this patch re-uses existing code in 
drivers/base/cacheinfo.c to compute cache parameters.  To preserve that, 
it'll be more robust to initialize shmlba at a point certain to have 
those parameters -- at the comment far below, "Force recalculating 
SHMLBA if cache parameters are updated."  That way it keeps working if 
that point in time changes.


>> +
>> +long arch_shmlba(void);
>> +
>> +#endif /* _ASM_RISCV_SHMPARAM_H */
>> diff --git a/arch/riscv/kernel/cacheinfo.c 
>> b/arch/riscv/kernel/cacheinfo.c
>> index 4c90c07d8c39..1bc7df8577d6 100644
>> --- a/arch/riscv/kernel/cacheinfo.c
>> +++ b/arch/riscv/kernel/cacheinfo.c
>> @@ -1,12 +1,61 @@
>>  // SPDX-License-Identifier: GPL-2.0-only
>>  /*
>>   * Copyright (C) 2017 SiFive
>> + * Copyright (C) 2019 Aril Inc
>>   */
>>
>>  #include <linux/cacheinfo.h>
>>  #include <linux/cpu.h>
>>  #include <linux/of.h>
>>  #include <linux/of_device.h>
>> +#include <linux/mm.h>
>> +
>> +static long shmlba;
>> +
>> +
>> +/*
>> + * Assuming  cache size = line size * #sets * N  for N-way 
>> associative caches,
>> + * return the max cache "span" == (line size * #sets) == (cache size 
>> / N)
>> + * across all L1 caches, or 0 if cache parameters are not available.
>> + * VIPT caches with span > min page size are susceptible to aliasing.
>> + */
>> +static long get_max_cache_span(void)
>> +{
>> +    struct cpu_cacheinfo *this_cpu_ci;
>> +    struct cacheinfo *this_leaf;
>> +    long span, max_span = 0;
>> +    int cpu, leaf;
>> +
>> +    for_each_possible_cpu(cpu) {
>> +        this_cpu_ci = get_cpu_cacheinfo(cpu);
>> +        this_leaf = this_cpu_ci->info_list;
>> +        for (leaf = 0; leaf < this_cpu_ci->num_leaves; leaf++) {
>> +            if (this_leaf->level > 1)
>> +                break;
>> +            span = this_leaf->coherency_line_size *
>> +                   this_leaf->number_of_sets;
>> +            if (span > max_span)
>> +                max_span = span;
>> +            this_leaf++;
>> +        }
>> +    }
>> +    return max_span;
>> +}
>> +
>> +/*
>> + * Align shared mappings to the maximum cache "span" to avoid aliasing
>> + * in VIPT caches, for performance.
>> + * The returned SHMLBA value is always a power-of-two multiple of 
>> PAGE_SIZE.
>> + */
>> +long arch_shmlba(void)
>> +{
>> +    if (shmlba == 0) {
>> +        long max_span = get_max_cache_span();
>> +
>> +        shmlba = max_span ? PAGE_ALIGN(max_span) : 4 * PAGE_SIZE;
>
> I'd prefer to avoid sneaking in a default 4*PAGE_SIZE here, just 
> default to
> PAGE_SIZE and rely on systems with this behavior specifying the 
> correct tuning
> value in the device tree.

Fair enough.


> This avoids changing the behavior for existing
> systems, which is a slight regression as the alignment uses more 
> memory.  It's
> not a big deal, but on systems that don't require alignment for high
> performance there's no reason to just throw away memory -- 
> particularly as we
> have some RISC-V systems with pretty limited memory

Greater alignment takes up more virtual memory, not more physical memory.


> (I'm thinking of the
> Kendryte boards, though I don't know how SHMLBA interacts with NOMMU 
> so it
> might not matter).

There's no virtual memory in NOMMU, so indeed it doesn't matter.

M


>> +    }
>> +    return shmlba;
>> +}
>>
>>  static void ci_leaf_init(struct cacheinfo *this_leaf,
>>               struct device_node *node,
>> @@ -93,6 +142,9 @@ static int __populate_cache_leaves(unsigned int cpu)
>>      }
>>      of_node_put(np);
>>
>> +    /* Force recalculating SHMLBA if cache parameters are updated. */
>> +    shmlba = 0;
>> +
>>      return 0;
>>  }

Palmer Dabbelt Dec. 6, 2019, 12:07 a.m. UTC | #3

On Thu, 05 Dec 2019 15:58:25 PST (-0800), consult-mg@gstardust.com wrote:
> Palmer Dabbelt wrote on 2019-12-05 18:03:
>> On Tue, 26 Nov 2019 14:44:46 PST (-0800), consult-mg@gstardust.com wrote:
>>> Set SHMLBA to the maximum cache "span" (line size * number of sets) of
>>> all CPU L1 instruction and data caches (L2 and up are rarely VIPT).
>>> This avoids VIPT cache aliasing with minimal alignment constraints.
>>>
>>> If the device tree does not provide cache parameters, use a conservative
>>> default of 16 KB:  only large enough to avoid aliasing in most VIPT
>>> caches.
>>>
>>> Signed-off-by: Marc Gauthier <consult-mg@gstardust.com>
>>> ---
>>>  arch/riscv/include/asm/Kbuild     |  1 -
>>>  arch/riscv/include/asm/shmparam.h | 12 +++++++
>>>  arch/riscv/kernel/cacheinfo.c     | 52 +++++++++++++++++++++++++++++++
>>>  3 files changed, 64 insertions(+), 1 deletion(-)
>>>  create mode 100644 arch/riscv/include/asm/shmparam.h
>>>
>>> diff --git a/arch/riscv/include/asm/Kbuild
>>> b/arch/riscv/include/asm/Kbuild
>>> index 16970f246860..3905765807af 100644
>>> --- a/arch/riscv/include/asm/Kbuild
>>> +++ b/arch/riscv/include/asm/Kbuild
>>> @@ -27,7 +27,6 @@ generic-y += percpu.h
>>>  generic-y += preempt.h
>>>  generic-y += sections.h
>>>  generic-y += serial.h
>>> -generic-y += shmparam.h
>>>  generic-y += topology.h
>>>  generic-y += trace_clock.h
>>>  generic-y += unaligned.h
>>> diff --git a/arch/riscv/include/asm/shmparam.h
>>> b/arch/riscv/include/asm/shmparam.h
>>> new file mode 100644
>>> index 000000000000..9b6a98153648
>>> --- /dev/null
>>> +++ b/arch/riscv/include/asm/shmparam.h
>>> @@ -0,0 +1,12 @@
>>> +/* SPDX-License-Identifier: GPL-2.0 */
>>> +#ifndef _ASM_RISCV_SHMPARAM_H
>>> +#define _ASM_RISCV_SHMPARAM_H
>>> +
>>> +/*
>>> + * Minimum alignment of shared memory segments as a function of
>>> cache geometry.
>>> + */
>>> +#define    SHMLBA    arch_shmlba()
>>
>> I'd prefer if we inline the memoization, which would avoid the cost of a
>> function call in the general case.  You can also avoid that 0 test by
>> initializing the variable to PAGE_SIZE and the filling it out in our
>> early init
>> code -- maybe setup_vm()?  That's what SPARC is doing.
>
> Good point.
> Unlike SPARC, this patch re-uses existing code in
> drivers/base/cacheinfo.c to compute cache parameters.  To preserve that,
> it'll be more robust to initialize shmlba at a point certain to have
> those parameters -- at the comment far below, "Force recalculating
> SHMLBA if cache parameters are updated."  That way it keeps working if
> that point in time changes.

Works for me.

>>> +
>>> +long arch_shmlba(void);
>>> +
>>> +#endif /* _ASM_RISCV_SHMPARAM_H */
>>> diff --git a/arch/riscv/kernel/cacheinfo.c
>>> b/arch/riscv/kernel/cacheinfo.c
>>> index 4c90c07d8c39..1bc7df8577d6 100644
>>> --- a/arch/riscv/kernel/cacheinfo.c
>>> +++ b/arch/riscv/kernel/cacheinfo.c
>>> @@ -1,12 +1,61 @@
>>>  // SPDX-License-Identifier: GPL-2.0-only
>>>  /*
>>>   * Copyright (C) 2017 SiFive
>>> + * Copyright (C) 2019 Aril Inc
>>>   */
>>>
>>>  #include <linux/cacheinfo.h>
>>>  #include <linux/cpu.h>
>>>  #include <linux/of.h>
>>>  #include <linux/of_device.h>
>>> +#include <linux/mm.h>
>>> +
>>> +static long shmlba;
>>> +
>>> +
>>> +/*
>>> + * Assuming  cache size = line size * #sets * N  for N-way
>>> associative caches,
>>> + * return the max cache "span" == (line size * #sets) == (cache size
>>> / N)
>>> + * across all L1 caches, or 0 if cache parameters are not available.
>>> + * VIPT caches with span > min page size are susceptible to aliasing.
>>> + */
>>> +static long get_max_cache_span(void)
>>> +{
>>> +    struct cpu_cacheinfo *this_cpu_ci;
>>> +    struct cacheinfo *this_leaf;
>>> +    long span, max_span = 0;
>>> +    int cpu, leaf;
>>> +
>>> +    for_each_possible_cpu(cpu) {
>>> +        this_cpu_ci = get_cpu_cacheinfo(cpu);
>>> +        this_leaf = this_cpu_ci->info_list;
>>> +        for (leaf = 0; leaf < this_cpu_ci->num_leaves; leaf++) {
>>> +            if (this_leaf->level > 1)
>>> +                break;
>>> +            span = this_leaf->coherency_line_size *
>>> +                   this_leaf->number_of_sets;
>>> +            if (span > max_span)
>>> +                max_span = span;
>>> +            this_leaf++;
>>> +        }
>>> +    }
>>> +    return max_span;
>>> +}
>>> +
>>> +/*
>>> + * Align shared mappings to the maximum cache "span" to avoid aliasing
>>> + * in VIPT caches, for performance.
>>> + * The returned SHMLBA value is always a power-of-two multiple of
>>> PAGE_SIZE.
>>> + */
>>> +long arch_shmlba(void)
>>> +{
>>> +    if (shmlba == 0) {
>>> +        long max_span = get_max_cache_span();
>>> +
>>> +        shmlba = max_span ? PAGE_ALIGN(max_span) : 4 * PAGE_SIZE;
>>
>> I'd prefer to avoid sneaking in a default 4*PAGE_SIZE here, just
>> default to
>> PAGE_SIZE and rely on systems with this behavior specifying the
>> correct tuning
>> value in the device tree.
>
> Fair enough.
>
>
>> This avoids changing the behavior for existing
>> systems, which is a slight regression as the alignment uses more
>> memory.  It's
>> not a big deal, but on systems that don't require alignment for high
>> performance there's no reason to just throw away memory --
>> particularly as we
>> have some RISC-V systems with pretty limited memory
>
> Greater alignment takes up more virtual memory, not more physical memory.
>
>
>> (I'm thinking of the
>> Kendryte boards, though I don't know how SHMLBA interacts with NOMMU
>> so it
>> might not matter).
>
> There's no virtual memory in NOMMU, so indeed it doesn't matter.

Of course :).  I'd still like to leave the default alone, if only to prevent
people from relying on an arbitrary default decision.

>
> M
>
>
>>> +    }
>>> +    return shmlba;
>>> +}
>>>
>>>  static void ci_leaf_init(struct cacheinfo *this_leaf,
>>>               struct device_node *node,
>>> @@ -93,6 +142,9 @@ static int __populate_cache_leaves(unsigned int cpu)
>>>      }
>>>      of_node_put(np);
>>>
>>> +    /* Force recalculating SHMLBA if cache parameters are updated. */
>>> +    shmlba = 0;
>>> +
>>>      return 0;
>>>  }

diff --git a/arch/riscv/include/asm/Kbuild b/arch/riscv/include/asm/Kbuild
index 16970f246860..3905765807af 100644
--- a/arch/riscv/include/asm/Kbuild
+++ b/arch/riscv/include/asm/Kbuild
@@ -27,7 +27,6 @@  generic-y += percpu.h
 generic-y += preempt.h
 generic-y += sections.h
 generic-y += serial.h
-generic-y += shmparam.h
 generic-y += topology.h
 generic-y += trace_clock.h
 generic-y += unaligned.h
diff --git a/arch/riscv/include/asm/shmparam.h b/arch/riscv/include/asm/shmparam.h
new file mode 100644
index 000000000000..9b6a98153648
--- /dev/null
+++ b/arch/riscv/include/asm/shmparam.h
@@ -0,0 +1,12 @@ 
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_RISCV_SHMPARAM_H
+#define _ASM_RISCV_SHMPARAM_H
+
+/*
+ * Minimum alignment of shared memory segments as a function of cache geometry.
+ */
+#define	SHMLBA	arch_shmlba()
+
+long arch_shmlba(void);
+
+#endif /* _ASM_RISCV_SHMPARAM_H */
diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c
index 4c90c07d8c39..1bc7df8577d6 100644
--- a/arch/riscv/kernel/cacheinfo.c
+++ b/arch/riscv/kernel/cacheinfo.c
@@ -1,12 +1,61 @@ 
 // SPDX-License-Identifier: GPL-2.0-only
 /*
  * Copyright (C) 2017 SiFive
+ * Copyright (C) 2019 Aril Inc
  */
 
 #include <linux/cacheinfo.h>
 #include <linux/cpu.h>
 #include <linux/of.h>
 #include <linux/of_device.h>
+#include <linux/mm.h>
+
+static long shmlba;
+
+
+/*
+ * Assuming  cache size = line size * #sets * N  for N-way associative caches,
+ * return the max cache "span" == (line size * #sets) == (cache size / N)
+ * across all L1 caches, or 0 if cache parameters are not available.
+ * VIPT caches with span > min page size are susceptible to aliasing.
+ */
+static long get_max_cache_span(void)
+{
+	struct cpu_cacheinfo *this_cpu_ci;
+	struct cacheinfo *this_leaf;
+	long span, max_span = 0;
+	int cpu, leaf;
+
+	for_each_possible_cpu(cpu) {
+		this_cpu_ci = get_cpu_cacheinfo(cpu);
+		this_leaf = this_cpu_ci->info_list;
+		for (leaf = 0; leaf < this_cpu_ci->num_leaves; leaf++) {
+			if (this_leaf->level > 1)
+				break;
+			span = this_leaf->coherency_line_size *
+			       this_leaf->number_of_sets;
+			if (span > max_span)
+				max_span = span;
+			this_leaf++;
+		}
+	}
+	return max_span;
+}
+
+/*
+ * Align shared mappings to the maximum cache "span" to avoid aliasing
+ * in VIPT caches, for performance.
+ * The returned SHMLBA value is always a power-of-two multiple of PAGE_SIZE.
+ */
+long arch_shmlba(void)
+{
+	if (shmlba == 0) {
+		long max_span = get_max_cache_span();
+
+		shmlba = max_span ? PAGE_ALIGN(max_span) : 4 * PAGE_SIZE;
+	}
+	return shmlba;
+}
 
 static void ci_leaf_init(struct cacheinfo *this_leaf,
 			 struct device_node *node,
@@ -93,6 +142,9 @@  static int __populate_cache_leaves(unsigned int cpu)
 	}
 	of_node_put(np);
 
+	/* Force recalculating SHMLBA if cache parameters are updated. */
+	shmlba = 0;
+
 	return 0;
 }

[v2,2/2] riscv: Set SHMLBA according to cache geometry

Commit Message

Comments

Patch