diff mbox

[v4,2/5] irqchip, gicv3: Workaround for Cavium ThunderX erratum 23154

Message ID 1439576885-15621-3-git-send-email-rric@kernel.org (mailing list archive)
State New, archived
Headers show

Commit Message

Robert Richter Aug. 14, 2015, 6:28 p.m. UTC
From: Robert Richter <rrichter@cavium.com>

This patch implements Cavium ThunderX erratum 23154.

The gicv3 of ThunderX requires a modified version for reading the IAR
status to ensure data synchronization. Since this is in the fast-path
and called with each interrupt, runtime patching is used using jump
label patching for smallest overhead (no-op). This is the same
technique as used for tracepoints.

v4:
 * simplify code to only use cpus_have_cap() in gicv3_enable_quirks()

v3:
 * fix erratum to be dependend from midr
 * use arm64 errata framework

v2:
 * implement code in a single asm() to keep instruction sequence
 * added comment to the code that explains the erratum
 * apply workaround also if running as guest, thus check MIDR

Signed-off-by: Robert Richter <rrichter@cavium.com>
---
 arch/arm64/Kconfig                  | 11 ++++++++++
 arch/arm64/include/asm/cpufeature.h |  3 ++-
 arch/arm64/include/asm/cputype.h    | 18 +++++++++-------
 arch/arm64/kernel/cpu_errata.c      |  9 ++++++++
 drivers/irqchip/irq-gic-v3.c        | 42 ++++++++++++++++++++++++++++++++++++-
 5 files changed, 74 insertions(+), 9 deletions(-)

Comments

Catalin Marinas Aug. 17, 2015, 4:40 p.m. UTC | #1
On Fri, Aug 14, 2015 at 08:28:02PM +0200, Robert Richter wrote:
> +struct static_key is_cavium_thunderx = STATIC_KEY_INIT_FALSE;

This could also be "static struct ...". BTW, the use of static_key
directly is deprecated, so just do:

static DEFINE_STATIC_KEY_FALSE(is_cavium_thunderx);
David Daney Aug. 17, 2015, 5 p.m. UTC | #2
On 08/14/2015 11:28 AM, Robert Richter wrote:
> From: Robert Richter <rrichter@cavium.com>
>
> This patch implements Cavium ThunderX erratum 23154.
>
> The gicv3 of ThunderX requires a modified version for reading the IAR
> status to ensure data synchronization. Since this is in the fast-path
> and called with each interrupt, runtime patching is used using jump
> label patching for smallest overhead (no-op). This is the same
> technique as used for tracepoints.
>
> v4:
>   * simplify code to only use cpus_have_cap() in gicv3_enable_quirks()
>
> v3:
>   * fix erratum to be dependend from midr
>   * use arm64 errata framework
>
> v2:
>   * implement code in a single asm() to keep instruction sequence
>   * added comment to the code that explains the erratum
>   * apply workaround also if running as guest, thus check MIDR
>
> Signed-off-by: Robert Richter <rrichter@cavium.com>
> ---
>   arch/arm64/Kconfig                  | 11 ++++++++++
>   arch/arm64/include/asm/cpufeature.h |  3 ++-
>   arch/arm64/include/asm/cputype.h    | 18 +++++++++-------
>   arch/arm64/kernel/cpu_errata.c      |  9 ++++++++
>   drivers/irqchip/irq-gic-v3.c        | 42 ++++++++++++++++++++++++++++++++++++-
>   5 files changed, 74 insertions(+), 9 deletions(-)
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 0f6edb14b7e4..4f866a4c6536 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -417,6 +417,17 @@ config ARM64_ERRATUM_845719
>
>   	  If unsure, say Y.
>
> +config CAVIUM_ERRATUM_23154
> +	bool "Cavium erratum 23154: Access to ICC_IAR1_EL1 is not sync'ed"
> +	depends on ARCH_THUNDER

None of the other errata depend on a specific ARCH_*.  I think we should 
remove this 'depends on', so that a generic kernel can be configured to 
work on Thunder without having to first select ARCH_THUNDER.

David Daney


> +	default y
> +	help
> +	  The gicv3 of ThunderX requires a modified version for
> +	  reading the IAR status to ensure data synchronization
> +	  (access to icc_iar1_el1 is not sync'ed before and after).
> +
> +	  If unsure, say Y.
> +
>   endmenu
>
>
>
Robert Richter Aug. 19, 2015, 3:43 p.m. UTC | #3
On 17.08.15 17:40:03, Catalin Marinas wrote:
> On Fri, Aug 14, 2015 at 08:28:02PM +0200, Robert Richter wrote:
> > +struct static_key is_cavium_thunderx = STATIC_KEY_INIT_FALSE;

Will add the static ...

> This could also be "static struct ...". BTW, the use of static_key
> directly is deprecated, so just do:
> 
> static DEFINE_STATIC_KEY_FALSE(is_cavium_thunderx);

... and for simplicity a patch with this after the jump laber bits are
merged upstream.

-Robert
Robert Richter Aug. 19, 2015, 4:05 p.m. UTC | #4
On 17.08.15 10:00:53, David Daney wrote:
> On 08/14/2015 11:28 AM, Robert Richter wrote:
> >+config CAVIUM_ERRATUM_23154
> >+	bool "Cavium erratum 23154: Access to ICC_IAR1_EL1 is not sync'ed"
> >+	depends on ARCH_THUNDER
> 
> None of the other errata depend on a specific ARCH_*.  I think we should
> remove this 'depends on', so that a generic kernel can be configured to work
> on Thunder without having to first select ARCH_THUNDER.

Right, will remove the dependency. Same as for the other errata then.

Thanks,

-Robert

> >+	default y
> >+	help
> >+	  The gicv3 of ThunderX requires a modified version for
> >+	  reading the IAR status to ensure data synchronization
> >+	  (access to icc_iar1_el1 is not sync'ed before and after).
> >+
> >+	  If unsure, say Y.
Suzuki K Poulose Sept. 7, 2015, 4:54 p.m. UTC | #5
On 14/08/15 19:28, Robert Richter wrote:
> From: Robert Richter <rrichter@cavium.com>
>
> This patch implements Cavium ThunderX erratum 23154.
>
> The gicv3 of ThunderX requires a modified version for reading the IAR
> status to ensure data synchronization. Since this is in the fast-path
> and called with each interrupt, runtime patching is used using jump
> label patching for smallest overhead (no-op). This is the same
> technique as used for tracepoints.
>
> v4:
>   * simplify code to only use cpus_have_cap() in gicv3_enable_quirks()
>
> v3:
>   * fix erratum to be dependend from midr
>   * use arm64 errata framework
>
> v2:
>   * implement code in a single asm() to keep instruction sequence
>   * added comment to the code that explains the erratum
>   * apply workaround also if running as guest, thus check MIDR
>
> Signed-off-by: Robert Richter <rrichter@cavium.com>
> ---
>   arch/arm64/Kconfig                  | 11 ++++++++++
>   arch/arm64/include/asm/cpufeature.h |  3 ++-
>   arch/arm64/include/asm/cputype.h    | 18 +++++++++-------
>   arch/arm64/kernel/cpu_errata.c      |  9 ++++++++
>   drivers/irqchip/irq-gic-v3.c        | 42 ++++++++++++++++++++++++++++++++++++-
>   5 files changed, 74 insertions(+), 9 deletions(-)
>

...

>   };
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index c52f7ba205b4..4211c39b8744 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -107,7 +107,7 @@ static void gic_redist_wait_for_rwp(void)

...

> +}
> +
>   static void __maybe_unused gic_write_pmr(u64 val)
>   {
>   	asm volatile("msr_s " __stringify(ICC_PMR_EL1) ", %0" : : "r" (val));
> @@ -766,6 +798,12 @@ static const struct irq_domain_ops gic_irq_domain_ops = {
>   	.free = gic_irq_domain_free,
>   };
>
> +static void gicv3_enable_quirks(void)
> +{
> +	if (cpus_have_cap(ARM64_WORKAROUND_CAVIUM_23154))
> +		static_key_slow_inc(&is_cavium_thunderx);

May be you could use the enable() method added to struct arm64_cpu_capability
here to perform the above operation, added by James :

commit 1c0763037f1e1caef739e36e09c6d41ed7b61b2d
Author: James Morse <james.morse@arm.com>
Date:   Tue Jul 21 13:23:28 2015 +0100

     arm64: kernel: Add cpufeature 'enable' callback


> +}
> +
>   static int __init gic_of_init(struct device_node *node, struct device_node *parent)
>   {
>   	void __iomem *dist_base;
> @@ -825,6 +863,8 @@ static int __init gic_of_init(struct device_node *node, struct device_node *pare
>   	gic_data.nr_redist_regions = nr_redist_regions;
>   	gic_data.redist_stride = redist_stride;
>
> +	gicv3_enable_quirks();
> +

than adding a hook here ?

Cheers
Suzuki
Marc Zyngier Sept. 7, 2015, 5:09 p.m. UTC | #6
On 07/09/15 17:54, Suzuki K. Poulose wrote:
> On 14/08/15 19:28, Robert Richter wrote:
>> From: Robert Richter <rrichter@cavium.com>
>>
>> This patch implements Cavium ThunderX erratum 23154.
>>
>> The gicv3 of ThunderX requires a modified version for reading the IAR
>> status to ensure data synchronization. Since this is in the fast-path
>> and called with each interrupt, runtime patching is used using jump
>> label patching for smallest overhead (no-op). This is the same
>> technique as used for tracepoints.
>>
>> v4:
>>   * simplify code to only use cpus_have_cap() in gicv3_enable_quirks()
>>
>> v3:
>>   * fix erratum to be dependend from midr
>>   * use arm64 errata framework
>>
>> v2:
>>   * implement code in a single asm() to keep instruction sequence
>>   * added comment to the code that explains the erratum
>>   * apply workaround also if running as guest, thus check MIDR
>>
>> Signed-off-by: Robert Richter <rrichter@cavium.com>
>> ---
>>   arch/arm64/Kconfig                  | 11 ++++++++++
>>   arch/arm64/include/asm/cpufeature.h |  3 ++-
>>   arch/arm64/include/asm/cputype.h    | 18 +++++++++-------
>>   arch/arm64/kernel/cpu_errata.c      |  9 ++++++++
>>   drivers/irqchip/irq-gic-v3.c        | 42 ++++++++++++++++++++++++++++++++++++-
>>   5 files changed, 74 insertions(+), 9 deletions(-)
>>
> 
> ...
> 
>>   };
>> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
>> index c52f7ba205b4..4211c39b8744 100644
>> --- a/drivers/irqchip/irq-gic-v3.c
>> +++ b/drivers/irqchip/irq-gic-v3.c
>> @@ -107,7 +107,7 @@ static void gic_redist_wait_for_rwp(void)
> 
> ...
> 
>> +}
>> +
>>   static void __maybe_unused gic_write_pmr(u64 val)
>>   {
>>   	asm volatile("msr_s " __stringify(ICC_PMR_EL1) ", %0" : : "r" (val));
>> @@ -766,6 +798,12 @@ static const struct irq_domain_ops gic_irq_domain_ops = {
>>   	.free = gic_irq_domain_free,
>>   };
>>
>> +static void gicv3_enable_quirks(void)
>> +{
>> +	if (cpus_have_cap(ARM64_WORKAROUND_CAVIUM_23154))
>> +		static_key_slow_inc(&is_cavium_thunderx);
> 
> May be you could use the enable() method added to struct arm64_cpu_capability
> here to perform the above operation, added by James :
> 
> commit 1c0763037f1e1caef739e36e09c6d41ed7b61b2d
> Author: James Morse <james.morse@arm.com>
> Date:   Tue Jul 21 13:23:28 2015 +0100
> 
>      arm64: kernel: Add cpufeature 'enable' callback
> 
> 
>> +}
>> +
>>   static int __init gic_of_init(struct device_node *node, struct device_node *parent)
>>   {
>>   	void __iomem *dist_base;
>> @@ -825,6 +863,8 @@ static int __init gic_of_init(struct device_node *node, struct device_node *pare
>>   	gic_data.nr_redist_regions = nr_redist_regions;
>>   	gic_data.redist_stride = redist_stride;
>>
>> +	gicv3_enable_quirks();
>> +
> 
> than adding a hook here ?

It feels like a good idea, except that it creates a weird dependency
between the core arch code and the GIC driver. Can you think of an
elegant way to deal with this?

Thanks,

	M.
Catalin Marinas Sept. 7, 2015, 5:15 p.m. UTC | #7
On Mon, Sep 07, 2015 at 05:54:06PM +0100, Suzuki K. Poulose wrote:
> On 14/08/15 19:28, Robert Richter wrote:
> >diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> >index c52f7ba205b4..4211c39b8744 100644
> >--- a/drivers/irqchip/irq-gic-v3.c
> >+++ b/drivers/irqchip/irq-gic-v3.c
> >@@ -107,7 +107,7 @@ static void gic_redist_wait_for_rwp(void)
> 
> ...
> 
> >+}
> >+
> >  static void __maybe_unused gic_write_pmr(u64 val)
> >  {
> >  	asm volatile("msr_s " __stringify(ICC_PMR_EL1) ", %0" : : "r" (val));
> >@@ -766,6 +798,12 @@ static const struct irq_domain_ops gic_irq_domain_ops = {
> >  	.free = gic_irq_domain_free,
> >  };
> >
> >+static void gicv3_enable_quirks(void)
> >+{
> >+	if (cpus_have_cap(ARM64_WORKAROUND_CAVIUM_23154))
> >+		static_key_slow_inc(&is_cavium_thunderx);
> 
> May be you could use the enable() method added to struct arm64_cpu_capability
> here to perform the above operation, added by James :
> 
> commit 1c0763037f1e1caef739e36e09c6d41ed7b61b2d
> Author: James Morse <james.morse@arm.com>
> Date:   Tue Jul 21 13:23:28 2015 +0100
> 
>     arm64: kernel: Add cpufeature 'enable' callback

I thought about this as well when looking at the patch but decided it's
better as it is. The "enable" method is meant to enable per-CPU features
(or workarounds) but here it is about GICv3, so we don't want to enable
for every CPU.
Robert Richter Sept. 7, 2015, 5:32 p.m. UTC | #8
On 07.09.15 18:09:48, Marc Zyngier wrote:
> On 07/09/15 17:54, Suzuki K. Poulose wrote:
> > On 14/08/15 19:28, Robert Richter wrote:
> >> From: Robert Richter <rrichter@cavium.com>

> >> +static void gicv3_enable_quirks(void)
> >> +{
> >> +	if (cpus_have_cap(ARM64_WORKAROUND_CAVIUM_23154))
> >> +		static_key_slow_inc(&is_cavium_thunderx);
> > 
> > May be you could use the enable() method added to struct arm64_cpu_capability
> > here to perform the above operation, added by James :
> > 
> > commit 1c0763037f1e1caef739e36e09c6d41ed7b61b2d
> > Author: James Morse <james.morse@arm.com>
> > Date:   Tue Jul 21 13:23:28 2015 +0100
> > 
> >      arm64: kernel: Add cpufeature 'enable' callback
> > 
> > 
> >> +}
> >> +
> >>   static int __init gic_of_init(struct device_node *node, struct device_node *parent)
> >>   {
> >>   	void __iomem *dist_base;
> >> @@ -825,6 +863,8 @@ static int __init gic_of_init(struct device_node *node, struct device_node *pare
> >>   	gic_data.nr_redist_regions = nr_redist_regions;
> >>   	gic_data.redist_stride = redist_stride;
> >>
> >> +	gicv3_enable_quirks();
> >> +
> > 
> > than adding a hook here ?
> 
> It feels like a good idea, except that it creates a weird dependency
> between the core arch code and the GIC driver. Can you think of an
> elegant way to deal with this?

The only chance I see is to move it all to the driver and adding
enable() calls to the caps in gicv3_errata:

static void gicv3_enable_quirks(void)
{
        check_cpu_capabilities(gicv3_errata, "enabling workaround for");
}

Here the code is kept in the driver and called during driver init.

But current solution looks more elegant and simpler to me. So I would
not change it.

-Robert
Suzuki K Poulose Sept. 7, 2015, 5:41 p.m. UTC | #9
On 07/09/15 18:15, Catalin Marinas wrote:
> On Mon, Sep 07, 2015 at 05:54:06PM +0100, Suzuki K. Poulose wrote:
>> On 14/08/15 19:28, Robert Richter wrote:
>>> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
>>> index c52f7ba205b4..4211c39b8744 100644
>>> --- a/drivers/irqchip/irq-gic-v3.c
>>> +++ b/drivers/irqchip/irq-gic-v3.c
>>> @@ -107,7 +107,7 @@ static void gic_redist_wait_for_rwp(void)
>>
>> ...
>>
>>> +}
>>> +
>>>   static void __maybe_unused gic_write_pmr(u64 val)
>>>   {
>>>   	asm volatile("msr_s " __stringify(ICC_PMR_EL1) ", %0" : : "r" (val));
>>> @@ -766,6 +798,12 @@ static const struct irq_domain_ops gic_irq_domain_ops = {
>>>   	.free = gic_irq_domain_free,
>>>   };
>>>
>>> +static void gicv3_enable_quirks(void)
>>> +{
>>> +	if (cpus_have_cap(ARM64_WORKAROUND_CAVIUM_23154))
>>> +		static_key_slow_inc(&is_cavium_thunderx);
>>
>> May be you could use the enable() method added to struct arm64_cpu_capability
>> here to perform the above operation, added by James :
>>
>> commit 1c0763037f1e1caef739e36e09c6d41ed7b61b2d
>> Author: James Morse <james.morse@arm.com>
>> Date:   Tue Jul 21 13:23:28 2015 +0100
>>
>>      arm64: kernel: Add cpufeature 'enable' callback
>
> I thought about this as well when looking at the patch but decided it's
> better as it is. The "enable" method is meant to enable per-CPU features
> (or workarounds) but here it is about GICv3, so we don't want to enable
> for every CPU.

Right. I have been playing with a series where the checks are delayed until
all CPUs are brought up. But yes, I understand this usecase is slightly different
and may not match what I was thinking about.

May be, gic can have its own private list of _cpu_capability which it can run
over check_cpu_capabilities(), which it can run over and that will fall back to
what we have at the moment. So, may be what we have here is as good as we can
get.

Cheers
Suzuki
Catalin Marinas Sept. 8, 2015, 9 a.m. UTC | #10
On Mon, Sep 07, 2015 at 06:41:50PM +0100, Suzuki K. Poulose wrote:
> On 07/09/15 18:15, Catalin Marinas wrote:
> >On Mon, Sep 07, 2015 at 05:54:06PM +0100, Suzuki K. Poulose wrote:
> >>On 14/08/15 19:28, Robert Richter wrote:
> >>>+static void gicv3_enable_quirks(void)
> >>>+{
> >>>+	if (cpus_have_cap(ARM64_WORKAROUND_CAVIUM_23154))
> >>>+		static_key_slow_inc(&is_cavium_thunderx);
> >>
> >>May be you could use the enable() method added to struct arm64_cpu_capability
> >>here to perform the above operation, added by James :
> >>
> >>commit 1c0763037f1e1caef739e36e09c6d41ed7b61b2d
> >>Author: James Morse <james.morse@arm.com>
> >>Date:   Tue Jul 21 13:23:28 2015 +0100
> >>
> >>     arm64: kernel: Add cpufeature 'enable' callback
> >
> >I thought about this as well when looking at the patch but decided it's
> >better as it is. The "enable" method is meant to enable per-CPU features
> >(or workarounds) but here it is about GICv3, so we don't want to enable
> >for every CPU.
> 
> Right. I have been playing with a series where the checks are delayed until
> all CPUs are brought up.

Unrelated to the GIC workaround, delaying the enable feature until the
CPUs are brought up is not always be feasible. At some point we may
implement support to defer the CPU on to user space (I already have a
patch that does this when no DT enable-method is specified, but I won't
publish it before Qualcomm fixes its firmware ;)). But we may have other
reasons to start with CPUs hot-unplugged by default and turn them on
later.
Suzuki K Poulose Sept. 8, 2015, 9:09 a.m. UTC | #11
On 08/09/15 10:00, Catalin Marinas wrote:
> On Mon, Sep 07, 2015 at 06:41:50PM +0100, Suzuki K. Poulose wrote:
>> On 07/09/15 18:15, Catalin Marinas wrote:
>>> On Mon, Sep 07, 2015 at 05:54:06PM +0100, Suzuki K. Poulose wrote:
>>>> On 14/08/15 19:28, Robert Richter wrote:
>>>>> +static void gicv3_enable_quirks(void)
>>>>> +{
>>>>> +	if (cpus_have_cap(ARM64_WORKAROUND_CAVIUM_23154))
>>>>> +		static_key_slow_inc(&is_cavium_thunderx);
>>>>
>>>> May be you could use the enable() method added to struct arm64_cpu_capability
>>>> here to perform the above operation, added by James :
>>>>
>>>> commit 1c0763037f1e1caef739e36e09c6d41ed7b61b2d
>>>> Author: James Morse <james.morse@arm.com>
>>>> Date:   Tue Jul 21 13:23:28 2015 +0100
>>>>
>>>>      arm64: kernel: Add cpufeature 'enable' callback
>>>
>>> I thought about this as well when looking at the patch but decided it's
>>> better as it is. The "enable" method is meant to enable per-CPU features
>>> (or workarounds) but here it is about GICv3, so we don't want to enable
>>> for every CPU.
>>
>> Right. I have been playing with a series where the checks are delayed until
>> all CPUs are brought up.
>
> Unrelated to the GIC workaround, delaying the enable feature until the
> CPUs are brought up is not always be feasible.

Right. But then, enabling a feature(and applying the alternatives) based on
a single CPU may not be safe, always, like PAN. If one of the boot time CPU
doesn't have it, then we are in trouble (even though we WARN about it from
SANITY check)

> At some point we may
> implement support to defer the CPU on to user space (I already have a
> patch that does this when no DT enable-method is specified, but I won't
> publish it before Qualcomm fixes its firmware ;)). But we may have other
> reasons to start with CPUs hot-unplugged by default and turn them on
> later.

We have SANITY check infrastructure that WARNs in such cases, if the features
don't match.  But still, wouldn't it be better to enable a feature
only if all the boot-time enabled CPUs have it ? (Errata is an exception though,
which only depends on whether one of the CPU needs it).

Thanks
Suzuki

>
Catalin Marinas Sept. 8, 2015, 9:37 a.m. UTC | #12
On Tue, Sep 08, 2015 at 10:09:30AM +0100, Suzuki K. Poulose wrote:
> On 08/09/15 10:00, Catalin Marinas wrote:
> >On Mon, Sep 07, 2015 at 06:41:50PM +0100, Suzuki K. Poulose wrote:
> >>On 07/09/15 18:15, Catalin Marinas wrote:
> >>>On Mon, Sep 07, 2015 at 05:54:06PM +0100, Suzuki K. Poulose wrote:
> >>>>On 14/08/15 19:28, Robert Richter wrote:
> >>>>>+static void gicv3_enable_quirks(void)
> >>>>>+{
> >>>>>+	if (cpus_have_cap(ARM64_WORKAROUND_CAVIUM_23154))
> >>>>>+		static_key_slow_inc(&is_cavium_thunderx);
> >>>>
> >>>>May be you could use the enable() method added to struct arm64_cpu_capability
> >>>>here to perform the above operation, added by James :
> >>>>
> >>>>commit 1c0763037f1e1caef739e36e09c6d41ed7b61b2d
> >>>>Author: James Morse <james.morse@arm.com>
> >>>>Date:   Tue Jul 21 13:23:28 2015 +0100
> >>>>
> >>>>     arm64: kernel: Add cpufeature 'enable' callback
> >>>
> >>>I thought about this as well when looking at the patch but decided it's
> >>>better as it is. The "enable" method is meant to enable per-CPU features
> >>>(or workarounds) but here it is about GICv3, so we don't want to enable
> >>>for every CPU.
> >>
> >>Right. I have been playing with a series where the checks are delayed until
> >>all CPUs are brought up.
> >
> >Unrelated to the GIC workaround, delaying the enable feature until the
> >CPUs are brought up is not always be feasible.
> 
> Right. But then, enabling a feature(and applying the alternatives) based on
> a single CPU may not be safe, always, like PAN. If one of the boot time CPU
> doesn't have it, then we are in trouble (even though we WARN about it from
> SANITY check)

I see your point but there's a trade-off. For some features it's not be
feasible to postpone until user space (e.g. errata workarounds). But if
a CPU coming up late doesn't have compatible features, just keep it in a
loop (or park it back if possible or even refuse to boot any further). I
don't think we should cater for insane hardware configurations (e.g. mix
of PAN/no-PAN as we already do the code patching). Do you plan to defer
code patching as well?

Note that we may have to use the .enable function for errata workarounds
as well, not just features like PAN (we currently only do code patching
but we may have to do other things like issuing SMC calls, you never
know what's going to hit us).

> >At some point we may
> >implement support to defer the CPU on to user space (I already have a
> >patch that does this when no DT enable-method is specified, but I won't
> >publish it before Qualcomm fixes its firmware ;)). But we may have other
> >reasons to start with CPUs hot-unplugged by default and turn them on
> >later.
> 
> We have SANITY check infrastructure that WARNs in such cases, if the features
> don't match.  But still, wouldn't it be better to enable a feature
> only if all the boot-time enabled CPUs have it ? (Errata is an exception though,
> which only depends on whether one of the CPU needs it).

If we ever need this, I think we should implement a separate late_enable
function as just deferring all features enabling is not generic enough.
But in the meantime, I don't think we should worry about this case,
let's wait and see whether we ever get such configurations (panicking
the kernel on incompatible features is a good starting point -
FPSIMD/no-FPSIMD, PAN/no-PAN etc.)
Suzuki K Poulose Sept. 8, 2015, 10:30 a.m. UTC | #13
On 08/09/15 10:37, Catalin Marinas wrote:
> On Tue, Sep 08, 2015 at 10:09:30AM +0100, Suzuki K. Poulose wrote:
>> On 08/09/15 10:00, Catalin Marinas wrote:
>>> On Mon, Sep 07, 2015 at 06:41:50PM +0100, Suzuki K. Poulose wrote:
>>>> On 07/09/15 18:15, Catalin Marinas wrote:
>>>>> On Mon, Sep 07, 2015 at 05:54:06PM +0100, Suzuki K. Poulose wrote:
>>>>>> On 14/08/15 19:28, Robert Richter wrote:
>>>>>>> +static void gicv3_enable_quirks(void)
>>>>>>> +{
>>>>>>> +	if (cpus_have_cap(ARM64_WORKAROUND_CAVIUM_23154))
>>>>>>> +		static_key_slow_inc(&is_cavium_thunderx);
>>>>>>
>>>>>> May be you could use the enable() method added to struct arm64_cpu_capability
>>>>>> here to perform the above operation, added by James :
>>>>>>
>>>>>> commit 1c0763037f1e1caef739e36e09c6d41ed7b61b2d
>>>>>> Author: James Morse <james.morse@arm.com>
>>>>>> Date:   Tue Jul 21 13:23:28 2015 +0100
>>>>>>
>>>>>>      arm64: kernel: Add cpufeature 'enable' callback
>>>>>
>>>>> I thought about this as well when looking at the patch but decided it's
>>>>> better as it is. The "enable" method is meant to enable per-CPU features
>>>>> (or workarounds) but here it is about GICv3, so we don't want to enable
>>>>> for every CPU.
>>>>
>>>> Right. I have been playing with a series where the checks are delayed until
>>>> all CPUs are brought up.
>>>
>>> Unrelated to the GIC workaround, delaying the enable feature until the
>>> CPUs are brought up is not always be feasible.
>>
>> Right. But then, enabling a feature(and applying the alternatives) based on
>> a single CPU may not be safe, always, like PAN. If one of the boot time CPU
>> doesn't have it, then we are in trouble (even though we WARN about it from
>> SANITY check)
>
> I see your point but there's a trade-off. For some features it's not be
> feasible to postpone until user space (e.g. errata workarounds). But if

Right, I agree. I should have been more descriptive. Here is my plan :

Classify the capabilities / workarounds as two different types.

1) Errata workaround capability checks are triggered for each booting
    CPU.
2) CPU Feature capabilities are checked until all boot-time enabled CPUs are
active, in smp_cpus_done() and before apply_alternatives_all().

(We could even classify some of the capabilities as CPU_LOCAL and check it
  per-CPU).

Delay the feature/capability detection to smp_cpus_done() and before
apply_alternatives_all().

i.e, :

  void __init smp_cpus_done(unsigned int max_cpus)
  {
         pr_info("SMP: Total of %d processors activated.\n", num_online_cpus());
+       setup_cpu_features();
         hyp_mode_check();
         apply_alternatives_all();
  }

Where setup_cpu_features() will do all the CPU feature related processing
based on the system wide safe value(will be available from the new infrastructure) :

1) cpu capability based on feature registers (e.g, GIC SYSREG, PAN, ATOMICS )
2) ELF_HWCAP


> a CPU coming up late doesn't have compatible features, just keep it in a
> loop (or park it back if possible or even refuse to boot any further). I
> don't think we should cater for insane hardware configurations (e.g. mix


Any other new CPU, which is missing an available system capability, could be
made to loop, as you mentioned.

> of PAN/no-PAN as we already do the code patching). Do you plan to defer
> code patching as well?

As shown above, the apply_alternatives_all() is already done from smp_cpus_done(),
which will stay there.

>
> Note that we may have to use the .enable function for errata workarounds
> as well, not just features like PAN (we currently only do code patching
> but we may have to do other things like issuing SMC calls, you never
> know what's going to hit us).

Given that ERRATAs are checked for each CPU and are not delayed, we need not
worry about. But yes, we could have flags to indicate how/when the enable methods
should be invoked ? e.g, per CPU (like PAN), or per SYSTEM (once for the entire system)

>>> At some point we may
>>> implement support to defer the CPU on to user space (I already have a
>>> patch that does this when no DT enable-method is specified, but I won't
>>> publish it before Qualcomm fixes its firmware ;)). But we may have other
>>> reasons to start with CPUs hot-unplugged by default and turn them on
>>> later.
>>
>> We have SANITY check infrastructure that WARNs in such cases, if the features
>> don't match.  But still, wouldn't it be better to enable a feature
>> only if all the boot-time enabled CPUs have it ? (Errata is an exception though,
>> which only depends on whether one of the CPU needs it).
>
> If we ever need this, I think we should implement a separate late_enable
> function as just deferring all features enabling is not generic enough.
> But in the meantime, I don't think we should worry about this case,
> let's wait and see whether we ever get such configurations (panicking
> the kernel on incompatible features is a good starting point -
> FPSIMD/no-FPSIMD, PAN/no-PAN etc.)

OK. I will post the series after the merge window. We can discuss further
then.

Cheers
Suzuki
diff mbox

Patch

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 0f6edb14b7e4..4f866a4c6536 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -417,6 +417,17 @@  config ARM64_ERRATUM_845719
 
 	  If unsure, say Y.
 
+config CAVIUM_ERRATUM_23154
+	bool "Cavium erratum 23154: Access to ICC_IAR1_EL1 is not sync'ed"
+	depends on ARCH_THUNDER
+	default y
+	help
+	  The gicv3 of ThunderX requires a modified version for
+	  reading the IAR status to ensure data synchronization
+	  (access to icc_iar1_el1 is not sync'ed before and after).
+
+	  If unsure, say Y.
+
 endmenu
 
 
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index c1044218a63a..2a5e4c163ee5 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -25,8 +25,9 @@ 
 #define ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE	1
 #define ARM64_WORKAROUND_845719			2
 #define ARM64_HAS_SYSREG_GIC_CPUIF		3
+#define ARM64_WORKAROUND_CAVIUM_23154		4
 
-#define ARM64_NCAPS				4
+#define ARM64_NCAPS				5
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h
index a84ec605bed8..3f0c7683f252 100644
--- a/arch/arm64/include/asm/cputype.h
+++ b/arch/arm64/include/asm/cputype.h
@@ -62,15 +62,19 @@ 
 	(0xf			<< MIDR_ARCHITECTURE_SHIFT) | \
 	((partnum)		<< MIDR_PARTNUM_SHIFT))
 
-#define ARM_CPU_IMP_ARM		0x41
-#define ARM_CPU_IMP_APM		0x50
+#define ARM_CPU_IMP_ARM			0x41
+#define ARM_CPU_IMP_APM			0x50
+#define ARM_CPU_IMP_CAVIUM		0x43
 
-#define ARM_CPU_PART_AEM_V8	0xD0F
-#define ARM_CPU_PART_FOUNDATION	0xD00
-#define ARM_CPU_PART_CORTEX_A57	0xD07
-#define ARM_CPU_PART_CORTEX_A53	0xD03
+#define ARM_CPU_PART_AEM_V8		0xD0F
+#define ARM_CPU_PART_FOUNDATION		0xD00
+#define ARM_CPU_PART_CORTEX_A57		0xD07
+#define ARM_CPU_PART_CORTEX_A53		0xD03
+
+#define APM_CPU_PART_POTENZA		0x000
+
+#define CAVIUM_CPU_PART_THUNDERX	0x0A1
 
-#define APM_CPU_PART_POTENZA	0x000
 
 #define ID_AA64MMFR0_BIGENDEL0_SHIFT	16
 #define ID_AA64MMFR0_BIGENDEL0_MASK	(0xf << ID_AA64MMFR0_BIGENDEL0_SHIFT)
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 6ffd91438560..574450c257a4 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -23,6 +23,7 @@ 
 
 #define MIDR_CORTEX_A53 MIDR_CPU_PART(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A53)
 #define MIDR_CORTEX_A57 MIDR_CPU_PART(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A57)
+#define MIDR_THUNDERX	MIDR_CPU_PART(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX)
 
 #define CPU_MODEL_MASK (MIDR_IMPLEMENTOR_MASK | MIDR_PARTNUM_MASK | \
 			MIDR_ARCHITECTURE_MASK)
@@ -82,6 +83,14 @@  const struct arm64_cpu_capabilities arm64_errata[] = {
 		MIDR_RANGE(MIDR_CORTEX_A53, 0x00, 0x04),
 	},
 #endif
+#ifdef CONFIG_CAVIUM_ERRATUM_23154
+	{
+	/* Cavium ThunderX, pass 1.x */
+		.desc = "Cavium erratum 23154",
+		.capability = ARM64_WORKAROUND_CAVIUM_23154,
+		MIDR_RANGE(MIDR_THUNDERX, 0x00, 0x01),
+	},
+#endif
 	{
 	}
 };
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index c52f7ba205b4..4211c39b8744 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -107,7 +107,7 @@  static void gic_redist_wait_for_rwp(void)
 }
 
 /* Low level accessors */
-static u64 __maybe_unused gic_read_iar(void)
+static u64 gic_read_iar_common(void)
 {
 	u64 irqstat;
 
@@ -115,6 +115,38 @@  static u64 __maybe_unused gic_read_iar(void)
 	return irqstat;
 }
 
+/*
+ * Cavium ThunderX erratum 23154
+ *
+ * The gicv3 of ThunderX requires a modified version for reading the
+ * IAR status to ensure data synchronization (access to icc_iar1_el1
+ * is not sync'ed before and after).
+ */
+static u64 gic_read_iar_cavium_thunderx(void)
+{
+	u64 irqstat;
+
+	asm volatile(
+		"nop;nop;nop;nop\n\t"
+		"nop;nop;nop;nop\n\t"
+		"mrs_s %0, " __stringify(ICC_IAR1_EL1) "\n\t"
+		"nop;nop;nop;nop"
+		: "=r" (irqstat));
+	mb();
+
+	return irqstat;
+}
+
+struct static_key is_cavium_thunderx = STATIC_KEY_INIT_FALSE;
+
+static u64 __maybe_unused gic_read_iar(void)
+{
+	if (static_key_false(&is_cavium_thunderx))
+		return gic_read_iar_common();
+	else
+		return gic_read_iar_cavium_thunderx();
+}
+
 static void __maybe_unused gic_write_pmr(u64 val)
 {
 	asm volatile("msr_s " __stringify(ICC_PMR_EL1) ", %0" : : "r" (val));
@@ -766,6 +798,12 @@  static const struct irq_domain_ops gic_irq_domain_ops = {
 	.free = gic_irq_domain_free,
 };
 
+static void gicv3_enable_quirks(void)
+{
+	if (cpus_have_cap(ARM64_WORKAROUND_CAVIUM_23154))
+		static_key_slow_inc(&is_cavium_thunderx);
+}
+
 static int __init gic_of_init(struct device_node *node, struct device_node *parent)
 {
 	void __iomem *dist_base;
@@ -825,6 +863,8 @@  static int __init gic_of_init(struct device_node *node, struct device_node *pare
 	gic_data.nr_redist_regions = nr_redist_regions;
 	gic_data.redist_stride = redist_stride;
 
+	gicv3_enable_quirks();
+
 	/*
 	 * Find out how many interrupts are supported.
 	 * The GIC only supports up to 1020 interrupt sources (SGI+PPI+SPI)