Message ID | 1439576885-15621-3-git-send-email-rric@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Aug 14, 2015 at 08:28:02PM +0200, Robert Richter wrote:
> +struct static_key is_cavium_thunderx = STATIC_KEY_INIT_FALSE;
This could also be "static struct ...". BTW, the use of static_key
directly is deprecated, so just do:
static DEFINE_STATIC_KEY_FALSE(is_cavium_thunderx);
On 08/14/2015 11:28 AM, Robert Richter wrote: > From: Robert Richter <rrichter@cavium.com> > > This patch implements Cavium ThunderX erratum 23154. > > The gicv3 of ThunderX requires a modified version for reading the IAR > status to ensure data synchronization. Since this is in the fast-path > and called with each interrupt, runtime patching is used using jump > label patching for smallest overhead (no-op). This is the same > technique as used for tracepoints. > > v4: > * simplify code to only use cpus_have_cap() in gicv3_enable_quirks() > > v3: > * fix erratum to be dependend from midr > * use arm64 errata framework > > v2: > * implement code in a single asm() to keep instruction sequence > * added comment to the code that explains the erratum > * apply workaround also if running as guest, thus check MIDR > > Signed-off-by: Robert Richter <rrichter@cavium.com> > --- > arch/arm64/Kconfig | 11 ++++++++++ > arch/arm64/include/asm/cpufeature.h | 3 ++- > arch/arm64/include/asm/cputype.h | 18 +++++++++------- > arch/arm64/kernel/cpu_errata.c | 9 ++++++++ > drivers/irqchip/irq-gic-v3.c | 42 ++++++++++++++++++++++++++++++++++++- > 5 files changed, 74 insertions(+), 9 deletions(-) > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index 0f6edb14b7e4..4f866a4c6536 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -417,6 +417,17 @@ config ARM64_ERRATUM_845719 > > If unsure, say Y. > > +config CAVIUM_ERRATUM_23154 > + bool "Cavium erratum 23154: Access to ICC_IAR1_EL1 is not sync'ed" > + depends on ARCH_THUNDER None of the other errata depend on a specific ARCH_*. I think we should remove this 'depends on', so that a generic kernel can be configured to work on Thunder without having to first select ARCH_THUNDER. David Daney > + default y > + help > + The gicv3 of ThunderX requires a modified version for > + reading the IAR status to ensure data synchronization > + (access to icc_iar1_el1 is not sync'ed before and after). > + > + If unsure, say Y. > + > endmenu > > >
On 17.08.15 17:40:03, Catalin Marinas wrote: > On Fri, Aug 14, 2015 at 08:28:02PM +0200, Robert Richter wrote: > > +struct static_key is_cavium_thunderx = STATIC_KEY_INIT_FALSE; Will add the static ... > This could also be "static struct ...". BTW, the use of static_key > directly is deprecated, so just do: > > static DEFINE_STATIC_KEY_FALSE(is_cavium_thunderx); ... and for simplicity a patch with this after the jump laber bits are merged upstream. -Robert
On 17.08.15 10:00:53, David Daney wrote: > On 08/14/2015 11:28 AM, Robert Richter wrote: > >+config CAVIUM_ERRATUM_23154 > >+ bool "Cavium erratum 23154: Access to ICC_IAR1_EL1 is not sync'ed" > >+ depends on ARCH_THUNDER > > None of the other errata depend on a specific ARCH_*. I think we should > remove this 'depends on', so that a generic kernel can be configured to work > on Thunder without having to first select ARCH_THUNDER. Right, will remove the dependency. Same as for the other errata then. Thanks, -Robert > >+ default y > >+ help > >+ The gicv3 of ThunderX requires a modified version for > >+ reading the IAR status to ensure data synchronization > >+ (access to icc_iar1_el1 is not sync'ed before and after). > >+ > >+ If unsure, say Y.
On 14/08/15 19:28, Robert Richter wrote: > From: Robert Richter <rrichter@cavium.com> > > This patch implements Cavium ThunderX erratum 23154. > > The gicv3 of ThunderX requires a modified version for reading the IAR > status to ensure data synchronization. Since this is in the fast-path > and called with each interrupt, runtime patching is used using jump > label patching for smallest overhead (no-op). This is the same > technique as used for tracepoints. > > v4: > * simplify code to only use cpus_have_cap() in gicv3_enable_quirks() > > v3: > * fix erratum to be dependend from midr > * use arm64 errata framework > > v2: > * implement code in a single asm() to keep instruction sequence > * added comment to the code that explains the erratum > * apply workaround also if running as guest, thus check MIDR > > Signed-off-by: Robert Richter <rrichter@cavium.com> > --- > arch/arm64/Kconfig | 11 ++++++++++ > arch/arm64/include/asm/cpufeature.h | 3 ++- > arch/arm64/include/asm/cputype.h | 18 +++++++++------- > arch/arm64/kernel/cpu_errata.c | 9 ++++++++ > drivers/irqchip/irq-gic-v3.c | 42 ++++++++++++++++++++++++++++++++++++- > 5 files changed, 74 insertions(+), 9 deletions(-) > ... > }; > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c > index c52f7ba205b4..4211c39b8744 100644 > --- a/drivers/irqchip/irq-gic-v3.c > +++ b/drivers/irqchip/irq-gic-v3.c > @@ -107,7 +107,7 @@ static void gic_redist_wait_for_rwp(void) ... > +} > + > static void __maybe_unused gic_write_pmr(u64 val) > { > asm volatile("msr_s " __stringify(ICC_PMR_EL1) ", %0" : : "r" (val)); > @@ -766,6 +798,12 @@ static const struct irq_domain_ops gic_irq_domain_ops = { > .free = gic_irq_domain_free, > }; > > +static void gicv3_enable_quirks(void) > +{ > + if (cpus_have_cap(ARM64_WORKAROUND_CAVIUM_23154)) > + static_key_slow_inc(&is_cavium_thunderx); May be you could use the enable() method added to struct arm64_cpu_capability here to perform the above operation, added by James : commit 1c0763037f1e1caef739e36e09c6d41ed7b61b2d Author: James Morse <james.morse@arm.com> Date: Tue Jul 21 13:23:28 2015 +0100 arm64: kernel: Add cpufeature 'enable' callback > +} > + > static int __init gic_of_init(struct device_node *node, struct device_node *parent) > { > void __iomem *dist_base; > @@ -825,6 +863,8 @@ static int __init gic_of_init(struct device_node *node, struct device_node *pare > gic_data.nr_redist_regions = nr_redist_regions; > gic_data.redist_stride = redist_stride; > > + gicv3_enable_quirks(); > + than adding a hook here ? Cheers Suzuki
On 07/09/15 17:54, Suzuki K. Poulose wrote: > On 14/08/15 19:28, Robert Richter wrote: >> From: Robert Richter <rrichter@cavium.com> >> >> This patch implements Cavium ThunderX erratum 23154. >> >> The gicv3 of ThunderX requires a modified version for reading the IAR >> status to ensure data synchronization. Since this is in the fast-path >> and called with each interrupt, runtime patching is used using jump >> label patching for smallest overhead (no-op). This is the same >> technique as used for tracepoints. >> >> v4: >> * simplify code to only use cpus_have_cap() in gicv3_enable_quirks() >> >> v3: >> * fix erratum to be dependend from midr >> * use arm64 errata framework >> >> v2: >> * implement code in a single asm() to keep instruction sequence >> * added comment to the code that explains the erratum >> * apply workaround also if running as guest, thus check MIDR >> >> Signed-off-by: Robert Richter <rrichter@cavium.com> >> --- >> arch/arm64/Kconfig | 11 ++++++++++ >> arch/arm64/include/asm/cpufeature.h | 3 ++- >> arch/arm64/include/asm/cputype.h | 18 +++++++++------- >> arch/arm64/kernel/cpu_errata.c | 9 ++++++++ >> drivers/irqchip/irq-gic-v3.c | 42 ++++++++++++++++++++++++++++++++++++- >> 5 files changed, 74 insertions(+), 9 deletions(-) >> > > ... > >> }; >> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c >> index c52f7ba205b4..4211c39b8744 100644 >> --- a/drivers/irqchip/irq-gic-v3.c >> +++ b/drivers/irqchip/irq-gic-v3.c >> @@ -107,7 +107,7 @@ static void gic_redist_wait_for_rwp(void) > > ... > >> +} >> + >> static void __maybe_unused gic_write_pmr(u64 val) >> { >> asm volatile("msr_s " __stringify(ICC_PMR_EL1) ", %0" : : "r" (val)); >> @@ -766,6 +798,12 @@ static const struct irq_domain_ops gic_irq_domain_ops = { >> .free = gic_irq_domain_free, >> }; >> >> +static void gicv3_enable_quirks(void) >> +{ >> + if (cpus_have_cap(ARM64_WORKAROUND_CAVIUM_23154)) >> + static_key_slow_inc(&is_cavium_thunderx); > > May be you could use the enable() method added to struct arm64_cpu_capability > here to perform the above operation, added by James : > > commit 1c0763037f1e1caef739e36e09c6d41ed7b61b2d > Author: James Morse <james.morse@arm.com> > Date: Tue Jul 21 13:23:28 2015 +0100 > > arm64: kernel: Add cpufeature 'enable' callback > > >> +} >> + >> static int __init gic_of_init(struct device_node *node, struct device_node *parent) >> { >> void __iomem *dist_base; >> @@ -825,6 +863,8 @@ static int __init gic_of_init(struct device_node *node, struct device_node *pare >> gic_data.nr_redist_regions = nr_redist_regions; >> gic_data.redist_stride = redist_stride; >> >> + gicv3_enable_quirks(); >> + > > than adding a hook here ? It feels like a good idea, except that it creates a weird dependency between the core arch code and the GIC driver. Can you think of an elegant way to deal with this? Thanks, M.
On Mon, Sep 07, 2015 at 05:54:06PM +0100, Suzuki K. Poulose wrote: > On 14/08/15 19:28, Robert Richter wrote: > >diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c > >index c52f7ba205b4..4211c39b8744 100644 > >--- a/drivers/irqchip/irq-gic-v3.c > >+++ b/drivers/irqchip/irq-gic-v3.c > >@@ -107,7 +107,7 @@ static void gic_redist_wait_for_rwp(void) > > ... > > >+} > >+ > > static void __maybe_unused gic_write_pmr(u64 val) > > { > > asm volatile("msr_s " __stringify(ICC_PMR_EL1) ", %0" : : "r" (val)); > >@@ -766,6 +798,12 @@ static const struct irq_domain_ops gic_irq_domain_ops = { > > .free = gic_irq_domain_free, > > }; > > > >+static void gicv3_enable_quirks(void) > >+{ > >+ if (cpus_have_cap(ARM64_WORKAROUND_CAVIUM_23154)) > >+ static_key_slow_inc(&is_cavium_thunderx); > > May be you could use the enable() method added to struct arm64_cpu_capability > here to perform the above operation, added by James : > > commit 1c0763037f1e1caef739e36e09c6d41ed7b61b2d > Author: James Morse <james.morse@arm.com> > Date: Tue Jul 21 13:23:28 2015 +0100 > > arm64: kernel: Add cpufeature 'enable' callback I thought about this as well when looking at the patch but decided it's better as it is. The "enable" method is meant to enable per-CPU features (or workarounds) but here it is about GICv3, so we don't want to enable for every CPU.
On 07.09.15 18:09:48, Marc Zyngier wrote: > On 07/09/15 17:54, Suzuki K. Poulose wrote: > > On 14/08/15 19:28, Robert Richter wrote: > >> From: Robert Richter <rrichter@cavium.com> > >> +static void gicv3_enable_quirks(void) > >> +{ > >> + if (cpus_have_cap(ARM64_WORKAROUND_CAVIUM_23154)) > >> + static_key_slow_inc(&is_cavium_thunderx); > > > > May be you could use the enable() method added to struct arm64_cpu_capability > > here to perform the above operation, added by James : > > > > commit 1c0763037f1e1caef739e36e09c6d41ed7b61b2d > > Author: James Morse <james.morse@arm.com> > > Date: Tue Jul 21 13:23:28 2015 +0100 > > > > arm64: kernel: Add cpufeature 'enable' callback > > > > > >> +} > >> + > >> static int __init gic_of_init(struct device_node *node, struct device_node *parent) > >> { > >> void __iomem *dist_base; > >> @@ -825,6 +863,8 @@ static int __init gic_of_init(struct device_node *node, struct device_node *pare > >> gic_data.nr_redist_regions = nr_redist_regions; > >> gic_data.redist_stride = redist_stride; > >> > >> + gicv3_enable_quirks(); > >> + > > > > than adding a hook here ? > > It feels like a good idea, except that it creates a weird dependency > between the core arch code and the GIC driver. Can you think of an > elegant way to deal with this? The only chance I see is to move it all to the driver and adding enable() calls to the caps in gicv3_errata: static void gicv3_enable_quirks(void) { check_cpu_capabilities(gicv3_errata, "enabling workaround for"); } Here the code is kept in the driver and called during driver init. But current solution looks more elegant and simpler to me. So I would not change it. -Robert
On 07/09/15 18:15, Catalin Marinas wrote: > On Mon, Sep 07, 2015 at 05:54:06PM +0100, Suzuki K. Poulose wrote: >> On 14/08/15 19:28, Robert Richter wrote: >>> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c >>> index c52f7ba205b4..4211c39b8744 100644 >>> --- a/drivers/irqchip/irq-gic-v3.c >>> +++ b/drivers/irqchip/irq-gic-v3.c >>> @@ -107,7 +107,7 @@ static void gic_redist_wait_for_rwp(void) >> >> ... >> >>> +} >>> + >>> static void __maybe_unused gic_write_pmr(u64 val) >>> { >>> asm volatile("msr_s " __stringify(ICC_PMR_EL1) ", %0" : : "r" (val)); >>> @@ -766,6 +798,12 @@ static const struct irq_domain_ops gic_irq_domain_ops = { >>> .free = gic_irq_domain_free, >>> }; >>> >>> +static void gicv3_enable_quirks(void) >>> +{ >>> + if (cpus_have_cap(ARM64_WORKAROUND_CAVIUM_23154)) >>> + static_key_slow_inc(&is_cavium_thunderx); >> >> May be you could use the enable() method added to struct arm64_cpu_capability >> here to perform the above operation, added by James : >> >> commit 1c0763037f1e1caef739e36e09c6d41ed7b61b2d >> Author: James Morse <james.morse@arm.com> >> Date: Tue Jul 21 13:23:28 2015 +0100 >> >> arm64: kernel: Add cpufeature 'enable' callback > > I thought about this as well when looking at the patch but decided it's > better as it is. The "enable" method is meant to enable per-CPU features > (or workarounds) but here it is about GICv3, so we don't want to enable > for every CPU. Right. I have been playing with a series where the checks are delayed until all CPUs are brought up. But yes, I understand this usecase is slightly different and may not match what I was thinking about. May be, gic can have its own private list of _cpu_capability which it can run over check_cpu_capabilities(), which it can run over and that will fall back to what we have at the moment. So, may be what we have here is as good as we can get. Cheers Suzuki
On Mon, Sep 07, 2015 at 06:41:50PM +0100, Suzuki K. Poulose wrote: > On 07/09/15 18:15, Catalin Marinas wrote: > >On Mon, Sep 07, 2015 at 05:54:06PM +0100, Suzuki K. Poulose wrote: > >>On 14/08/15 19:28, Robert Richter wrote: > >>>+static void gicv3_enable_quirks(void) > >>>+{ > >>>+ if (cpus_have_cap(ARM64_WORKAROUND_CAVIUM_23154)) > >>>+ static_key_slow_inc(&is_cavium_thunderx); > >> > >>May be you could use the enable() method added to struct arm64_cpu_capability > >>here to perform the above operation, added by James : > >> > >>commit 1c0763037f1e1caef739e36e09c6d41ed7b61b2d > >>Author: James Morse <james.morse@arm.com> > >>Date: Tue Jul 21 13:23:28 2015 +0100 > >> > >> arm64: kernel: Add cpufeature 'enable' callback > > > >I thought about this as well when looking at the patch but decided it's > >better as it is. The "enable" method is meant to enable per-CPU features > >(or workarounds) but here it is about GICv3, so we don't want to enable > >for every CPU. > > Right. I have been playing with a series where the checks are delayed until > all CPUs are brought up. Unrelated to the GIC workaround, delaying the enable feature until the CPUs are brought up is not always be feasible. At some point we may implement support to defer the CPU on to user space (I already have a patch that does this when no DT enable-method is specified, but I won't publish it before Qualcomm fixes its firmware ;)). But we may have other reasons to start with CPUs hot-unplugged by default and turn them on later.
On 08/09/15 10:00, Catalin Marinas wrote: > On Mon, Sep 07, 2015 at 06:41:50PM +0100, Suzuki K. Poulose wrote: >> On 07/09/15 18:15, Catalin Marinas wrote: >>> On Mon, Sep 07, 2015 at 05:54:06PM +0100, Suzuki K. Poulose wrote: >>>> On 14/08/15 19:28, Robert Richter wrote: >>>>> +static void gicv3_enable_quirks(void) >>>>> +{ >>>>> + if (cpus_have_cap(ARM64_WORKAROUND_CAVIUM_23154)) >>>>> + static_key_slow_inc(&is_cavium_thunderx); >>>> >>>> May be you could use the enable() method added to struct arm64_cpu_capability >>>> here to perform the above operation, added by James : >>>> >>>> commit 1c0763037f1e1caef739e36e09c6d41ed7b61b2d >>>> Author: James Morse <james.morse@arm.com> >>>> Date: Tue Jul 21 13:23:28 2015 +0100 >>>> >>>> arm64: kernel: Add cpufeature 'enable' callback >>> >>> I thought about this as well when looking at the patch but decided it's >>> better as it is. The "enable" method is meant to enable per-CPU features >>> (or workarounds) but here it is about GICv3, so we don't want to enable >>> for every CPU. >> >> Right. I have been playing with a series where the checks are delayed until >> all CPUs are brought up. > > Unrelated to the GIC workaround, delaying the enable feature until the > CPUs are brought up is not always be feasible. Right. But then, enabling a feature(and applying the alternatives) based on a single CPU may not be safe, always, like PAN. If one of the boot time CPU doesn't have it, then we are in trouble (even though we WARN about it from SANITY check) > At some point we may > implement support to defer the CPU on to user space (I already have a > patch that does this when no DT enable-method is specified, but I won't > publish it before Qualcomm fixes its firmware ;)). But we may have other > reasons to start with CPUs hot-unplugged by default and turn them on > later. We have SANITY check infrastructure that WARNs in such cases, if the features don't match. But still, wouldn't it be better to enable a feature only if all the boot-time enabled CPUs have it ? (Errata is an exception though, which only depends on whether one of the CPU needs it). Thanks Suzuki >
On Tue, Sep 08, 2015 at 10:09:30AM +0100, Suzuki K. Poulose wrote: > On 08/09/15 10:00, Catalin Marinas wrote: > >On Mon, Sep 07, 2015 at 06:41:50PM +0100, Suzuki K. Poulose wrote: > >>On 07/09/15 18:15, Catalin Marinas wrote: > >>>On Mon, Sep 07, 2015 at 05:54:06PM +0100, Suzuki K. Poulose wrote: > >>>>On 14/08/15 19:28, Robert Richter wrote: > >>>>>+static void gicv3_enable_quirks(void) > >>>>>+{ > >>>>>+ if (cpus_have_cap(ARM64_WORKAROUND_CAVIUM_23154)) > >>>>>+ static_key_slow_inc(&is_cavium_thunderx); > >>>> > >>>>May be you could use the enable() method added to struct arm64_cpu_capability > >>>>here to perform the above operation, added by James : > >>>> > >>>>commit 1c0763037f1e1caef739e36e09c6d41ed7b61b2d > >>>>Author: James Morse <james.morse@arm.com> > >>>>Date: Tue Jul 21 13:23:28 2015 +0100 > >>>> > >>>> arm64: kernel: Add cpufeature 'enable' callback > >>> > >>>I thought about this as well when looking at the patch but decided it's > >>>better as it is. The "enable" method is meant to enable per-CPU features > >>>(or workarounds) but here it is about GICv3, so we don't want to enable > >>>for every CPU. > >> > >>Right. I have been playing with a series where the checks are delayed until > >>all CPUs are brought up. > > > >Unrelated to the GIC workaround, delaying the enable feature until the > >CPUs are brought up is not always be feasible. > > Right. But then, enabling a feature(and applying the alternatives) based on > a single CPU may not be safe, always, like PAN. If one of the boot time CPU > doesn't have it, then we are in trouble (even though we WARN about it from > SANITY check) I see your point but there's a trade-off. For some features it's not be feasible to postpone until user space (e.g. errata workarounds). But if a CPU coming up late doesn't have compatible features, just keep it in a loop (or park it back if possible or even refuse to boot any further). I don't think we should cater for insane hardware configurations (e.g. mix of PAN/no-PAN as we already do the code patching). Do you plan to defer code patching as well? Note that we may have to use the .enable function for errata workarounds as well, not just features like PAN (we currently only do code patching but we may have to do other things like issuing SMC calls, you never know what's going to hit us). > >At some point we may > >implement support to defer the CPU on to user space (I already have a > >patch that does this when no DT enable-method is specified, but I won't > >publish it before Qualcomm fixes its firmware ;)). But we may have other > >reasons to start with CPUs hot-unplugged by default and turn them on > >later. > > We have SANITY check infrastructure that WARNs in such cases, if the features > don't match. But still, wouldn't it be better to enable a feature > only if all the boot-time enabled CPUs have it ? (Errata is an exception though, > which only depends on whether one of the CPU needs it). If we ever need this, I think we should implement a separate late_enable function as just deferring all features enabling is not generic enough. But in the meantime, I don't think we should worry about this case, let's wait and see whether we ever get such configurations (panicking the kernel on incompatible features is a good starting point - FPSIMD/no-FPSIMD, PAN/no-PAN etc.)
On 08/09/15 10:37, Catalin Marinas wrote: > On Tue, Sep 08, 2015 at 10:09:30AM +0100, Suzuki K. Poulose wrote: >> On 08/09/15 10:00, Catalin Marinas wrote: >>> On Mon, Sep 07, 2015 at 06:41:50PM +0100, Suzuki K. Poulose wrote: >>>> On 07/09/15 18:15, Catalin Marinas wrote: >>>>> On Mon, Sep 07, 2015 at 05:54:06PM +0100, Suzuki K. Poulose wrote: >>>>>> On 14/08/15 19:28, Robert Richter wrote: >>>>>>> +static void gicv3_enable_quirks(void) >>>>>>> +{ >>>>>>> + if (cpus_have_cap(ARM64_WORKAROUND_CAVIUM_23154)) >>>>>>> + static_key_slow_inc(&is_cavium_thunderx); >>>>>> >>>>>> May be you could use the enable() method added to struct arm64_cpu_capability >>>>>> here to perform the above operation, added by James : >>>>>> >>>>>> commit 1c0763037f1e1caef739e36e09c6d41ed7b61b2d >>>>>> Author: James Morse <james.morse@arm.com> >>>>>> Date: Tue Jul 21 13:23:28 2015 +0100 >>>>>> >>>>>> arm64: kernel: Add cpufeature 'enable' callback >>>>> >>>>> I thought about this as well when looking at the patch but decided it's >>>>> better as it is. The "enable" method is meant to enable per-CPU features >>>>> (or workarounds) but here it is about GICv3, so we don't want to enable >>>>> for every CPU. >>>> >>>> Right. I have been playing with a series where the checks are delayed until >>>> all CPUs are brought up. >>> >>> Unrelated to the GIC workaround, delaying the enable feature until the >>> CPUs are brought up is not always be feasible. >> >> Right. But then, enabling a feature(and applying the alternatives) based on >> a single CPU may not be safe, always, like PAN. If one of the boot time CPU >> doesn't have it, then we are in trouble (even though we WARN about it from >> SANITY check) > > I see your point but there's a trade-off. For some features it's not be > feasible to postpone until user space (e.g. errata workarounds). But if Right, I agree. I should have been more descriptive. Here is my plan : Classify the capabilities / workarounds as two different types. 1) Errata workaround capability checks are triggered for each booting CPU. 2) CPU Feature capabilities are checked until all boot-time enabled CPUs are active, in smp_cpus_done() and before apply_alternatives_all(). (We could even classify some of the capabilities as CPU_LOCAL and check it per-CPU). Delay the feature/capability detection to smp_cpus_done() and before apply_alternatives_all(). i.e, : void __init smp_cpus_done(unsigned int max_cpus) { pr_info("SMP: Total of %d processors activated.\n", num_online_cpus()); + setup_cpu_features(); hyp_mode_check(); apply_alternatives_all(); } Where setup_cpu_features() will do all the CPU feature related processing based on the system wide safe value(will be available from the new infrastructure) : 1) cpu capability based on feature registers (e.g, GIC SYSREG, PAN, ATOMICS ) 2) ELF_HWCAP > a CPU coming up late doesn't have compatible features, just keep it in a > loop (or park it back if possible or even refuse to boot any further). I > don't think we should cater for insane hardware configurations (e.g. mix Any other new CPU, which is missing an available system capability, could be made to loop, as you mentioned. > of PAN/no-PAN as we already do the code patching). Do you plan to defer > code patching as well? As shown above, the apply_alternatives_all() is already done from smp_cpus_done(), which will stay there. > > Note that we may have to use the .enable function for errata workarounds > as well, not just features like PAN (we currently only do code patching > but we may have to do other things like issuing SMC calls, you never > know what's going to hit us). Given that ERRATAs are checked for each CPU and are not delayed, we need not worry about. But yes, we could have flags to indicate how/when the enable methods should be invoked ? e.g, per CPU (like PAN), or per SYSTEM (once for the entire system) >>> At some point we may >>> implement support to defer the CPU on to user space (I already have a >>> patch that does this when no DT enable-method is specified, but I won't >>> publish it before Qualcomm fixes its firmware ;)). But we may have other >>> reasons to start with CPUs hot-unplugged by default and turn them on >>> later. >> >> We have SANITY check infrastructure that WARNs in such cases, if the features >> don't match. But still, wouldn't it be better to enable a feature >> only if all the boot-time enabled CPUs have it ? (Errata is an exception though, >> which only depends on whether one of the CPU needs it). > > If we ever need this, I think we should implement a separate late_enable > function as just deferring all features enabling is not generic enough. > But in the meantime, I don't think we should worry about this case, > let's wait and see whether we ever get such configurations (panicking > the kernel on incompatible features is a good starting point - > FPSIMD/no-FPSIMD, PAN/no-PAN etc.) OK. I will post the series after the merge window. We can discuss further then. Cheers Suzuki
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 0f6edb14b7e4..4f866a4c6536 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -417,6 +417,17 @@ config ARM64_ERRATUM_845719 If unsure, say Y. +config CAVIUM_ERRATUM_23154 + bool "Cavium erratum 23154: Access to ICC_IAR1_EL1 is not sync'ed" + depends on ARCH_THUNDER + default y + help + The gicv3 of ThunderX requires a modified version for + reading the IAR status to ensure data synchronization + (access to icc_iar1_el1 is not sync'ed before and after). + + If unsure, say Y. + endmenu diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index c1044218a63a..2a5e4c163ee5 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -25,8 +25,9 @@ #define ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE 1 #define ARM64_WORKAROUND_845719 2 #define ARM64_HAS_SYSREG_GIC_CPUIF 3 +#define ARM64_WORKAROUND_CAVIUM_23154 4 -#define ARM64_NCAPS 4 +#define ARM64_NCAPS 5 #ifndef __ASSEMBLY__ diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h index a84ec605bed8..3f0c7683f252 100644 --- a/arch/arm64/include/asm/cputype.h +++ b/arch/arm64/include/asm/cputype.h @@ -62,15 +62,19 @@ (0xf << MIDR_ARCHITECTURE_SHIFT) | \ ((partnum) << MIDR_PARTNUM_SHIFT)) -#define ARM_CPU_IMP_ARM 0x41 -#define ARM_CPU_IMP_APM 0x50 +#define ARM_CPU_IMP_ARM 0x41 +#define ARM_CPU_IMP_APM 0x50 +#define ARM_CPU_IMP_CAVIUM 0x43 -#define ARM_CPU_PART_AEM_V8 0xD0F -#define ARM_CPU_PART_FOUNDATION 0xD00 -#define ARM_CPU_PART_CORTEX_A57 0xD07 -#define ARM_CPU_PART_CORTEX_A53 0xD03 +#define ARM_CPU_PART_AEM_V8 0xD0F +#define ARM_CPU_PART_FOUNDATION 0xD00 +#define ARM_CPU_PART_CORTEX_A57 0xD07 +#define ARM_CPU_PART_CORTEX_A53 0xD03 + +#define APM_CPU_PART_POTENZA 0x000 + +#define CAVIUM_CPU_PART_THUNDERX 0x0A1 -#define APM_CPU_PART_POTENZA 0x000 #define ID_AA64MMFR0_BIGENDEL0_SHIFT 16 #define ID_AA64MMFR0_BIGENDEL0_MASK (0xf << ID_AA64MMFR0_BIGENDEL0_SHIFT) diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index 6ffd91438560..574450c257a4 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -23,6 +23,7 @@ #define MIDR_CORTEX_A53 MIDR_CPU_PART(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A53) #define MIDR_CORTEX_A57 MIDR_CPU_PART(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A57) +#define MIDR_THUNDERX MIDR_CPU_PART(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX) #define CPU_MODEL_MASK (MIDR_IMPLEMENTOR_MASK | MIDR_PARTNUM_MASK | \ MIDR_ARCHITECTURE_MASK) @@ -82,6 +83,14 @@ const struct arm64_cpu_capabilities arm64_errata[] = { MIDR_RANGE(MIDR_CORTEX_A53, 0x00, 0x04), }, #endif +#ifdef CONFIG_CAVIUM_ERRATUM_23154 + { + /* Cavium ThunderX, pass 1.x */ + .desc = "Cavium erratum 23154", + .capability = ARM64_WORKAROUND_CAVIUM_23154, + MIDR_RANGE(MIDR_THUNDERX, 0x00, 0x01), + }, +#endif { } }; diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index c52f7ba205b4..4211c39b8744 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -107,7 +107,7 @@ static void gic_redist_wait_for_rwp(void) } /* Low level accessors */ -static u64 __maybe_unused gic_read_iar(void) +static u64 gic_read_iar_common(void) { u64 irqstat; @@ -115,6 +115,38 @@ static u64 __maybe_unused gic_read_iar(void) return irqstat; } +/* + * Cavium ThunderX erratum 23154 + * + * The gicv3 of ThunderX requires a modified version for reading the + * IAR status to ensure data synchronization (access to icc_iar1_el1 + * is not sync'ed before and after). + */ +static u64 gic_read_iar_cavium_thunderx(void) +{ + u64 irqstat; + + asm volatile( + "nop;nop;nop;nop\n\t" + "nop;nop;nop;nop\n\t" + "mrs_s %0, " __stringify(ICC_IAR1_EL1) "\n\t" + "nop;nop;nop;nop" + : "=r" (irqstat)); + mb(); + + return irqstat; +} + +struct static_key is_cavium_thunderx = STATIC_KEY_INIT_FALSE; + +static u64 __maybe_unused gic_read_iar(void) +{ + if (static_key_false(&is_cavium_thunderx)) + return gic_read_iar_common(); + else + return gic_read_iar_cavium_thunderx(); +} + static void __maybe_unused gic_write_pmr(u64 val) { asm volatile("msr_s " __stringify(ICC_PMR_EL1) ", %0" : : "r" (val)); @@ -766,6 +798,12 @@ static const struct irq_domain_ops gic_irq_domain_ops = { .free = gic_irq_domain_free, }; +static void gicv3_enable_quirks(void) +{ + if (cpus_have_cap(ARM64_WORKAROUND_CAVIUM_23154)) + static_key_slow_inc(&is_cavium_thunderx); +} + static int __init gic_of_init(struct device_node *node, struct device_node *parent) { void __iomem *dist_base; @@ -825,6 +863,8 @@ static int __init gic_of_init(struct device_node *node, struct device_node *pare gic_data.nr_redist_regions = nr_redist_regions; gic_data.redist_stride = redist_stride; + gicv3_enable_quirks(); + /* * Find out how many interrupts are supported. * The GIC only supports up to 1020 interrupt sources (SGI+PPI+SPI)