diff mbox series

[v1] ARM: A9: Add ARM ERRATA 764319 workaround

Message ID 20220506192957.24889-1-nick.hawkins@hpe.com (mailing list archive)
State New, archived
Headers show
Series [v1] ARM: A9: Add ARM ERRATA 764319 workaround | expand

Commit Message

Hawkins, Nick May 6, 2022, 7:29 p.m. UTC
From: Nick Hawkins <nick.hawkins@hpe.com>

Enable the workaround for the 764319 Cortex A-9 erratum.
CP14 read accesses to the DBGPRSR and DBGOSLSR registers generate an
unexpected Undefined Instruction exception when the DBGSWENABLE external
pin is set to 0, even when the CP14 accesses are performed from a
privileged mode. The work around catches the exception in a way
the kernel does not stop execution with the use of undef_hook. This
has been found to effect the HPE GXP SoC.

Signed-off-by: Nick Hawkins <nick.hawkins@hpe.com>
---
 arch/arm/Kconfig                | 11 +++++++++++
 arch/arm/kernel/hw_breakpoint.c | 26 ++++++++++++++++++++++++++
 2 files changed, 37 insertions(+)

Comments

Arnd Bergmann May 10, 2022, 7:19 a.m. UTC | #1
On Fri, May 6, 2022 at 9:29 PM <nick.hawkins@hpe.com> wrote:
>
> From: Nick Hawkins <nick.hawkins@hpe.com>
>
> Enable the workaround for the 764319 Cortex A-9 erratum.
> CP14 read accesses to the DBGPRSR and DBGOSLSR registers generate an
> unexpected Undefined Instruction exception when the DBGSWENABLE external
> pin is set to 0, even when the CP14 accesses are performed from a
> privileged mode. The work around catches the exception in a way
> the kernel does not stop execution with the use of undef_hook. This
> has been found to effect the HPE GXP SoC.
>
> Signed-off-by: Nick Hawkins <nick.hawkins@hpe.com>
> ---
>  arch/arm/Kconfig                | 11 +++++++++++
>  arch/arm/kernel/hw_breakpoint.c | 26 ++++++++++++++++++++++++++
>  2 files changed, 37 insertions(+)
>
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 13f77eec7c40..6944adfb0fae 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -974,6 +974,17 @@ config ARM_ERRATA_764369
>           relevant cache maintenance functions and sets a specific bit
>           in the diagnostic control register of the SCU.
>
> +config ARM_ERRATA_764319
> +       bool "ARM errata: Read to DBGPRSR and DBGOSLSR may generate Undefined instruction"
> +       depends on CPU_V7
> +       help
> +         This option enables the workaround for the 764319 Cortex A-9 erratum.
> +         CP14 read accesses to the DBGPRSR and DBGOSLSR registers generate an
> +         unexpected Undefined Instruction exception when the DBGSWENABLE
> +         external pin is set to 0, even when the CP14 accesses are performed
> +         from a privileged mode. This work around catches the exception in a
> +         way the kernel does not stop execution.
> +
>  config ARM_ERRATA_775420
>         bool "ARM errata: A data cache maintenance operation which aborts, might lead to deadlock"
>         depends on CPU_V7
> diff --git a/arch/arm/kernel/hw_breakpoint.c b/arch/arm/kernel/hw_breakpoint.c
> index b1423fb130ea..c41a8436a796 100644
> --- a/arch/arm/kernel/hw_breakpoint.c
> +++ b/arch/arm/kernel/hw_breakpoint.c
> @@ -941,6 +941,23 @@ static int hw_breakpoint_pending(unsigned long addr, unsigned int fsr,
>         return ret;
>  }
>
> +#ifdef CONFIG_ARM_ERRATA_764319
> +int oslsr_fault;
> +
> +static int debug_oslsr_trap(struct pt_regs *regs, unsigned int instr)
> +{
> +       oslsr_fault = 1;
> +       instruction_pointer(regs) += 4;
> +       return 0;
> +}
> +
> +static struct undef_hook debug_oslsr_hook = {
> +       .instr_mask  = 0xffffffff,
> +       .instr_val = 0xee115e91,
> +       .fn = debug_oslsr_trap,
> +};
> +#endif
> +

Hi Nick,

This seems a bit more complex than necessary. Can't you just use a custom
inline asm with an ex_table entry to catch the fault? Have a look at
__get_user_asm() for an example.

       Arnd
Robin Murphy May 10, 2022, 9:47 a.m. UTC | #2
On 2022-05-10 08:19, Arnd Bergmann wrote:
> On Fri, May 6, 2022 at 9:29 PM <nick.hawkins@hpe.com> wrote:
>>
>> From: Nick Hawkins <nick.hawkins@hpe.com>
>>
>> Enable the workaround for the 764319 Cortex A-9 erratum.
>> CP14 read accesses to the DBGPRSR and DBGOSLSR registers generate an
>> unexpected Undefined Instruction exception when the DBGSWENABLE external
>> pin is set to 0, even when the CP14 accesses are performed from a
>> privileged mode. The work around catches the exception in a way
>> the kernel does not stop execution with the use of undef_hook. This
>> has been found to effect the HPE GXP SoC.
>>
>> Signed-off-by: Nick Hawkins <nick.hawkins@hpe.com>
>> ---
>>   arch/arm/Kconfig                | 11 +++++++++++
>>   arch/arm/kernel/hw_breakpoint.c | 26 ++++++++++++++++++++++++++
>>   2 files changed, 37 insertions(+)
>>
>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
>> index 13f77eec7c40..6944adfb0fae 100644
>> --- a/arch/arm/Kconfig
>> +++ b/arch/arm/Kconfig
>> @@ -974,6 +974,17 @@ config ARM_ERRATA_764369
>>            relevant cache maintenance functions and sets a specific bit
>>            in the diagnostic control register of the SCU.
>>
>> +config ARM_ERRATA_764319
>> +       bool "ARM errata: Read to DBGPRSR and DBGOSLSR may generate Undefined instruction"
>> +       depends on CPU_V7
>> +       help
>> +         This option enables the workaround for the 764319 Cortex A-9 erratum.
>> +         CP14 read accesses to the DBGPRSR and DBGOSLSR registers generate an
>> +         unexpected Undefined Instruction exception when the DBGSWENABLE
>> +         external pin is set to 0, even when the CP14 accesses are performed
>> +         from a privileged mode. This work around catches the exception in a
>> +         way the kernel does not stop execution.
>> +
>>   config ARM_ERRATA_775420
>>          bool "ARM errata: A data cache maintenance operation which aborts, might lead to deadlock"
>>          depends on CPU_V7
>> diff --git a/arch/arm/kernel/hw_breakpoint.c b/arch/arm/kernel/hw_breakpoint.c
>> index b1423fb130ea..c41a8436a796 100644
>> --- a/arch/arm/kernel/hw_breakpoint.c
>> +++ b/arch/arm/kernel/hw_breakpoint.c
>> @@ -941,6 +941,23 @@ static int hw_breakpoint_pending(unsigned long addr, unsigned int fsr,
>>          return ret;
>>   }
>>
>> +#ifdef CONFIG_ARM_ERRATA_764319
>> +int oslsr_fault;
>> +
>> +static int debug_oslsr_trap(struct pt_regs *regs, unsigned int instr)
>> +{
>> +       oslsr_fault = 1;
>> +       instruction_pointer(regs) += 4;
>> +       return 0;
>> +}
>> +
>> +static struct undef_hook debug_oslsr_hook = {
>> +       .instr_mask  = 0xffffffff,
>> +       .instr_val = 0xee115e91,
>> +       .fn = debug_oslsr_trap,
>> +};
>> +#endif
>> +
> 
> Hi Nick,
> 
> This seems a bit more complex than necessary. Can't you just use a custom
> inline asm with an ex_table entry to catch the fault? Have a look at
> __get_user_asm() for an example.

Indeed, according to the Cortex-A9 documentation the register should 
always read as zero anyway, so all we should need to do is initialise 
the local oslsr variable to 0 and have the fault handler just return to 
the next instruction.

Robin.
Verdun, Jean-Marie May 10, 2022, 11:53 a.m. UTC | #3
Hi Arnd,

>    Hi Nick,

>    This seems a bit more complex than necessary. Can't you just use a custom
>    inline asm with an ex_table entry to catch the fault? Have a look at
>    __get_user_asm() for an example.
>
>           Arnd

We got inspired from debug_reg_hook within the same source file ( ./arch/arm/kernel/hw_breakpoint.c ). We chose that path to keep coherency within the source code. We can implement the same fix by using an ex_table entry, but this will create two different ways at catching unknown instruction within the same source file. Will that be ok ?

vejmarie
Arnd Bergmann May 10, 2022, 1:01 p.m. UTC | #4
On Tue, May 10, 2022 at 1:53 PM Verdun, Jean-Marie <verdun@hpe.com> wrote:
>
> Hi Arnd,
>
> >    Hi Nick,
>
> >    This seems a bit more complex than necessary. Can't you just use a custom
> >    inline asm with an ex_table entry to catch the fault? Have a look at
> >    __get_user_asm() for an example.
> >
> >           Arnd
>
> We got inspired from debug_reg_hook within the same source file
>( ./arch/arm/kernel/hw_breakpoint.c ). We chose that path to keep coherency
> within the source code. We can implement the same fix by using an ex_table
> entry, but this will create two different ways at catching unknown instruction
> within the same source file. Will that be ok ?

I got a little lost trying to find where the breakpoint instruction comes
from that gets trapped here, but I would guess that they had to do this
using an undef_hook because the ex_table approach does not work
there for some reason.

I would still pick the ex_table method here if that works.

       Arnd
Hawkins, Nick May 10, 2022, 1:41 p.m. UTC | #5
On Tue, May 10, 2022 at 1:53 PM Verdun, Jean-Marie <verdun@hpe.com> wrote:
> >
> > Hi Arnd,
> >
> > >    Hi Nick,
> >
> > >    This seems a bit more complex than necessary. Can't you just use a custom
> >    inline asm with an ex_table entry to catch the fault? Have a look at
> >    __get_user_asm() for an example.
> >
> >           Arnd
> >
> > We got inspired from debug_reg_hook within the same source file ( 
> >./arch/arm/kernel/hw_breakpoint.c ). We chose that path to keep 
> >coherency  within the source code. We can implement the same fix by 
> >using an ex_table  entry, but this will create two different ways at 
> >catching unknown instruction  within the same source file. Will that be ok ?

> I got a little lost trying to find where the breakpoint instruction comes from that gets trapped here, but I would guess that they had to do this using an undef_hook because the ex_table approach does not work there for some reason.

> I would still pick the ex_table method here if that works.

I will pursue the method you have recommended Arnd.

Thank you for the feedback as always.

-Nick Hawkins
Will Deacon May 10, 2022, 2:11 p.m. UTC | #6
On Tue, May 10, 2022 at 03:01:26PM +0200, Arnd Bergmann wrote:
> On Tue, May 10, 2022 at 1:53 PM Verdun, Jean-Marie <verdun@hpe.com> wrote:
> >
> > Hi Arnd,
> >
> > >    Hi Nick,
> >
> > >    This seems a bit more complex than necessary. Can't you just use a custom
> > >    inline asm with an ex_table entry to catch the fault? Have a look at
> > >    __get_user_asm() for an example.
> > >
> > >           Arnd
> >
> > We got inspired from debug_reg_hook within the same source file
> >( ./arch/arm/kernel/hw_breakpoint.c ). We chose that path to keep coherency
> > within the source code. We can implement the same fix by using an ex_table
> > entry, but this will create two different ways at catching unknown instruction
> > within the same source file. Will that be ok ?
> 
> I got a little lost trying to find where the breakpoint instruction comes
> from that gets trapped here, but I would guess that they had to do this
> using an undef_hook because the ex_table approach does not work
> there for some reason.
> 
> I would still pick the ex_table method here if that works.

IIRC, the ex_table handlers are called only for data aborts and are intended
to be used to handle cases where we take a fault on a memory access (e.g.
translation fault). In this case, we're taking an undefined instruction
exception on a cp14 access and so the undef_hook is the right thing to use.

Will
Hawkins, Nick May 11, 2022, 12:23 p.m. UTC | #7
On Tue, May 10, 2022 at 03:01:26PM +0200, Arnd Bergmann wrote:
> > On Tue, May 10, 2022 at 1:53 PM Verdun, Jean-Marie <verdun@hpe.com> wrote:
> > >
> > > Hi Arnd,
> > >
> > > >    Hi Nick,
> > >
> > > >    This seems a bit more complex than necessary. Can't you just use a custom
> > > >    inline asm with an ex_table entry to catch the fault? Have a look at
> > > >    __get_user_asm() for an example.
> > > >
> > > >           Arnd
> > >
> > > We got inspired from debug_reg_hook within the same source file ( 
> > >./arch/arm/kernel/hw_breakpoint.c ). We chose that path to keep 
> > >coherency  within the source code. We can implement the same fix by 
> > >using an ex_table  entry, but this will create two different ways at 
> > >catching unknown instruction  within the same source file. Will that be ok ?
> > 
> > I got a little lost trying to find where the breakpoint instruction 
> > comes from that gets trapped here, but I would guess that they had to 
> > do this using an undef_hook because the ex_table approach does not 
> > work there for some reason.
> > 
> > I would still pick the ex_table method here if that works.

> IIRC, the ex_table handlers are called only for data aborts and are intended to be used to handle cases where we take a fault on a memory access (e.g.
translation fault). In this case, we're taking an undefined instruction exception on a cp14 access and so the undef_hook is the right thing to use.

Hello Arnd,

Given Will's input would you like me to still use the ex_table method?

Thanks,

-Nick Hawkins
Arnd Bergmann May 11, 2022, 12:45 p.m. UTC | #8
On Wed, May 11, 2022 at 2:23 PM Hawkins, Nick <nick.hawkins@hpe.com> wrote:
> On Tue, May 10, 2022 at 03:01:26PM +0200, Arnd Bergmann wrote:
> > > On Tue, May 10, 2022 at 1:53 PM Verdun, Jean-Marie <verdun@hpe.com> wrote:
> > >
> > > I got a little lost trying to find where the breakpoint instruction
> > > comes from that gets trapped here, but I would guess that they had to
> > > do this using an undef_hook because the ex_table approach does not
> > > work there for some reason.
> > >
> > > I would still pick the ex_table method here if that works.
>
> > IIRC, the ex_table handlers are called only for data aborts and are intended to be used to handle cases where we take a fault on a memory access (e.g.
> translation fault). In this case, we're taking an undefined instruction exception on a cp14 access and so the undef_hook is the right thing to use.
>
> Given Will's input would you like me to still use the ex_table method?

If it doesn't work, then there is no point trying. You could try
changing the exception
handling so it searches the ex_table for Undefined Instruction
exceptions as well,
but that's probably more complicated.

       Arnd
Verdun, Jean-Marie May 11, 2022, 12:59 p.m. UTC | #9
Hi Arnd,

>    If it doesn't work, then there is no point trying. You could try
>    changing the exception
>    handling so it searches the ex_table for Undefined Instruction
>    exceptions as well,
>    but that's probably more complicated.

I looked yesterday evening and share Will's point of view, I don't 
really see how it could work and believe unfortunately that 
we have to stay with the undef_hook approach. Let's us know if 
you believe some improvements are required to the current patch.

Let's see the positive aspect we learnt something, and if we have
to face additional issue, we have that option open.

Side question: Does any of you will attend Open Source Summit North America ?
I will be there with Nick and HPE team to introduce our GXP Asic during a couple
of talks. Would be great if we could spend some time together.

vejmarie

>           Arnd
Arnd Bergmann May 11, 2022, 1:16 p.m. UTC | #10
On Wed, May 11, 2022 at 2:59 PM Verdun, Jean-Marie <verdun@hpe.com> wrote:
>
> Hi Arnd,
>
> >    If it doesn't work, then there is no point trying. You could try
> >    changing the exception
> >    handling so it searches the ex_table for Undefined Instruction
> >    exceptions as well,
> >    but that's probably more complicated.
>
> I looked yesterday evening and share Will's point of view, I don't
> really see how it could work and believe unfortunately that
> we have to stay with the undef_hook approach. Let's us know if
> you believe some improvements are required to the current patch.
>
> Let's see the positive aspect we learnt something, and if we have
> to face additional issue, we have that option open.

The current patch looks reaonable to me then, please add

Reviewed-by: Arnd Bergmann <arnd@arndb.de>

and submit it to Russell's patch tracker at
https://www.arm.linux.org.uk/developer/patches/info.php

with one small change applied: the 'oslsr_fault' variable should
be marked "static".

> Side question: Does any of you will attend Open Source Summit North America ?
> I will be there with Nick and HPE team to introduce our GXP Asic during a couple
> of talks. Would be great if we could spend some time together.

I have no plans to attend at the moment.

        Arnd
Russell King (Oracle) May 11, 2022, 2:28 p.m. UTC | #11
On Wed, May 11, 2022 at 02:45:27PM +0200, Arnd Bergmann wrote:
> If it doesn't work, then there is no point trying. You could try
> changing the exception handling so it searches the ex_table for
> Undefined Instruction exceptions as well, but that's probably more
> complicated.

What's the point when we have the undef hook?
Arnd Bergmann May 11, 2022, 2:31 p.m. UTC | #12
On Wed, May 11, 2022 at 4:28 PM Russell King (Oracle)
<linux@armlinux.org.uk> wrote:
>
> On Wed, May 11, 2022 at 02:45:27PM +0200, Arnd Bergmann wrote:
> > If it doesn't work, then there is no point trying. You could try
> > changing the exception handling so it searches the ex_table for
> > Undefined Instruction exceptions as well, but that's probably more
> > complicated.
>
> What's the point when we have the undef hook?

It seemed a little easier to follow the code flow if things are in
one place.

        Arnd
diff mbox series

Patch

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 13f77eec7c40..6944adfb0fae 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -974,6 +974,17 @@  config ARM_ERRATA_764369
 	  relevant cache maintenance functions and sets a specific bit
 	  in the diagnostic control register of the SCU.
 
+config ARM_ERRATA_764319
+	bool "ARM errata: Read to DBGPRSR and DBGOSLSR may generate Undefined instruction"
+	depends on CPU_V7
+	help
+	  This option enables the workaround for the 764319 Cortex A-9 erratum.
+	  CP14 read accesses to the DBGPRSR and DBGOSLSR registers generate an
+	  unexpected Undefined Instruction exception when the DBGSWENABLE
+	  external pin is set to 0, even when the CP14 accesses are performed
+	  from a privileged mode. This work around catches the exception in a
+	  way the kernel does not stop execution.
+
 config ARM_ERRATA_775420
        bool "ARM errata: A data cache maintenance operation which aborts, might lead to deadlock"
        depends on CPU_V7
diff --git a/arch/arm/kernel/hw_breakpoint.c b/arch/arm/kernel/hw_breakpoint.c
index b1423fb130ea..c41a8436a796 100644
--- a/arch/arm/kernel/hw_breakpoint.c
+++ b/arch/arm/kernel/hw_breakpoint.c
@@ -941,6 +941,23 @@  static int hw_breakpoint_pending(unsigned long addr, unsigned int fsr,
 	return ret;
 }
 
+#ifdef CONFIG_ARM_ERRATA_764319
+int oslsr_fault;
+
+static int debug_oslsr_trap(struct pt_regs *regs, unsigned int instr)
+{
+	oslsr_fault = 1;
+	instruction_pointer(regs) += 4;
+	return 0;
+}
+
+static struct undef_hook debug_oslsr_hook = {
+	.instr_mask  = 0xffffffff,
+	.instr_val = 0xee115e91,
+	.fn = debug_oslsr_trap,
+};
+#endif
+
 /*
  * One-time initialisation.
  */
@@ -974,7 +991,16 @@  static bool core_has_os_save_restore(void)
 	case ARM_DEBUG_ARCH_V7_1:
 		return true;
 	case ARM_DEBUG_ARCH_V7_ECP14:
+#ifdef CONFIG_ARM_ERRATA_764319
+		oslsr_fault = 0;
+		register_undef_hook(&debug_oslsr_hook);
 		ARM_DBG_READ(c1, c1, 4, oslsr);
+		unregister_undef_hook(&debug_oslsr_hook);
+		if (oslsr_fault)
+			return false;
+#else
+		ARM_DBG_READ(c1, c1, 4, oslsr);
+#endif
 		if (oslsr & ARM_OSLSR_OSLM0)
 			return true;
 		fallthrough;