Message ID | 1403432119-1409-1-git-send-email-shawn.guo@freescale.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Shawn, On 22.06.2014 12:15, Shawn Guo wrote: > The CP15 diagnostic register holds ARM errata bits on Cortex-A9, so it > needs to be saved/restored on suspend/resume. Otherwise, the > effectiveness of errata workaround gets lost together with diagnostic > register bit across suspend/resume cycle. > > The patch adds a couple of Cortex-A9 specific suspend and resume > functions to handle the diagnostic register across suspend/resume cycle. [snip] > +ENTRY(cpu_ca9mp_do_resume) > + ldmia r0!, {r4} > + mcr p15, 0, r4, c15, c0, 1 @ Diagnostic register What about platforms running in non-secure mode in which the register is read-only? Best regards, Tomasz
On Tue, Jun 24, 2014 at 05:28:54PM +0100, Tomasz Figa wrote: > Hi Shawn, > > On 22.06.2014 12:15, Shawn Guo wrote: > > The CP15 diagnostic register holds ARM errata bits on Cortex-A9, so it > > needs to be saved/restored on suspend/resume. Otherwise, the > > effectiveness of errata workaround gets lost together with diagnostic > > register bit across suspend/resume cycle. > > > > The patch adds a couple of Cortex-A9 specific suspend and resume > > functions to handle the diagnostic register across suspend/resume cycle. > > [snip] > > > +ENTRY(cpu_ca9mp_do_resume) > > + ldmia r0!, {r4} > > + mcr p15, 0, r4, c15, c0, 1 @ Diagnostic register > > What about platforms running in non-secure mode in which the register is > read-only? On A9, it should be write-ignore. Are you seeing problems on a real SoC? Will
Hi Will, On 24.06.2014 18:33, Will Deacon wrote: > On Tue, Jun 24, 2014 at 05:28:54PM +0100, Tomasz Figa wrote: >> Hi Shawn, >> >> On 22.06.2014 12:15, Shawn Guo wrote: >>> The CP15 diagnostic register holds ARM errata bits on Cortex-A9, so it >>> needs to be saved/restored on suspend/resume. Otherwise, the >>> effectiveness of errata workaround gets lost together with diagnostic >>> register bit across suspend/resume cycle. >>> >>> The patch adds a couple of Cortex-A9 specific suspend and resume >>> functions to handle the diagnostic register across suspend/resume cycle. >> >> [snip] >> >>> +ENTRY(cpu_ca9mp_do_resume) >>> + ldmia r0!, {r4} >>> + mcr p15, 0, r4, c15, c0, 1 @ Diagnostic register >> >> What about platforms running in non-secure mode in which the register is >> read-only? > > On A9, it should be write-ignore. Are you seeing problems on a real SoC? I'm observing a complete system hang on Exynos4412-based Trats2 board if I try to write this register in resume from system-wide sleep. Note that the board is running under secure firmware, but there is no support for suspend/resume of such boards in mainline yet, so I'm testing on a work in progress (but mostly finished) series that is yet to be sent. Best regards, Tomasz
On Tue, Jun 24, 2014 at 06:40:27PM +0100, Tomasz Figa wrote: > Hi Will, Hello, > On 24.06.2014 18:33, Will Deacon wrote: > > On Tue, Jun 24, 2014 at 05:28:54PM +0100, Tomasz Figa wrote: > >> Hi Shawn, > >> > >> On 22.06.2014 12:15, Shawn Guo wrote: > >>> The CP15 diagnostic register holds ARM errata bits on Cortex-A9, so it > >>> needs to be saved/restored on suspend/resume. Otherwise, the > >>> effectiveness of errata workaround gets lost together with diagnostic > >>> register bit across suspend/resume cycle. > >>> > >>> The patch adds a couple of Cortex-A9 specific suspend and resume > >>> functions to handle the diagnostic register across suspend/resume cycle. > >> > >> [snip] > >> > >>> +ENTRY(cpu_ca9mp_do_resume) > >>> + ldmia r0!, {r4} > >>> + mcr p15, 0, r4, c15, c0, 1 @ Diagnostic register > >> > >> What about platforms running in non-secure mode in which the register is > >> read-only? > > > > On A9, it should be write-ignore. Are you seeing problems on a real SoC? > > I'm observing a complete system hang on Exynos4412-based Trats2 board if > I try to write this register in resume from system-wide sleep. That's certainly unexpected; I just double-checked that non-secure accesses are treated as read-only/write-ignore. > Note that the board is running under secure firmware, but there is no > support for suspend/resume of such boards in mainline yet, so I'm > testing on a work in progress (but mostly finished) series that is yet > to be sent. Maybe it could be unrelated to this register. What happens if you change the code to read/write a different (architected) cp15 register? Will
On Tue, Jun 24, 2014 at 07:40:27PM +0200, Tomasz Figa wrote: > Hi Will, > > On 24.06.2014 18:33, Will Deacon wrote: > > On Tue, Jun 24, 2014 at 05:28:54PM +0100, Tomasz Figa wrote: > >> Hi Shawn, > >> > >> On 22.06.2014 12:15, Shawn Guo wrote: > >>> The CP15 diagnostic register holds ARM errata bits on Cortex-A9, so it > >>> needs to be saved/restored on suspend/resume. Otherwise, the > >>> effectiveness of errata workaround gets lost together with diagnostic > >>> register bit across suspend/resume cycle. > >>> > >>> The patch adds a couple of Cortex-A9 specific suspend and resume > >>> functions to handle the diagnostic register across suspend/resume cycle. > >> > >> [snip] > >> > >>> +ENTRY(cpu_ca9mp_do_resume) > >>> + ldmia r0!, {r4} > >>> + mcr p15, 0, r4, c15, c0, 1 @ Diagnostic register > >> > >> What about platforms running in non-secure mode in which the register is > >> read-only? > > > > On A9, it should be write-ignore. Are you seeing problems on a real SoC? > > I'm observing a complete system hang on Exynos4412-based Trats2 board if > I try to write this register in resume from system-wide sleep. > > Note that the board is running under secure firmware, but there is no > support for suspend/resume of such boards in mainline yet, so I'm > testing on a work in progress (but mostly finished) series that is yet > to be sent. Do we have any progress on this? There's patches being generated by the Exynos people which depend on this, so we need to decide whether we want this patch or not.
On 01.07.2014 16:44, Russell King - ARM Linux wrote: > On Tue, Jun 24, 2014 at 07:40:27PM +0200, Tomasz Figa wrote: >> Hi Will, >> >> On 24.06.2014 18:33, Will Deacon wrote: >>> On Tue, Jun 24, 2014 at 05:28:54PM +0100, Tomasz Figa wrote: >>>> Hi Shawn, >>>> >>>> On 22.06.2014 12:15, Shawn Guo wrote: >>>>> The CP15 diagnostic register holds ARM errata bits on Cortex-A9, so it >>>>> needs to be saved/restored on suspend/resume. Otherwise, the >>>>> effectiveness of errata workaround gets lost together with diagnostic >>>>> register bit across suspend/resume cycle. >>>>> >>>>> The patch adds a couple of Cortex-A9 specific suspend and resume >>>>> functions to handle the diagnostic register across suspend/resume cycle. >>>> >>>> [snip] >>>> >>>>> +ENTRY(cpu_ca9mp_do_resume) >>>>> + ldmia r0!, {r4} >>>>> + mcr p15, 0, r4, c15, c0, 1 @ Diagnostic register >>>> >>>> What about platforms running in non-secure mode in which the register is >>>> read-only? >>> >>> On A9, it should be write-ignore. Are you seeing problems on a real SoC? >> >> I'm observing a complete system hang on Exynos4412-based Trats2 board if >> I try to write this register in resume from system-wide sleep. >> >> Note that the board is running under secure firmware, but there is no >> support for suspend/resume of such boards in mainline yet, so I'm >> testing on a work in progress (but mostly finished) series that is yet >> to be sent. > > Do we have any progress on this? There's patches being generated by > the Exynos people which depend on this, so we need to decide whether > we want this patch or not. > I need to handle some other duties right now, but I'll get back to this as soon as possible and carry on with further testing. However I've done few more tests checking non-secure access to both diagnostic and power control registers and both always crash on Exynos4412. Best regards, Tomasz
Hi Tomasz, I asked the hardware guys to take a closer look at your issue and they've uncovered the problem with trying to document an undocumented register. On Tue, Jul 01, 2014 at 05:26:05PM +0100, Tomasz Figa wrote: > On 01.07.2014 16:44, Russell King - ARM Linux wrote: > > On Tue, Jun 24, 2014 at 07:40:27PM +0200, Tomasz Figa wrote: > >> On 24.06.2014 18:33, Will Deacon wrote: > >>> On A9, it should be write-ignore. Are you seeing problems on a real SoC? > >> > >> I'm observing a complete system hang on Exynos4412-based Trats2 board if > >> I try to write this register in resume from system-wide sleep. In actual fact, this register causes an undefined instruction exception if you try to write it from a non-secure mode (although reads work fine, for whatever that's worth). That nicely explains what you're seeing, but it leaves us in a right old two and eight with respect to the suspend/resume code. Given that we don't know whether we're running secure or non-secure, I'm not sure about the best approach to solve this. For arm64, we simply mandate booting non-secure and require firmware to do everything that requires secure access, but we're not in a position to attempt imposing these sorts of restrictions for arch/arm/. Does anybody have any ideas? We could try `handling' the undef, but it's ugly and fragile. Anyway, apologies for the mistake earlier on -- I assumed the documentation I had was correct. Will
On Wed, Jul 02, 2014 at 01:34:44PM +0100, Will Deacon wrote: > Hi Tomasz, > > I asked the hardware guys to take a closer look at your issue and they've > uncovered the problem with trying to document an undocumented register. > > On Tue, Jul 01, 2014 at 05:26:05PM +0100, Tomasz Figa wrote: > > On 01.07.2014 16:44, Russell King - ARM Linux wrote: > > > On Tue, Jun 24, 2014 at 07:40:27PM +0200, Tomasz Figa wrote: > > >> On 24.06.2014 18:33, Will Deacon wrote: > > >>> On A9, it should be write-ignore. Are you seeing problems on a real SoC? > > >> > > >> I'm observing a complete system hang on Exynos4412-based Trats2 board if > > >> I try to write this register in resume from system-wide sleep. > > In actual fact, this register causes an undefined instruction exception if > you try to write it from a non-secure mode (although reads work fine, for > whatever that's worth). That nicely explains what you're seeing, but it > leaves us in a right old two and eight with respect to the suspend/resume > code. > > Given that we don't know whether we're running secure or non-secure, I'm > not sure about the best approach to solve this. For arm64, we simply mandate > booting non-secure and require firmware to do everything that requires > secure access, but we're not in a position to attempt imposing these sorts > of restrictions for arch/arm/. > > Does anybody have any ideas? We could try `handling' the undef, but it's > ugly and fragile. You could try to treat it the same way as ACTLR (see cpu_v7_do_resume, where the ACTLR is first read, if it is already set, ie equal to the saved value, skip the write); ACTLR suffers from the same issue and IIRC that's why Russell added the "teq" then "write" test at the time cpu_suspend was consolidated. Lorenzo
On Wed, Jul 02, 2014 at 03:01:34PM +0100, Lorenzo Pieralisi wrote: > On Wed, Jul 02, 2014 at 01:34:44PM +0100, Will Deacon wrote: > > Does anybody have any ideas? We could try `handling' the undef, but it's > > ugly and fragile. > > You could try to treat it the same way as ACTLR (see cpu_v7_do_resume, > where the ACTLR is first read, if it is already set, ie equal to the > saved value, skip the write); ACTLR suffers from the same issue and IIRC > that's why Russell added the "teq" then "write" test at the time > cpu_suspend was consolidated. We can't do that now for these, because if they were already correct, then by definition we wouldn't need to restore them except on secure parts. From the comments that have been made on the subject of these registers, it is already the case that non-secure platforms don't restore this register, instead relying on the platform to jump through hoops to update it _after_ the MMU has been enabled. Will Deacon raised a point when I discussed this that once the MMU is enabled, the hardware doesn't expect the diagnostic register to be changed. So, having platform code (which runs after the MMU has been enabled) sounds dodgy to me. Of course, prior to DT and single zImage, this wouldn't be that big a deal, we _could_ work around it via the kernel config or similar, insisting that we have a secure and a non-secure build mode. We don't have that "luxury" anymore though. What this all means is that we can't even augment the suspend/resume paths to deal with this register even for secure mode parts. I, too, don't know what the answer is here. This is really a totally fubar'd situation where there is no solution which will work for everyone. I guess what this ultimately comes down to is that we _did_ accept the work-arounds into the kernel source for secure parts - had we not done that and set a clear message like ARM64 does that work-arounds must be done prior to calling the kernel (irrespective of how that happens, iow, whether boot or resume) then we wouldn't have this problem right now. Such a statement would have raised lots of objections though, but with hindsight, it would have been the right thing to do to overrule those objections and just mandate it. It would've been a pain for some people, but we would not be in this situation now where there is no proper solution which works for everyone.
On Wed, Jul 02, 2014 at 03:26:21PM +0100, Russell King - ARM Linux wrote: > On Wed, Jul 02, 2014 at 03:01:34PM +0100, Lorenzo Pieralisi wrote: > > On Wed, Jul 02, 2014 at 01:34:44PM +0100, Will Deacon wrote: > > > Does anybody have any ideas? We could try `handling' the undef, but it's > > > ugly and fragile. > > > > You could try to treat it the same way as ACTLR (see cpu_v7_do_resume, > > where the ACTLR is first read, if it is already set, ie equal to the > > saved value, skip the write); ACTLR suffers from the same issue and IIRC > > that's why Russell added the "teq" then "write" test at the time > > cpu_suspend was consolidated. > > We can't do that now for these, because if they were already correct, > then by definition we wouldn't need to restore them except on secure > parts. > > From the comments that have been made on the subject of these registers, > it is already the case that non-secure platforms don't restore this > register, instead relying on the platform to jump through hoops to > update it _after_ the MMU has been enabled. > > Will Deacon raised a point when I discussed this that once the MMU is > enabled, the hardware doesn't expect the diagnostic register to be > changed. So, having platform code (which runs after the MMU has been > enabled) sounds dodgy to me. Though it sounds dodgy, restoring the registers at platform level is the only choice we have now to get the workaround effectiveness back, due to this secure vs. non-secure pain. Also it seems Exynos has been doing this for years (see exynos_cpu_restore_register()). Shawn
diff --git a/arch/arm/include/asm/glue-proc.h b/arch/arm/include/asm/glue-proc.h index 74a8b84f3cb1..74be7c22035a 100644 --- a/arch/arm/include/asm/glue-proc.h +++ b/arch/arm/include/asm/glue-proc.h @@ -221,15 +221,6 @@ # endif #endif -#ifdef CONFIG_CPU_V7 -# ifdef CPU_NAME -# undef MULTI_CPU -# define MULTI_CPU -# else -# define CPU_NAME cpu_v7 -# endif -#endif - #ifdef CONFIG_CPU_V7M # ifdef CPU_NAME # undef MULTI_CPU @@ -248,6 +239,15 @@ # endif #endif +#ifdef CONFIG_CPU_V7 +/* + * Cortex-A9 needs a different suspend/resume function, so we need + * multiple CPU support for ARMv7 anyway. + */ +# undef MULTI_CPU +# define MULTI_CPU +#endif + #ifndef MULTI_CPU #define cpu_proc_init __glue(CPU_NAME,_proc_init) #define cpu_proc_fin __glue(CPU_NAME,_proc_fin) diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S index 3db2c2f04a30..e26bfaa419ee 100644 --- a/arch/arm/mm/proc-v7.S +++ b/arch/arm/mm/proc-v7.S @@ -152,6 +152,34 @@ ENTRY(cpu_v7_do_resume) ENDPROC(cpu_v7_do_resume) #endif +/* + * Cortex-A9 processor functions + */ + globl_equ cpu_ca9mp_proc_init, cpu_v7_proc_init + globl_equ cpu_ca9mp_proc_fin, cpu_v7_proc_fin + globl_equ cpu_ca9mp_reset, cpu_v7_reset + globl_equ cpu_ca9mp_do_idle, cpu_v7_do_idle + globl_equ cpu_ca9mp_dcache_clean_area, cpu_v7_dcache_clean_area + globl_equ cpu_ca9mp_switch_mm, cpu_v7_switch_mm + globl_equ cpu_ca9mp_set_pte_ext, cpu_v7_set_pte_ext +.globl cpu_ca9mp_suspend_size +.equ cpu_ca9mp_suspend_size, cpu_v7_suspend_size + 1 * 4 +#ifdef CONFIG_ARM_CPU_SUSPEND +ENTRY(cpu_ca9mp_do_suspend) + stmfd sp!, {r4} + mrc p15, 0, r4, c15, c0, 1 @ Diagnostic register + stmia r0!, {r4} + ldmfd sp!, {r4} + b cpu_v7_do_suspend +ENDPROC(cpu_ca9mp_do_suspend) + +ENTRY(cpu_ca9mp_do_resume) + ldmia r0!, {r4} + mcr p15, 0, r4, c15, c0, 1 @ Diagnostic register + b cpu_v7_do_resume +ENDPROC(cpu_ca9mp_do_resume) +#endif + #ifdef CONFIG_CPU_PJ4B globl_equ cpu_pj4b_switch_mm, cpu_v7_switch_mm globl_equ cpu_pj4b_set_pte_ext, cpu_v7_set_pte_ext @@ -418,6 +446,7 @@ __v7_setup_stack: @ define struct processor (see <asm/proc-fns.h> and proc-macros.S) define_processor_functions v7, dabort=v7_early_abort, pabort=v7_pabort, suspend=1 + define_processor_functions ca9mp, dabort=v7_early_abort, pabort=v7_pabort, suspend=1 #ifdef CONFIG_CPU_PJ4B define_processor_functions pj4b, dabort=v7_early_abort, pabort=v7_pabort, suspend=1 #endif @@ -470,7 +499,7 @@ __v7_ca5mp_proc_info: __v7_ca9mp_proc_info: .long 0x410fc090 .long 0xff0ffff0 - __v7_proc __v7_ca9mp_setup + __v7_proc __v7_ca9mp_setup, proc_fns = ca9mp_processor_functions .size __v7_ca9mp_proc_info, . - __v7_ca9mp_proc_info #endif /* CONFIG_ARM_LPAE */
The CP15 diagnostic register holds ARM errata bits on Cortex-A9, so it needs to be saved/restored on suspend/resume. Otherwise, the effectiveness of errata workaround gets lost together with diagnostic register bit across suspend/resume cycle. The patch adds a couple of Cortex-A9 specific suspend and resume functions to handle the diagnostic register across suspend/resume cycle. Signed-off-by: Shawn Guo <shawn.guo@freescale.com> --- Changes since v1: - Add Cortex-A9 specific processor function for suspend and resume to handle diagnostic register across suspend/resume cycle arch/arm/include/asm/glue-proc.h | 18 +++++++++--------- arch/arm/mm/proc-v7.S | 31 ++++++++++++++++++++++++++++++- 2 files changed, 39 insertions(+), 10 deletions(-)