diff mbox

arm: omap: ratelimit omap_l3_smx error log spam

Message ID CAMQu2gz9HE0xEoW7gu257rpLWdCUpt7N6GQNQzWVhFb-A+69hg@mail.gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Santosh Shilimkar Aug. 27, 2012, 10:12 p.m. UTC
On Mon, Aug 27, 2012 at 3:02 PM, Aaro Koskinen <aaro.koskinen@iki.fi> wrote:
> Hi,
>
> On Mon, Aug 27, 2012 at 02:35:57PM -0700, Shilimkar, Santosh wrote:
>> > -       pr_err("%s seen by %s %s at address %x\n",
>> > +       pr_err_ratelimited("%s seen by %s %s at address %x\n",
>> >                         omap3_l3_code_string(code),
>> >                         omap3_l3_initiator_string(initid),
>> >                         multi ? "Multiple Errors" : "", address);
>> > -       WARN_ON(1);
>> > +       WARN_ON_ONCE(1);
>> >
>> The issue needs to be fixed instead of WARN_ON_ONCE() and then
>> just moving ahead. Interconnect in bad states is really bad state and you
>> won't have reliable operations post that on SOC.
>
> How printing megabytes of identical stack traces helps anything?
>
It just says repeatedly and  loudly... Fix the issue :-)

> This has been there always (since the L3 driver was added) on every boot
> with N950/N9 (which BTW are HS devices, not sure if that has anything
> to do with it). There is no apparent effect on device functionality,
> at least nothing unusual has been observed...
>
I assumed this is secure device when I saw the SRAM memset is causing the
issue.

> Is there any documentation how to interpret and debug this error report?
>
The issue could be, there is memset tried on Secure portion of SRAM or
CPU speculatively accessed adjacent SRAM region of public SRAM which
is secure and hence the error.

If you just bypass the SRAM init [1], does the issue go away ?

Regards
Santosh

[1]
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Aaro Koskinen Aug. 27, 2012, 11:26 p.m. UTC | #1
On Mon, Aug 27, 2012 at 03:12:07PM -0700, Shilimkar, Santosh wrote:
> On Mon, Aug 27, 2012 at 3:02 PM, Aaro Koskinen <aaro.koskinen@iki.fi> wrote:
> > Hi,
> >
> > On Mon, Aug 27, 2012 at 02:35:57PM -0700, Shilimkar, Santosh wrote:
> >> > -       pr_err("%s seen by %s %s at address %x\n",
> >> > +       pr_err_ratelimited("%s seen by %s %s at address %x\n",
> >> >                         omap3_l3_code_string(code),
> >> >                         omap3_l3_initiator_string(initid),
> >> >                         multi ? "Multiple Errors" : "", address);
> >> > -       WARN_ON(1);
> >> > +       WARN_ON_ONCE(1);
> >> >
> >> The issue needs to be fixed instead of WARN_ON_ONCE() and then
> >> just moving ahead. Interconnect in bad states is really bad state and you
> >> won't have reliable operations post that on SOC.
> >
> > How printing megabytes of identical stack traces helps anything?
> >
> It just says repeatedly and  loudly... Fix the issue :-)
> 
> > This has been there always (since the L3 driver was added) on every boot
> > with N950/N9 (which BTW are HS devices, not sure if that has anything
> > to do with it). There is no apparent effect on device functionality,
> > at least nothing unusual has been observed...
> >
> I assumed this is secure device when I saw the SRAM memset is causing the
> issue.
> 
> > Is there any documentation how to interpret and debug this error report?
> >
> The issue could be, there is memset tried on Secure portion of SRAM or
> CPU speculatively accessed adjacent SRAM region of public SRAM which
> is secure and hence the error.

Thanks, that makes sense.

> If you just bypass the SRAM init [1], does the issue go away ?

I tried bypassing the whole SRAM init, but the device does not seem to
boot up at all.

If I comment out the memset alone, then it boots and the issue is gone:

+#if 0
	memset_io(omap_sram_base + SRAM_BOOTLOADER_SZ, 0,
		  omap_sram_size - SRAM_BOOTLOADER_SZ);
+#endif

A.
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Santosh Shilimkar Aug. 28, 2012, 12:17 a.m. UTC | #2
On Mon, Aug 27, 2012 at 4:26 PM, Aaro Koskinen <aaro.koskinen@iki.fi> wrote:
> On Mon, Aug 27, 2012 at 03:12:07PM -0700, Shilimkar, Santosh wrote:
>> On Mon, Aug 27, 2012 at 3:02 PM, Aaro Koskinen <aaro.koskinen@iki.fi> wrote:
>> > Hi,
>> >
>> > On Mon, Aug 27, 2012 at 02:35:57PM -0700, Shilimkar, Santosh wrote:
>> >> > -       pr_err("%s seen by %s %s at address %x\n",
>> >> > +       pr_err_ratelimited("%s seen by %s %s at address %x\n",
>> >> >                         omap3_l3_code_string(code),
>> >> >                         omap3_l3_initiator_string(initid),
>> >> >                         multi ? "Multiple Errors" : "", address);
>> >> > -       WARN_ON(1);
>> >> > +       WARN_ON_ONCE(1);
>> >> >
>> >> The issue needs to be fixed instead of WARN_ON_ONCE() and then
>> >> just moving ahead. Interconnect in bad states is really bad state and you
>> >> won't have reliable operations post that on SOC.
>> >
>> > How printing megabytes of identical stack traces helps anything?
>> >
>> It just says repeatedly and  loudly... Fix the issue :-)
>>
>> > This has been there always (since the L3 driver was added) on every boot
>> > with N950/N9 (which BTW are HS devices, not sure if that has anything
>> > to do with it). There is no apparent effect on device functionality,
>> > at least nothing unusual has been observed...
>> >
>> I assumed this is secure device when I saw the SRAM memset is causing the
>> issue.
>>
>> > Is there any documentation how to interpret and debug this error report?
>> >
>> The issue could be, there is memset tried on Secure portion of SRAM or
>> CPU speculatively accessed adjacent SRAM region of public SRAM which
>> is secure and hence the error.
>
> Thanks, that makes sense.
>
>> If you just bypass the SRAM init [1], does the issue go away ?
>
> I tried bypassing the whole SRAM init, but the device does not seem to
> boot up at all.
>
> If I comment out the memset alone, then it boots and the issue is gone:
>
> +#if 0
>         memset_io(omap_sram_base + SRAM_BOOTLOADER_SZ, 0,
>                   omap_sram_size - SRAM_BOOTLOADER_SZ);
> +#endif
>
Good. So the issue is indeed direct or indirect access to the secure SRAM.
As security can dynamically resize the secure RAM size it is even harder
to fix this issue properly. One easier way to deal with the issue is map only
needed SRAM and leave rest for security.

For now, Can you check if reducing the size of the SRAM in init is helping you
to get way with the issue. Sorry it might need few iterations for you to get
a working SRAM size.

Regards
Santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Aaro Koskinen Aug. 28, 2012, 12:20 p.m. UTC | #3
Hi,

On Mon, Aug 27, 2012 at 05:17:30PM -0700, Shilimkar, Santosh wrote:
> On Mon, Aug 27, 2012 at 4:26 PM, Aaro Koskinen <aaro.koskinen@iki.fi> wrote:
> > I tried bypassing the whole SRAM init, but the device does not seem to
> > boot up at all.
> >
> > If I comment out the memset alone, then it boots and the issue is gone:
> >
> > +#if 0
> >         memset_io(omap_sram_base + SRAM_BOOTLOADER_SZ, 0,
> >                   omap_sram_size - SRAM_BOOTLOADER_SZ);
> > +#endif
> >
> Good. So the issue is indeed direct or indirect access to the secure SRAM.
> As security can dynamically resize the secure RAM size it is even harder
> to fix this issue properly. One easier way to deal with the issue is map only
> needed SRAM and leave rest for security.
> 
> For now, Can you check if reducing the size of the SRAM in init is helping you
> to get way with the issue. Sorry it might need few iterations for you to get
> a working SRAM size.

The problem is triggered by writing to the beginning of the SRAM area,
not to the end. I need to skip the first 16k (0x4000) to get rid of
the errors. Maybe the base address calculation is wrong? This could
also explain why it's still possible to use the device - it seems the
allocator starts from the end, and moves towards the base...

A.
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Santosh Shilimkar Aug. 28, 2012, 2:19 p.m. UTC | #4
On Tue, Aug 28, 2012 at 5:20 AM, Aaro Koskinen <aaro.koskinen@iki.fi> wrote:
> Hi,
>
> On Mon, Aug 27, 2012 at 05:17:30PM -0700, Shilimkar, Santosh wrote:
>> On Mon, Aug 27, 2012 at 4:26 PM, Aaro Koskinen <aaro.koskinen@iki.fi> wrote:
>> > I tried bypassing the whole SRAM init, but the device does not seem to
>> > boot up at all.
>> >
>> > If I comment out the memset alone, then it boots and the issue is gone:
>> >
>> > +#if 0
>> >         memset_io(omap_sram_base + SRAM_BOOTLOADER_SZ, 0,
>> >                   omap_sram_size - SRAM_BOOTLOADER_SZ);
>> > +#endif
>> >
>> Good. So the issue is indeed direct or indirect access to the secure SRAM.
>> As security can dynamically resize the secure RAM size it is even harder
>> to fix this issue properly. One easier way to deal with the issue is map only
>> needed SRAM and leave rest for security.
>>
>> For now, Can you check if reducing the size of the SRAM in init is helping you
>> to get way with the issue. Sorry it might need few iterations for you to get
>> a working SRAM size.
>
> The problem is triggered by writing to the beginning of the SRAM area,
> not to the end. I need to skip the first 16k (0x4000) to get rid of
> the errors. Maybe the base address calculation is wrong? This could
> also explain why it's still possible to use the device - it seems the
> allocator starts from the end, and moves towards the base...
>
Or the PPA has resized the secure area of 16K. As you have seen the issue
on one OMAP3 device, it makes sense to takeout that 16K from the public
SRAM map.

Can you send the patch with fixed base address fir PUB SRAM ?

Regards
Santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/arm/plat-omap/sram.c b/arch/arm/plat-omap/sram.c
index 766181c..305e6de 100644
--- a/arch/arm/plat-omap/sram.c
+++ b/arch/arm/plat-omap/sram.c
@@ -384,6 +384,7 @@  static inline int am33xx_sram_init(void)

 int __init omap_sram_init(void)
 {
+#if 0
        omap_detect_sram();
        omap_map_sram();

@@ -397,6 +398,6 @@  int __init omap_sram_init(void)
                am33xx_sram_init();
        else if (cpu_is_omap34xx())
                omap34xx_sram_init();
-
+#endif
        return 0;
 }