diff mbox

3.18.1->3.19-rc2: In-band Error seen by MPU

Message ID 20150106020122.GA24980@saruman (mailing list archive)
State New, archived
Headers show

Commit Message

Felipe Balbi Jan. 6, 2015, 2:01 a.m. UTC
Hi,

On Tue, Jan 06, 2015 at 01:16:21AM +0200, Aaro Koskinen wrote:
> Hi,
> 
> On Mon, Jan 05, 2015 at 09:43:13AM -0600, Felipe Balbi wrote:
> > On Sat, Jan 03, 2015 at 02:16:22PM +0200, Aaro Koskinen wrote:
> > > > > > > > >>>When updating (custom DM3730 board) from 3.18.1 ro 3.19-rc2
> > > > > > > > >>>I see a "In-band ERROR" warning which wasn't present in 3.18.1.
> > > > > > > > >>>Could it be that I missed some DT updates?
> > > > > > > > >>
> > > > > > > > >>>[    0.366882] In-band Error seen by MPU  at address 0
> > > > > > > > >>>[    0.366912] ------------[ cut here ]------------
> > > > > > > > >>>[    0.366943] WARNING: CPU: 0 PID: 1 at drivers/bus/omap_l3_smx.c:166 omap3_l3_app_irq+0x100/0x134()
> > > > > > > > >>
> > > > > > > > >>This appears also on N900/N950/N9...
> > > > > > > > >
> > > > > > > > >Do you have CONFIG_PREEMPT enabled? It seems there's some
> > > > > > > > >regression related to CONFIG_PREEMPT that started happening
> > > > > > > > >with the merge window?
> > > > > > > > 
> > > > > > > > Indeed, when I disable CONFIG_PREEMPT the warning is gone.
> > > > > > > 
> > > > > > > Yeah, disabling CONFIG_PREEMPT helps here too. Is there some e-mail
> > > > > > > thread / patch set for this already; or should we try to bisect this?
> > > > > > 
> > > > > > AFAIK I'm not aware of other threads, I noticied it with the
> > > > > > "OMAP 4430 SDP: rather sick with recent kernels" thread, but
> > > > > > never got anywhere with it.
> > > > > > 
> > > > > > Yeah it seems it's somewhere between v3.18 and v3.19-rc1, but
> > > > > > that too should be verified. Sounds like running git bisect on
> > > > > > this one is needed.
> > > > > 
> > > > > I tried to bisect this on N950, and it resulted in:
> > > > > 
> > > > > aa25729cfd9709156661bea0f9293deb7729f57a is the first bad commit
> > > > > commit aa25729cfd9709156661bea0f9293deb7729f57a
> > > > > Author: Tony Lindgren <tony@atomide.com>
> > > > > Date:   Wed Nov 5 09:21:23 2014 -0800
> > > > > 
> > > > >     ARM: OMAP3: Fix errors for omap_l3_smx when booted with device tree
> > > > > 
> > > > > But when I tried to revert this from 3.19-rc2, my board won't boot at
> > > > > all...
> > > > 
> > > > Hmm OK that commit just fixed the omap_l3_smx so we now see
> > > > warnings about the unclocked register access.
> > > > 
> > > > It seems that probably the CONFIG_PREEMPT issue has been lurking
> > > > around for longer but we have not seen any errors because
> > > > omap_l3_smx just recently started exposing them.
> > > > 
> > > > Does v3.18 + commit aa25729cfd9 manually applied also produce
> > > > the CONFIG_PREEMPT errors?
> > > 
> > > Yes it does, so I made another bisection between 3.17 and 3.18
> > > using the above patch to trigger the issue, and I got:
> > > 
> > > 55601c9f24670ba926ebdd4d712ac3b177232330 is the first bad commit
> > > commit 55601c9f24670ba926ebdd4d712ac3b177232330
> > > Author: Felipe Balbi <balbi@ti.com>
> > > Date:   Mon Sep 8 17:54:58 2014 -0700
> > > 
> > >     arm: omap: intc: switch over to linear irq domain
> > 
> > Just booted AM335x with CONFIG_PREEMPT and haven't seen any problem.
> > Perhaps this is something related to another OMAP3-only driver ? Perhaps
> > HSI/SSI ?
> 
> I did some debugging and it seems the "In-band Error"
> occurs when omap_system_dma_probe() is being run, specifically when
> the interrupt is enabled. I believe the "DMA" interrupt it's trying
> set up is completely wrong:
> 
>  28:          0      GPIO   2  DMA
> 
> GPIO 2?! Where is that coming from?

heh, it's probably the linux number used ended up mapping to another irq
domain. Can you add this debugging patch and report dmesg ?


Note that I need one log post commit and another log pre commit. If any
of the IRQ numbers change, if means that irq_domain_add_linear() ended
up changing IRQ start and we would need some trick to grab the correct
IRQ number again.

cheers

Comments

Aaro Koskinen Jan. 6, 2015, 12:38 p.m. UTC | #1
Hi,
On Mon, Jan 05, 2015 at 08:01:22PM -0600, Felipe Balbi wrote:
> On Tue, Jan 06, 2015 at 01:16:21AM +0200, Aaro Koskinen wrote:
> > I did some debugging and it seems the "In-band Error"
> > occurs when omap_system_dma_probe() is being run, specifically when
> > the interrupt is enabled. I believe the "DMA" interrupt it's trying
> > set up is completely wrong:
> > 
> >  28:          0      GPIO   2  DMA
> > 
> > GPIO 2?! Where is that coming from?
> 
> heh, it's probably the linux number used ended up mapping to another irq
> domain. Can you add this debugging patch and report dmesg ?

Post-commit:

[    0.208251] omap_dma_system omap_dma_system.0: legacy DMA IRQ 28
[    0.216125] omap-dma-engine 48056000.dma-controller: dmaengine IRQ 22

 22:          5      INTC  13  omap-dma-engine
 28:          0      GPIO   2  DMA

Pre-commit:

[    0.208557] omap_dma_system omap_dma_system.0: legacy DMA IRQ 28
[    0.216461] omap-dma-engine 48056000.dma-controller: dmaengine IRQ 29

 28:          0      INTC  12  DMA
 29:          5      INTC  13  omap-dma-engine

> Note that I need one log post commit and another log pre commit. If any
> of the IRQ numbers change, if means that irq_domain_add_linear() ended
> up changing IRQ start and we would need some trick to grab the correct
> IRQ number again.

So looks like static OMAP_INTC_START cannot be used anymore, but hwmod
data is full of these?

mach-omap2/omap_hwmod_2xxx_3xxx_ipblock_data.c: { .name = "0", .irq = 12 + OMAP_INTC_START, }, /* INT_24XX_SDMA_IRQ0 */

A.
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/arm/plat-omap/dma.c b/arch/arm/plat-omap/dma.c
index 24770e5..b3f6dcd 100644
--- a/arch/arm/plat-omap/dma.c
+++ b/arch/arm/plat-omap/dma.c
@@ -1380,6 +1380,8 @@  static int omap_system_dma_probe(struct platform_device *pdev)
 	if (dma_omap2plus() && !(d->dev_caps & DMA_ENGINE_HANDLE_IRQ)) {
 		strcpy(irq_name, "0");
 		dma_irq = platform_get_irq_byname(pdev, irq_name);
+		dev_info(&pdev->dev, "legacy DMA IRQ %d\n", dma_irq);
+
 		if (dma_irq < 0) {
 			dev_err(&pdev->dev, "failed: request IRQ %d", dma_irq);
 			ret = dma_irq;
diff --git a/drivers/dma/omap-dma.c b/drivers/dma/omap-dma.c
index c0016a6..98fe2d2 100644
--- a/drivers/dma/omap-dma.c
+++ b/drivers/dma/omap-dma.c
@@ -1155,6 +1155,8 @@  static int omap_dma_probe(struct platform_device *pdev)
 	}
 
 	irq = platform_get_irq(pdev, 1);
+
+	dev_info(&pdev->dev, "dmaengine IRQ %d\n", irq);
 	if (irq <= 0) {
 		dev_info(&pdev->dev, "failed to get L1 IRQ: %d\n", irq);
 		od->legacy = true;