diff mbox

tty: omap-serial: fix boot hang by converting to use a threaded IRQ handler (was Re: [PATCH] irq: always set IRQF_ONESHOT if no primary handler is specified)

Message ID alpine.DEB.2.00.1108222306300.28226@utopia.booyaka.com (mailing list archive)
State New, archived
Headers show

Commit Message

Paul Walmsley Aug. 23, 2011, 5:10 a.m. UTC
Convert the omap-serial hardirq handler to a threaded IRQ handler. Without 
this patch, OMAP boards which use the on-board OMAP UARTs and the 
omap-serial driver will not boot to userspace after commit 
f3637a5f2e2eb391ff5757bc83fb5de8f9726464 ("irq: Always set IRQF_ONESHOT if 
no primary handler is specified").  Enabling CONFIG_DEBUG_SHIRQ reveals 
'IRQ handler type mismatch' errors:

IRQ handler type mismatch for IRQ 74
current handler: serial idle
[<c001b124>] (unwind_backtrace+0x0/0xf0) from [<c009c900>] (__setup_irq+0x42c/0x464)
[<c009c900>] (__setup_irq+0x42c/0x464) from [<c009ca08>] (request_threaded_irq+0xd0/0x148)
[<c009ca08>] (request_threaded_irq+0xd0/0x148) from [<c029d398>] (serial_omap_startup+0x30/0x2dc)
[<c029d398>] (serial_omap_startup+0x30/0x2dc) from [<c0295a38>] (uart_startup+0x5c/0x1ac)

(etc.)

It turns out that the omap-serial code used one threaded IRQ
handler[1][2] and one non-threaded IRQ handler[3] that shared the same
IRQ.  During the 3.1-rc series, a patch was merged[4] that caused
IRQF_ONESHOT to be set on the threaded handler, but the non-threaded
handler did not have IRQF_ONESHOT set.  Since interrupt handlers
that share the same IRQ must also share the presence or absence of
IRQF_ONESHOT[5], this new commit caused a mismatch that prevented the
non-threaded IRQ from registering.

Fix by converting the non-threaded IRQ handler in omap-serial.c to a
threaded IRQ handler.  Ideally we would not have to make this type of
change during the -rc series, but the commit that caused this behavior
was itself merged between v3.1-rc2 and v3.1-rc3.  In the long term, it
would be good to get rid of the shared IRQ handler hack in
arch/arm/mach-omap2/serial.c.

This change has been boot-tested on OMAP3530 BeagleBoard and
OMAP4430ES2 PandaBoard, and static suspend has been lightly tested on
the BeagleBoard.

Pantelis Antoniou <pantelis.antoniou@gmail.com> originally reported
the boot failure.

1. arch/arm/mach-omap2/serial.c line 550, as of Linux v3.1-rc3.  http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=arch/arm/mach-omap2/serial.c;h=466fc722fa0f39f03b8d93cf84e4dae4f57fd029;hb=fcb8ce5cfe30ca9ca5c9a79cdfe26d1993e65e0c#l550

2. arch/arm/mach-omap2/serial.c line 563, as of Linux v3.1-rc3.  http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=arch/arm/mach-omap2/serial.c;h=466fc722fa0f39f03b8d93cf84e4dae4f57fd029;hb=fcb8ce5cfe30ca9ca5c9a79cdfe26d1993e65e0c#l563

3. drivers/tty/serial/omap-serial.c line 464, as of Linux v3.1-rc3.  http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=drivers/tty/serial/omap-serial.c;h=c37df8d0fa2819261dffccc5bc4d0180b9531f49;hb=fcb8ce5cfe30ca9ca5c9a79cdfe26d1993e65e0c#l464

4. Commit f3637a5f2e2eb391ff5757bc83fb5de8f9726464 ("irq: Always set
   IRQF_ONESHOT if no primary handler is specified")

5. Gleixner, Thomas. _[patch 2/5] genirq: Allow shared oneshot
   interrupts_.  http://lkml.org/lkml/2011/2/23/511

Signed-off-by: Paul Walmsley <paul@pwsan.com>
Cc: Pantelis Antoniou <pantelis.antoniou@gmail.com>
Cc: Govindraj.R <govindraj.raja@ti.com>
Cc: Kevin Hilman <khilman@ti.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 drivers/tty/serial/omap-serial.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

Comments

Sebastian Andrzej Siewior Aug. 23, 2011, 8:14 a.m. UTC | #1
* Paul Walmsley | 2011-08-22 23:10:21 [-0600]:

>IRQ handler type mismatch for IRQ 74

>It turns out that the omap-serial code used one threaded IRQ
>handler[1][2] and one non-threaded IRQ handler[3] that shared the same
>IRQ.  During the 3.1-rc series, a patch was merged[4] that caused
>IRQF_ONESHOT to be set on the threaded handler, but the non-threaded
>handler did not have IRQF_ONESHOT set.  Since interrupt handlers

Is the type of IRQ74 edge or line/eoi?
Since you registered the serial handler as threaded and the primary
handler did _not_ disable the irq source (due to the absence of the
IRQF_ONESHOT flag or a custom primary handler which does this by
changing a bit in the hardware) it is more or less luck that you did not
hang forever on the first serial interrupt. Well, it does not happen
with edge typed interrupt lines.

>that share the same IRQ must also share the presence or absence of
>IRQF_ONESHOT[5], this new commit caused a mismatch that prevented the
>non-threaded IRQ from registering.

In theory it should be okay if the handler without ONESHOT is either
primary only or primary + threaded and the primary disables the irq
source.

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alan Cox Aug. 23, 2011, 8:57 a.m. UTC | #2
On Mon, 22 Aug 2011 23:10:21 -0600 (MDT)
Paul Walmsley <paul@pwsan.com> wrote:

> 
> Convert the omap-serial hardirq handler to a threaded IRQ handler. Without 
> this patch, OMAP boards which use the on-board OMAP UARTs and the 
> omap-serial driver will not boot to userspace after commit 
> f3637a5f2e2eb391ff5757bc83fb5de8f9726464 ("irq: Always set IRQF_ONESHOT if 
> no primary handler is specified").  Enabling CONFIG_DEBUG_SHIRQ reveals 
> 'IRQ handler type mismatch' errors:

There are multiple other drivers reporting all these problems - the
faulty irq change should be reverted at this point not the drivers
fiddled with.

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Felipe Balbi Aug. 23, 2011, 9:12 a.m. UTC | #3
Hi,

On Mon, Aug 22, 2011 at 11:10:21PM -0600, Paul Walmsley wrote:
> 
> Convert the omap-serial hardirq handler to a threaded IRQ handler. Without 
> this patch, OMAP boards which use the on-board OMAP UARTs and the 
> omap-serial driver will not boot to userspace after commit 
> f3637a5f2e2eb391ff5757bc83fb5de8f9726464 ("irq: Always set IRQF_ONESHOT if 
> no primary handler is specified").  Enabling CONFIG_DEBUG_SHIRQ reveals 
> 'IRQ handler type mismatch' errors:
> 
> IRQ handler type mismatch for IRQ 74
> current handler: serial idle
> [<c001b124>] (unwind_backtrace+0x0/0xf0) from [<c009c900>] (__setup_irq+0x42c/0x464)
> [<c009c900>] (__setup_irq+0x42c/0x464) from [<c009ca08>] (request_threaded_irq+0xd0/0x148)
> [<c009ca08>] (request_threaded_irq+0xd0/0x148) from [<c029d398>] (serial_omap_startup+0x30/0x2dc)
> [<c029d398>] (serial_omap_startup+0x30/0x2dc) from [<c0295a38>] (uart_startup+0x5c/0x1ac)
> 
> (etc.)
> 
> It turns out that the omap-serial code used one threaded IRQ
> handler[1][2] and one non-threaded IRQ handler[3] that shared the same
> IRQ.  During the 3.1-rc series, a patch was merged[4] that caused
> IRQF_ONESHOT to be set on the threaded handler, but the non-threaded
> handler did not have IRQF_ONESHOT set.  Since interrupt handlers
> that share the same IRQ must also share the presence or absence of
> IRQF_ONESHOT[5], this new commit caused a mismatch that prevented the
> non-threaded IRQ from registering.
> 
> Fix by converting the non-threaded IRQ handler in omap-serial.c to a
> threaded IRQ handler.  Ideally we would not have to make this type of
> change during the -rc series, but the commit that caused this behavior
> was itself merged between v3.1-rc2 and v3.1-rc3.  In the long term, it
> would be good to get rid of the shared IRQ handler hack in
> arch/arm/mach-omap2/serial.c.
> 
> This change has been boot-tested on OMAP3530 BeagleBoard and
> OMAP4430ES2 PandaBoard, and static suspend has been lightly tested on
> the BeagleBoard.
> 
> Pantelis Antoniou <pantelis.antoniou@gmail.com> originally reported
> the boot failure.
> 
> 1. arch/arm/mach-omap2/serial.c line 550, as of Linux v3.1-rc3.  http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=arch/arm/mach-omap2/serial.c;h=466fc722fa0f39f03b8d93cf84e4dae4f57fd029;hb=fcb8ce5cfe30ca9ca5c9a79cdfe26d1993e65e0c#l550
> 
> 2. arch/arm/mach-omap2/serial.c line 563, as of Linux v3.1-rc3.  http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=arch/arm/mach-omap2/serial.c;h=466fc722fa0f39f03b8d93cf84e4dae4f57fd029;hb=fcb8ce5cfe30ca9ca5c9a79cdfe26d1993e65e0c#l563
> 
> 3. drivers/tty/serial/omap-serial.c line 464, as of Linux v3.1-rc3.  http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=drivers/tty/serial/omap-serial.c;h=c37df8d0fa2819261dffccc5bc4d0180b9531f49;hb=fcb8ce5cfe30ca9ca5c9a79cdfe26d1993e65e0c#l464
> 
> 4. Commit f3637a5f2e2eb391ff5757bc83fb5de8f9726464 ("irq: Always set
>    IRQF_ONESHOT if no primary handler is specified")
> 
> 5. Gleixner, Thomas. _[patch 2/5] genirq: Allow shared oneshot
>    interrupts_.  http://lkml.org/lkml/2011/2/23/511
> 
> Signed-off-by: Paul Walmsley <paul@pwsan.com>
> Cc: Pantelis Antoniou <pantelis.antoniou@gmail.com>
> Cc: Govindraj.R <govindraj.raja@ti.com>
> Cc: Kevin Hilman <khilman@ti.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>  drivers/tty/serial/omap-serial.c |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/tty/serial/omap-serial.c b/drivers/tty/serial/omap-serial.c
> index c37df8d..e83a769 100644
> --- a/drivers/tty/serial/omap-serial.c
> +++ b/drivers/tty/serial/omap-serial.c
> @@ -461,8 +461,8 @@ static int serial_omap_startup(struct uart_port *port)
>  	/*
>  	 * Allocate the IRQ
>  	 */
> -	retval = request_irq(up->port.irq, serial_omap_irq, up->port.irqflags,
> -				up->name, up);
> +	retval = request_threaded_irq(up->port.irq, NULL, serial_omap_irq,
> +				      up->port.irqflags, up->name, up);

if you're not running on a slow bus, you should use top half to check if
IRQs are really from this device.
Linus Torvalds Aug. 23, 2011, 4:13 p.m. UTC | #4
On Tue, Aug 23, 2011 at 1:57 AM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> On Mon, 22 Aug 2011 23:10:21 -0600 (MDT)
> Paul Walmsley <paul@pwsan.com> wrote:
>
>>
>> Convert the omap-serial hardirq handler to a threaded IRQ handler. Without
>> this patch, OMAP boards which use the on-board OMAP UARTs and the
>> omap-serial driver will not boot to userspace after commit
>> f3637a5f2e2eb391ff5757bc83fb5de8f9726464 ("irq: Always set IRQF_ONESHOT if
>> no primary handler is specified").  Enabling CONFIG_DEBUG_SHIRQ reveals
>> 'IRQ handler type mismatch' errors:
>
> There are multiple other drivers reporting all these problems - the
> faulty irq change should be reverted at this point not the drivers
> fiddled with.

Agreed. It's too late to try to fix random drivers.

The real issue seems to be that clue:

    "Enabling CONFIG_DEBUG_SHIRQ reveals 'IRQ handler type mismatch' errors"

iow, we have a shared irq, and forcing the IRQF_ONESHOT bit is
resulting in those shared interrupts now having different values of
IRQF_ONESHOT, so this test triggers:

                /*
                 * Can't share interrupts unless both agree to and are
                 * the same type (level, edge, polarity). So both flag
                 * fields must have IRQF_SHARED set and the bits which
                 * set the trigger type must match. Also all must
                 * agree on ONESHOT.
                 */
                if (!((old->flags & new->flags) & IRQF_SHARED) ||
                    ((old->flags ^ new->flags) & IRQF_TRIGGER_MASK) ||
                    ((old->flags ^ new->flags) & IRQF_ONESHOT)) {
                        old_name = old->name;
                        goto mismatch;
                }

and the irq isn't installed at all (setup_irq returns with EBUSY).

So the commit that caused this problem was simply bogus. The commit log says

                       "Since it is required for those users and
    there is no difference for others it makes sense to add this flag
    unconditionally."

but the "there is no difference for others" seems to be total crap,
exactly because it results in this IRQF mismatch.

So I think that commit should just be reverted. Thomas?

                           Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Gleixner Aug. 23, 2011, 4:46 p.m. UTC | #5
Linus,

On Tue, 23 Aug 2011, Linus Torvalds wrote:
> but the "there is no difference for others" seems to be total crap,
> exactly because it results in this IRQF mismatch.
> 
> So I think that commit should just be reverted. Thomas?

Yes. I missed that detail when I applied that patch. Reverting it
seems to be the only sensible solution.

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul Walmsley Aug. 23, 2011, 5:21 p.m. UTC | #6
On Tue, 23 Aug 2011, Felipe Balbi wrote:

> if you're not running on a slow bus, you should use top half to check if
> IRQs are really from this device.

I don't think this applies in this case, since the IRQ isn't shared 
between different hardware devices - it's shared for the purposes of a 
software hack.


- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/tty/serial/omap-serial.c b/drivers/tty/serial/omap-serial.c
index c37df8d..e83a769 100644
--- a/drivers/tty/serial/omap-serial.c
+++ b/drivers/tty/serial/omap-serial.c
@@ -461,8 +461,8 @@  static int serial_omap_startup(struct uart_port *port)
 	/*
 	 * Allocate the IRQ
 	 */
-	retval = request_irq(up->port.irq, serial_omap_irq, up->port.irqflags,
-				up->name, up);
+	retval = request_threaded_irq(up->port.irq, NULL, serial_omap_irq,
+				      up->port.irqflags, up->name, up);
 	if (retval)
 		return retval;