diff mbox

gpio-omap: Edge interrupts stall

Message ID 20170323124355.npcptv255roleku3@lenoch (mailing list archive)
State New, archived
Headers show

Commit Message

Ladislav Michl March 23, 2017, 12:43 p.m. UTC
Hi Tony & Grygorii,

On Wed, Mar 22, 2017 at 10:17:14AM -0700, Tony Lindgren wrote:
> * Ladislav Michl <ladis@linux-mips.org> [170320 17:19]:
> > On Mon, Mar 20, 2017 at 04:16:33PM -0500, Grygorii Strashko wrote:
> > > 
> > > 
> > > On 03/20/2017 06:21 AM, Ladislav Michl wrote:
> > > > On Thu, Mar 16, 2017 at 12:17:45PM -0700, Tony Lindgren wrote:
> > > >> Hmm maybe we need to flush posted writes when re-enabling the GPIO interrupts?
> > > >>
> > > >> Below is an untested patch that might help if that's the case.
> > > > 
> > > > Unfortunately that's not the case. Even writing set&clear version of
> > > > interrupt demux handler did not make it any better. And idea from
> > > > ancient patch I initially sent is not easily extensible when we
> > > > need to trigger interrupt on both edges. So far I'm clueless...
> > > 
> > > So, just to be sure - can you reproduce it with LKML?
> > 
> > Do you mean mainline kernel? I'm on 4.11-rc3 now...
> > 
> > > As per code, the possible problem could be with double acking of edge irqs
> > > (theoretically):
> > > - omap_gpio_irq_handler
> > >   - "isr" = read irq status
> > >   - omap_clear_gpio_irqbank(bank, isr_saved & ~level_mask); --- clear edge status, so new irq can be accepted
> > >   - loop while "isr"
> > > 	generic_handle_irq()
> > > 	 - handle_edge_irq()
> > > 	    - desc->irq_data.chip->irq_ack(&desc->irq_data); 
> > > 		- omap_gpio_ack_irq()
> > > it might be that at this moment edge IRQ was triggered again and it will be cleared.
> > > 
> > > just as an experiment, could you try to update omap_gpio_ack_irq()
> > > as below:
> > > 
> > > if (irq type is not edge)
> > > 	omap_clear_gpio_irqstatus(bank, offset);
> > 
> > Rewritten as:
> > 
> > diff --git a/drivers/gpio/gpio-omap.c b/drivers/gpio/gpio-omap.c
> > index efc85a279d54..9381763e1fec 100644
> > --- a/drivers/gpio/gpio-omap.c
> > +++ b/drivers/gpio/gpio-omap.c
> > @@ -806,7 +806,8 @@ static void omap_gpio_ack_irq(struct irq_data *d)
> >  	struct gpio_bank *bank = omap_irq_data_get_bank(d);
> >  	unsigned offset = d->hwirq;
> >  
> > -	omap_clear_gpio_irqstatus(bank, offset);
> > +	if (bank->level_mask & BIT(offset))
> > +		omap_clear_gpio_irqstatus(bank, offset);
> >  }
> >  
> >  static void omap_gpio_mask_irq(struct irq_data *d)
> > 
> > And that did the trick. So far I tried IR decoder and decoding sometimes
> > fails, but I cannot say anything certain until I check with scope (which
> > I do tomorrow)
> 
> Another good code review catch by Grygorii! :)

So, to add more info to my previous emails...
For testing above patch I setup tftp server and nfs root and that did the
trick. As both (tested) edge triggered GPIO and ethernet interrupt line
are connected to the same GPIO bank it seemed as an improvement. So I
changed to initramfs for tests. See bellow...

> Can you guys please add a comment there to the code when posting the
> proper patch? Something like:
> 
> /*
>  * Edge GPIOs are already cleared during handling, clearing
>  * them here again will cause lost interrupts.
>  */
> 
> And also add a proper Fixes tag so this gets propagated to the
> stable kernels. It really seems we've had this for a long time.

Above probably still applies, but it is a bit hard to trigger.
I'll setup a test for that and provide reports.

Edge triggered interrupts are lost no matter how fast is its source.
It happens even at 10Hz and patch bellow fixes issue. Of course,
it is just a hack showing where problem is and it will need a bit
more debugging.

Thank you,
	ladis


--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Tony Lindgren March 23, 2017, 2:49 p.m. UTC | #1
* Ladislav Michl <ladis@linux-mips.org> [170323 05:46]:
> Edge triggered interrupts are lost no matter how fast is its source.
> It happens even at 10Hz and patch bellow fixes issue. Of course,
> it is just a hack showing where problem is and it will need a bit
> more debugging.

OK disabling pm_runtime fixes it.. Care to describe which GPIO bank
you're testing with?

For GPIO bank1, context is never lost so wake-ups work without
the wakeirqs handled by pinctrl-single.c. But you should not hit
this unless you've configured the system to hit off mode during
idle by setting /sys/kernel/debug/pm_debug/enable_off_mode and
configuring UART autosuspend.

So it seems the runtime_suspend/resume edge configuration for
wake-up events causes it?

If nothing else helps, I guess one option would be to implement
pm_runtime_use_autosuspend() with the timeout configured to
100ms or something similar. But that seems like a workaround and
won't fix the real issue.

Regards,

Tony

> diff --git a/drivers/gpio/gpio-omap.c b/drivers/gpio/gpio-omap.c
> index efc85a279d54..f6398ab7b75a 100644
> --- a/drivers/gpio/gpio-omap.c
> +++ b/drivers/gpio/gpio-omap.c
> @@ -1273,6 +1273,7 @@ static int omap_gpio_runtime_suspend(struct device *dev)
>  	unsigned long flags;
>  	u32 wake_low, wake_hi;
>  
> +	return -EINVAL;
>  	raw_spin_lock_irqsave(&bank->lock, flags);
>  
>  	/*
> @@ -1341,6 +1342,7 @@ static int omap_gpio_runtime_resume(struct device *dev)
>  	unsigned long flags;
>  	int c;
>  
> +	return -EINVAL;
>  	raw_spin_lock_irqsave(&bank->lock, flags);
>  
>  	/*
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Grygorii Strashko March 24, 2017, 5:33 p.m. UTC | #2
On 03/23/2017 09:49 AM, Tony Lindgren wrote:
> * Ladislav Michl <ladis@linux-mips.org> [170323 05:46]:
>> Edge triggered interrupts are lost no matter how fast is its source.
>> It happens even at 10Hz and patch bellow fixes issue. Of course,
>> it is just a hack showing where problem is and it will need a bit
>> more debugging.
>
> OK disabling pm_runtime fixes it.. Care to describe which GPIO bank
> you're testing with?
>
> For GPIO bank1, context is never lost so wake-ups work without
> the wakeirqs handled by pinctrl-single.c. But you should not hit
> this unless you've configured the system to hit off mode during
> idle by setting /sys/kernel/debug/pm_debug/enable_off_mode and
> configuring UART autosuspend.
>
> So it seems the runtime_suspend/resume edge configuration for
> wake-up events causes it?
>
> If nothing else helps, I guess one option would be to implement
> pm_runtime_use_autosuspend() with the timeout configured to
> 100ms or something similar. But that seems like a workaround and
> won't fix the real issue.

This is really strange as PM runtime will be executed only when
first gpio is requested - all other time gpio bank should stay active.

Only one possibility is deep CPUIdle or Suspend states, but in this case
some QoS setup may require to block these states while IR/App is
running.

>
> Regards,
>
> Tony
>
>> diff --git a/drivers/gpio/gpio-omap.c b/drivers/gpio/gpio-omap.c
>> index efc85a279d54..f6398ab7b75a 100644
>> --- a/drivers/gpio/gpio-omap.c
>> +++ b/drivers/gpio/gpio-omap.c
>> @@ -1273,6 +1273,7 @@ static int omap_gpio_runtime_suspend(struct device *dev)
>>  	unsigned long flags;
>>  	u32 wake_low, wake_hi;
>>
>> +	return -EINVAL;
>>  	raw_spin_lock_irqsave(&bank->lock, flags);
>>
>>  	/*
>> @@ -1341,6 +1342,7 @@ static int omap_gpio_runtime_resume(struct device *dev)
>>  	unsigned long flags;
>>  	int c;
>>
>> +	return -EINVAL;
>>  	raw_spin_lock_irqsave(&bank->lock, flags);
>>
>>  	/*
Tony Lindgren March 24, 2017, 5:42 p.m. UTC | #3
* Grygorii Strashko <grygorii.strashko@ti.com> [170324 10:35]:
> 
> 
> On 03/23/2017 09:49 AM, Tony Lindgren wrote:
> > * Ladislav Michl <ladis@linux-mips.org> [170323 05:46]:
> > > Edge triggered interrupts are lost no matter how fast is its source.
> > > It happens even at 10Hz and patch bellow fixes issue. Of course,
> > > it is just a hack showing where problem is and it will need a bit
> > > more debugging.
> > 
> > OK disabling pm_runtime fixes it.. Care to describe which GPIO bank
> > you're testing with?
> > 
> > For GPIO bank1, context is never lost so wake-ups work without
> > the wakeirqs handled by pinctrl-single.c. But you should not hit
> > this unless you've configured the system to hit off mode during
> > idle by setting /sys/kernel/debug/pm_debug/enable_off_mode and
> > configuring UART autosuspend.
> > 
> > So it seems the runtime_suspend/resume edge configuration for
> > wake-up events causes it?
> > 
> > If nothing else helps, I guess one option would be to implement
> > pm_runtime_use_autosuspend() with the timeout configured to
> > 100ms or something similar. But that seems like a workaround and
> > won't fix the real issue.
> 
> This is really strange as PM runtime will be executed only when
> first gpio is requested - all other time gpio bank should stay active.
> 
> Only one possibility is deep CPUIdle or Suspend states, but in this case
> some QoS setup may require to block these states while IR/App is
> running.

I think this is triggered when omap3 PM in omap_sram_idle() calls
omap2_gpio_prepare_for_idle(). If that's the case, we could fix
it with a PM QoS configuration similar to how the audio glitches
were fixed in commit 9834ffd1ecc3 ("ASoC: omap-mcbsp: Add PM QoS
support for McBSP to prevent glitches").

No idea what should set the latency requirement though. Probably
the IR driver when active?

That probably still won't help with fact that we can lose events
though. Ideally things would just work even without the PM QoS.

Regards,

Tony

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Grygorii Strashko April 3, 2017, 8:20 p.m. UTC | #4
hi Ladislav,

On 03/24/2017 12:33 PM, Grygorii Strashko wrote:
>
>
> On 03/23/2017 09:49 AM, Tony Lindgren wrote:
>> * Ladislav Michl <ladis@linux-mips.org> [170323 05:46]:
>>> Edge triggered interrupts are lost no matter how fast is its source.
>>> It happens even at 10Hz and patch bellow fixes issue. Of course,
>>> it is just a hack showing where problem is and it will need a bit
>>> more debugging.
>>
>> OK disabling pm_runtime fixes it.. Care to describe which GPIO bank
>> you're testing with?
>>
>> For GPIO bank1, context is never lost so wake-ups work without
>> the wakeirqs handled by pinctrl-single.c. But you should not hit
>> this unless you've configured the system to hit off mode during
>> idle by setting /sys/kernel/debug/pm_debug/enable_off_mode and
>> configuring UART autosuspend.
>>
>> So it seems the runtime_suspend/resume edge configuration for
>> wake-up events causes it?
>>
>> If nothing else helps, I guess one option would be to implement
>> pm_runtime_use_autosuspend() with the timeout configured to
>> 100ms or something similar. But that seems like a workaround and
>> won't fix the real issue.
>
> This is really strange as PM runtime will be executed only when
> first gpio is requested - all other time gpio bank should stay active.
>
> Only one possibility is deep CPUIdle or Suspend states, but in this case
> some QoS setup may require to block these states while IR/App is
> running.
>


Do you have any news here?
a) will it fix your issue if you just disable cpuidle and auto-suspend?
b) will it fix your issue if you disable cpuidle and auto-suspend,  and
apply fix in omap_gpio_ack_irq()?
...
Ladislav Michl Sept. 15, 2017, 9:44 p.m. UTC | #5
Hi Grygorii,

On Mon, Apr 03, 2017 at 03:20:47PM -0500, Grygorii Strashko wrote:
> hi Ladislav,
> 
> On 03/24/2017 12:33 PM, Grygorii Strashko wrote:
> > 
> > 
> > On 03/23/2017 09:49 AM, Tony Lindgren wrote:
> > > * Ladislav Michl <ladis@linux-mips.org> [170323 05:46]:
> > > > Edge triggered interrupts are lost no matter how fast is its source.
> > > > It happens even at 10Hz and patch bellow fixes issue. Of course,
> > > > it is just a hack showing where problem is and it will need a bit
> > > > more debugging.
> > > 
> > > OK disabling pm_runtime fixes it.. Care to describe which GPIO bank
> > > you're testing with?
> > > 
> > > For GPIO bank1, context is never lost so wake-ups work without
> > > the wakeirqs handled by pinctrl-single.c. But you should not hit
> > > this unless you've configured the system to hit off mode during
> > > idle by setting /sys/kernel/debug/pm_debug/enable_off_mode and
> > > configuring UART autosuspend.
> > > 
> > > So it seems the runtime_suspend/resume edge configuration for
> > > wake-up events causes it?
> > > 
> > > If nothing else helps, I guess one option would be to implement
> > > pm_runtime_use_autosuspend() with the timeout configured to
> > > 100ms or something similar. But that seems like a workaround and
> > > won't fix the real issue.
> > 
> > This is really strange as PM runtime will be executed only when
> > first gpio is requested - all other time gpio bank should stay active.
> > 
> > Only one possibility is deep CPUIdle or Suspend states, but in this case
> > some QoS setup may require to block these states while IR/App is
> > running.
> > 
> 
> Do you have any news here?
> a) will it fix your issue if you just disable cpuidle and auto-suspend?
> b) will it fix your issue if you disable cpuidle and auto-suspend,  and
> apply fix in omap_gpio_ack_irq()?
> ...

a) almost fixes issue, rarely interrupt is lost.
b) fixes issue (most likely, I'd need to build test equipment to detect
   interrupt misses and run test over weekend)

And as omap_gpio_ack_irq fix is obviously correct, I'm sending it as proper
patch. According to Tony, the rest could be optimized somehow :-)

	ladis
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/gpio/gpio-omap.c b/drivers/gpio/gpio-omap.c
index efc85a279d54..f6398ab7b75a 100644
--- a/drivers/gpio/gpio-omap.c
+++ b/drivers/gpio/gpio-omap.c
@@ -1273,6 +1273,7 @@  static int omap_gpio_runtime_suspend(struct device *dev)
 	unsigned long flags;
 	u32 wake_low, wake_hi;
 
+	return -EINVAL;
 	raw_spin_lock_irqsave(&bank->lock, flags);
 
 	/*
@@ -1341,6 +1342,7 @@  static int omap_gpio_runtime_resume(struct device *dev)
 	unsigned long flags;
 	int c;
 
+	return -EINVAL;
 	raw_spin_lock_irqsave(&bank->lock, flags);
 
 	/*