diff mbox

[RFC] irq: Rework IRQF_NO_SUSPENDED

Message ID 1663327.PISiM9sMHC@vostro.rjw.lan (mailing list archive)
State RFC, archived
Headers show

Commit Message

Rafael J. Wysocki July 25, 2014, 10:25 p.m. UTC
On Friday, July 25, 2014 11:00:12 PM Thomas Gleixner wrote:
> On Fri, 25 Jul 2014, Rafael J. Wysocki wrote:
> > On Friday, July 25, 2014 03:25:41 PM Peter Zijlstra wrote:
> > > OK, so Rafael said there's devices that keep on raising their interrupt
> > > until they get attention. Ideally this won't happen because the device
> > > is suspended etc.. But I'm sure there's some broken piece of hardware
> > > out there that'll make it go boom.
> > 
> > So here's an idea.
> > 
> > What about returning IRQ_NONE rather than IRQ_HANDLED for "suspended"
> > interrupts (after all, that's what a sane driver would do for a
> > suspended device I suppose)?
> > 
> > If the line is really shared and the interrupt is taken care of by
> > the other guy sharing the line, we'll be all fine.
> > 
> > If that is not the case, on the other hand, and something's really
> > broken, we'll end up disabling the interrupt and marking it as
> > IRQS_SPURIOUS_DISABLED (if I understand things correctly).
> 
> We should not wait 100k unhandled interrupts in that case. We know
> already at the first unhandled interrupt that the shit hit the fan.

The first one may be a bus glitch or some such.  Also I guess we still need to
allow the legitimate "no suspend" guy to handle his interrupts until it gets
too worse.

Also does it really hurt to rely on the generic mechanism here?  We regard
it as fine at all other times after all.

> I'll have a deeper look how we can sanitize the whole wake/no_suspend
> logic vs. shared interrupts.

Cool, thanks!

> Need to look at the usage sites first.

There will be more of them, like this:

https://patchwork.kernel.org/patch/4618531/

Essentially, all wakeup interrupts will need at least one no_suspend irqaction
going forward.

Below is my take on this (untested) in case it is useful for anything.

It is targeted at the problematic case (that is, a shared interrupt with at least
one irqaction that has IRQF_NO_SUSPEND set and at least one that doesn't) only and
is not supposed to change behavior in the other cases (the do_irqaction thing
shamelessly stolen from the Peter's patch).  It drops the IRQD_WAKEUP_STATE check,
because that has the same problem with shared interrupts as no_suspend.

Rafael


---
 kernel/irq/handle.c |   21 ++++++++++++++++++---
 kernel/irq/manage.c |   30 +++++++++++++++++++++++++-----
 2 files changed, 43 insertions(+), 8 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Rafael J. Wysocki July 25, 2014, 11:07 p.m. UTC | #1
On Saturday, July 26, 2014 12:25:29 AM Rafael J. Wysocki wrote:
> On Friday, July 25, 2014 11:00:12 PM Thomas Gleixner wrote:
> > On Fri, 25 Jul 2014, Rafael J. Wysocki wrote:
> > > On Friday, July 25, 2014 03:25:41 PM Peter Zijlstra wrote:
> > > > OK, so Rafael said there's devices that keep on raising their interrupt
> > > > until they get attention. Ideally this won't happen because the device
> > > > is suspended etc.. But I'm sure there's some broken piece of hardware
> > > > out there that'll make it go boom.
> > > 
> > > So here's an idea.
> > > 
> > > What about returning IRQ_NONE rather than IRQ_HANDLED for "suspended"
> > > interrupts (after all, that's what a sane driver would do for a
> > > suspended device I suppose)?
> > > 
> > > If the line is really shared and the interrupt is taken care of by
> > > the other guy sharing the line, we'll be all fine.
> > > 
> > > If that is not the case, on the other hand, and something's really
> > > broken, we'll end up disabling the interrupt and marking it as
> > > IRQS_SPURIOUS_DISABLED (if I understand things correctly).
> > 
> > We should not wait 100k unhandled interrupts in that case. We know
> > already at the first unhandled interrupt that the shit hit the fan.
> 
> The first one may be a bus glitch or some such.  Also I guess we still need to
> allow the legitimate "no suspend" guy to handle his interrupts until it gets
> too worse.

s/worse/bad/ (ah, grammar).

> Also does it really hurt to rely on the generic mechanism here?  We regard
> it as fine at all other times after all.
> 
> > I'll have a deeper look how we can sanitize the whole wake/no_suspend
> > logic vs. shared interrupts.
> 
> Cool, thanks!
> 
> > Need to look at the usage sites first.
> 
> There will be more of them, like this:
> 
> https://patchwork.kernel.org/patch/4618531/
> 
> Essentially, all wakeup interrupts will need at least one no_suspend irqaction
> going forward.
> 
> Below is my take on this (untested) in case it is useful for anything.
> 
> It is targeted at the problematic case (that is, a shared interrupt with at least
> one irqaction that has IRQF_NO_SUSPEND set and at least one that doesn't) only and
> is not supposed to change behavior in the other cases (the do_irqaction thing
> shamelessly stolen from the Peter's patch).  It drops the IRQD_WAKEUP_STATE check,
> because that has the same problem with shared interrupts as no_suspend.

Self-correction ->

> ---
>  kernel/irq/handle.c |   21 ++++++++++++++++++---
>  kernel/irq/manage.c |   30 +++++++++++++++++++++++++-----
>  2 files changed, 43 insertions(+), 8 deletions(-)
> 
> Index: linux-pm/kernel/irq/manage.c
> ===================================================================
> --- linux-pm.orig/kernel/irq/manage.c
> +++ linux-pm/kernel/irq/manage.c

[cut]

> @@ -446,7 +459,15 @@ EXPORT_SYMBOL(disable_irq);
>  void __enable_irq(struct irq_desc *desc, unsigned int irq, bool resume)
>  {
>  	if (resume) {
> -		if (!(desc->istate & IRQS_SUSPENDED)) {
> +		if (desc->istate & IRQS_SUSPENDED) {
> +			desc->istate &= ~IRQS_SUSPENDED;
> +			if (desc->istate & IRQS_SPURIOUS_DISABLED) {
> +				pr_err("WARNING! Unhandled events during suspend for IRQ %d\n", irq);

-> This should be printed for desc->irqs_unhandled > 0 I suppose.  That will cover
the cases when we don't have to disable it too.  The value of desc->irqs_unhandled
can be included into the warning too.

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki July 26, 2014, 11:49 a.m. UTC | #2
On Saturday, July 26, 2014 12:25:29 AM Rafael J. Wysocki wrote:
> On Friday, July 25, 2014 11:00:12 PM Thomas Gleixner wrote:
> > On Fri, 25 Jul 2014, Rafael J. Wysocki wrote:
> > > On Friday, July 25, 2014 03:25:41 PM Peter Zijlstra wrote:
> > > > OK, so Rafael said there's devices that keep on raising their interrupt
> > > > until they get attention. Ideally this won't happen because the device
> > > > is suspended etc.. But I'm sure there's some broken piece of hardware
> > > > out there that'll make it go boom.
> > > 
> > > So here's an idea.
> > > 
> > > What about returning IRQ_NONE rather than IRQ_HANDLED for "suspended"
> > > interrupts (after all, that's what a sane driver would do for a
> > > suspended device I suppose)?
> > > 
> > > If the line is really shared and the interrupt is taken care of by
> > > the other guy sharing the line, we'll be all fine.
> > > 
> > > If that is not the case, on the other hand, and something's really
> > > broken, we'll end up disabling the interrupt and marking it as
> > > IRQS_SPURIOUS_DISABLED (if I understand things correctly).
> > 
> > We should not wait 100k unhandled interrupts in that case. We know
> > already at the first unhandled interrupt that the shit hit the fan.
> 
> The first one may be a bus glitch or some such.  Also I guess we still need to
> allow the legitimate "no suspend" guy to handle his interrupts until it gets
> too worse.
> 
> Also does it really hurt to rely on the generic mechanism here?  We regard
> it as fine at all other times after all.
> 
> > I'll have a deeper look how we can sanitize the whole wake/no_suspend
> > logic vs. shared interrupts.

One more idea, on top of the prototype patch that I posted
(https://patchwork.kernel.org/patch/4625921/).

The problem with enable_irq_wake() is that it only takes one argument, so
if that's a shared interrupt, we can't really say which irqaction is supposed
to handle wakeup interrupts should they occur.

To address this we can introduce enable_device_irq_wake() that will take
an additional dev_id argument (that must be the one passed to request_irq() or
the operation will fail) that can be used to identify the irqaction for
handling the wakeup interrupts.  It can work by setting IRQF_NO_SUSPEND
for the irqaction in question and doing the rest along the lines of
irq_set_irq_wake(irq, 1).  disable_device_irq_wake() will then clear
IRQF_NO_SUSPEND (it also has to be passed the dev_id argument).

If we have that, the guys who need to set up device interrupts (ie. interrupts
normally used for signaling input events etc) for system wakeup will need to
use enable_device_irq_wake() and that should just work.

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki July 26, 2014, 11:53 a.m. UTC | #3
On Saturday, July 26, 2014 01:49:17 PM Rafael J. Wysocki wrote:
> On Saturday, July 26, 2014 12:25:29 AM Rafael J. Wysocki wrote:
> > On Friday, July 25, 2014 11:00:12 PM Thomas Gleixner wrote:
> > > On Fri, 25 Jul 2014, Rafael J. Wysocki wrote:
> > > > On Friday, July 25, 2014 03:25:41 PM Peter Zijlstra wrote:
> > > > > OK, so Rafael said there's devices that keep on raising their interrupt
> > > > > until they get attention. Ideally this won't happen because the device
> > > > > is suspended etc.. But I'm sure there's some broken piece of hardware
> > > > > out there that'll make it go boom.
> > > > 
> > > > So here's an idea.
> > > > 
> > > > What about returning IRQ_NONE rather than IRQ_HANDLED for "suspended"
> > > > interrupts (after all, that's what a sane driver would do for a
> > > > suspended device I suppose)?
> > > > 
> > > > If the line is really shared and the interrupt is taken care of by
> > > > the other guy sharing the line, we'll be all fine.
> > > > 
> > > > If that is not the case, on the other hand, and something's really
> > > > broken, we'll end up disabling the interrupt and marking it as
> > > > IRQS_SPURIOUS_DISABLED (if I understand things correctly).
> > > 
> > > We should not wait 100k unhandled interrupts in that case. We know
> > > already at the first unhandled interrupt that the shit hit the fan.
> > 
> > The first one may be a bus glitch or some such.  Also I guess we still need to
> > allow the legitimate "no suspend" guy to handle his interrupts until it gets
> > too worse.
> > 
> > Also does it really hurt to rely on the generic mechanism here?  We regard
> > it as fine at all other times after all.
> > 
> > > I'll have a deeper look how we can sanitize the whole wake/no_suspend
> > > logic vs. shared interrupts.
> 
> One more idea, on top of the prototype patch that I posted
> (https://patchwork.kernel.org/patch/4625921/).
> 
> The problem with enable_irq_wake() is that it only takes one argument, so
> if that's a shared interrupt, we can't really say which irqaction is supposed
> to handle wakeup interrupts should they occur.
> 
> To address this we can introduce enable_device_irq_wake() that will take
> an additional dev_id argument (that must be the one passed to request_irq() or
> the operation will fail) that can be used to identify the irqaction for
> handling the wakeup interrupts.  It can work by setting IRQF_NO_SUSPEND
> for the irqaction in question and doing the rest along the lines of
> irq_set_irq_wake(irq, 1).  disable_device_irq_wake() will then clear
> IRQF_NO_SUSPEND (it also has to be passed the dev_id argument).
> 
> If we have that, the guys who need to set up device interrupts (ie. interrupts
> normally used for signaling input events etc) for system wakeup will need to
> use enable_device_irq_wake() and that should just work.

That could be used to address the PCIe PME interrupt wakeup and gpio-keys driver
wakeup in the same way at least.

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Zijlstra July 28, 2014, 6:49 a.m. UTC | #4
On Sat, Jul 26, 2014 at 01:49:17PM +0200, Rafael J. Wysocki wrote:
> One more idea, on top of the prototype patch that I posted
> (https://patchwork.kernel.org/patch/4625921/).
> 
> The problem with enable_irq_wake() is that it only takes one argument, so
> if that's a shared interrupt, we can't really say which irqaction is supposed
> to handle wakeup interrupts should they occur.

Right.

> To address this we can introduce enable_device_irq_wake() that will take
> an additional dev_id argument (that must be the one passed to request_irq() or
> the operation will fail) that can be used to identify the irqaction for
> handling the wakeup interrupts.  It can work by setting IRQF_NO_SUSPEND
> for the irqaction in question and doing the rest along the lines of
> irq_set_irq_wake(irq, 1).  disable_device_irq_wake() will then clear
> IRQF_NO_SUSPEND (it also has to be passed the dev_id argument).
> 
> If we have that, the guys who need to set up device interrupts (ie. interrupts
> normally used for signaling input events etc) for system wakeup will need to
> use enable_device_irq_wake() and that should just work.

So in the patch I posted I described why NO_SUSPEND is useful, I still
have to hear a solid reason for why we also need enable_irq_wake()? What
does it do that cannot be achieved with NO_SUSPEND?

I realize its dynamic, but that's crap, at device registration time it
pretty much already knows if its a wakeup source or not, right?
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Gleixner July 28, 2014, 12:33 p.m. UTC | #5
On Mon, 28 Jul 2014, Peter Zijlstra wrote:
> On Sat, Jul 26, 2014 at 01:49:17PM +0200, Rafael J. Wysocki wrote:
> > One more idea, on top of the prototype patch that I posted
> > (https://patchwork.kernel.org/patch/4625921/).
> > 
> > The problem with enable_irq_wake() is that it only takes one argument, so
> > if that's a shared interrupt, we can't really say which irqaction is supposed
> > to handle wakeup interrupts should they occur.
> 
> Right.

Historically no hardware manufacturer was so stupid to have wakeup
sources on shared interrupts. Just because x86 is brain damaged is no
reason to claim that stuff that worked fine for years is crap.

We never needed this until now, so we simply go and do what we've done
always in such situations. We look for a solution which copes with the
new hardware brain farts and keeps the existing stuff working. It's
that simple.

> > To address this we can introduce enable_device_irq_wake() that will take
> > an additional dev_id argument (that must be the one passed to request_irq() or
> > the operation will fail) that can be used to identify the irqaction for
> > handling the wakeup interrupts.  It can work by setting IRQF_NO_SUSPEND
> > for the irqaction in question and doing the rest along the lines of
> > irq_set_irq_wake(irq, 1).  disable_device_irq_wake() will then clear
> > IRQF_NO_SUSPEND (it also has to be passed the dev_id argument).
> > 
> > If we have that, the guys who need to set up device interrupts (ie. interrupts
> > normally used for signaling input events etc) for system wakeup will need to
> > use enable_device_irq_wake() and that should just work.
> 
> So in the patch I posted I described why NO_SUSPEND is useful, I still
> have to hear a solid reason for why we also need enable_irq_wake()? What
> does it do that cannot be achieved with NO_SUSPEND?
> 
> I realize its dynamic, but that's crap, at device registration time it
> pretty much already knows if its a wakeup source or not, right?

It does, but that doesn't make it crap. There are situations where you
want certain wakeup sources enabled or disabled and you can't do that
with IRQF_NO_SUSPEND. And to support this, you need the wake_depth
counter as well. And that's what we always had.

I'd rather say, that the "I can wakeup the machine and will do so no
matter what flag" is more stupid than the wake mechanism.

It was added to support XEN wreckage and I wish we've never done that
in the first place.

So we are not going to make everything a single stupid flag and limit
the usability of existing code. We rather go and try to remove the
stupid flag before it becomes more wide spread.

And we cannot treat the wakeup thing the same way as the
IRQF_NO_SUSPEND flag, because there is hardware where the irq line
must be disabled at the normal (non suspend) interrupt controller, and
the wake mechanism tells the PM microcontroller to monitor the
interrupt line and kick the machine back to life.

So we need to very carefully look at all the existing cases instead of
yelling crap and inflicting x86 specific horror on everyone. I said on
friday, that I need to look at ALL use cases first and I meant it.

Thanks,

	tglx









--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Zijlstra July 28, 2014, 1:04 p.m. UTC | #6
On Mon, Jul 28, 2014 at 02:33:41PM +0200, Thomas Gleixner wrote:
> > I realize its dynamic, but that's crap, at device registration time it
> > pretty much already knows if its a wakeup source or not, right?
> 
> It does, but that doesn't make it crap. There are situations where you
> want certain wakeup sources enabled or disabled and you can't do that
> with IRQF_NO_SUSPEND. And to support this, you need the wake_depth
> counter as well. And that's what we always had.

But if the driver is PM aware it can change its behaviour in runtime. We
don't need to disable the line in the PIC if its well behaved. This
force disable is more a bandaid to quiesce ignorant drivers than
anything else.

> I'd rather say, that the "I can wakeup the machine and will do so no
> matter what flag" is more stupid than the wake mechanism.

I never said such. I think the NO_SUSPEND name bad one that confuses
thing more than anything as it describes the implementation not the
semantics.

But as above, if a driver sets this it basically says I'm PM aware and
will behave right, no need to go force the PIC on me.

With that the driver can or can not, depending on runtime circumstances
wake or not etc.

> So we are not going to make everything a single stupid flag and limit
> the usability of existing code. We rather go and try to remove the
> stupid flag before it becomes more wide spread.

Sure, having two different means of PM for the irq layer is one too
many.

> And we cannot treat the wakeup thing the same way as the
> IRQF_NO_SUSPEND flag, because there is hardware where the irq line
> must be disabled at the normal (non suspend) interrupt controller, and
> the wake mechanism tells the PM microcontroller to monitor the
> interrupt line and kick the machine back to life.

Ah, see that is useful information.. Yes such a thing would invalidate
the flag scheme.

> So we need to very carefully look at all the existing cases instead of
> yelling crap and inflicting x86 specific horror on everyone. I said on
> friday, that I need to look at ALL use cases first and I meant it.

Well, the reason I yelled crap is because its IRQ nr based and not
handler based. Yes, shared lines are a pain, but we have to deal with
them, so allowing 'new' interfaces in that do not deal with it is very
unfortunate at best.

What further didn't help is that it wasn't at all clear to me how it was
supposed to work.


--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki July 28, 2014, 9:27 p.m. UTC | #7
On Monday, July 28, 2014 08:49:13 AM Peter Zijlstra wrote:
> On Sat, Jul 26, 2014 at 01:49:17PM +0200, Rafael J. Wysocki wrote:
> > One more idea, on top of the prototype patch that I posted
> > (https://patchwork.kernel.org/patch/4625921/).
> > 
> > The problem with enable_irq_wake() is that it only takes one argument, so
> > if that's a shared interrupt, we can't really say which irqaction is supposed
> > to handle wakeup interrupts should they occur.
> 
> Right.
> 
> > To address this we can introduce enable_device_irq_wake() that will take
> > an additional dev_id argument (that must be the one passed to request_irq() or
> > the operation will fail) that can be used to identify the irqaction for
> > handling the wakeup interrupts.  It can work by setting IRQF_NO_SUSPEND
> > for the irqaction in question and doing the rest along the lines of
> > irq_set_irq_wake(irq, 1).  disable_device_irq_wake() will then clear
> > IRQF_NO_SUSPEND (it also has to be passed the dev_id argument).
> > 
> > If we have that, the guys who need to set up device interrupts (ie. interrupts
> > normally used for signaling input events etc) for system wakeup will need to
> > use enable_device_irq_wake() and that should just work.
> 
> So in the patch I posted I described why NO_SUSPEND is useful, I still
> have to hear a solid reason for why we also need enable_irq_wake()? What
> does it do that cannot be achieved with NO_SUSPEND?
> 
> I realize its dynamic, but that's crap, at device registration time it
> pretty much already knows if its a wakeup source or not, right?

It knows that it can be a wakeup source, but it doesn't know if it will be
use that way (user space may not want that, for example).

It still makes sense to use IRQF_NO_SUSPEND for it, but people were complaining
about having to do that in addition to using enable_irq_wake().  Quite
understandably, because usually you want both or at least "wakeup" should
imply "no suspend".

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki July 28, 2014, 9:53 p.m. UTC | #8
On Monday, July 28, 2014 02:33:41 PM Thomas Gleixner wrote:
> On Mon, 28 Jul 2014, Peter Zijlstra wrote:
> > On Sat, Jul 26, 2014 at 01:49:17PM +0200, Rafael J. Wysocki wrote:
> > > One more idea, on top of the prototype patch that I posted
> > > (https://patchwork.kernel.org/patch/4625921/).
> > > 
> > > The problem with enable_irq_wake() is that it only takes one argument, so
> > > if that's a shared interrupt, we can't really say which irqaction is supposed
> > > to handle wakeup interrupts should they occur.
> > 
> > Right.
> 
> Historically no hardware manufacturer was so stupid to have wakeup
> sources on shared interrupts. Just because x86 is brain damaged is no
> reason to claim that stuff that worked fine for years is crap.

I don't honestly think that's x86 only.  The PCIe PME interrupt has always been
designed as shareable and that's not limited to x86.

Also enable_irq_wake() has the problem that on some platforms it makes
the interrupt wakeup-only, so it stops signaling input events after that.
AT91 seems to be one of these platforms (you may argue that this is a platform
bug, but still).  So if it is called in a driver's .suspend() callback,
it will create a "blind" period for potential wakeup events between
that point and when the platform is eventually turned off.

> We never needed this until now, so we simply go and do what we've done
> always in such situations. We look for a solution which copes with the
> new hardware brain farts and keeps the existing stuff working. It's
> that simple.
> 
> > > To address this we can introduce enable_device_irq_wake() that will take
> > > an additional dev_id argument (that must be the one passed to request_irq() or
> > > the operation will fail) that can be used to identify the irqaction for
> > > handling the wakeup interrupts.  It can work by setting IRQF_NO_SUSPEND
> > > for the irqaction in question and doing the rest along the lines of
> > > irq_set_irq_wake(irq, 1).  disable_device_irq_wake() will then clear
> > > IRQF_NO_SUSPEND (it also has to be passed the dev_id argument).
> > > 
> > > If we have that, the guys who need to set up device interrupts (ie. interrupts
> > > normally used for signaling input events etc) for system wakeup will need to
> > > use enable_device_irq_wake() and that should just work.
> > 
> > So in the patch I posted I described why NO_SUSPEND is useful, I still
> > have to hear a solid reason for why we also need enable_irq_wake()? What
> > does it do that cannot be achieved with NO_SUSPEND?
> > 
> > I realize its dynamic, but that's crap, at device registration time it
> > pretty much already knows if its a wakeup source or not, right?
> 
> It does, but that doesn't make it crap. There are situations where you
> want certain wakeup sources enabled or disabled and you can't do that
> with IRQF_NO_SUSPEND. And to support this, you need the wake_depth
> counter as well. And that's what we always had.
> 
> I'd rather say, that the "I can wakeup the machine and will do so no
> matter what flag" is more stupid than the wake mechanism.
> 
> It was added to support XEN wreckage and I wish we've never done that
> in the first place.

We needed to do that for timers to start with.  We also need it for ACPI
and pretty much everything that needs to react to events during system
suspend/resume too.  I would argue for limiting its use to things that
need their interrupts to be always enabled, though.

> So we are not going to make everything a single stupid flag and limit
> the usability of existing code. We rather go and try to remove the
> stupid flag before it becomes more wide spread.
> 
> And we cannot treat the wakeup thing the same way as the
> IRQF_NO_SUSPEND flag, because there is hardware where the irq line
> must be disabled at the normal (non suspend) interrupt controller, and
> the wake mechanism tells the PM microcontroller to monitor the
> interrupt line and kick the machine back to life.
> 
> So we need to very carefully look at all the existing cases instead of
> yelling crap and inflicting x86 specific horror on everyone. I said on
> friday, that I need to look at ALL use cases first and I meant it.

Regardless of the use case, I don't think it is necessary to manipulate
the interrupt controller settings before the syscore_suspend stage, because
if an interrupt happens earlier, we need to handle it pretty much in a normal
way, unless it has been suspended.

So I'd argue for not using anything like enable_irq_wake() that goes all
the way to the hardware in drivers.  Instead, we could allow drivers to
mark interrupts as "set this up for system wakeup" and really do the setup
right before putting the platform into the final "suspended" state.  And that
is totally independend of the IRQF_NO_SUSPEND thing.

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki July 28, 2014, 11:01 p.m. UTC | #9
On Monday, July 28, 2014 11:53:15 PM Rafael J. Wysocki wrote:
> On Monday, July 28, 2014 02:33:41 PM Thomas Gleixner wrote:
> > On Mon, 28 Jul 2014, Peter Zijlstra wrote:
> > > On Sat, Jul 26, 2014 at 01:49:17PM +0200, Rafael J. Wysocki wrote:

[cut]

> > So we are not going to make everything a single stupid flag and limit
> > the usability of existing code. We rather go and try to remove the
> > stupid flag before it becomes more wide spread.
> > 
> > And we cannot treat the wakeup thing the same way as the
> > IRQF_NO_SUSPEND flag, because there is hardware where the irq line
> > must be disabled at the normal (non suspend) interrupt controller, and
> > the wake mechanism tells the PM microcontroller to monitor the
> > interrupt line and kick the machine back to life.
> > 
> > So we need to very carefully look at all the existing cases instead of
> > yelling crap and inflicting x86 specific horror on everyone. I said on
> > friday, that I need to look at ALL use cases first and I meant it.
> 
> Regardless of the use case, I don't think it is necessary to manipulate
> the interrupt controller settings before the syscore_suspend stage, because
> if an interrupt happens earlier, we need to handle it pretty much in a normal
> way, unless it has been suspended.
> 
> So I'd argue for not using anything like enable_irq_wake() that goes all
> the way to the hardware in drivers.  Instead, we could allow drivers to
> mark interrupts as "set this up for system wakeup" and really do the setup
> right before putting the platform into the final "suspended" state.  And that
> is totally independend of the IRQF_NO_SUSPEND thing.

In addition to that we need the interrupt handler of the driver that requested
the irq to be set up for system wakeup to be invoked after suspend_device_irqs()
in case there are interrupts that should abort the suspend transition or we
can lose a wakeup event.  So whatever interface we decide to use it has to
affect suspend/resume_device_irqs() pretty much in the same way as the
IRQF_NO_SUSPEND flag.

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Gleixner July 29, 2014, 12:46 p.m. UTC | #10
On Tue, 29 Jul 2014, Rafael J. Wysocki wrote:

> On Monday, July 28, 2014 11:53:15 PM Rafael J. Wysocki wrote:
> > On Monday, July 28, 2014 02:33:41 PM Thomas Gleixner wrote:
> > > On Mon, 28 Jul 2014, Peter Zijlstra wrote:
> > > > On Sat, Jul 26, 2014 at 01:49:17PM +0200, Rafael J. Wysocki wrote:
> 
> [cut]
> 
> > > So we are not going to make everything a single stupid flag and limit
> > > the usability of existing code. We rather go and try to remove the
> > > stupid flag before it becomes more wide spread.
> > > 
> > > And we cannot treat the wakeup thing the same way as the
> > > IRQF_NO_SUSPEND flag, because there is hardware where the irq line
> > > must be disabled at the normal (non suspend) interrupt controller, and
> > > the wake mechanism tells the PM microcontroller to monitor the
> > > interrupt line and kick the machine back to life.
> > > 
> > > So we need to very carefully look at all the existing cases instead of
> > > yelling crap and inflicting x86 specific horror on everyone. I said on
> > > friday, that I need to look at ALL use cases first and I meant it.
> > 
> > Regardless of the use case, I don't think it is necessary to manipulate
> > the interrupt controller settings before the syscore_suspend stage, because
> > if an interrupt happens earlier, we need to handle it pretty much in a normal
> > way, unless it has been suspended.
> > 
> > So I'd argue for not using anything like enable_irq_wake() that goes all
> > the way to the hardware in drivers.  Instead, we could allow drivers to
> > mark interrupts as "set this up for system wakeup" and really do the setup
> > right before putting the platform into the final "suspended" state.  And that
> > is totally independend of the IRQF_NO_SUSPEND thing.
> 
> In addition to that we need the interrupt handler of the driver that requested
> the irq to be set up for system wakeup to be invoked after suspend_device_irqs()
> in case there are interrupts that should abort the suspend transition or we
> can lose a wakeup event.  So whatever interface we decide to use it has to
> affect suspend/resume_device_irqs() pretty much in the same way as the
> IRQF_NO_SUSPEND flag.

Right, that's a different issue. We probably want that even for the
existing irq_wake() users.

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki July 29, 2014, 1:33 p.m. UTC | #11
On Tuesday, July 29, 2014 02:46:41 PM Thomas Gleixner wrote:
> On Tue, 29 Jul 2014, Rafael J. Wysocki wrote:
> 
> > On Monday, July 28, 2014 11:53:15 PM Rafael J. Wysocki wrote:
> > > On Monday, July 28, 2014 02:33:41 PM Thomas Gleixner wrote:
> > > > On Mon, 28 Jul 2014, Peter Zijlstra wrote:
> > > > > On Sat, Jul 26, 2014 at 01:49:17PM +0200, Rafael J. Wysocki wrote:
> > 
> > [cut]
> > 
> > > > So we are not going to make everything a single stupid flag and limit
> > > > the usability of existing code. We rather go and try to remove the
> > > > stupid flag before it becomes more wide spread.
> > > > 
> > > > And we cannot treat the wakeup thing the same way as the
> > > > IRQF_NO_SUSPEND flag, because there is hardware where the irq line
> > > > must be disabled at the normal (non suspend) interrupt controller, and
> > > > the wake mechanism tells the PM microcontroller to monitor the
> > > > interrupt line and kick the machine back to life.
> > > > 
> > > > So we need to very carefully look at all the existing cases instead of
> > > > yelling crap and inflicting x86 specific horror on everyone. I said on
> > > > friday, that I need to look at ALL use cases first and I meant it.
> > > 
> > > Regardless of the use case, I don't think it is necessary to manipulate
> > > the interrupt controller settings before the syscore_suspend stage, because
> > > if an interrupt happens earlier, we need to handle it pretty much in a normal
> > > way, unless it has been suspended.
> > > 
> > > So I'd argue for not using anything like enable_irq_wake() that goes all
> > > the way to the hardware in drivers.  Instead, we could allow drivers to
> > > mark interrupts as "set this up for system wakeup" and really do the setup
> > > right before putting the platform into the final "suspended" state.  And that
> > > is totally independend of the IRQF_NO_SUSPEND thing.
> > 
> > In addition to that we need the interrupt handler of the driver that requested
> > the irq to be set up for system wakeup to be invoked after suspend_device_irqs()
> > in case there are interrupts that should abort the suspend transition or we
> > can lose a wakeup event.  So whatever interface we decide to use it has to
> > affect suspend/resume_device_irqs() pretty much in the same way as the
> > IRQF_NO_SUSPEND flag.
> 
> Right, that's a different issue. We probably want that even for the
> existing irq_wake() users.

I agree.

There's one more thing to consider here.  Going forward we'll want to avoid
touching runtime-suspended devices during system suspend.  Then, system wakeup
devices will need to mark their IRQs for system wakeup at the runtime suspend
time and I'm not sure if that's the right time for calling enable_irq_wake().

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki July 30, 2014, 9:46 p.m. UTC | #12
On Tuesday, July 29, 2014 03:33:23 PM Rafael J. Wysocki wrote:
> On Tuesday, July 29, 2014 02:46:41 PM Thomas Gleixner wrote:
> > On Tue, 29 Jul 2014, Rafael J. Wysocki wrote:
> > 
> > > On Monday, July 28, 2014 11:53:15 PM Rafael J. Wysocki wrote:
> > > > On Monday, July 28, 2014 02:33:41 PM Thomas Gleixner wrote:
> > > > > On Mon, 28 Jul 2014, Peter Zijlstra wrote:
> > > > > > On Sat, Jul 26, 2014 at 01:49:17PM +0200, Rafael J. Wysocki wrote:
> > > 
> > > [cut]
> > > 
> > > > > So we are not going to make everything a single stupid flag and limit
> > > > > the usability of existing code. We rather go and try to remove the
> > > > > stupid flag before it becomes more wide spread.
> > > > > 
> > > > > And we cannot treat the wakeup thing the same way as the
> > > > > IRQF_NO_SUSPEND flag, because there is hardware where the irq line
> > > > > must be disabled at the normal (non suspend) interrupt controller, and
> > > > > the wake mechanism tells the PM microcontroller to monitor the
> > > > > interrupt line and kick the machine back to life.
> > > > > 
> > > > > So we need to very carefully look at all the existing cases instead of
> > > > > yelling crap and inflicting x86 specific horror on everyone. I said on
> > > > > friday, that I need to look at ALL use cases first and I meant it.
> > > > 
> > > > Regardless of the use case, I don't think it is necessary to manipulate
> > > > the interrupt controller settings before the syscore_suspend stage, because
> > > > if an interrupt happens earlier, we need to handle it pretty much in a normal
> > > > way, unless it has been suspended.
> > > > 
> > > > So I'd argue for not using anything like enable_irq_wake() that goes all
> > > > the way to the hardware in drivers.  Instead, we could allow drivers to
> > > > mark interrupts as "set this up for system wakeup" and really do the setup
> > > > right before putting the platform into the final "suspended" state.  And that
> > > > is totally independend of the IRQF_NO_SUSPEND thing.
> > > 
> > > In addition to that we need the interrupt handler of the driver that requested
> > > the irq to be set up for system wakeup to be invoked after suspend_device_irqs()
> > > in case there are interrupts that should abort the suspend transition or we
> > > can lose a wakeup event.  So whatever interface we decide to use it has to
> > > affect suspend/resume_device_irqs() pretty much in the same way as the
> > > IRQF_NO_SUSPEND flag.
> > 
> > Right, that's a different issue. We probably want that even for the
> > existing irq_wake() users.
> 
> I agree.
> 
> There's one more thing to consider here.  Going forward we'll want to avoid
> touching runtime-suspended devices during system suspend.  Then, system wakeup
> devices will need to mark their IRQs for system wakeup at the runtime suspend
> time and I'm not sure if that's the right time for calling enable_irq_wake().

Taking all of the above into consideration I prepared a prototype that will
follow.  Patch [1/3] is the actual prototype of the core changes, patch [2/3]
uses that to implement suspend-to-idle wakeup for PME and patch [3/3] illustrates
how an existing user of enable_irq_wake() can be modified to use the new stuff.

All is on top of https://patchwork.kernel.org/patch/4643871/ which should apply
on top of -tip (if I'm not mistaken).

I've tested patches [1-2/3] with PME on my MSI Wind.

Comments welcome.

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

Index: linux-pm/kernel/irq/manage.c
===================================================================
--- linux-pm.orig/kernel/irq/manage.c
+++ linux-pm/kernel/irq/manage.c
@@ -385,10 +385,23 @@  setup_affinity(unsigned int irq, struct
 void __disable_irq(struct irq_desc *desc, unsigned int irq, bool suspend)
 {
 	if (suspend) {
-		if (!desc->action || (desc->action->flags & IRQF_NO_SUSPEND)
-		    || irqd_has_set(&desc->irq_data, IRQD_WAKEUP_STATE))
+		struct irqaction *action = desc->action;
+		unsigned int no_suspend, flags;
+
+		if (!action)
+			return;
+		no_suspend = IRQF_NO_SUSPEND;
+		flags = 0;
+		do {
+			no_suspend &= action->flags;
+			flags |= action->flags;
+			action = action->next;
+		} while (action);
+		if (no_suspend)
 			return;
 		desc->istate |= IRQS_SUSPENDED;
+		if (flags & IRQF_NO_SUSPEND)
+			return;
 	}
 
 	if (!desc->depth++)
@@ -446,7 +459,15 @@  EXPORT_SYMBOL(disable_irq);
 void __enable_irq(struct irq_desc *desc, unsigned int irq, bool resume)
 {
 	if (resume) {
-		if (!(desc->istate & IRQS_SUSPENDED)) {
+		if (desc->istate & IRQS_SUSPENDED) {
+			desc->istate &= ~IRQS_SUSPENDED;
+			if (desc->istate & IRQS_SPURIOUS_DISABLED) {
+				pr_err("WARNING! Unhandled events during suspend for IRQ %d\n", irq);
+				desc->istate &= ~IRQS_SPURIOUS_DISABLED;
+			} else if (desc->depth == 0) {
+				return;
+			}
+		} else {
 			if (!desc->action)
 				return;
 			if (!(desc->action->flags & IRQF_FORCE_RESUME))
@@ -454,7 +475,6 @@  void __enable_irq(struct irq_desc *desc,
 			/* Pretend that it got disabled ! */
 			desc->depth++;
 		}
-		desc->istate &= ~IRQS_SUSPENDED;
 	}
 
 	switch (desc->depth) {
@@ -1079,7 +1099,7 @@  __setup_irq(unsigned int irq, struct irq
 		 */
 
 #define IRQF_MISMATCH \
-	(IRQF_TRIGGER_MASK | IRQF_ONESHOT | IRQF_NO_SUSPEND)
+	(IRQF_TRIGGER_MASK | IRQF_ONESHOT)
 
 		if (!((old->flags & new->flags) & IRQF_SHARED) ||
 		    ((old->flags ^ new->flags) & IRQF_MISMATCH))
Index: linux-pm/kernel/irq/handle.c
===================================================================
--- linux-pm.orig/kernel/irq/handle.c
+++ linux-pm/kernel/irq/handle.c
@@ -131,6 +131,23 @@  void __irq_wake_thread(struct irq_desc *
 }
 
 irqreturn_t
+do_irqaction(struct irq_desc *desc, struct irqaction *action,
+	     unsigned int irq, void *dev_id)
+{
+	irqreturn_t ret;
+
+	if (unlikely((desc->istate & IRQS_SUSPENDED) &&
+		     !(action->flags & IRQF_NO_SUSPEND)))
+		return IRQ_NONE;
+
+	trace_irq_handler_entry(irq, action);
+	ret = action->handler(irq, dev_id);
+	trace_irq_handler_exit(irq, action, ret);
+
+	return ret;
+}
+
+irqreturn_t
 handle_irq_event_percpu(struct irq_desc *desc, struct irqaction *action)
 {
 	irqreturn_t retval = IRQ_NONE;
@@ -139,9 +156,7 @@  handle_irq_event_percpu(struct irq_desc
 	do {
 		irqreturn_t res;
 
-		trace_irq_handler_entry(irq, action);
-		res = action->handler(irq, action->dev_id);
-		trace_irq_handler_exit(irq, action, res);
+		res = do_irqaction(desc, action, irq, action->dev_id);
 
 		if (WARN_ONCE(!irqs_disabled(),"irq %u handler %pF enabled interrupts\n",
 			      irq, action->handler))