diff mbox

[RFC,v6,3/3] arm: fix a migrating irq bug when hotplug cpu

Message ID 1443087135-17044-4-git-send-email-yangyingliang@huawei.com (mailing list archive)
State New, archived
Headers show

Commit Message

Yang Yingliang Sept. 24, 2015, 9:32 a.m. UTC
When cpu is disabled, all irqs will be migratged to another cpu.
In some cases, a new affinity is different, the old affinity need
to be updated and if irq_set_affinity's return value is IRQ_SET_MASK_OK_DONE,
the old affinity can not be updated. Fix it by using irq_do_set_affinity.

And migrating interrupts is a core code matter, so use the generic
function irq_migrate_all_off_this_cpu() to migrate interrupts in
kernel/irq/migration.c.

Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
---
 arch/arm/Kconfig           |  1 +
 arch/arm/include/asm/irq.h |  1 -
 arch/arm/kernel/irq.c      | 62 ----------------------------------------------
 arch/arm/kernel/smp.c      |  2 +-
 4 files changed, 2 insertions(+), 64 deletions(-)

Comments

Russell King - ARM Linux Oct. 13, 2015, 7:28 p.m. UTC | #1
On Thu, Sep 24, 2015 at 05:32:15PM +0800, Yang Yingliang wrote:
> When cpu is disabled, all irqs will be migratged to another cpu.
> In some cases, a new affinity is different, the old affinity need
> to be updated and if irq_set_affinity's return value is IRQ_SET_MASK_OK_DONE,
> the old affinity can not be updated. Fix it by using irq_do_set_affinity.
> 
> And migrating interrupts is a core code matter, so use the generic
> function irq_migrate_all_off_this_cpu() to migrate interrupts in
> kernel/irq/migration.c.

Please can you put this patch in the patch system so I don't forget it
(it's taken tglx prompting and quite a long time to find this in my
mailbox already.)

See http:/www.arm.linux.org.uk/developer/patches/

Thanks.
Geert Uytterhoeven Oct. 21, 2015, 11:47 a.m. UTC | #2
On Thu, Sep 24, 2015 at 11:32 AM, Yang Yingliang
<yangyingliang@huawei.com> wrote:
> When cpu is disabled, all irqs will be migratged to another cpu.
> In some cases, a new affinity is different, the old affinity need
> to be updated and if irq_set_affinity's return value is IRQ_SET_MASK_OK_DONE,
> the old affinity can not be updated. Fix it by using irq_do_set_affinity.
>
> And migrating interrupts is a core code matter, so use the generic
> function irq_migrate_all_off_this_cpu() to migrate interrupts in
> kernel/irq/migration.c.
>
> Cc: Jiang Liu <jiang.liu@linux.intel.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
> Cc: Hanjun Guo <hanjun.guo@linaro.org>
> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
> ---
>  arch/arm/Kconfig           |  1 +
>  arch/arm/include/asm/irq.h |  1 -
>  arch/arm/kernel/irq.c      | 62 ----------------------------------------------
>  arch/arm/kernel/smp.c      |  2 +-
>  4 files changed, 2 insertions(+), 64 deletions(-)
>
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 72ad724..bffba78 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -1492,6 +1492,7 @@ config NR_CPUS
>  config HOTPLUG_CPU
>         bool "Support for hot-pluggable CPUs"
>         depends on SMP
> +       select GENERIC_IRQ_MIGRATION

This causes the following warnings during s2ram on r8a7791/koelsch
(dual-core CA15):

 Freezing remaining freezable tasks ... (elapsed 0.003 seconds) done.
 PM: Suspending system (mem)
 Suspending console(s) (use no_console_suspend to debug)
 PM: suspend of devices complete after 56.533 msecs
 PM: late suspend of devices complete after 21.229 msecs
 PM: noirq suspend of devices complete after 21.093 msecs
 Disabling non-boot CPUs ...
+IRQ0: unable to set affinity
+IRQ1: unable to set affinity
+IRQ2: unable to set affinity
+IRQ3: unable to set affinity
+IRQ4: unable to set affinity
+IRQ5: unable to set affinity
+IRQ6: unable to set affinity
+IRQ7: unable to set affinity
+IRQ8: unable to set affinity
+IRQ9: unable to set affinity
 CPU1: shutdown
 Enabling non-boot CPUs ...
 CPU1 is up
 PM: noirq resume of devices complete after 13.605 msecs
 PM: early resume of devices complete after 15.177 msecs

/proc/interrupts doesn't have IRQ[0-9].

Printing the irqchip's name and removing the rate limiting reveals:

chip none IRQ0: unable to set affinity
chip none IRQ1: unable to set affinity
chip none IRQ2: unable to set affinity
chip none IRQ3: unable to set affinity
chip none IRQ4: unable to set affinity
chip none IRQ5: unable to set affinity
chip none IRQ6: unable to set affinity
chip none IRQ7: unable to set affinity
chip none IRQ8: unable to set affinity
chip none IRQ9: unable to set affinity
chip none IRQ10: unable to set affinity
chip none IRQ11: unable to set affinity
chip none IRQ12: unable to set affinity
chip none IRQ13: unable to set affinity
chip none IRQ14: unable to set affinity
chip none IRQ15: unable to set affinity
chip e6050000.gpio IRQ128: unable to set affinity
chip e6050000.gpio IRQ129: unable to set affinity
chip e6050000.gpio IRQ130: unable to set affinity
chip e6050000.gpio IRQ131: unable to set affinity
chip e6050000.gpio IRQ132: unable to set affinity
chip e6050000.gpio IRQ133: unable to set affinity
chip e6050000.gpio IRQ134: unable to set affinity
chip e6050000.gpio IRQ135: unable to set affinity
chip e6050000.gpio IRQ136: unable to set affinity
chip e6050000.gpio IRQ137: unable to set affinity
chip e6050000.gpio IRQ138: unable to set affinity
chip e6050000.gpio IRQ139: unable to set affinity
chip e6050000.gpio IRQ140: unable to set affinity
chip e6050000.gpio IRQ141: unable to set affinity
chip e6050000.gpio IRQ142: unable to set affinity
chip e6050000.gpio IRQ143: unable to set affinity
chip e6050000.gpio IRQ144: unable to set affinity
chip e6050000.gpio IRQ145: unable to set affinity
chip e6050000.gpio IRQ146: unable to set affinity
chip e6050000.gpio IRQ147: unable to set affinity
chip e6050000.gpio IRQ148: unable to set affinity
chip e6050000.gpio IRQ149: unable to set affinity
chip e6050000.gpio IRQ150: unable to set affinity
chip e6050000.gpio IRQ151: unable to set affinity
chip e6050000.gpio IRQ152: unable to set affinity
chip e6050000.gpio IRQ153: unable to set affinity
chip e6050000.gpio IRQ154: unable to set affinity
chip e6050000.gpio IRQ155: unable to set affinity
chip e6050000.gpio IRQ156: unable to set affinity
chip e6050000.gpio IRQ157: unable to set affinity
chip e6050000.gpio IRQ158: unable to set affinity
chip e6050000.gpio IRQ159: unable to set affinity
chip e6051000.gpio IRQ160: unable to set affinity
chip e6051000.gpio IRQ161: unable to set affinity
chip e6051000.gpio IRQ162: unable to set affinity
chip e6051000.gpio IRQ163: unable to set affinity
chip e6051000.gpio IRQ164: unable to set affinity
chip e6051000.gpio IRQ165: unable to set affinity
chip e6051000.gpio IRQ166: unable to set affinity
chip e6051000.gpio IRQ167: unable to set affinity
chip e6051000.gpio IRQ168: unable to set affinity
chip e6051000.gpio IRQ169: unable to set affinity
chip e6051000.gpio IRQ170: unable to set affinity
chip e6051000.gpio IRQ171: unable to set affinity
chip e6051000.gpio IRQ172: unable to set affinity
chip e6051000.gpio IRQ173: unable to set affinity
chip e6051000.gpio IRQ174: unable to set affinity
chip e6051000.gpio IRQ175: unable to set affinity
chip e6051000.gpio IRQ176: unable to set affinity
chip e6051000.gpio IRQ177: unable to set affinity
chip e6051000.gpio IRQ178: unable to set affinity
chip e6051000.gpio IRQ179: unable to set affinity
chip e6051000.gpio IRQ180: unable to set affinity
chip e6051000.gpio IRQ181: unable to set affinity
chip e6051000.gpio IRQ182: unable to set affinity
chip e6051000.gpio IRQ183: unable to set affinity
chip e6051000.gpio IRQ184: unable to set affinity
chip e6051000.gpio IRQ185: unable to set affinity
chip e6051000.gpio IRQ186: unable to set affinity
chip e6051000.gpio IRQ187: unable to set affinity
chip e6051000.gpio IRQ188: unable to set affinity
chip e6051000.gpio IRQ189: unable to set affinity
chip e6051000.gpio IRQ190: unable to set affinity
chip e6051000.gpio IRQ191: unable to set affinity
chip e6052000.gpio IRQ192: unable to set affinity
chip e6052000.gpio IRQ193: unable to set affinity
chip e6052000.gpio IRQ194: unable to set affinity
chip e6052000.gpio IRQ195: unable to set affinity
chip e6052000.gpio IRQ196: unable to set affinity
chip e6052000.gpio IRQ197: unable to set affinity
chip e6052000.gpio IRQ198: unable to set affinity
chip e6052000.gpio IRQ199: unable to set affinity
chip e6052000.gpio IRQ200: unable to set affinity
chip e6052000.gpio IRQ201: unable to set affinity
chip e6052000.gpio IRQ202: unable to set affinity
chip e6052000.gpio IRQ203: unable to set affinity
chip e6052000.gpio IRQ204: unable to set affinity
chip e6052000.gpio IRQ205: unable to set affinity
chip e6052000.gpio IRQ206: unable to set affinity
chip e6052000.gpio IRQ207: unable to set affinity
chip e6052000.gpio IRQ208: unable to set affinity
chip e6052000.gpio IRQ209: unable to set affinity
chip e6052000.gpio IRQ210: unable to set affinity
chip e6052000.gpio IRQ211: unable to set affinity
chip e6052000.gpio IRQ212: unable to set affinity
chip e6052000.gpio IRQ213: unable to set affinity
chip e6052000.gpio IRQ214: unable to set affinity
chip e6052000.gpio IRQ215: unable to set affinity
chip e6052000.gpio IRQ216: unable to set affinity
chip e6052000.gpio IRQ217: unable to set affinity
chip e6052000.gpio IRQ218: unable to set affinity
chip e6052000.gpio IRQ219: unable to set affinity
chip e6052000.gpio IRQ220: unable to set affinity
chip e6052000.gpio IRQ221: unable to set affinity
chip e6052000.gpio IRQ222: unable to set affinity
chip e6052000.gpio IRQ223: unable to set affinity
chip e6053000.gpio IRQ224: unable to set affinity
chip e6053000.gpio IRQ225: unable to set affinity
chip e6053000.gpio IRQ226: unable to set affinity
chip e6053000.gpio IRQ227: unable to set affinity
chip e6053000.gpio IRQ228: unable to set affinity
chip e6053000.gpio IRQ229: unable to set affinity
chip e6053000.gpio IRQ230: unable to set affinity
chip e6053000.gpio IRQ231: unable to set affinity
chip e6053000.gpio IRQ232: unable to set affinity
chip e6053000.gpio IRQ233: unable to set affinity
chip e6053000.gpio IRQ234: unable to set affinity
chip e6053000.gpio IRQ235: unable to set affinity
chip e6053000.gpio IRQ236: unable to set affinity
chip e6053000.gpio IRQ237: unable to set affinity
chip e6053000.gpio IRQ238: unable to set affinity
chip e6053000.gpio IRQ239: unable to set affinity
chip e6053000.gpio IRQ240: unable to set affinity
chip e6053000.gpio IRQ241: unable to set affinity
chip e6053000.gpio IRQ242: unable to set affinity
chip e6053000.gpio IRQ243: unable to set affinity
chip e6053000.gpio IRQ244: unable to set affinity
chip e6053000.gpio IRQ245: unable to set affinity
chip e6053000.gpio IRQ246: unable to set affinity
chip e6053000.gpio IRQ247: unable to set affinity
chip e6053000.gpio IRQ248: unable to set affinity
chip e6053000.gpio IRQ249: unable to set affinity
chip e6053000.gpio IRQ250: unable to set affinity
chip e6053000.gpio IRQ251: unable to set affinity
chip e6053000.gpio IRQ252: unable to set affinity
chip e6053000.gpio IRQ253: unable to set affinity
chip e6053000.gpio IRQ254: unable to set affinity
chip e6053000.gpio IRQ255: unable to set affinity
chip e6054000.gpio IRQ256: unable to set affinity
chip e6054000.gpio IRQ257: unable to set affinity
chip e6054000.gpio IRQ258: unable to set affinity
chip e6054000.gpio IRQ259: unable to set affinity
chip e6054000.gpio IRQ260: unable to set affinity
chip e6054000.gpio IRQ261: unable to set affinity
chip e6054000.gpio IRQ262: unable to set affinity
chip e6054000.gpio IRQ263: unable to set affinity
chip e6054000.gpio IRQ264: unable to set affinity
chip e6054000.gpio IRQ265: unable to set affinity
chip e6054000.gpio IRQ266: unable to set affinity
chip e6054000.gpio IRQ267: unable to set affinity
chip e6054000.gpio IRQ268: unable to set affinity
chip e6054000.gpio IRQ269: unable to set affinity
chip e6054000.gpio IRQ270: unable to set affinity
chip e6054000.gpio IRQ271: unable to set affinity
chip e6054000.gpio IRQ272: unable to set affinity
chip e6054000.gpio IRQ273: unable to set affinity
chip e6054000.gpio IRQ274: unable to set affinity
chip e6054000.gpio IRQ275: unable to set affinity
chip e6054000.gpio IRQ276: unable to set affinity
chip e6054000.gpio IRQ277: unable to set affinity
chip e6054000.gpio IRQ278: unable to set affinity
chip e6054000.gpio IRQ279: unable to set affinity
chip e6054000.gpio IRQ280: unable to set affinity
chip e6054000.gpio IRQ281: unable to set affinity
chip e6054000.gpio IRQ282: unable to set affinity
chip e6054000.gpio IRQ283: unable to set affinity
chip e6054000.gpio IRQ284: unable to set affinity
chip e6054000.gpio IRQ285: unable to set affinity
chip e6054000.gpio IRQ286: unable to set affinity
chip e6054000.gpio IRQ287: unable to set affinity
chip e6055000.gpio IRQ288: unable to set affinity
chip e6055000.gpio IRQ289: unable to set affinity
chip e6055000.gpio IRQ290: unable to set affinity
chip e6055000.gpio IRQ291: unable to set affinity
chip e6055000.gpio IRQ292: unable to set affinity
chip e6055000.gpio IRQ293: unable to set affinity
chip e6055000.gpio IRQ294: unable to set affinity
chip e6055000.gpio IRQ295: unable to set affinity
chip e6055000.gpio IRQ296: unable to set affinity
chip e6055000.gpio IRQ297: unable to set affinity
chip e6055000.gpio IRQ298: unable to set affinity
chip e6055000.gpio IRQ299: unable to set affinity
chip e6055000.gpio IRQ300: unable to set affinity
chip e6055000.gpio IRQ301: unable to set affinity
chip e6055000.gpio IRQ302: unable to set affinity
chip e6055000.gpio IRQ303: unable to set affinity
chip e6055000.gpio IRQ304: unable to set affinity
chip e6055000.gpio IRQ305: unable to set affinity
chip e6055000.gpio IRQ306: unable to set affinity
chip e6055000.gpio IRQ307: unable to set affinity
chip e6055000.gpio IRQ308: unable to set affinity
chip e6055000.gpio IRQ309: unable to set affinity
chip e6055000.gpio IRQ310: unable to set affinity
chip e6055000.gpio IRQ311: unable to set affinity
chip e6055000.gpio IRQ312: unable to set affinity
chip e6055000.gpio IRQ313: unable to set affinity
chip e6055000.gpio IRQ314: unable to set affinity
chip e6055000.gpio IRQ315: unable to set affinity
chip e6055000.gpio IRQ316: unable to set affinity
chip e6055000.gpio IRQ317: unable to set affinity
chip e6055000.gpio IRQ318: unable to set affinity
chip e6055000.gpio IRQ319: unable to set affinity
chip e6055400.gpio IRQ320: unable to set affinity
chip e6055400.gpio IRQ321: unable to set affinity
chip e6055400.gpio IRQ322: unable to set affinity
chip e6055400.gpio IRQ323: unable to set affinity
chip e6055400.gpio IRQ324: unable to set affinity
chip e6055400.gpio IRQ325: unable to set affinity
chip e6055400.gpio IRQ326: unable to set affinity
chip e6055400.gpio IRQ327: unable to set affinity
chip e6055400.gpio IRQ328: unable to set affinity
chip e6055400.gpio IRQ329: unable to set affinity
chip e6055400.gpio IRQ330: unable to set affinity
chip e6055400.gpio IRQ331: unable to set affinity
chip e6055400.gpio IRQ332: unable to set affinity
chip e6055400.gpio IRQ333: unable to set affinity
chip e6055400.gpio IRQ334: unable to set affinity
chip e6055400.gpio IRQ335: unable to set affinity
chip e6055400.gpio IRQ336: unable to set affinity
chip e6055400.gpio IRQ337: unable to set affinity
chip e6055400.gpio IRQ338: unable to set affinity
chip e6055400.gpio IRQ339: unable to set affinity
chip e6055400.gpio IRQ340: unable to set affinity
chip e6055400.gpio IRQ341: unable to set affinity
chip e6055400.gpio IRQ342: unable to set affinity
chip e6055400.gpio IRQ343: unable to set affinity
chip e6055400.gpio IRQ344: unable to set affinity
chip e6055400.gpio IRQ345: unable to set affinity
chip e6055400.gpio IRQ346: unable to set affinity
chip e6055400.gpio IRQ347: unable to set affinity
chip e6055400.gpio IRQ348: unable to set affinity
chip e6055400.gpio IRQ349: unable to set affinity
chip e6055400.gpio IRQ350: unable to set affinity
chip e6055400.gpio IRQ351: unable to set affinity
chip e6055800.gpio IRQ352: unable to set affinity
chip e6055800.gpio IRQ353: unable to set affinity
chip e6055800.gpio IRQ354: unable to set affinity
chip e6055800.gpio IRQ355: unable to set affinity
chip e6055800.gpio IRQ356: unable to set affinity
chip e6055800.gpio IRQ357: unable to set affinity
chip e6055800.gpio IRQ358: unable to set affinity
chip e6055800.gpio IRQ359: unable to set affinity
chip e6055800.gpio IRQ360: unable to set affinity
chip e6055800.gpio IRQ361: unable to set affinity
chip e6055800.gpio IRQ362: unable to set affinity
chip e6055800.gpio IRQ363: unable to set affinity
chip e6055800.gpio IRQ364: unable to set affinity
chip e6055800.gpio IRQ365: unable to set affinity
chip e6055800.gpio IRQ366: unable to set affinity
chip e6055800.gpio IRQ367: unable to set affinity
chip e6055800.gpio IRQ368: unable to set affinity
chip e6055800.gpio IRQ369: unable to set affinity
chip e6055800.gpio IRQ370: unable to set affinity
chip e6055800.gpio IRQ371: unable to set affinity
chip e6055800.gpio IRQ372: unable to set affinity
chip e6055800.gpio IRQ373: unable to set affinity
chip e6055800.gpio IRQ374: unable to set affinity
chip e6055800.gpio IRQ375: unable to set affinity
chip e6055800.gpio IRQ376: unable to set affinity
chip e6055800.gpio IRQ377: unable to set affinity
chip e61c0000.interrupt-controller IRQ378: unable to set affinity
chip da9063-irq IRQ379: unable to set affinity
chip da9063-irq IRQ380: unable to set affinity
chip da9063-irq IRQ381: unable to set affinity
chip da9063-irq IRQ382: unable to set affinity
chip da9063-irq IRQ383: unable to set affinity
chip da9063-irq IRQ384: unable to set affinity
chip da9063-irq IRQ385: unable to set affinity
chip da9063-irq IRQ386: unable to set affinity
chip da9063-irq IRQ387: unable to set affinity
chip da9063-irq IRQ388: unable to set affinity
chip da9063-irq IRQ389: unable to set affinity
chip da9063-irq IRQ390: unable to set affinity
chip da9063-irq IRQ391: unable to set affinity
chip da9063-irq IRQ392: unable to set affinity
chip da9063-irq IRQ393: unable to set affinity
chip da9063-irq IRQ394: unable to set affinity
chip da9063-irq IRQ395: unable to set affinity
chip da9063-irq IRQ396: unable to set affinity
chip da9063-irq IRQ397: unable to set affinity
chip da9063-irq IRQ398: unable to set affinity
chip da9063-irq IRQ399: unable to set affinity
chip da9063-irq IRQ400: unable to set affinity
chip da9063-irq IRQ401: unable to set affinity
chip da9063-irq IRQ402: unable to set affinity
chip da9063-irq IRQ403: unable to set affinity
chip da9063-irq IRQ404: unable to set affinity
chip da9063-irq IRQ405: unable to set affinity
chip da9063-irq IRQ406: unable to set affinity
chip da9063-irq IRQ407: unable to set affinity
chip e61c0000.interrupt-controller IRQ408: unable to set affinity

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
Russell King - ARM Linux Oct. 21, 2015, 8:29 p.m. UTC | #3
On Wed, Oct 21, 2015 at 01:47:49PM +0200, Geert Uytterhoeven wrote:
> On Thu, Sep 24, 2015 at 11:32 AM, Yang Yingliang
> <yangyingliang@huawei.com> wrote:
> > When cpu is disabled, all irqs will be migratged to another cpu.
> > In some cases, a new affinity is different, the old affinity need
> > to be updated and if irq_set_affinity's return value is IRQ_SET_MASK_OK_DONE,
> > the old affinity can not be updated. Fix it by using irq_do_set_affinity.
> >
> > And migrating interrupts is a core code matter, so use the generic
> > function irq_migrate_all_off_this_cpu() to migrate interrupts in
> > kernel/irq/migration.c.
> >
> > Cc: Jiang Liu <jiang.liu@linux.intel.com>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Marc Zyngier <marc.zyngier@arm.com>
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Cc: Will Deacon <will.deacon@arm.com>
> > Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
> > Cc: Hanjun Guo <hanjun.guo@linaro.org>
> > Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
> > ---
> >  arch/arm/Kconfig           |  1 +
> >  arch/arm/include/asm/irq.h |  1 -
> >  arch/arm/kernel/irq.c      | 62 ----------------------------------------------
> >  arch/arm/kernel/smp.c      |  2 +-
> >  4 files changed, 2 insertions(+), 64 deletions(-)
> >
> > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> > index 72ad724..bffba78 100644
> > --- a/arch/arm/Kconfig
> > +++ b/arch/arm/Kconfig
> > @@ -1492,6 +1492,7 @@ config NR_CPUS
> >  config HOTPLUG_CPU
> >         bool "Support for hot-pluggable CPUs"
> >         depends on SMP
> > +       select GENERIC_IRQ_MIGRATION
> 
> This causes the following warnings during s2ram on r8a7791/koelsch
> (dual-core CA15):

Thanks for the report.  I'll see what tonight's boot run says for my
platforms.  Hopefully, the author of these changes can help debug
this.
Russell King - ARM Linux Oct. 22, 2015, 9:26 a.m. UTC | #4
On Wed, Oct 21, 2015 at 09:29:08PM +0100, Russell King - ARM Linux wrote:
> On Wed, Oct 21, 2015 at 01:47:49PM +0200, Geert Uytterhoeven wrote:
> > On Thu, Sep 24, 2015 at 11:32 AM, Yang Yingliang
> > <yangyingliang@huawei.com> wrote:
> > > When cpu is disabled, all irqs will be migratged to another cpu.
> > > In some cases, a new affinity is different, the old affinity need
> > > to be updated and if irq_set_affinity's return value is IRQ_SET_MASK_OK_DONE,
> > > the old affinity can not be updated. Fix it by using irq_do_set_affinity.
> > >
> > > And migrating interrupts is a core code matter, so use the generic
> > > function irq_migrate_all_off_this_cpu() to migrate interrupts in
> > > kernel/irq/migration.c.
> > >
> > > Cc: Jiang Liu <jiang.liu@linux.intel.com>
> > > Cc: Thomas Gleixner <tglx@linutronix.de>
> > > Cc: Marc Zyngier <marc.zyngier@arm.com>
> > > Cc: Mark Rutland <mark.rutland@arm.com>
> > > Cc: Will Deacon <will.deacon@arm.com>
> > > Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
> > > Cc: Hanjun Guo <hanjun.guo@linaro.org>
> > > Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
> > > ---
> > >  arch/arm/Kconfig           |  1 +
> > >  arch/arm/include/asm/irq.h |  1 -
> > >  arch/arm/kernel/irq.c      | 62 ----------------------------------------------
> > >  arch/arm/kernel/smp.c      |  2 +-
> > >  4 files changed, 2 insertions(+), 64 deletions(-)
> > >
> > > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> > > index 72ad724..bffba78 100644
> > > --- a/arch/arm/Kconfig
> > > +++ b/arch/arm/Kconfig
> > > @@ -1492,6 +1492,7 @@ config NR_CPUS
> > >  config HOTPLUG_CPU
> > >         bool "Support for hot-pluggable CPUs"
> > >         depends on SMP
> > > +       select GENERIC_IRQ_MIGRATION
> > 
> > This causes the following warnings during s2ram on r8a7791/koelsch
> > (dual-core CA15):
> 
> Thanks for the report.  I'll see what tonight's boot run says for my
> platforms.  Hopefully, the author of these changes can help debug
> this.

What's happened is that:

-	c = irq_data_get_irq_chip(d);
-	if (!c->irq_set_affinity)
-		pr_debug("IRQ%u: unable to set affinity\n", d->irq);

has become:

+	c = irq_data_get_irq_chip(d);
+	if (!c->irq_set_affinity) {
+		pr_warn_ratelimited("IRQ%u: unable to set affinity\n", d->irq);

which makes things more noisy.

This is a change that was not described in the commit message for the
patch Thomas merged.

So, I think the right thing to do is to drop the patch set and revert
back to what we knew was sane, and get the submitter to do the job
properly: cleanly move the code from one location to another with _no_
changes what so ever, convert ARM and ARM64 to use it, and _then_ to
modify the resulting code.

From what I can see, both ARM and ARM64 implementations here are
identical.

I'm certainly dropping this from the ARM tree.
Geert Uytterhoeven Oct. 22, 2015, 9:46 a.m. UTC | #5
Hi Russell,

On Thu, Oct 22, 2015 at 11:26 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Wed, Oct 21, 2015 at 09:29:08PM +0100, Russell King - ARM Linux wrote:
>> On Wed, Oct 21, 2015 at 01:47:49PM +0200, Geert Uytterhoeven wrote:
>> > On Thu, Sep 24, 2015 at 11:32 AM, Yang Yingliang
>> > <yangyingliang@huawei.com> wrote:
>> > > When cpu is disabled, all irqs will be migratged to another cpu.
>> > > In some cases, a new affinity is different, the old affinity need
>> > > to be updated and if irq_set_affinity's return value is IRQ_SET_MASK_OK_DONE,
>> > > the old affinity can not be updated. Fix it by using irq_do_set_affinity.
>> > >
>> > > And migrating interrupts is a core code matter, so use the generic
>> > > function irq_migrate_all_off_this_cpu() to migrate interrupts in
>> > > kernel/irq/migration.c.
>> > >
>> > > Cc: Jiang Liu <jiang.liu@linux.intel.com>
>> > > Cc: Thomas Gleixner <tglx@linutronix.de>
>> > > Cc: Marc Zyngier <marc.zyngier@arm.com>
>> > > Cc: Mark Rutland <mark.rutland@arm.com>
>> > > Cc: Will Deacon <will.deacon@arm.com>
>> > > Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
>> > > Cc: Hanjun Guo <hanjun.guo@linaro.org>
>> > > Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
>> > > ---
>> > >  arch/arm/Kconfig           |  1 +
>> > >  arch/arm/include/asm/irq.h |  1 -
>> > >  arch/arm/kernel/irq.c      | 62 ----------------------------------------------
>> > >  arch/arm/kernel/smp.c      |  2 +-
>> > >  4 files changed, 2 insertions(+), 64 deletions(-)
>> > >
>> > > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
>> > > index 72ad724..bffba78 100644
>> > > --- a/arch/arm/Kconfig
>> > > +++ b/arch/arm/Kconfig
>> > > @@ -1492,6 +1492,7 @@ config NR_CPUS
>> > >  config HOTPLUG_CPU
>> > >         bool "Support for hot-pluggable CPUs"
>> > >         depends on SMP
>> > > +       select GENERIC_IRQ_MIGRATION
>> >
>> > This causes the following warnings during s2ram on r8a7791/koelsch
>> > (dual-core CA15):
>>
>> Thanks for the report.  I'll see what tonight's boot run says for my
>> platforms.  Hopefully, the author of these changes can help debug
>> this.
>
> What's happened is that:
>
> -       c = irq_data_get_irq_chip(d);
> -       if (!c->irq_set_affinity)
> -               pr_debug("IRQ%u: unable to set affinity\n", d->irq);
>
> has become:
>
> +       c = irq_data_get_irq_chip(d);
> +       if (!c->irq_set_affinity) {
> +               pr_warn_ratelimited("IRQ%u: unable to set affinity\n", d->irq);
>
> which makes things more noisy.
>
> This is a change that was not described in the commit message for the
> patch Thomas merged.
>
> So, I think the right thing to do is to drop the patch set and revert
> back to what we knew was sane, and get the submitter to do the job
> properly: cleanly move the code from one location to another with _no_
> changes what so ever, convert ARM and ARM64 to use it, and _then_ to
> modify the resulting code.
>
> From what I can see, both ARM and ARM64 implementations here are
> identical.
>
> I'm certainly dropping this from the ARM tree.

Thanks for your analysis!

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
Yang Yingliang Oct. 22, 2015, 10:56 a.m. UTC | #6
On 2015/10/22 17:26, Russell King - ARM Linux wrote:
> On Wed, Oct 21, 2015 at 09:29:08PM +0100, Russell King - ARM Linux wrote:
>> On Wed, Oct 21, 2015 at 01:47:49PM +0200, Geert Uytterhoeven wrote:
>>> On Thu, Sep 24, 2015 at 11:32 AM, Yang Yingliang
>>> <yangyingliang@huawei.com> wrote:
>>>> When cpu is disabled, all irqs will be migratged to another cpu.
>>>> In some cases, a new affinity is different, the old affinity need
>>>> to be updated and if irq_set_affinity's return value is IRQ_SET_MASK_OK_DONE,
>>>> the old affinity can not be updated. Fix it by using irq_do_set_affinity.
>>>>
>>>> And migrating interrupts is a core code matter, so use the generic
>>>> function irq_migrate_all_off_this_cpu() to migrate interrupts in
>>>> kernel/irq/migration.c.
>>>>
>>>> Cc: Jiang Liu <jiang.liu@linux.intel.com>
>>>> Cc: Thomas Gleixner <tglx@linutronix.de>
>>>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>>>> Cc: Mark Rutland <mark.rutland@arm.com>
>>>> Cc: Will Deacon <will.deacon@arm.com>
>>>> Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
>>>> Cc: Hanjun Guo <hanjun.guo@linaro.org>
>>>> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
>>>> ---
>>>>   arch/arm/Kconfig           |  1 +
>>>>   arch/arm/include/asm/irq.h |  1 -
>>>>   arch/arm/kernel/irq.c      | 62 ----------------------------------------------
>>>>   arch/arm/kernel/smp.c      |  2 +-
>>>>   4 files changed, 2 insertions(+), 64 deletions(-)
>>>>
>>>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
>>>> index 72ad724..bffba78 100644
>>>> --- a/arch/arm/Kconfig
>>>> +++ b/arch/arm/Kconfig
>>>> @@ -1492,6 +1492,7 @@ config NR_CPUS
>>>>   config HOTPLUG_CPU
>>>>          bool "Support for hot-pluggable CPUs"
>>>>          depends on SMP
>>>> +       select GENERIC_IRQ_MIGRATION
>>>
>>> This causes the following warnings during s2ram on r8a7791/koelsch
>>> (dual-core CA15):
>>
>> Thanks for the report.  I'll see what tonight's boot run says for my
>> platforms.  Hopefully, the author of these changes can help debug
>> this.
>
> What's happened is that:
>
> -	c = irq_data_get_irq_chip(d);
> -	if (!c->irq_set_affinity)
> -		pr_debug("IRQ%u: unable to set affinity\n", d->irq);
>
> has become:
>
> +	c = irq_data_get_irq_chip(d);
> +	if (!c->irq_set_affinity) {
> +		pr_warn_ratelimited("IRQ%u: unable to set affinity\n", d->irq);
>
> which makes things more noisy.
>
> This is a change that was not described in the commit message for the
> patch Thomas merged.

I described it in v2 cover letter and kept the change history in v6
cover letter. There is no comment on the change when patch the was
reviewing in v2, so I thought it's ok and I kept the change in the
next versions.

Need I send a patch to the Thomas branch to revert the change ?

Thanks,
Yang
Russell King - ARM Linux Oct. 22, 2015, 11:13 a.m. UTC | #7
On Thu, Oct 22, 2015 at 06:56:29PM +0800, Yang Yingliang wrote:
> I described it in v2 cover letter and kept the change history in v6
> cover letter. There is no comment on the change when patch the was
> reviewing in v2, so I thought it's ok and I kept the change in the
> next versions.

Cover letters don't always get read, neither do changelogs.

However, there's a principle here: never mix moving code around with
changes to that code.  Always move code with as few changes as possible
in one patch, and then make changes in a subsequent patch.

The "few changes as possible" means that if you need to make changes
for it to end up building in its new location, such as removing a
'static' or adding an 'EXPORT_SYMBOL' then those are fine, but the
main body of the code should remain identical, even down to style.

Any changes (such as, in this case, replacing pr_debug with pr_warn)
should be done as a distinctly separate patch so that such changes
are immediately obvious to reviewers.

> Need I send a patch to the Thomas branch to revert the change ?

I think wait for Thomas and Catalin to reply.  Your patch series is
currently merged into two different trees (Thomas' and Catalin's
trees) and what action is needed depends on how they want to handle
it.

The solutions are:
* A patch to restore the pr_debug() which Thomas applies, and Catalin
  and myself then pull Thomas' tree again, which potentially creates
  a messier history.

* Catalin drops the ARM64 change and Thomas' tree from the ARM64 tree,
  Thomas drops the original commit, and we start again doing it
  correctly.

Which is up to Catalin and Thomas.

I've dropped it from my tree as an easy way to fix the regression
on ARM for the time being, pending the outcome of deciding how to
fix this.
Thomas Gleixner Oct. 22, 2015, 12:43 p.m. UTC | #8
On Thu, 22 Oct 2015, Russell King - ARM Linux wrote:
> On Thu, Oct 22, 2015 at 06:56:29PM +0800, Yang Yingliang wrote:
> > I described it in v2 cover letter and kept the change history in v6
> > cover letter. There is no comment on the change when patch the was
> > reviewing in v2, so I thought it's ok and I kept the change in the
> > next versions.
> 
> Cover letters don't always get read, neither do changelogs.
> 
> However, there's a principle here: never mix moving code around with
> changes to that code.  Always move code with as few changes as possible
> in one patch, and then make changes in a subsequent patch.
> 
> The "few changes as possible" means that if you need to make changes
> for it to end up building in its new location, such as removing a
> 'static' or adding an 'EXPORT_SYMBOL' then those are fine, but the
> main body of the code should remain identical, even down to style.
> 
> Any changes (such as, in this case, replacing pr_debug with pr_warn)
> should be done as a distinctly separate patch so that such changes
> are immediately obvious to reviewers.
> 
> > Need I send a patch to the Thomas branch to revert the change ?
> 
> I think wait for Thomas and Catalin to reply.  Your patch series is
> currently merged into two different trees (Thomas' and Catalin's
> trees) and what action is needed depends on how they want to handle
> it.
> 
> The solutions are:
> * A patch to restore the pr_debug() which Thomas applies, and Catalin
>   and myself then pull Thomas' tree again, which potentially creates
>   a messier history.
> 
> * Catalin drops the ARM64 change and Thomas' tree from the ARM64 tree,
>   Thomas drops the original commit, and we start again doing it
>   correctly.
> 
> Which is up to Catalin and Thomas.

I'd have to do a revert as it's in the middle of other changes. So I
prefer to do an incremental fix.

I committed the change into the irq/for-arm branch and pushed it out.

Thanks,

	tglx
Catalin Marinas Oct. 22, 2015, 4:34 p.m. UTC | #9
On Thu, Oct 22, 2015 at 02:43:34PM +0200, Thomas Gleixner wrote:
> On Thu, 22 Oct 2015, Russell King - ARM Linux wrote:
> > The solutions are:
> > * A patch to restore the pr_debug() which Thomas applies, and Catalin
> >   and myself then pull Thomas' tree again, which potentially creates
> >   a messier history.
> > 
> > * Catalin drops the ARM64 change and Thomas' tree from the ARM64 tree,
> >   Thomas drops the original commit, and we start again doing it
> >   correctly.
> > 
> > Which is up to Catalin and Thomas.
> 
> I'd have to do a revert as it's in the middle of other changes. So I
> prefer to do an incremental fix.
> 
> I committed the change into the irq/for-arm branch and pushed it out.

I pulled the branch again into arm64 for-next/core. Thanks.
Yang Yingliang Dec. 15, 2015, 6:59 a.m. UTC | #10
Hi, Russell

On 2015/10/22 19:13, Russell King - ARM Linux wrote:
> On Thu, Oct 22, 2015 at 06:56:29PM +0800, Yang Yingliang wrote:
>> I described it in v2 cover letter and kept the change history in v6
>> cover letter. There is no comment on the change when patch the was
>> reviewing in v2, so I thought it's ok and I kept the change in the
>> next versions.
>
> Cover letters don't always get read, neither do changelogs.
>
> However, there's a principle here: never mix moving code around with
> changes to that code.  Always move code with as few changes as possible
> in one patch, and then make changes in a subsequent patch.
>
> The "few changes as possible" means that if you need to make changes
> for it to end up building in its new location, such as removing a
> 'static' or adding an 'EXPORT_SYMBOL' then those are fine, but the
> main body of the code should remain identical, even down to style.
>
> Any changes (such as, in this case, replacing pr_debug with pr_warn)
> should be done as a distinctly separate patch so that such changes
> are immediately obvious to reviewers.
>
>> Need I send a patch to the Thomas branch to revert the change ?
>
> I think wait for Thomas and Catalin to reply.  Your patch series is
> currently merged into two different trees (Thomas' and Catalin's
> trees) and what action is needed depends on how they want to handle
> it.
>
> The solutions are:
> * A patch to restore the pr_debug() which Thomas applies, and Catalin
>    and myself then pull Thomas' tree again, which potentially creates
>    a messier history.
>
> * Catalin drops the ARM64 change and Thomas' tree from the ARM64 tree,
>    Thomas drops the original commit, and we start again doing it
>    correctly.
>
> Which is up to Catalin and Thomas.
>
> I've dropped it from my tree as an easy way to fix the regression
> on ARM for the time being, pending the outcome of deciding how to
> fix this.
>

The regression had been fixed.
Do I need to put the patch in the patch system again ?

Regards,
Yang
diff mbox

Patch

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 72ad724..bffba78 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1492,6 +1492,7 @@  config NR_CPUS
 config HOTPLUG_CPU
 	bool "Support for hot-pluggable CPUs"
 	depends on SMP
+	select GENERIC_IRQ_MIGRATION
 	help
 	  Say Y here to experiment with turning CPUs off and on.  CPUs
 	  can be controlled through /sys/devices/system/cpu.
diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h
index be1d07d..882bf98 100644
--- a/arch/arm/include/asm/irq.h
+++ b/arch/arm/include/asm/irq.h
@@ -24,7 +24,6 @@ 
 #ifndef __ASSEMBLY__
 struct irqaction;
 struct pt_regs;
-extern void migrate_irqs(void);
 
 extern void asm_do_IRQ(unsigned int, struct pt_regs *);
 void handle_IRQ(unsigned int, struct pt_regs *);
diff --git a/arch/arm/kernel/irq.c b/arch/arm/kernel/irq.c
index 2766183..85299b7 100644
--- a/arch/arm/kernel/irq.c
+++ b/arch/arm/kernel/irq.c
@@ -31,7 +31,6 @@ 
 #include <linux/smp.h>
 #include <linux/init.h>
 #include <linux/seq_file.h>
-#include <linux/ratelimit.h>
 #include <linux/errno.h>
 #include <linux/list.h>
 #include <linux/kallsyms.h>
@@ -116,64 +115,3 @@  int __init arch_probe_nr_irqs(void)
 	return nr_irqs;
 }
 #endif
-
-#ifdef CONFIG_HOTPLUG_CPU
-static bool migrate_one_irq(struct irq_desc *desc)
-{
-	struct irq_data *d = irq_desc_get_irq_data(desc);
-	const struct cpumask *affinity = irq_data_get_affinity_mask(d);
-	struct irq_chip *c;
-	bool ret = false;
-
-	/*
-	 * If this is a per-CPU interrupt, or the affinity does not
-	 * include this CPU, then we have nothing to do.
-	 */
-	if (irqd_is_per_cpu(d) || !cpumask_test_cpu(smp_processor_id(), affinity))
-		return false;
-
-	if (cpumask_any_and(affinity, cpu_online_mask) >= nr_cpu_ids) {
-		affinity = cpu_online_mask;
-		ret = true;
-	}
-
-	c = irq_data_get_irq_chip(d);
-	if (!c->irq_set_affinity)
-		pr_debug("IRQ%u: unable to set affinity\n", d->irq);
-	else if (c->irq_set_affinity(d, affinity, false) == IRQ_SET_MASK_OK && ret)
-		cpumask_copy(irq_data_get_affinity_mask(d), affinity);
-
-	return ret;
-}
-
-/*
- * The current CPU has been marked offline.  Migrate IRQs off this CPU.
- * If the affinity settings do not allow other CPUs, force them onto any
- * available CPU.
- *
- * Note: we must iterate over all IRQs, whether they have an attached
- * action structure or not, as we need to get chained interrupts too.
- */
-void migrate_irqs(void)
-{
-	unsigned int i;
-	struct irq_desc *desc;
-	unsigned long flags;
-
-	local_irq_save(flags);
-
-	for_each_irq_desc(i, desc) {
-		bool affinity_broken;
-
-		raw_spin_lock(&desc->lock);
-		affinity_broken = migrate_one_irq(desc);
-		raw_spin_unlock(&desc->lock);
-
-		if (affinity_broken)
-			pr_warn_ratelimited("IRQ%u no longer affine to CPU%u\n",
-				i, smp_processor_id());
-	}
-
-	local_irq_restore(flags);
-}
-#endif /* CONFIG_HOTPLUG_CPU */
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 48185a7..fc26384 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -218,7 +218,7 @@  int __cpu_disable(void)
 	/*
 	 * OK - migrate IRQs away from this CPU
 	 */
-	migrate_irqs();
+	irq_migrate_all_off_this_cpu();
 
 	/*
 	 * Flush user cache and TLB mappings, and then remove this CPU