Message ID | alpine.DEB.2.10.1408251136380.3323@nanos (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Thomas, Thanks for your feedback. On 25/08/14 10:55, Thomas Gleixner wrote: > On Thu, 17 Jul 2014, Sudeep Holla wrote: >>>> Any suggestions on this ? Since commit 01f8fa4f01d8 and ffde1de64012 are >>>> now in >>>> stable releases, CPU0 hotplug is broken there now. >>> >>> Maybe we should ask Thomas, as he's (a) the maintainer of the irqchip >>> stuff, and (b) the author of the patch causing the breakage. >>> >>> From what I can see looking at the x86 code, the work-around in >>> ffde1de64012 is wrong. >>> >> >> Can provide your thoughts on how to solve this issue ? > > ffde1de64012 is not about offlining a cpu, it's about onlining where > we need to make sure that we assign the affinity to a not yet online > marked cpu. > >> Is it expected from all the irqchip implementation to use force flag in >> irq_set_affinity to ignore cpu_online_mask similar to GIC ? > > No, it's only relevant for the cases where we need to route irqs to > not yet online cpus. > Ok. IIUC Russell's main concern was if irqchip implementation uses force flag differently, then we can't change the core code to false. Also x86 core code also uses forced irq_set_affinity in arch/x86/kernel/irq.c Russell, any comments on this or are you fine with changing to false. Regards, Sudeep > Now the wreckage of offlining was definitely not intended and I wonder > why set_affinity() is called there with force = true. This was > introduced in commit 1dbfa187dad. I acked it back then, but I have no > idea why, because the force argument did not have any effect at that > time. > > Changing it to false should solve the issue. > > Thanks, > > tglx > > diff --git a/arch/arm/kernel/irq.c b/arch/arm/kernel/irq.c > index 2c4257604513..5c4d38e32a51 100644 > --- a/arch/arm/kernel/irq.c > +++ b/arch/arm/kernel/irq.c > @@ -175,7 +175,7 @@ static bool migrate_one_irq(struct irq_desc *desc) > c = irq_data_get_irq_chip(d); > if (!c->irq_set_affinity) > pr_debug("IRQ%u: unable to set affinity\n", d->irq); > - else if (c->irq_set_affinity(d, affinity, true) == IRQ_SET_MASK_OK && ret) > + else if (c->irq_set_affinity(d, affinity, false) == IRQ_SET_MASK_OK && ret) > cpumask_copy(d->affinity, affinity); > > return ret; >
On Tue, 26 Aug 2014, Sudeep Holla wrote: > > > Can provide your thoughts on how to solve this issue ? > > > > ffde1de64012 is not about offlining a cpu, it's about onlining where > > we need to make sure that we assign the affinity to a not yet online > > marked cpu. > > > > > Is it expected from all the irqchip implementation to use force flag in > > > irq_set_affinity to ignore cpu_online_mask similar to GIC ? > > > > No, it's only relevant for the cases where we need to route irqs to > > not yet online cpus. > > > > Ok. IIUC Russell's main concern was if irqchip implementation uses force > flag differently, then we can't change the core code to false. Also > x86 core code also uses forced irq_set_affinity in arch/x86/kernel/irq.c Which is pointless as none of the x86 irq chip implementations actually honours the force argument. In fact until the point where I implemented it in the GIC driver, nothing ever used that argument. So the GIC conversion actually added semantics to the argument. Any driver which will make use of it, has to follow that now. I'll add documentation to the core code for it ... Thanks, tglx
On 28/08/14 10:32, Thomas Gleixner wrote: > On Tue, 26 Aug 2014, Sudeep Holla wrote: >>>> Can provide your thoughts on how to solve this issue ? >>> >>> ffde1de64012 is not about offlining a cpu, it's about onlining where >>> we need to make sure that we assign the affinity to a not yet online >>> marked cpu. >>> >>>> Is it expected from all the irqchip implementation to use force flag in >>>> irq_set_affinity to ignore cpu_online_mask similar to GIC ? >>> >>> No, it's only relevant for the cases where we need to route irqs to >>> not yet online cpus. >>> >> >> Ok. IIUC Russell's main concern was if irqchip implementation uses force >> flag differently, then we can't change the core code to false. Also >> x86 core code also uses forced irq_set_affinity in arch/x86/kernel/irq.c > > Which is pointless as none of the x86 irq chip implementations > actually honours the force argument. > > In fact until the point where I implemented it in the GIC driver, > nothing ever used that argument. So the GIC conversion actually added > semantics to the argument. Any driver which will make use of it, has > to follow that now. I'll add documentation to the core code for it ... > Thanks Thomas for confirming. Hi Russell, Can I post the patch changing force to false in irq_set_affinity call to fix the issue and cc stable ? It's broken in stable kernels(v3.10 and v3.14) Regards, Sudeep
On Thu, Aug 28, 2014 at 11:12:15AM +0100, Sudeep Holla wrote: > > > On 28/08/14 10:32, Thomas Gleixner wrote: >> On Tue, 26 Aug 2014, Sudeep Holla wrote: >>>>> Can provide your thoughts on how to solve this issue ? >>>> >>>> ffde1de64012 is not about offlining a cpu, it's about onlining where >>>> we need to make sure that we assign the affinity to a not yet online >>>> marked cpu. >>>> >>>>> Is it expected from all the irqchip implementation to use force flag in >>>>> irq_set_affinity to ignore cpu_online_mask similar to GIC ? >>>> >>>> No, it's only relevant for the cases where we need to route irqs to >>>> not yet online cpus. >>>> >>> >>> Ok. IIUC Russell's main concern was if irqchip implementation uses force >>> flag differently, then we can't change the core code to false. Also >>> x86 core code also uses forced irq_set_affinity in arch/x86/kernel/irq.c >> >> Which is pointless as none of the x86 irq chip implementations >> actually honours the force argument. >> >> In fact until the point where I implemented it in the GIC driver, >> nothing ever used that argument. So the GIC conversion actually added >> semantics to the argument. Any driver which will make use of it, has >> to follow that now. I'll add documentation to the core code for it ... >> > > Thanks Thomas for confirming. > > Hi Russell, > > Can I post the patch changing force to false in irq_set_affinity call > to fix the issue and cc stable ? It's broken in stable kernels(v3.10 and > v3.14) I think it's up to Thomas to suggest what the correct solution is to the problem he introduced, and it's not clear from Thomas' email you quote above what he thinks would be the right solution. It sounds like passing false there would be the right thing, but really it needs a clear and unambiguous statement from Thomas.
diff --git a/arch/arm/kernel/irq.c b/arch/arm/kernel/irq.c index 2c4257604513..5c4d38e32a51 100644 --- a/arch/arm/kernel/irq.c +++ b/arch/arm/kernel/irq.c @@ -175,7 +175,7 @@ static bool migrate_one_irq(struct irq_desc *desc) c = irq_data_get_irq_chip(d); if (!c->irq_set_affinity) pr_debug("IRQ%u: unable to set affinity\n", d->irq); - else if (c->irq_set_affinity(d, affinity, true) == IRQ_SET_MASK_OK && ret) + else if (c->irq_set_affinity(d, affinity, false) == IRQ_SET_MASK_OK && ret) cpumask_copy(d->affinity, affinity); return ret;