diff mbox

arm: use cpu_online_mask when using forced irq_set_affinity

Message ID alpine.DEB.2.10.1408251136380.3323@nanos (mailing list archive)
State New, archived
Headers show

Commit Message

Thomas Gleixner Aug. 25, 2014, 9:55 a.m. UTC
On Thu, 17 Jul 2014, Sudeep Holla wrote:
> > > Any suggestions on this ? Since commit 01f8fa4f01d8 and ffde1de64012 are
> > > now in
> > > stable releases, CPU0 hotplug is broken there now.
> > 
> > Maybe we should ask Thomas, as he's (a) the maintainer of the irqchip
> > stuff, and (b) the author of the patch causing the breakage.
> > 
> >  From what I can see looking at the x86 code, the work-around in
> > ffde1de64012 is wrong.
> > 
> 
> Can provide your thoughts on how to solve this issue ?

ffde1de64012 is not about offlining a cpu, it's about onlining where
we need to make sure that we assign the affinity to a not yet online
marked cpu.
 
> Is it expected from all the irqchip implementation to use force flag in
> irq_set_affinity to ignore cpu_online_mask similar to GIC ?

No, it's only relevant for the cases where we need to route irqs to
not yet online cpus.

Now the wreckage of offlining was definitely not intended and I wonder
why set_affinity() is called there with force = true. This was
introduced in commit 1dbfa187dad. I acked it back then, but I have no
idea why, because the force argument did not have any effect at that
time.

Changing it to false should solve the issue.

Thanks,

	tglx

Comments

Sudeep Holla Aug. 26, 2014, 3:19 p.m. UTC | #1
Hi Thomas,

Thanks for your feedback.

On 25/08/14 10:55, Thomas Gleixner wrote:
> On Thu, 17 Jul 2014, Sudeep Holla wrote:
>>>> Any suggestions on this ? Since commit 01f8fa4f01d8 and ffde1de64012 are
>>>> now in
>>>> stable releases, CPU0 hotplug is broken there now.
>>>
>>> Maybe we should ask Thomas, as he's (a) the maintainer of the irqchip
>>> stuff, and (b) the author of the patch causing the breakage.
>>>
>>>   From what I can see looking at the x86 code, the work-around in
>>> ffde1de64012 is wrong.
>>>
>>
>> Can provide your thoughts on how to solve this issue ?
>
> ffde1de64012 is not about offlining a cpu, it's about onlining where
> we need to make sure that we assign the affinity to a not yet online
> marked cpu.
>
>> Is it expected from all the irqchip implementation to use force flag in
>> irq_set_affinity to ignore cpu_online_mask similar to GIC ?
>
> No, it's only relevant for the cases where we need to route irqs to
> not yet online cpus.
>

Ok. IIUC Russell's main concern was if irqchip implementation uses force
flag differently, then we can't change the core code to false. Also
x86 core code also uses forced irq_set_affinity in arch/x86/kernel/irq.c

Russell, any comments on this or are you fine with changing to false.

Regards,
Sudeep

> Now the wreckage of offlining was definitely not intended and I wonder
> why set_affinity() is called there with force = true. This was
> introduced in commit 1dbfa187dad. I acked it back then, but I have no
> idea why, because the force argument did not have any effect at that
> time.
>
> Changing it to false should solve the issue.
>
> Thanks,
>
> 	tglx
>
> diff --git a/arch/arm/kernel/irq.c b/arch/arm/kernel/irq.c
> index 2c4257604513..5c4d38e32a51 100644
> --- a/arch/arm/kernel/irq.c
> +++ b/arch/arm/kernel/irq.c
> @@ -175,7 +175,7 @@ static bool migrate_one_irq(struct irq_desc *desc)
>   	c = irq_data_get_irq_chip(d);
>   	if (!c->irq_set_affinity)
>   		pr_debug("IRQ%u: unable to set affinity\n", d->irq);
> -	else if (c->irq_set_affinity(d, affinity, true) == IRQ_SET_MASK_OK && ret)
> +	else if (c->irq_set_affinity(d, affinity, false) == IRQ_SET_MASK_OK && ret)
>   		cpumask_copy(d->affinity, affinity);
>
>   	return ret;
>
Thomas Gleixner Aug. 28, 2014, 9:32 a.m. UTC | #2
On Tue, 26 Aug 2014, Sudeep Holla wrote:
> > > Can provide your thoughts on how to solve this issue ?
> > 
> > ffde1de64012 is not about offlining a cpu, it's about onlining where
> > we need to make sure that we assign the affinity to a not yet online
> > marked cpu.
> > 
> > > Is it expected from all the irqchip implementation to use force flag in
> > > irq_set_affinity to ignore cpu_online_mask similar to GIC ?
> > 
> > No, it's only relevant for the cases where we need to route irqs to
> > not yet online cpus.
> > 
> 
> Ok. IIUC Russell's main concern was if irqchip implementation uses force
> flag differently, then we can't change the core code to false. Also
> x86 core code also uses forced irq_set_affinity in arch/x86/kernel/irq.c

Which is pointless as none of the x86 irq chip implementations
actually honours the force argument.

In fact until the point where I implemented it in the GIC driver,
nothing ever used that argument. So the GIC conversion actually added
semantics to the argument. Any driver which will make use of it, has
to follow that now. I'll add documentation to the core code for it ...

Thanks,

	tglx
Sudeep Holla Aug. 28, 2014, 10:12 a.m. UTC | #3
On 28/08/14 10:32, Thomas Gleixner wrote:
> On Tue, 26 Aug 2014, Sudeep Holla wrote:
>>>> Can provide your thoughts on how to solve this issue ?
>>>
>>> ffde1de64012 is not about offlining a cpu, it's about onlining where
>>> we need to make sure that we assign the affinity to a not yet online
>>> marked cpu.
>>>
>>>> Is it expected from all the irqchip implementation to use force flag in
>>>> irq_set_affinity to ignore cpu_online_mask similar to GIC ?
>>>
>>> No, it's only relevant for the cases where we need to route irqs to
>>> not yet online cpus.
>>>
>>
>> Ok. IIUC Russell's main concern was if irqchip implementation uses force
>> flag differently, then we can't change the core code to false. Also
>> x86 core code also uses forced irq_set_affinity in arch/x86/kernel/irq.c
>
> Which is pointless as none of the x86 irq chip implementations
> actually honours the force argument.
>
> In fact until the point where I implemented it in the GIC driver,
> nothing ever used that argument. So the GIC conversion actually added
> semantics to the argument. Any driver which will make use of it, has
> to follow that now. I'll add documentation to the core code for it ...
>

Thanks Thomas for confirming.

Hi Russell,

Can I post the patch changing force to false in irq_set_affinity call
to fix the issue and cc stable ? It's broken in stable kernels(v3.10 and
v3.14)

Regards,
Sudeep
Russell King - ARM Linux Aug. 28, 2014, 10:16 a.m. UTC | #4
On Thu, Aug 28, 2014 at 11:12:15AM +0100, Sudeep Holla wrote:
>
>
> On 28/08/14 10:32, Thomas Gleixner wrote:
>> On Tue, 26 Aug 2014, Sudeep Holla wrote:
>>>>> Can provide your thoughts on how to solve this issue ?
>>>>
>>>> ffde1de64012 is not about offlining a cpu, it's about onlining where
>>>> we need to make sure that we assign the affinity to a not yet online
>>>> marked cpu.
>>>>
>>>>> Is it expected from all the irqchip implementation to use force flag in
>>>>> irq_set_affinity to ignore cpu_online_mask similar to GIC ?
>>>>
>>>> No, it's only relevant for the cases where we need to route irqs to
>>>> not yet online cpus.
>>>>
>>>
>>> Ok. IIUC Russell's main concern was if irqchip implementation uses force
>>> flag differently, then we can't change the core code to false. Also
>>> x86 core code also uses forced irq_set_affinity in arch/x86/kernel/irq.c
>>
>> Which is pointless as none of the x86 irq chip implementations
>> actually honours the force argument.
>>
>> In fact until the point where I implemented it in the GIC driver,
>> nothing ever used that argument. So the GIC conversion actually added
>> semantics to the argument. Any driver which will make use of it, has
>> to follow that now. I'll add documentation to the core code for it ...
>>
>
> Thanks Thomas for confirming.
>
> Hi Russell,
>
> Can I post the patch changing force to false in irq_set_affinity call
> to fix the issue and cc stable ? It's broken in stable kernels(v3.10 and
> v3.14)

I think it's up to Thomas to suggest what the correct solution is to
the problem he introduced, and it's not clear from Thomas' email you
quote above what he thinks would be the right solution.  It sounds
like passing false there would be the right thing, but really it needs
a clear and unambiguous statement from Thomas.
diff mbox

Patch

diff --git a/arch/arm/kernel/irq.c b/arch/arm/kernel/irq.c
index 2c4257604513..5c4d38e32a51 100644
--- a/arch/arm/kernel/irq.c
+++ b/arch/arm/kernel/irq.c
@@ -175,7 +175,7 @@  static bool migrate_one_irq(struct irq_desc *desc)
 	c = irq_data_get_irq_chip(d);
 	if (!c->irq_set_affinity)
 		pr_debug("IRQ%u: unable to set affinity\n", d->irq);
-	else if (c->irq_set_affinity(d, affinity, true) == IRQ_SET_MASK_OK && ret)
+	else if (c->irq_set_affinity(d, affinity, false) == IRQ_SET_MASK_OK && ret)
 		cpumask_copy(d->affinity, affinity);
 
 	return ret;