diff mbox

[for,4.7,1/4] xen: sched: avoid spuriously re-enabling IRQs in csched2_switch_sched()

Message ID 146231199550.25631.15153219462074625034.stgit@Solace.fritz.box (mailing list archive)
State New, archived
Headers show

Commit Message

Dario Faggioli May 3, 2016, 9:46 p.m. UTC
In fact, interrupts are already disabled when calling
the hook from schedule_cpu_switch(), and hence using
anything different than just spin_lock() is wrong (and
ASSERT()-s in debug builds) or unnecessary.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
---
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 xen/common/sched_credit2.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Jan Beulich May 4, 2016, 8:48 a.m. UTC | #1
>>> On 03.05.16 at 23:46, <dario.faggioli@citrix.com> wrote:
> In fact, interrupts are already disabled when calling
> the hook from schedule_cpu_switch(), and hence using
> anything different than just spin_lock() is wrong (and
> ASSERT()-s in debug builds) or unnecessary.

Well, this is a little too broad a statement: spin_lock_irqsave()
would be quite fine in this situation.

> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

Actual code change
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Dario Faggioli May 4, 2016, 9:08 a.m. UTC | #2
On Wed, 2016-05-04 at 02:48 -0600, Jan Beulich wrote:
> > > > On 03.05.16 at 23:46, <dario.faggioli@citrix.com> wrote:
> > In fact, interrupts are already disabled when calling
> > the hook from schedule_cpu_switch(), and hence using
> > anything different than just spin_lock() is wrong (and
> > ASSERT()-s in debug builds) or unnecessary.
> Well, this is a little too broad a statement: spin_lock_irqsave()
> would be quite fine in this situation.
> 
That would be covered by the "or unnecessary" part.

As in "spin_lock_irq() is wrong and spin_lock_irqsave() is
unnecessary".
> > 
> > Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
> Actual code change
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> 
Thanks and Regards,
Dario
George Dunlap May 4, 2016, 3:11 p.m. UTC | #3
On 03/05/16 22:46, Dario Faggioli wrote:
> In fact, interrupts are already disabled when calling
> the hook from schedule_cpu_switch(), and hence using
> anything different than just spin_lock() is wrong (and
> ASSERT()-s in debug builds) or unnecessary.
> 
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

Good catch.  But would it be better to either 1) add an assert that irqs
are disabled, or 2) use spin_lock_irqsave(), just to make sure the two
bits of code won't silently go out of sync?

Re Jan's comment -- you meant to say that spin_lock_irq() is wrong and
spin_lock_irqsave() is unnecessary?  That's probably a bit more
inference than we want changeset messages to have, ideally. :-)

I wouldn't require a re-spin just for that, but if you are going to
respin it, I would suggest just dropping the "and hence...".

Thanks,
 -George

> ---
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> ---
>  xen/common/sched_credit2.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
> index 46b9279..50c1b9d 100644
> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c
> @@ -2232,7 +2232,7 @@ csched2_switch_sched(struct scheduler *new_ops, unsigned int cpu,
>       * And owning exactly that one (the lock of the old scheduler of this
>       * cpu) is what is necessary to prevent races.
>       */
> -    spin_lock_irq(&prv->lock);
> +    spin_lock(&prv->lock);
>  
>      idle_vcpu[cpu]->sched_priv = vdata;
>  
> @@ -2257,7 +2257,7 @@ csched2_switch_sched(struct scheduler *new_ops, unsigned int cpu,
>      smp_mb();
>      per_cpu(schedule_data, cpu).schedule_lock = &prv->rqd[rqi].lock;
>  
> -    spin_unlock_irq(&prv->lock);
> +    spin_unlock(&prv->lock);
>  }
>  
>  static void
>
Dario Faggioli May 4, 2016, 3:58 p.m. UTC | #4
On Wed, 2016-05-04 at 16:11 +0100, George Dunlap wrote:
> On 03/05/16 22:46, Dario Faggioli wrote:
> > 
> > In fact, interrupts are already disabled when calling
> > the hook from schedule_cpu_switch(), and hence using
> > anything different than just spin_lock() is wrong (and
> > ASSERT()-s in debug builds) or unnecessary.
> > 
> > Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
> Good catch.  
>
Well, much rather my bad introducing this! :-/

> But would it be better to either 1) add an assert that irqs
> are disabled, or 2) use spin_lock_irqsave(), just to make sure the
> two
> bits of code won't silently go out of sync?
> 
There's an ASSERT() already, in spin_lock_irq() (asking for IRQs to be
enabled), which is in fact the one that triggers.

I introduced the bug because of an oversight when applying the last
review comments to a previous patch series, and did not spot it because
--and this is the true mistake, IMO-- I tested the final result only
with debug=n. This is why I think an ASSERT() is after all not that
useful here... In fact, if this were tested with debug=y, what's
already in place would have been enough.

There are quite a few other similar cases all around scheduling code.
Some of them have comments, none has ASSERT()-s, and I think that is
fine.

I also don't like using spin_lock_irqsave() when it is not necessary,
even if this is not an hot path. In fact, apart from being slower, I
find it more confusing than helpful, especially when looking at the
code and trying to reason about whether we can be called with IRQ
enabled or not.

So, in summary, I'd be fine with adding a comment, and I'm ok
respinning for this. This patch is pretty independent from the rest of
the series anyway, so I can easily respin just this one.

Thoughts?

> Re Jan's comment -- you meant to say that spin_lock_irq() is wrong
> and
> spin_lock_irqsave() is unnecessary?  That's probably a bit more
> inference than we want changeset messages to have, ideally. :-)
> 
Yes, that's what I meant, as I replied myself to Jan.

> I wouldn't require a re-spin just for that, but if you are going to
> respin it, I would suggest just dropping the "and hence...".
> 
Ok, if I have to respin, I'll do this. If not, I'm fine with whoever
commits this has to drop that sentence.

Thanks and Regards,
Dario
George Dunlap May 4, 2016, 5:05 p.m. UTC | #5
On 04/05/16 16:58, Dario Faggioli wrote:
> On Wed, 2016-05-04 at 16:11 +0100, George Dunlap wrote:
>> On 03/05/16 22:46, Dario Faggioli wrote:
>>>
>>> In fact, interrupts are already disabled when calling
>>> the hook from schedule_cpu_switch(), and hence using
>>> anything different than just spin_lock() is wrong (and
>>> ASSERT()-s in debug builds) or unnecessary.
>>>
>>> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
>> Good catch.  
>>
> Well, much rather my bad introducing this! :-/
> 
>> But would it be better to either 1) add an assert that irqs
>> are disabled, or 2) use spin_lock_irqsave(), just to make sure the
>> two
>> bits of code won't silently go out of sync?
>>
> There's an ASSERT() already, in spin_lock_irq() (asking for IRQs to be
> enabled), which is in fact the one that triggers.
> 
> I introduced the bug because of an oversight when applying the last
> review comments to a previous patch series, and did not spot it because
> --and this is the true mistake, IMO-- I tested the final result only
> with debug=n. This is why I think an ASSERT() is after all not that
> useful here... In fact, if this were tested with debug=y, what's
> already in place would have been enough.
> 
> There are quite a few other similar cases all around scheduling code.
> Some of them have comments, none has ASSERT()-s, and I think that is
> fine.

I'm a bit confused by these few paragraphs, and it makes me wonder if
maybe I wasn't very clear.  By #1, I meant, "Do exactly what is done in
this patch (replace spin_lock_irq() with spin_lock()), but also add to
it an assert to make sure that if irqs ever *stop* being disabled when
calling this, we'll find out".

Is that what you understood me to mean, or did you think I meant,
"Instead of changing to spin_lock(), [do something else]"?

> I also don't like using spin_lock_irqsave() when it is not necessary,
> even if this is not an hot path. In fact, apart from being slower, I
> find it more confusing than helpful, especially when looking at the
> code and trying to reason about whether we can be called with IRQ
> enabled or not.

I agree with this; I mainly included the suggestion as a potential
alternative to making the code robust.

I've checked in patches 2 and 3, btw, so no need to re-send them. :-)

 -George
Dario Faggioli May 4, 2016, 5:21 p.m. UTC | #6
On Wed, 2016-05-04 at 18:05 +0100, George Dunlap wrote:
> On 04/05/16 16:58, Dario Faggioli wrote:
> > On Wed, 2016-05-04 at 16:11 +0100, George Dunlap wrote:
> > There are quite a few other similar cases all around scheduling
> > code.
> > Some of them have comments, none has ASSERT()-s, and I think that
> > is
> > fine.
> I'm a bit confused by these few paragraphs, and it makes me wonder if
> maybe I wasn't very clear.  By #1, I meant, "Do exactly what is done
> in
> this patch (replace spin_lock_irq() with spin_lock()), but also add
> to
> it an assert to make sure that if irqs ever *stop* being disabled
> when
> calling this, we'll find out".
> 
No, you were clear. I wasn't, sorry. :-)

> Is that what you understood me to mean, or did you think I meant,
> "Instead of changing to spin_lock(), [do something else]"?
> 
I understood what you meant. What I meant was this:

instead of another ASSERT, I would prefer to add a comment, like we are
doing in other similar places already:

static void
csched2_deinit_pdata(...)
{
    ...
    /* No need to save IRQs here, they're already disabled */
    spin_lock(&rqd->lock);
   ...
}

And I'm now adding this:

After all, I'm fine with an ASSERT() too, but then I think we should
add one to the same effect to csched_switch_sched() too.

> > I also don't like using spin_lock_irqsave() when it is not
> > necessary,
> > even if this is not an hot path. In fact, apart from being slower,
> > I
> > find it more confusing than helpful, especially when looking at the
> > code and trying to reason about whether we can be called with IRQ
> > enabled or not.
> I agree with this; I mainly included the suggestion as a potential
> alternative to making the code robust.
> 
> I've checked in patches 2 and 3, btw, so no need to re-send them. :-)
> 
Great, thanks. :-)

Dario
George Dunlap May 4, 2016, 5:34 p.m. UTC | #7
On 04/05/16 18:21, Dario Faggioli wrote:
> On Wed, 2016-05-04 at 18:05 +0100, George Dunlap wrote:
>> On 04/05/16 16:58, Dario Faggioli wrote:
>>> On Wed, 2016-05-04 at 16:11 +0100, George Dunlap wrote:
>>> There are quite a few other similar cases all around scheduling
>>> code.
>>> Some of them have comments, none has ASSERT()-s, and I think that
>>> is
>>> fine.
>> I'm a bit confused by these few paragraphs, and it makes me wonder if
>> maybe I wasn't very clear.  By #1, I meant, "Do exactly what is done
>> in
>> this patch (replace spin_lock_irq() with spin_lock()), but also add
>> to
>> it an assert to make sure that if irqs ever *stop* being disabled
>> when
>> calling this, we'll find out".
>>
> No, you were clear. I wasn't, sorry. :-)
> 
>> Is that what you understood me to mean, or did you think I meant,
>> "Instead of changing to spin_lock(), [do something else]"?
>>
> I understood what you meant. What I meant was this:
> 
> instead of another ASSERT, I would prefer to add a comment, like we are
> doing in other similar places already:
> 
> static void
> csched2_deinit_pdata(...)
> {
>     ...
>     /* No need to save IRQs here, they're already disabled */
>     spin_lock(&rqd->lock);
>    ...
> }
> 
> And I'm now adding this:
> 
> After all, I'm fine with an ASSERT() too, but then I think we should
> add one to the same effect to csched_switch_sched() too.

Well an ASSERT() is sort of like a comment, in that if you see
ASSERT(irqs_disabled()), you know there's no need to save irqs because
they should already disabled.  But it has the advantage that osstest
will be able to "read" it once we get some proper cpupool tests for
osstest. :-)

If we weren't in the feature freeze, I'd definitely favor adding an
ASSERT to credit1.  As it is, I think either way (adding now or waiting
until the 4.8 development window) should be fine.

 -George
diff mbox

Patch

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 46b9279..50c1b9d 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -2232,7 +2232,7 @@  csched2_switch_sched(struct scheduler *new_ops, unsigned int cpu,
      * And owning exactly that one (the lock of the old scheduler of this
      * cpu) is what is necessary to prevent races.
      */
-    spin_lock_irq(&prv->lock);
+    spin_lock(&prv->lock);
 
     idle_vcpu[cpu]->sched_priv = vdata;
 
@@ -2257,7 +2257,7 @@  csched2_switch_sched(struct scheduler *new_ops, unsigned int cpu,
     smp_mb();
     per_cpu(schedule_data, cpu).schedule_lock = &prv->rqd[rqi].lock;
 
-    spin_unlock_irq(&prv->lock);
+    spin_unlock(&prv->lock);
 }
 
 static void