diff mbox

[01/19] xen: sched: leave CPUs doing tasklet work alone.

Message ID 146620508199.29766.7439697154591443704.stgit@Solace.fritz.box (mailing list archive)
State New, archived
Headers show

Commit Message

Dario Faggioli June 17, 2016, 11:11 p.m. UTC
In both Credit1 and Credit2, stop considering a pCPU idle,
if the reason why the idle vCPU is being selected, is to
do tasklet work.

Not doing so means that the tickling and load balancing
logic, seeing the pCPU as idle, considers it a candidate
for picking up vCPUs. But the pCPU won't actually pick
up or schedule any vCPU, which would then remain in the
runqueue, which is bas, especially if there were other,
truly idle pCPUs, that could execute it.

The only drawback is that we can't assume that a pCPU is
in always marked as idle when being removed from an
instance of the Credit2 scheduler (csched2_deinit_pdata).
In fact, if we are in stop-machine (i.e., during suspend
or shutdown), the pCPUs are running the stopmachine_tasklet
and hence are actually marked as busy. On the other hand,
when removing a pCPU from a Credit2 pool, it will indeed
be idle. The only thing we can do, therefore, is to
remove the BUG_ON() check.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
---
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Anshul Makkar <anshul.makkar@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 xen/common/sched_credit.c  |   12 ++++++------
 xen/common/sched_credit2.c |   14 ++++++++++----
 2 files changed, 16 insertions(+), 10 deletions(-)

Comments

Jan Beulich June 20, 2016, 7:48 a.m. UTC | #1
>>> On 18.06.16 at 01:11, <dario.faggioli@citrix.com> wrote:
> --- a/xen/common/sched_credit.c
> +++ b/xen/common/sched_credit.c
> @@ -1819,24 +1819,24 @@ csched_schedule(
>      else
>          snext = csched_load_balance(prv, cpu, snext, &ret.migrated);
>  
> + out:
>      /*
>       * Update idlers mask if necessary. When we're idling, other CPUs
>       * will tickle us when they get extra work.
>       */
> -    if ( snext->pri == CSCHED_PRI_IDLE )
> +    if ( tasklet_work_scheduled || snext->pri != CSCHED_PRI_IDLE )
>      {
> -        if ( !cpumask_test_cpu(cpu, prv->idlers) )
> -            cpumask_set_cpu(cpu, prv->idlers);
> +        if ( cpumask_test_cpu(cpu, prv->idlers) )
> +            cpumask_clear_cpu(cpu, prv->idlers);
>      }
> -    else if ( cpumask_test_cpu(cpu, prv->idlers) )
> +    else if ( !cpumask_test_cpu(cpu, prv->idlers) )
>      {
> -        cpumask_clear_cpu(cpu, prv->idlers);
> +        cpumask_set_cpu(cpu, prv->idlers);
>      }

Is there a reason for this extra code churn? It would seem to me
that the change could be just the "out" label movement and
adjustment to the first if:

   if ( !tasklet_work_scheduled && snext->pri == CSCHED_PRI_IDLE )

Am I overlooking something?

Jan
Anshul Makkar June 21, 2016, 4:17 p.m. UTC | #2
On 18/06/16 00:11, Dario Faggioli wrote:
> In both Credit1 and Credit2, stop considering a pCPU idle,
> if the reason why the idle vCPU is being selected, is to
> do tasklet work.
>
> Not doing so means that the tickling and load balancing
> logic, seeing the pCPU as idle, considers it a candidate
> for picking up vCPUs. But the pCPU won't actually pick
> up or schedule any vCPU, which would then remain in the
> runqueue, which is bas, especially if there were other,
> truly idle pCPUs, that could execute it.
>
> The only drawback is that we can't assume that a pCPU is
> in always marked as idle when being removed from an
> instance of the Credit2 scheduler (csched2_deinit_pdata).
> In fact, if we are in stop-machine (i.e., during suspend
> or shutdown), the pCPUs are running the stopmachine_tasklet
> and hence are actually marked as busy. On the other hand,
> when removing a pCPU from a Credit2 pool, it will indeed
> be idle. The only thing we can do, therefore, is to
> remove the BUG_ON() check.
>
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>

Reviewed-by: Anshul Makkar <anshul.makkar@citrix.com>
> ---
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Anshul Makkar <anshul.makkar@citrix.com>
> Cc: David Vrabel <david.vrabel@citrix.com>
> ---
>   xen/common/sched_credit.c  |   12 ++++++------
>   xen/common/sched_credit2.c |   14 ++++++++++----
>   2 files changed, 16 insertions(+), 10 deletions(-)
>
> diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
> index a38a63d..a6645a2 100644
> --- a/xen/common/sched_credit.c
> +++ b/xen/common/sched_credit.c
> @@ -1819,24 +1819,24 @@ csched_schedule(
>       else
>           snext = csched_load_balance(prv, cpu, snext, &ret.migrated);
>
> + out:
>       /*
>        * Update idlers mask if necessary. When we're idling, other CPUs
>        * will tickle us when they get extra work.
>        */
> -    if ( snext->pri == CSCHED_PRI_IDLE )
> +    if ( tasklet_work_scheduled || snext->pri != CSCHED_PRI_IDLE )
>       {
> -        if ( !cpumask_test_cpu(cpu, prv->idlers) )
> -            cpumask_set_cpu(cpu, prv->idlers);
> +        if ( cpumask_test_cpu(cpu, prv->idlers) )
> +            cpumask_clear_cpu(cpu, prv->idlers);
>       }
> -    else if ( cpumask_test_cpu(cpu, prv->idlers) )
> +    else if ( !cpumask_test_cpu(cpu, prv->idlers) )
>       {
> -        cpumask_clear_cpu(cpu, prv->idlers);
> +        cpumask_set_cpu(cpu, prv->idlers);
>       }
>
>       if ( !is_idle_vcpu(snext->vcpu) )
>           snext->start_time += now;
>
> -out:
>       /*
>        * Return task to run next...
>        */
> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
> index 1933ff1..cf8455c 100644
> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c
> @@ -1910,8 +1910,16 @@ csched2_schedule(
>       }
>       else
>       {
> -        /* Update the idle mask if necessary */
> -        if ( !cpumask_test_cpu(cpu, &rqd->idle) )
> +        /*
> +         * Update the idle mask if necessary. Note that, if we're scheduling
> +         * idle in order to carry on some tasklet work, we want to play busy!
> +         */
> +        if ( tasklet_work_scheduled )
> +        {
> +            if ( cpumask_test_cpu(cpu, &rqd->idle) )
> +                cpumask_clear_cpu(cpu, &rqd->idle);
> +        }
> +        else if ( !cpumask_test_cpu(cpu, &rqd->idle) )
>               cpumask_set_cpu(cpu, &rqd->idle);
>           /* Make sure avgload gets updated periodically even
>            * if there's no activity */
> @@ -2291,8 +2299,6 @@ csched2_deinit_pdata(const struct scheduler *ops, void *pcpu, int cpu)
>       /* No need to save IRQs here, they're already disabled */
>       spin_lock(&rqd->lock);
>
> -    BUG_ON(!cpumask_test_cpu(cpu, &rqd->idle));
> -
>       printk("Removing cpu %d from runqueue %d\n", cpu, rqi);
>
>       cpumask_clear_cpu(cpu, &rqd->idle);
>
George Dunlap July 6, 2016, 3:41 p.m. UTC | #3
On Sat, Jun 18, 2016 at 12:11 AM, Dario Faggioli
<dario.faggioli@citrix.com> wrote:
> In both Credit1 and Credit2, stop considering a pCPU idle,
> if the reason why the idle vCPU is being selected, is to
> do tasklet work.
>
> Not doing so means that the tickling and load balancing
> logic, seeing the pCPU as idle, considers it a candidate
> for picking up vCPUs. But the pCPU won't actually pick
> up or schedule any vCPU, which would then remain in the
> runqueue, which is bas, especially if there were other,

*bad

> truly idle pCPUs, that could execute it.
>
> The only drawback is that we can't assume that a pCPU is
> in always marked as idle when being removed from an
> instance of the Credit2 scheduler (csched2_deinit_pdata).
> In fact, if we are in stop-machine (i.e., during suspend
> or shutdown), the pCPUs are running the stopmachine_tasklet
> and hence are actually marked as busy. On the other hand,
> when removing a pCPU from a Credit2 pool, it will indeed
> be idle. The only thing we can do, therefore, is to
> remove the BUG_ON() check.
>
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
> ---
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Anshul Makkar <anshul.makkar@citrix.com>
> Cc: David Vrabel <david.vrabel@citrix.com>
> ---
>  xen/common/sched_credit.c  |   12 ++++++------
>  xen/common/sched_credit2.c |   14 ++++++++++----
>  2 files changed, 16 insertions(+), 10 deletions(-)
>
> diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
> index a38a63d..a6645a2 100644
> --- a/xen/common/sched_credit.c
> +++ b/xen/common/sched_credit.c
> @@ -1819,24 +1819,24 @@ csched_schedule(
>      else
>          snext = csched_load_balance(prv, cpu, snext, &ret.migrated);
>
> + out:

Sorry if I'm being a bit dense, but why is this moving up here?  As
far as I can tell the only time the 'out' label will be used, the
'idler' status of the cpu cannot change.

At very least moving it up here introduces a bug, since now...

>      /*
>       * Update idlers mask if necessary. When we're idling, other CPUs
>       * will tickle us when they get extra work.
>       */
> -    if ( snext->pri == CSCHED_PRI_IDLE )
> +    if ( tasklet_work_scheduled || snext->pri != CSCHED_PRI_IDLE )
>      {
> -        if ( !cpumask_test_cpu(cpu, prv->idlers) )
> -            cpumask_set_cpu(cpu, prv->idlers);
> +        if ( cpumask_test_cpu(cpu, prv->idlers) )
> +            cpumask_clear_cpu(cpu, prv->idlers);
>      }
> -    else if ( cpumask_test_cpu(cpu, prv->idlers) )
> +    else if ( !cpumask_test_cpu(cpu, prv->idlers) )
>      {
> -        cpumask_clear_cpu(cpu, prv->idlers);
> +        cpumask_set_cpu(cpu, prv->idlers);
>      }
>
>      if ( !is_idle_vcpu(snext->vcpu) )
>          snext->start_time += now;

...this will happen twice in the case (once in the if() clause, once
here).  (Although arguably the one in the if() clause should go away
and the out: label should be moved above this line anyway).

Other than that, looks good.

 -George
Dario Faggioli July 7, 2016, 10:11 a.m. UTC | #4
On Mon, 2016-06-20 at 01:48 -0600, Jan Beulich wrote:
> > > > On 18.06.16 at 01:11, <dario.faggioli@citrix.com> wrote:
> > --- a/xen/common/sched_credit.c
> > +++ b/xen/common/sched_credit.c
> > @@ -1819,24 +1819,24 @@ csched_schedule(
> >      else
> >          snext = csched_load_balance(prv, cpu, snext,
> > &ret.migrated);
> >  
> > + out:
> >      /*
> >       * Update idlers mask if necessary. When we're idling, other
> > CPUs
> >       * will tickle us when they get extra work.
> >       */
> > -    if ( snext->pri == CSCHED_PRI_IDLE )
> > +    if ( tasklet_work_scheduled || snext->pri != CSCHED_PRI_IDLE )
> >      {
> > -        if ( !cpumask_test_cpu(cpu, prv->idlers) )
> > -            cpumask_set_cpu(cpu, prv->idlers);
> > +        if ( cpumask_test_cpu(cpu, prv->idlers) )
> > +            cpumask_clear_cpu(cpu, prv->idlers);
> >      }
> > -    else if ( cpumask_test_cpu(cpu, prv->idlers) )
> > +    else if ( !cpumask_test_cpu(cpu, prv->idlers) )
> >      {
> > -        cpumask_clear_cpu(cpu, prv->idlers);
> > +        cpumask_set_cpu(cpu, prv->idlers);
> >      }
> Is there a reason for this extra code churn? It would seem to me
> that the change could be just the "out" label movement and
> adjustment to the first if:
> 
>    if ( !tasklet_work_scheduled && snext->pri == CSCHED_PRI_IDLE )
> 
> Am I overlooking something?
> 
No, you are not. It indeed can be done as you suggest, and it's better,
so I'll go for it.

Thanks and regards,
Dario
Dario Faggioli July 7, 2016, 10:25 a.m. UTC | #5
On Wed, 2016-07-06 at 16:41 +0100, George Dunlap wrote:
> On Sat, Jun 18, 2016 at 12:11 AM, Dario Faggioli

> > diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
> > index a38a63d..a6645a2 100644
> > --- a/xen/common/sched_credit.c
> > +++ b/xen/common/sched_credit.c
> > @@ -1819,24 +1819,24 @@ csched_schedule(
> >      else
> >          snext = csched_load_balance(prv, cpu, snext,
> > &ret.migrated);
> > 
> > + out:
> Sorry if I'm being a bit dense, but why is this moving up here?  As
> far as I can tell the only time the 'out' label will be used, the
> 'idler' status of the cpu cannot change.
> 
Mmm... I think you're right. If we go to out:, we are running someone
(so we are not idle), and we will continue to do so (so we should not
be marked as idle).

I seem to recall having seen something because of which out needed
bubbling up, but I don't any longer right now, so I'll leave it alone.

Thanks and Regards,
Dario
diff mbox

Patch

diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
index a38a63d..a6645a2 100644
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -1819,24 +1819,24 @@  csched_schedule(
     else
         snext = csched_load_balance(prv, cpu, snext, &ret.migrated);
 
+ out:
     /*
      * Update idlers mask if necessary. When we're idling, other CPUs
      * will tickle us when they get extra work.
      */
-    if ( snext->pri == CSCHED_PRI_IDLE )
+    if ( tasklet_work_scheduled || snext->pri != CSCHED_PRI_IDLE )
     {
-        if ( !cpumask_test_cpu(cpu, prv->idlers) )
-            cpumask_set_cpu(cpu, prv->idlers);
+        if ( cpumask_test_cpu(cpu, prv->idlers) )
+            cpumask_clear_cpu(cpu, prv->idlers);
     }
-    else if ( cpumask_test_cpu(cpu, prv->idlers) )
+    else if ( !cpumask_test_cpu(cpu, prv->idlers) )
     {
-        cpumask_clear_cpu(cpu, prv->idlers);
+        cpumask_set_cpu(cpu, prv->idlers);
     }
 
     if ( !is_idle_vcpu(snext->vcpu) )
         snext->start_time += now;
 
-out:
     /*
      * Return task to run next...
      */
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 1933ff1..cf8455c 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -1910,8 +1910,16 @@  csched2_schedule(
     }
     else
     {
-        /* Update the idle mask if necessary */
-        if ( !cpumask_test_cpu(cpu, &rqd->idle) )
+        /*
+         * Update the idle mask if necessary. Note that, if we're scheduling
+         * idle in order to carry on some tasklet work, we want to play busy!
+         */
+        if ( tasklet_work_scheduled )
+        {
+            if ( cpumask_test_cpu(cpu, &rqd->idle) )
+                cpumask_clear_cpu(cpu, &rqd->idle);
+        }
+        else if ( !cpumask_test_cpu(cpu, &rqd->idle) )
             cpumask_set_cpu(cpu, &rqd->idle);
         /* Make sure avgload gets updated periodically even
          * if there's no activity */
@@ -2291,8 +2299,6 @@  csched2_deinit_pdata(const struct scheduler *ops, void *pcpu, int cpu)
     /* No need to save IRQs here, they're already disabled */
     spin_lock(&rqd->lock);
 
-    BUG_ON(!cpumask_test_cpu(cpu, &rqd->idle));
-
     printk("Removing cpu %d from runqueue %d\n", cpu, rqi);
 
     cpumask_clear_cpu(cpu, &rqd->idle);