diff mbox series

[v2,28/48] xen/sched: move struct task_slice into struct sched_unit

Message ID 20190809145833.1020-29-jgross@suse.com (mailing list archive)
State Superseded
Headers show
Series xen: add core scheduling support | expand

Commit Message

Jürgen Groß Aug. 9, 2019, 2:58 p.m. UTC
In order to prepare for multiple vcpus per schedule unit move struct
task_slice in schedule() from the local stack into struct sched_unit
of the currently running unit. To make access easier for the single
schedulers add the pointer of the currently running unit as a parameter
of do_schedule().

While at it switch the tasklet_work_scheduled parameter of
do_schedule() from bool_t to bool.

As struct task_slice is only ever modified with the local schedule
lock held it is safe to directly set the different units in struct
sched_unit instead of using an on-stack copy for returning the data.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 xen/common/sched_arinc653.c | 20 +++++++-------------
 xen/common/sched_credit.c   | 25 +++++++++++--------------
 xen/common/sched_credit2.c  | 21 +++++++++------------
 xen/common/sched_null.c     | 29 ++++++++++++++---------------
 xen/common/sched_rt.c       | 22 +++++++++++-----------
 xen/common/schedule.c       | 32 ++++++++++++++------------------
 xen/include/xen/sched-if.h  | 11 +++--------
 xen/include/xen/sched.h     |  6 ++++++
 8 files changed, 75 insertions(+), 91 deletions(-)

Comments

Jan Beulich Sept. 10, 2019, 3:18 p.m. UTC | #1
On 09.08.2019 16:58, Juergen Gross wrote:
> In order to prepare for multiple vcpus per schedule unit move struct
> task_slice in schedule() from the local stack into struct sched_unit
> of the currently running unit.

The change looks mechanical enough to be probably fine, but what's
the connection between the item currently being on schedule()'s stack
and there being multiple vCPU-s? Is this because it'll be established
just once, but used multiple times (by different parties)? In which
case, since the "slaves" will have to wait for the "master" to make
the scheduling decision, there'll need to be communication anyway
between all involved parties.

Jan
Dario Faggioli Sept. 12, 2019, 8:13 a.m. UTC | #2
On Fri, 2019-08-09 at 16:58 +0200, Juergen Gross wrote:
> In order to prepare for multiple vcpus per schedule unit move struct
> task_slice in schedule() from the local stack into struct sched_unit
> of the currently running unit. To make access easier for the single
> schedulers add the pointer of the currently running unit as a
> parameter
> of do_schedule().
> 
> While at it switch the tasklet_work_scheduled parameter of
> do_schedule() from bool_t to bool.
> 
> As struct task_slice is only ever modified with the local schedule
> lock held it is safe to directly set the different units in struct
> sched_unit instead of using an on-stack copy for returning the data.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
> 
> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
> index e4d0dd4b65..d2fc89d983 100644
> --- a/xen/common/schedule.c
> +++ b/xen/common/schedule.c
> @@ -1751,9 +1749,7 @@ static void schedule(void)
>          TRACE_4D(TRC_SCHED_SWITCH_INFCONT,
>                   next->domain->domain_id, next->unit_id,
>                   now - prev->state_entry_time,
> -                 next_slice.time);
> -        trace_continue_running(next->vcpu_list);
> -        return continue_running(prev->vcpu_list);
> +                 prev->next_time);
>      }
>  
Mmm... I'm sorry, but I'm not sure I understand what is going on here.

Do you mind explaining why we're not calling continue_running() any
longer (and why this happens in this patch)?

Thanks and Regards
Jürgen Groß Sept. 12, 2019, 8:21 a.m. UTC | #3
On 12.09.19 10:13, Dario Faggioli wrote:
> On Fri, 2019-08-09 at 16:58 +0200, Juergen Gross wrote:
>> In order to prepare for multiple vcpus per schedule unit move struct
>> task_slice in schedule() from the local stack into struct sched_unit
>> of the currently running unit. To make access easier for the single
>> schedulers add the pointer of the currently running unit as a
>> parameter
>> of do_schedule().
>>
>> While at it switch the tasklet_work_scheduled parameter of
>> do_schedule() from bool_t to bool.
>>
>> As struct task_slice is only ever modified with the local schedule
>> lock held it is safe to directly set the different units in struct
>> sched_unit instead of using an on-stack copy for returning the data.
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> ---
>>
>> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
>> index e4d0dd4b65..d2fc89d983 100644
>> --- a/xen/common/schedule.c
>> +++ b/xen/common/schedule.c
>> @@ -1751,9 +1749,7 @@ static void schedule(void)
>>           TRACE_4D(TRC_SCHED_SWITCH_INFCONT,
>>                    next->domain->domain_id, next->unit_id,
>>                    now - prev->state_entry_time,
>> -                 next_slice.time);
>> -        trace_continue_running(next->vcpu_list);
>> -        return continue_running(prev->vcpu_list);
>> +                 prev->next_time);
>>       }
>>   
> Mmm... I'm sorry, but I'm not sure I understand what is going on here.
> 
> Do you mind explaining why we're not calling continue_running() any
> longer (and why this happens in this patch)?

Good catch. The related coding gets added in patch 29 again. Seems as
if two patches got mixed up.


Juergen
Jürgen Groß Sept. 13, 2019, 12:56 p.m. UTC | #4
On 10.09.19 17:18, Jan Beulich wrote:
> On 09.08.2019 16:58, Juergen Gross wrote:
>> In order to prepare for multiple vcpus per schedule unit move struct
>> task_slice in schedule() from the local stack into struct sched_unit
>> of the currently running unit.
> 
> The change looks mechanical enough to be probably fine, but what's
> the connection between the item currently being on schedule()'s stack
> and there being multiple vCPU-s? Is this because it'll be established
> just once, but used multiple times (by different parties)? In which
> case, since the "slaves" will have to wait for the "master" to make
> the scheduling decision, there'll need to be communication anyway
> between all involved parties.

Synchronization between the involved parties is done via struct
sched_unit (see patch 29). There is no need to add another data
structure for explicit communication, as on all cpus involved the same
unit is active, so its address is already known.

And this is mandatory, as only when all cpus have joined the last one
will do the schedule() call and then release the other cpus for doing
the context switch. Propagating another pointer on the local stack
would be hard as splitting up schedule() as done in patch 29 would no
longer be possible resulting in a rather hard to understand gigantic
function.


Juergen
diff mbox series

Patch

diff --git a/xen/common/sched_arinc653.c b/xen/common/sched_arinc653.c
index e48f2b2eb9..34efcc07c9 100644
--- a/xen/common/sched_arinc653.c
+++ b/xen/common/sched_arinc653.c
@@ -497,18 +497,14 @@  a653sched_unit_wake(const struct scheduler *ops, struct sched_unit *unit)
  *
  * @param ops       Pointer to this instance of the scheduler structure
  * @param now       Current time
- *
- * @return          Address of the UNIT structure scheduled to be run next
- *                  Amount of time to execute the returned UNIT
- *                  Flag for whether the UNIT was migrated
  */
-static struct task_slice
+static void
 a653sched_do_schedule(
     const struct scheduler *ops,
+    struct sched_unit *prev,
     s_time_t now,
-    bool_t tasklet_work_scheduled)
+    bool tasklet_work_scheduled)
 {
-    struct task_slice ret;                      /* hold the chosen domain */
     struct sched_unit *new_task = NULL;
     static unsigned int sched_index = 0;
     static s_time_t next_switch_time;
@@ -586,13 +582,11 @@  a653sched_do_schedule(
      * Return the amount of time the next domain has to run and the address
      * of the selected task's UNIT structure.
      */
-    ret.time = next_switch_time - now;
-    ret.task = new_task;
-    ret.migrated = 0;
-
-    BUG_ON(ret.time <= 0);
+    prev->next_time = next_switch_time - now;
+    prev->next_task = new_task;
+    new_task->migrated = false;
 
-    return ret;
+    BUG_ON(prev->next_time <= 0);
 }
 
 /**
diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
index 87cb62c632..f1675fd52e 100644
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -1675,7 +1675,7 @@  csched_runq_steal(int peer_cpu, int cpu, int pri, int balance_step)
 
 static struct csched_unit *
 csched_load_balance(struct csched_private *prv, int cpu,
-    struct csched_unit *snext, bool_t *stolen)
+    struct csched_unit *snext, bool *stolen)
 {
     struct cpupool *c = per_cpu(cpupool, cpu);
     struct csched_unit *speer;
@@ -1791,7 +1791,7 @@  csched_load_balance(struct csched_private *prv, int cpu,
                 /* As soon as one unit is found, balancing ends */
                 if ( speer != NULL )
                 {
-                    *stolen = 1;
+                    *stolen = true;
                     /*
                      * Next time we'll look for work to steal on this node, we
                      * will start from the next pCPU, with respect to this one,
@@ -1821,19 +1821,18 @@  csched_load_balance(struct csched_private *prv, int cpu,
  * This function is in the critical path. It is designed to be simple and
  * fast for the common case.
  */
-static struct task_slice
-csched_schedule(
-    const struct scheduler *ops, s_time_t now, bool_t tasklet_work_scheduled)
+static void csched_schedule(
+    const struct scheduler *ops, struct sched_unit *unit, s_time_t now,
+    bool tasklet_work_scheduled)
 {
     const unsigned int cpu = smp_processor_id();
     const unsigned int sched_cpu = sched_get_resource_cpu(cpu);
     struct list_head * const runq = RUNQ(sched_cpu);
-    struct sched_unit *unit = current->sched_unit;
     struct csched_unit * const scurr = CSCHED_UNIT(unit);
     struct csched_private *prv = CSCHED_PRIV(ops);
     struct csched_unit *snext;
-    struct task_slice ret;
     s_time_t runtime, tslice;
+    bool migrated = false;
 
     SCHED_STAT_CRANK(schedule);
     CSCHED_UNIT_CHECK(unit);
@@ -1924,7 +1923,6 @@  csched_schedule(
                         (unsigned char *)&d);
         }
 
-        ret.migrated = 0;
         goto out;
     }
     tslice = prv->tslice;
@@ -1942,7 +1940,6 @@  csched_schedule(
     }
 
     snext = __runq_elem(runq->next);
-    ret.migrated = 0;
 
     /* Tasklet work (which runs in idle UNIT context) overrides all else. */
     if ( tasklet_work_scheduled )
@@ -1968,7 +1965,7 @@  csched_schedule(
     if ( snext->pri > CSCHED_PRI_TS_OVER )
         __runq_remove(snext);
     else
-        snext = csched_load_balance(prv, sched_cpu, snext, &ret.migrated);
+        snext = csched_load_balance(prv, sched_cpu, snext, &migrated);
 
     /*
      * Update idlers mask if necessary. When we're idling, other CPUs
@@ -1991,12 +1988,12 @@  out:
     /*
      * Return task to run next...
      */
-    ret.time = (is_idle_unit(snext->unit) ?
+    unit->next_time = (is_idle_unit(snext->unit) ?
                 -1 : tslice);
-    ret.task = snext->unit;
+    unit->next_task = snext->unit;
+    snext->unit->migrated = migrated;
 
-    CSCHED_UNIT_CHECK(ret.task);
-    return ret;
+    CSCHED_UNIT_CHECK(unit->next_task);
 }
 
 static void
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 548b87af8b..98ef48d6f4 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -3444,19 +3444,18 @@  runq_candidate(struct csched2_runqueue_data *rqd,
  * This function is in the critical path. It is designed to be simple and
  * fast for the common case.
  */
-static struct task_slice
-csched2_schedule(
-    const struct scheduler *ops, s_time_t now, bool tasklet_work_scheduled)
+static void csched2_schedule(
+    const struct scheduler *ops, struct sched_unit *currunit, s_time_t now,
+    bool tasklet_work_scheduled)
 {
     const unsigned int cpu = smp_processor_id();
     const unsigned int sched_cpu = sched_get_resource_cpu(cpu);
     struct csched2_runqueue_data *rqd;
-    struct sched_unit *currunit = current->sched_unit;
     struct csched2_unit * const scurr = csched2_unit(currunit);
     struct csched2_unit *snext = NULL;
     unsigned int skipped_units = 0;
-    struct task_slice ret;
     bool tickled;
+    bool migrated = false;
 
     SCHED_STAT_CRANK(schedule);
     CSCHED2_UNIT_CHECK(currunit);
@@ -3541,8 +3540,6 @@  csched2_schedule(
          && unit_runnable(currunit) )
         __set_bit(__CSFLAG_delayed_runq_add, &scurr->flags);
 
-    ret.migrated = 0;
-
     /* Accounting for non-idle tasks */
     if ( !is_idle_unit(snext->unit) )
     {
@@ -3592,7 +3589,7 @@  csched2_schedule(
             snext->credit += CSCHED2_MIGRATE_COMPENSATION;
             sched_set_res(snext->unit, get_sched_res(sched_cpu));
             SCHED_STAT_CRANK(migrated);
-            ret.migrated = 1;
+            migrated = true;
         }
     }
     else
@@ -3623,11 +3620,11 @@  csched2_schedule(
     /*
      * Return task to run next...
      */
-    ret.time = csched2_runtime(ops, sched_cpu, snext, now);
-    ret.task = snext->unit;
+    currunit->next_time = csched2_runtime(ops, sched_cpu, snext, now);
+    currunit->next_task = snext->unit;
+    snext->unit->migrated = migrated;
 
-    CSCHED2_UNIT_CHECK(ret.task);
-    return ret;
+    CSCHED2_UNIT_CHECK(currunit->next_task);
 }
 
 static void
diff --git a/xen/common/sched_null.c b/xen/common/sched_null.c
index 56ef078c5a..397edcbc83 100644
--- a/xen/common/sched_null.c
+++ b/xen/common/sched_null.c
@@ -779,16 +779,14 @@  static inline void null_unit_check(struct sched_unit *unit)
  *  - the unit assigned to the pCPU, if there's one and it can run;
  *  - the idle unit, otherwise.
  */
-static struct task_slice null_schedule(const struct scheduler *ops,
-                                       s_time_t now,
-                                       bool_t tasklet_work_scheduled)
+static void null_schedule(const struct scheduler *ops, struct sched_unit *prev,
+                          s_time_t now, bool tasklet_work_scheduled)
 {
     unsigned int bs;
     const unsigned int cpu = smp_processor_id();
     const unsigned int sched_cpu = sched_get_resource_cpu(cpu);
     struct null_private *prv = null_priv(ops);
     struct null_unit *wvc;
-    struct task_slice ret;
 
     SCHED_STAT_CRANK(schedule);
     NULL_UNIT_CHECK(current->sched_unit);
@@ -816,19 +814,18 @@  static struct task_slice null_schedule(const struct scheduler *ops,
     if ( tasklet_work_scheduled )
     {
         trace_var(TRC_SNULL_TASKLET, 1, 0, NULL);
-        ret.task = sched_idle_unit(sched_cpu);
+        prev->next_task = sched_idle_unit(sched_cpu);
     }
     else
-        ret.task = per_cpu(npc, sched_cpu).unit;
-    ret.migrated = 0;
-    ret.time = -1;
+        prev->next_task = per_cpu(npc, sched_cpu).unit;
+    prev->next_time = -1;
 
     /*
      * We may be new in the cpupool, or just coming back online. In which
      * case, there may be units in the waitqueue that we can assign to us
      * and run.
      */
-    if ( unlikely(ret.task == NULL) )
+    if ( unlikely(prev->next_task == NULL) )
     {
         spin_lock(&prv->waitq_lock);
 
@@ -854,7 +851,7 @@  static struct task_slice null_schedule(const struct scheduler *ops,
                 {
                     unit_assign(prv, wvc->unit, sched_cpu);
                     list_del_init(&wvc->waitq_elem);
-                    ret.task = wvc->unit;
+                    prev->next_task = wvc->unit;
                     goto unlock;
                 }
             }
@@ -862,15 +859,17 @@  static struct task_slice null_schedule(const struct scheduler *ops,
  unlock:
         spin_unlock(&prv->waitq_lock);
 
-        if ( ret.task == NULL && !cpumask_test_cpu(cpu, &prv->cpus_free) )
+        if ( prev->next_task == NULL &&
+             !cpumask_test_cpu(cpu, &prv->cpus_free) )
             cpumask_set_cpu(cpu, &prv->cpus_free);
     }
 
-    if ( unlikely(ret.task == NULL || !unit_runnable(ret.task)) )
-        ret.task = sched_idle_unit(sched_cpu);
+    if ( unlikely(prev->next_task == NULL || !unit_runnable(prev->next_task)) )
+        prev->next_task = sched_idle_unit(sched_cpu);
 
-    NULL_UNIT_CHECK(ret.task);
-    return ret;
+    NULL_UNIT_CHECK(prev->next_task);
+
+    prev->next_task->migrated = false;
 }
 
 static inline void dump_unit(struct null_private *prv, struct null_unit *nvc)
diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index 7b9d25f138..fcbfa528f4 100644
--- a/xen/common/sched_rt.c
+++ b/xen/common/sched_rt.c
@@ -1054,16 +1054,16 @@  runq_pick(const struct scheduler *ops, const cpumask_t *mask)
  * schedule function for rt scheduler.
  * The lock is already grabbed in schedule.c, no need to lock here
  */
-static struct task_slice
-rt_schedule(const struct scheduler *ops, s_time_t now, bool_t tasklet_work_scheduled)
+static void
+rt_schedule(const struct scheduler *ops, struct sched_unit *currunit,
+            s_time_t now, bool tasklet_work_scheduled)
 {
     const unsigned int cpu = smp_processor_id();
     const unsigned int sched_cpu = sched_get_resource_cpu(cpu);
     struct rt_private *prv = rt_priv(ops);
-    struct rt_unit *const scurr = rt_unit(current->sched_unit);
+    struct rt_unit *const scurr = rt_unit(currunit);
     struct rt_unit *snext = NULL;
-    struct task_slice ret = { .migrated = 0 };
-    struct sched_unit *currunit = current->sched_unit;
+    bool migrated = false;
 
     /* TRACE */
     {
@@ -1111,7 +1111,7 @@  rt_schedule(const struct scheduler *ops, s_time_t now, bool_t tasklet_work_sched
         __set_bit(__RTDS_delayed_runq_add, &scurr->flags);
 
     snext->last_start = now;
-    ret.time =  -1; /* if an idle unit is picked */
+    currunit->next_time =  -1; /* if an idle unit is picked */
     if ( !is_idle_unit(snext->unit) )
     {
         if ( snext != scurr )
@@ -1122,13 +1122,13 @@  rt_schedule(const struct scheduler *ops, s_time_t now, bool_t tasklet_work_sched
         if ( sched_unit_cpu(snext->unit) != sched_cpu )
         {
             sched_set_res(snext->unit, get_sched_res(sched_cpu));
-            ret.migrated = 1;
+            migrated = true;
         }
-        ret.time = snext->cur_budget; /* invoke the scheduler next time */
+        /* Invoke the scheduler next time. */
+        currunit->next_time = snext->cur_budget;
     }
-    ret.task = snext->unit;
-
-    return ret;
+    currunit->next_task = snext->unit;
+    snext->unit->migrated = migrated;
 }
 
 /*
diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index e4d0dd4b65..d2fc89d983 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -105,15 +105,14 @@  sched_idle_free_vdata(const struct scheduler *ops, void *priv)
 {
 }
 
-static struct task_slice sched_idle_schedule(
-    const struct scheduler *ops, s_time_t now,
+static void sched_idle_schedule(
+    const struct scheduler *ops, struct sched_unit *unit, s_time_t now,
     bool tasklet_work_scheduled)
 {
     const unsigned int cpu = smp_processor_id();
-    struct task_slice ret = { .time = -1 };
 
-    ret.task = sched_idle_unit(sched_get_resource_cpu(cpu));
-    return ret;
+    unit->next_time = -1;
+    unit->next_task = sched_idle_unit(sched_get_resource_cpu(cpu));
 }
 
 static struct scheduler sched_idle_ops = {
@@ -1698,10 +1697,9 @@  static void schedule(void)
     s_time_t              now;
     struct scheduler     *sched;
     unsigned long        *tasklet_work = &this_cpu(tasklet_work_to_do);
-    bool_t                tasklet_work_scheduled = 0;
+    bool                  tasklet_work_scheduled = false;
     struct sched_resource *sd;
     spinlock_t           *lock;
-    struct task_slice     next_slice;
     int cpu = smp_processor_id();
 
     ASSERT_NOT_IN_ATOMIC();
@@ -1717,12 +1715,12 @@  static void schedule(void)
         set_bit(_TASKLET_scheduled, tasklet_work);
         /* fallthrough */
     case TASKLET_enqueued|TASKLET_scheduled:
-        tasklet_work_scheduled = 1;
+        tasklet_work_scheduled = true;
         break;
     case TASKLET_scheduled:
         clear_bit(_TASKLET_scheduled, tasklet_work);
     case 0:
-        /*tasklet_work_scheduled = 0;*/
+        /*tasklet_work_scheduled = false;*/
         break;
     default:
         BUG();
@@ -1736,14 +1734,14 @@  static void schedule(void)
 
     /* get policy-specific decision on scheduling... */
     sched = this_cpu(scheduler);
-    next_slice = sched->do_schedule(sched, now, tasklet_work_scheduled);
+    sched->do_schedule(sched, prev, now, tasklet_work_scheduled);
 
-    next = next_slice.task;
+    next = prev->next_task;
 
     sd->curr = next;
 
-    if ( next_slice.time >= 0 ) /* -ve means no limit */
-        set_timer(&sd->s_timer, now + next_slice.time);
+    if ( prev->next_time >= 0 ) /* -ve means no limit */
+        set_timer(&sd->s_timer, now + prev->next_time);
 
     if ( unlikely(prev == next) )
     {
@@ -1751,9 +1749,7 @@  static void schedule(void)
         TRACE_4D(TRC_SCHED_SWITCH_INFCONT,
                  next->domain->domain_id, next->unit_id,
                  now - prev->state_entry_time,
-                 next_slice.time);
-        trace_continue_running(next->vcpu_list);
-        return continue_running(prev->vcpu_list);
+                 prev->next_time);
     }
 
     TRACE_3D(TRC_SCHED_SWITCH_INFPREV,
@@ -1763,7 +1759,7 @@  static void schedule(void)
              next->domain->domain_id, next->unit_id,
              (next->vcpu_list->runstate.state == RUNSTATE_runnable) ?
              (now - next->state_entry_time) : 0,
-             next_slice.time);
+             prev->next_time);
 
     ASSERT(prev->vcpu_list->runstate.state == RUNSTATE_running);
 
@@ -1792,7 +1788,7 @@  static void schedule(void)
 
     stop_timer(&prev->vcpu_list->periodic_timer);
 
-    if ( next_slice.migrated )
+    if ( next->migrated )
         vcpu_move_irqs(next->vcpu_list);
 
     vcpu_periodic_timer_work(next->vcpu_list);
diff --git a/xen/include/xen/sched-if.h b/xen/include/xen/sched-if.h
index 1a3981e78a..ba91c3a680 100644
--- a/xen/include/xen/sched-if.h
+++ b/xen/include/xen/sched-if.h
@@ -195,12 +195,6 @@  static inline spinlock_t *pcpu_schedule_trylock(unsigned int cpu)
     return NULL;
 }
 
-struct task_slice {
-    struct sched_unit *task;
-    s_time_t           time;
-    bool_t             migrated;
-};
-
 struct scheduler {
     char *name;             /* full name for this scheduler      */
     char *opt_name;         /* option name for this scheduler    */
@@ -243,8 +237,9 @@  struct scheduler {
     void         (*context_saved)  (const struct scheduler *,
                                     struct sched_unit *);
 
-    struct task_slice (*do_schedule) (const struct scheduler *, s_time_t,
-                                      bool_t tasklet_work_scheduled);
+    void         (*do_schedule)    (const struct scheduler *,
+                                    struct sched_unit *, s_time_t,
+                                    bool tasklet_work_scheduled);
 
     struct sched_resource * (*pick_resource) (const struct scheduler *,
                                               struct sched_unit *);
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 7f84b823cb..9a17962132 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -281,12 +281,18 @@  struct sched_unit {
     bool                   is_running;
     /* Does soft affinity actually play a role (given hard affinity)? */
     bool                   soft_aff_effective;
+    /* Item has been migrated to other cpu(s). */
+    bool                   migrated;
     /* Bitmask of CPUs on which this VCPU may run. */
     cpumask_var_t          cpu_hard_affinity;
     /* Used to save affinity during temporary pinning. */
     cpumask_var_t          cpu_hard_affinity_saved;
     /* Bitmask of CPUs on which this VCPU prefers to run. */
     cpumask_var_t          cpu_soft_affinity;
+
+    /* Next unit to run. */
+    struct sched_unit      *next_task;
+    s_time_t                next_time;
 };
 
 #define for_each_sched_unit(d, e)                                         \