diff mbox series

[RFC,08/49] xen/sched: use new sched_item instead of vcpu in scheduler interfaces

Message ID 20190329150934.17694-9-jgross@suse.com (mailing list archive)
State Superseded
Headers show
Series xen: add core scheduling support | expand

Commit Message

Jürgen Groß March 29, 2019, 3:08 p.m. UTC
In order to prepare core- and socket-scheduling use a new struct
sched_item instead of struct vcpu for interfaces of the different
schedulers.

Rename the per-scheduler functions insert_vcpu and remove_vcpu to
insert_item and remove_item to reflect the change of the parameter.
In the schedulers rename local functions switched to sched_item, too.

For now this new struct will contain a vcpu pointer only and is
allocated on the stack. This will be changed later.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 xen/common/sched_arinc653.c | 30 +++++++++++++++---------
 xen/common/sched_credit.c   | 41 +++++++++++++++++++-------------
 xen/common/sched_credit2.c  | 57 +++++++++++++++++++++++++++------------------
 xen/common/sched_null.c     | 39 ++++++++++++++++++++-----------
 xen/common/sched_rt.c       | 33 +++++++++++++++-----------
 xen/common/schedule.c       | 53 ++++++++++++++++++++++++++++-------------
 xen/include/xen/sched-if.h  | 40 ++++++++++++++++++++-----------
 7 files changed, 187 insertions(+), 106 deletions(-)

Comments

Andrew Cooper March 29, 2019, 6:42 p.m. UTC | #1
On 29/03/2019 15:08, Juergen Gross wrote:
> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
> index 6b5d454630..d1a958143a 100644
> --- a/xen/common/schedule.c
> +++ b/xen/common/schedule.c
> @@ -256,6 +256,7 @@ static void sched_spin_unlock_double(spinlock_t *lock1, spinlock_t *lock2,
>  int sched_init_vcpu(struct vcpu *v, unsigned int processor)
>  {
>      struct domain *d = v->domain;
> +    struct sched_item item = { .vcpu = v };
>  
>      v->processor = processor;
>  
> @@ -267,7 +268,7 @@ int sched_init_vcpu(struct vcpu *v, unsigned int processor)
>      init_timer(&v->poll_timer, poll_timer_fn,
>                 v, v->processor);
>  
> -    v->sched_priv = SCHED_OP(dom_scheduler(d), alloc_vdata, v,
> +    v->sched_priv = SCHED_OP(dom_scheduler(d), alloc_vdata, &item,
>                       d->sched_priv);

I realise this is perhaps an over-the-top request, but can we see about
doing more here?

SCHED_OP() is a thoroughly objectionable piece of obfuscation, which
breaks cscope/ctags and also results in especially poor code generation.

Given that we are changing the interface anyway and touching all
codepaths, would you mind also adding static inline wrappers like I
started with 340edc3 ?

TBH, I'm even happy to give this a go and give you the back the
resulting tree, if you'd prefer.

~Andrew
Jürgen Groß March 30, 2019, 10:24 a.m. UTC | #2
On 29/03/2019 19:42, Andrew Cooper wrote:
> On 29/03/2019 15:08, Juergen Gross wrote:
>> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
>> index 6b5d454630..d1a958143a 100644
>> --- a/xen/common/schedule.c
>> +++ b/xen/common/schedule.c
>> @@ -256,6 +256,7 @@ static void sched_spin_unlock_double(spinlock_t *lock1, spinlock_t *lock2,
>>  int sched_init_vcpu(struct vcpu *v, unsigned int processor)
>>  {
>>      struct domain *d = v->domain;
>> +    struct sched_item item = { .vcpu = v };
>>  
>>      v->processor = processor;
>>  
>> @@ -267,7 +268,7 @@ int sched_init_vcpu(struct vcpu *v, unsigned int processor)
>>      init_timer(&v->poll_timer, poll_timer_fn,
>>                 v, v->processor);
>>  
>> -    v->sched_priv = SCHED_OP(dom_scheduler(d), alloc_vdata, v,
>> +    v->sched_priv = SCHED_OP(dom_scheduler(d), alloc_vdata, &item,
>>                       d->sched_priv);
> 
> I realise this is perhaps an over-the-top request, but can we see about
> doing more here?
> 
> SCHED_OP() is a thoroughly objectionable piece of obfuscation, which
> breaks cscope/ctags and also results in especially poor code generation.
> 
> Given that we are changing the interface anyway and touching all
> codepaths, would you mind also adding static inline wrappers like I
> started with 340edc3 ?

Okay, I'll do that.

> TBH, I'm even happy to give this a go and give you the back the
> resulting tree, if you'd prefer.

I think its is easier to do it myself, as I'm touching nearly all of
the call sites anyway.


Juergen
Jürgen Groß April 1, 2019, 6:06 a.m. UTC | #3
On 30/03/2019 11:24, Juergen Gross wrote:
> On 29/03/2019 19:42, Andrew Cooper wrote:
>> On 29/03/2019 15:08, Juergen Gross wrote:
>>> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
>>> index 6b5d454630..d1a958143a 100644
>>> --- a/xen/common/schedule.c
>>> +++ b/xen/common/schedule.c
>>> @@ -256,6 +256,7 @@ static void sched_spin_unlock_double(spinlock_t *lock1, spinlock_t *lock2,
>>>  int sched_init_vcpu(struct vcpu *v, unsigned int processor)
>>>  {
>>>      struct domain *d = v->domain;
>>> +    struct sched_item item = { .vcpu = v };
>>>  
>>>      v->processor = processor;
>>>  
>>> @@ -267,7 +268,7 @@ int sched_init_vcpu(struct vcpu *v, unsigned int processor)
>>>      init_timer(&v->poll_timer, poll_timer_fn,
>>>                 v, v->processor);
>>>  
>>> -    v->sched_priv = SCHED_OP(dom_scheduler(d), alloc_vdata, v,
>>> +    v->sched_priv = SCHED_OP(dom_scheduler(d), alloc_vdata, &item,
>>>                       d->sched_priv);
>>
>> I realise this is perhaps an over-the-top request, but can we see about
>> doing more here?
>>
>> SCHED_OP() is a thoroughly objectionable piece of obfuscation, which
>> breaks cscope/ctags and also results in especially poor code generation.
>>
>> Given that we are changing the interface anyway and touching all
>> codepaths, would you mind also adding static inline wrappers like I
>> started with 340edc3 ?
> 
> Okay, I'll do that.
> 
>> TBH, I'm even happy to give this a go and give you the back the
>> resulting tree, if you'd prefer.
> 
> I think its is easier to do it myself, as I'm touching nearly all of
> the call sites anyway.

And another thought I had: with RETPOLINE indirect jumps are even more
expensive. Would it be a good idea to remove the function pointers from
struct scheduler and generate the inline wrappers at build time? The
wrappers could then call the related specific scheduler function based
on the scheduler Id using a chain of if ... else if ... statements. It
would prefer the default scheduler over the others and test only for
configured schedulers. Scheduler registration could be done the same way
removing the need for an extra link section.


Juergen
Dario Faggioli April 1, 2019, 7:05 a.m. UTC | #4
On Mon, 2019-04-01 at 08:06 +0200, Juergen Gross wrote:
> On 30/03/2019 11:24, Juergen Gross wrote:
> > I think its is easier to do it myself, as I'm touching nearly all
> > of
> > the call sites anyway.
> 
> And another thought I had: with RETPOLINE indirect jumps are even
> more
> expensive. Would it be a good idea to remove the function pointers
> from
> struct scheduler and generate the inline wrappers at build time?
>
Yep, I was thinking about doing something like that already,
independently from this feature/series.

At least something that special case the configured default scheduler,
and let its hooks be called without indirect jumps (i.e., similarly to
what's being done in Linux, in quite a few places, these days).

> The
> wrappers could then call the related specific scheduler function
> based
> on the scheduler Id using a chain of if ... else if ... statements. 
>
I guess we'd have to see how the final code will look, but I like the
idea, and I think it's well worth a try.

Regards,
Dario
Andrew Cooper April 1, 2019, 8:19 a.m. UTC | #5
On 01/04/2019 08:05, Dario Faggioli wrote:
> On Mon, 2019-04-01 at 08:06 +0200, Juergen Gross wrote:
>> On 30/03/2019 11:24, Juergen Gross wrote:
>>> I think its is easier to do it myself, as I'm touching nearly all
>>> of
>>> the call sites anyway.
>> And another thought I had: with RETPOLINE indirect jumps are even
>> more
>> expensive. Would it be a good idea to remove the function pointers
>> from
>> struct scheduler and generate the inline wrappers at build time?
>>
> Yep, I was thinking about doing something like that already,
> independently from this feature/series.
>
> At least something that special case the configured default scheduler,
> and let its hooks be called without indirect jumps (i.e., similarly to
> what's being done in Linux, in quite a few places, these days).
>
>> The
>> wrappers could then call the related specific scheduler function
>> based
>> on the scheduler Id using a chain of if ... else if ... statements. 
>>
> I guess we'd have to see how the final code will look, but I like the
> idea, and I think it's well worth a try.

Jan has a series in progress which does do some manual devirtualisation
across Xen.

The scheduler is harder though - we've got the default scheduler which
is overwhelmingly likely to be the target of the call, but not always
guaranteed.

Normally, the result is put together with PGO rather than manually,
because the effects are quite subtle.

The base case which might be good enough for Xen is:

if ( sched == default )
    sched_foo();
else
    sched->foo();

which for the common case of the default cpupool only, or multiple
groups with the same scheduler, will always take the direct path rather
than the indirect path.

Beyond that, the best length of the if/else chain can only reasonably be
determined with profiling.  It depends on the relative frequencies of
each call, and blindly doing an if/else chain to the end of the
scheduler list will probably make worse performance if you're using the
final scheduler than using a retpoline would.  Furthermore, on future
fixed hardware, using indirect calls will become the quicker option again.

I think its useful to consider optimisations potential optimisations,
but I'd advise against trying to merge everything into this series.

~Andrew
Jürgen Groß April 1, 2019, 8:49 a.m. UTC | #6
On 01/04/2019 10:19, Andrew Cooper wrote:
> On 01/04/2019 08:05, Dario Faggioli wrote:
>> On Mon, 2019-04-01 at 08:06 +0200, Juergen Gross wrote:
>>> On 30/03/2019 11:24, Juergen Gross wrote:
>>>> I think its is easier to do it myself, as I'm touching nearly all
>>>> of
>>>> the call sites anyway.
>>> And another thought I had: with RETPOLINE indirect jumps are even
>>> more
>>> expensive. Would it be a good idea to remove the function pointers
>>> from
>>> struct scheduler and generate the inline wrappers at build time?
>>>
>> Yep, I was thinking about doing something like that already,
>> independently from this feature/series.
>>
>> At least something that special case the configured default scheduler,
>> and let its hooks be called without indirect jumps (i.e., similarly to
>> what's being done in Linux, in quite a few places, these days).
>>
>>> The
>>> wrappers could then call the related specific scheduler function
>>> based
>>> on the scheduler Id using a chain of if ... else if ... statements. 
>>>
>> I guess we'd have to see how the final code will look, but I like the
>> idea, and I think it's well worth a try.
> 
> Jan has a series in progress which does do some manual devirtualisation
> across Xen.
> 
> The scheduler is harder though - we've got the default scheduler which
> is overwhelmingly likely to be the target of the call, but not always
> guaranteed.
> 
> Normally, the result is put together with PGO rather than manually,
> because the effects are quite subtle.
> 
> The base case which might be good enough for Xen is:
> 
> if ( sched == default )
>     sched_foo();
> else
>     sched->foo();
> 
> which for the common case of the default cpupool only, or multiple
> groups with the same scheduler, will always take the direct path rather
> than the indirect path.
> 
> Beyond that, the best length of the if/else chain can only reasonably be
> determined with profiling.  It depends on the relative frequencies of
> each call, and blindly doing an if/else chain to the end of the
> scheduler list will probably make worse performance if you're using the
> final scheduler than using a retpoline would.  Furthermore, on future
> fixed hardware, using indirect calls will become the quicker option again.
> 
> I think its useful to consider optimisations potential optimisations,
> but I'd advise against trying to merge everything into this series.

Fine with me.


Juergen
Dario Faggioli April 1, 2019, 3:15 p.m. UTC | #7
On Mon, 2019-04-01 at 09:19 +0100, Andrew Cooper wrote:
> On 01/04/2019 08:05, Dario Faggioli wrote:
> > On Mon, 2019-04-01 at 08:06 +0200, Juergen Gross wrote:
> > > The
> > > wrappers could then call the related specific scheduler function
> > > based
> > > on the scheduler Id using a chain of if ... else if ...
> > > statements. 
> > > 
> > I guess we'd have to see how the final code will look, but I like
> > the
> > idea, and I think it's well worth a try.
> 
> Normally, the result is put together with PGO rather than manually,
> because the effects are quite subtle.
> 
> The base case which might be good enough for Xen is:
> 
> if ( sched == default )
>     sched_foo();
> else
>     sched->foo();
> 
Yep, and this was exactly what I had in mind, before a full 'if..else'
was mentioned here. And if that's as far as it's sane to get, I'm fine
with that.

> which for the common case of the default cpupool only, or multiple
> groups with the same scheduler, will always take the direct path
> rather
> than the indirect path.
> 
Yeah, and as far as I've been seeing, using default scheduler and
pretty much ignoring cpupool is common enough (and I'm not saying it's
too great a thing! :-/)

> Beyond that, the best length of the if/else chain can only reasonably
> be
> determined with profiling.  It depends on the relative frequencies of
> each call, and blindly doing an if/else chain to the end of the
> scheduler list will probably make worse performance if you're using
> the
> final scheduler than using a retpoline would.  
>
Yeah, makes sense.

And anyway...

> I think its useful to consider optimisations potential optimisations,
> but I'd advise against trying to merge everything into this series.
> 
...yes, let's keep this for later.

Regards,
Dario
diff mbox series

Patch

diff --git a/xen/common/sched_arinc653.c b/xen/common/sched_arinc653.c
index a4c6d00b81..fffe23113e 100644
--- a/xen/common/sched_arinc653.c
+++ b/xen/common/sched_arinc653.c
@@ -376,13 +376,16 @@  a653sched_deinit(struct scheduler *ops)
  * This function allocates scheduler-specific data for a VCPU
  *
  * @param ops       Pointer to this instance of the scheduler structure
+ * @param item      Pointer to struct sched_item
  *
  * @return          Pointer to the allocated data
  */
 static void *
-a653sched_alloc_vdata(const struct scheduler *ops, struct vcpu *vc, void *dd)
+a653sched_alloc_vdata(const struct scheduler *ops, struct sched_item *item,
+                      void *dd)
 {
     a653sched_priv_t *sched_priv = SCHED_PRIV(ops);
+    struct vcpu *vc = item->vcpu;
     arinc653_vcpu_t *svc;
     unsigned int entry;
     unsigned long flags;
@@ -458,11 +461,13 @@  a653sched_free_vdata(const struct scheduler *ops, void *priv)
  * Xen scheduler callback function to sleep a VCPU
  *
  * @param ops       Pointer to this instance of the scheduler structure
- * @param vc        Pointer to the VCPU structure for the current domain
+ * @param item      Pointer to struct sched_item
  */
 static void
-a653sched_vcpu_sleep(const struct scheduler *ops, struct vcpu *vc)
+a653sched_item_sleep(const struct scheduler *ops, struct sched_item *item)
 {
+    struct vcpu *vc = item->vcpu;
+
     if ( AVCPU(vc) != NULL )
         AVCPU(vc)->awake = 0;
 
@@ -478,11 +483,13 @@  a653sched_vcpu_sleep(const struct scheduler *ops, struct vcpu *vc)
  * Xen scheduler callback function to wake up a VCPU
  *
  * @param ops       Pointer to this instance of the scheduler structure
- * @param vc        Pointer to the VCPU structure for the current domain
+ * @param item      Pointer to struct sched_item
  */
 static void
-a653sched_vcpu_wake(const struct scheduler *ops, struct vcpu *vc)
+a653sched_item_wake(const struct scheduler *ops, struct sched_item *item)
 {
+    struct vcpu *vc = item->vcpu;
+
     if ( AVCPU(vc) != NULL )
         AVCPU(vc)->awake = 1;
 
@@ -597,13 +604,14 @@  a653sched_do_schedule(
  * Xen scheduler callback function to select a CPU for the VCPU to run on
  *
  * @param ops       Pointer to this instance of the scheduler structure
- * @param v         Pointer to the VCPU structure for the current domain
+ * @param item      Pointer to struct sched_item
  *
  * @return          Number of selected physical CPU
  */
 static int
-a653sched_pick_cpu(const struct scheduler *ops, struct vcpu *vc)
+a653sched_pick_cpu(const struct scheduler *ops, struct sched_item *item)
 {
+    struct vcpu *vc = item->vcpu;
     cpumask_t *online;
     unsigned int cpu;
 
@@ -712,11 +720,11 @@  static const struct scheduler sched_arinc653_def = {
     .free_vdata     = a653sched_free_vdata,
     .alloc_vdata    = a653sched_alloc_vdata,
 
-    .insert_vcpu    = NULL,
-    .remove_vcpu    = NULL,
+    .insert_item    = NULL,
+    .remove_item    = NULL,
 
-    .sleep          = a653sched_vcpu_sleep,
-    .wake           = a653sched_vcpu_wake,
+    .sleep          = a653sched_item_sleep,
+    .wake           = a653sched_item_wake,
     .yield          = NULL,
     .context_saved  = NULL,
 
diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
index 3abe20def8..3735486b4c 100644
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -868,15 +868,16 @@  _csched_cpu_pick(const struct scheduler *ops, struct vcpu *vc, bool_t commit)
 }
 
 static int
-csched_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
+csched_cpu_pick(const struct scheduler *ops, struct sched_item *item)
 {
+    struct vcpu *vc = item->vcpu;
     struct csched_vcpu *svc = CSCHED_VCPU(vc);
 
     /*
      * We have been called by vcpu_migrate() (in schedule.c), as part
      * of the process of seeing if vc can be migrated to another pcpu.
      * We make a note about this in svc->flags so that later, in
-     * csched_vcpu_wake() (still called from vcpu_migrate()) we won't
+     * csched_item_wake() (still called from vcpu_migrate()) we won't
      * get boosted, which we don't deserve as we are "only" migrating.
      */
     set_bit(CSCHED_FLAG_VCPU_MIGRATING, &svc->flags);
@@ -1004,8 +1005,10 @@  csched_vcpu_acct(struct csched_private *prv, unsigned int cpu)
 }
 
 static void *
-csched_alloc_vdata(const struct scheduler *ops, struct vcpu *vc, void *dd)
+csched_alloc_vdata(const struct scheduler *ops, struct sched_item *item,
+                   void *dd)
 {
+    struct vcpu *vc = item->vcpu;
     struct csched_vcpu *svc;
 
     /* Allocate per-VCPU info */
@@ -1025,8 +1028,9 @@  csched_alloc_vdata(const struct scheduler *ops, struct vcpu *vc, void *dd)
 }
 
 static void
-csched_vcpu_insert(const struct scheduler *ops, struct vcpu *vc)
+csched_item_insert(const struct scheduler *ops, struct sched_item *item)
 {
+    struct vcpu *vc = item->vcpu;
     struct csched_vcpu *svc = vc->sched_priv;
     spinlock_t *lock;
 
@@ -1035,7 +1039,7 @@  csched_vcpu_insert(const struct scheduler *ops, struct vcpu *vc)
     /* csched_cpu_pick() looks in vc->processor's runq, so we need the lock. */
     lock = vcpu_schedule_lock_irq(vc);
 
-    vc->processor = csched_cpu_pick(ops, vc);
+    vc->processor = csched_cpu_pick(ops, item);
 
     spin_unlock_irq(lock);
 
@@ -1060,9 +1064,10 @@  csched_free_vdata(const struct scheduler *ops, void *priv)
 }
 
 static void
-csched_vcpu_remove(const struct scheduler *ops, struct vcpu *vc)
+csched_item_remove(const struct scheduler *ops, struct sched_item *item)
 {
     struct csched_private *prv = CSCHED_PRIV(ops);
+    struct vcpu *vc = item->vcpu;
     struct csched_vcpu * const svc = CSCHED_VCPU(vc);
     struct csched_dom * const sdom = svc->sdom;
 
@@ -1087,8 +1092,9 @@  csched_vcpu_remove(const struct scheduler *ops, struct vcpu *vc)
 }
 
 static void
-csched_vcpu_sleep(const struct scheduler *ops, struct vcpu *vc)
+csched_item_sleep(const struct scheduler *ops, struct sched_item *item)
 {
+    struct vcpu *vc = item->vcpu;
     struct csched_vcpu * const svc = CSCHED_VCPU(vc);
     unsigned int cpu = vc->processor;
 
@@ -1111,8 +1117,9 @@  csched_vcpu_sleep(const struct scheduler *ops, struct vcpu *vc)
 }
 
 static void
-csched_vcpu_wake(const struct scheduler *ops, struct vcpu *vc)
+csched_item_wake(const struct scheduler *ops, struct sched_item *item)
 {
+    struct vcpu *vc = item->vcpu;
     struct csched_vcpu * const svc = CSCHED_VCPU(vc);
     bool_t migrating;
 
@@ -1172,8 +1179,9 @@  csched_vcpu_wake(const struct scheduler *ops, struct vcpu *vc)
 }
 
 static void
-csched_vcpu_yield(const struct scheduler *ops, struct vcpu *vc)
+csched_item_yield(const struct scheduler *ops, struct sched_item *item)
 {
+    struct vcpu *vc = item->vcpu;
     struct csched_vcpu * const svc = CSCHED_VCPU(vc);
 
     /* Let the scheduler know that this vcpu is trying to yield */
@@ -1226,9 +1234,10 @@  csched_dom_cntl(
 }
 
 static void
-csched_aff_cntl(const struct scheduler *ops, struct vcpu *v,
+csched_aff_cntl(const struct scheduler *ops, struct sched_item *item,
                 const cpumask_t *hard, const cpumask_t *soft)
 {
+    struct vcpu *v = item->vcpu;
     struct csched_vcpu *svc = CSCHED_VCPU(v);
 
     if ( !hard )
@@ -1756,7 +1765,7 @@  csched_load_balance(struct csched_private *prv, int cpu,
                  * - if we race with inc_nr_runnable(), we skip a pCPU that may
                  *   have runnable vCPUs in its runqueue, but that's not a
                  *   problem because:
-                 *   + if racing with csched_vcpu_insert() or csched_vcpu_wake(),
+                 *   + if racing with csched_item_insert() or csched_item_wake(),
                  *     __runq_tickle() will be called afterwords, so the vCPU
                  *     won't get stuck in the runqueue for too long;
                  *   + if racing with csched_runq_steal(), it may be that a
@@ -2268,12 +2277,12 @@  static const struct scheduler sched_credit_def = {
 
     .global_init    = csched_global_init,
 
-    .insert_vcpu    = csched_vcpu_insert,
-    .remove_vcpu    = csched_vcpu_remove,
+    .insert_item    = csched_item_insert,
+    .remove_item    = csched_item_remove,
 
-    .sleep          = csched_vcpu_sleep,
-    .wake           = csched_vcpu_wake,
-    .yield          = csched_vcpu_yield,
+    .sleep          = csched_item_sleep,
+    .wake           = csched_item_wake,
+    .yield          = csched_item_yield,
 
     .adjust         = csched_dom_cntl,
     .adjust_affinity= csched_aff_cntl,
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 6958b265fc..f44286c2a5 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -273,7 +273,7 @@ 
  * CSFLAG_delayed_runq_add: Do we need to add this to the runqueue once it'd done
  * being context switched out?
  * + Set when scheduling out in csched2_schedule() if prev is runnable
- * + Set in csched2_vcpu_wake if it finds CSFLAG_scheduled set
+ * + Set in csched2_item_wake if it finds CSFLAG_scheduled set
  * + Read in csched2_context_saved().  If set, it adds prev to the runqueue and
  *   clears the bit.
  */
@@ -623,14 +623,14 @@  static inline bool has_cap(const struct csched2_vcpu *svc)
  * This logic is entirely implemented in runq_tickle(), and that is enough.
  * In fact, in this scheduler, placement of a vcpu on one of the pcpus of a
  * runq, _always_ happens by means of tickling:
- *  - when a vcpu wakes up, it calls csched2_vcpu_wake(), which calls
+ *  - when a vcpu wakes up, it calls csched2_item_wake(), which calls
  *    runq_tickle();
  *  - when a migration is initiated in schedule.c, we call csched2_cpu_pick(),
- *    csched2_vcpu_migrate() (which calls migrate()) and csched2_vcpu_wake().
+ *    csched2_item_migrate() (which calls migrate()) and csched2_item_wake().
  *    csched2_cpu_pick() looks for the least loaded runq and return just any
- *    of its processors. Then, csched2_vcpu_migrate() just moves the vcpu to
+ *    of its processors. Then, csched2_item_migrate() just moves the vcpu to
  *    the chosen runq, and it is again runq_tickle(), called by
- *    csched2_vcpu_wake() that actually decides what pcpu to use within the
+ *    csched2_item_wake() that actually decides what pcpu to use within the
  *    chosen runq;
  *  - when a migration is initiated in sched_credit2.c, by calling  migrate()
  *    directly, that again temporarily use a random pcpu from the new runq,
@@ -2026,8 +2026,10 @@  csched2_vcpu_check(struct vcpu *vc)
 #endif
 
 static void *
-csched2_alloc_vdata(const struct scheduler *ops, struct vcpu *vc, void *dd)
+csched2_alloc_vdata(const struct scheduler *ops, struct sched_item *item,
+                    void *dd)
 {
+    struct vcpu *vc = item->vcpu;
     struct csched2_vcpu *svc;
 
     /* Allocate per-VCPU info */
@@ -2069,8 +2071,9 @@  csched2_alloc_vdata(const struct scheduler *ops, struct vcpu *vc, void *dd)
 }
 
 static void
-csched2_vcpu_sleep(const struct scheduler *ops, struct vcpu *vc)
+csched2_item_sleep(const struct scheduler *ops, struct sched_item *item)
 {
+    struct vcpu *vc = item->vcpu;
     struct csched2_vcpu * const svc = csched2_vcpu(vc);
 
     ASSERT(!is_idle_vcpu(vc));
@@ -2091,8 +2094,9 @@  csched2_vcpu_sleep(const struct scheduler *ops, struct vcpu *vc)
 }
 
 static void
-csched2_vcpu_wake(const struct scheduler *ops, struct vcpu *vc)
+csched2_item_wake(const struct scheduler *ops, struct sched_item *item)
 {
+    struct vcpu *vc = item->vcpu;
     struct csched2_vcpu * const svc = csched2_vcpu(vc);
     unsigned int cpu = vc->processor;
     s_time_t now;
@@ -2146,16 +2150,18 @@  out:
 }
 
 static void
-csched2_vcpu_yield(const struct scheduler *ops, struct vcpu *v)
+csched2_item_yield(const struct scheduler *ops, struct sched_item *item)
 {
+    struct vcpu *v = item->vcpu;
     struct csched2_vcpu * const svc = csched2_vcpu(v);
 
     __set_bit(__CSFLAG_vcpu_yield, &svc->flags);
 }
 
 static void
-csched2_context_saved(const struct scheduler *ops, struct vcpu *vc)
+csched2_context_saved(const struct scheduler *ops, struct sched_item *item)
 {
+    struct vcpu *vc = item->vcpu;
     struct csched2_vcpu * const svc = csched2_vcpu(vc);
     spinlock_t *lock = vcpu_schedule_lock_irq(vc);
     s_time_t now = NOW();
@@ -2196,9 +2202,10 @@  csched2_context_saved(const struct scheduler *ops, struct vcpu *vc)
 
 #define MAX_LOAD (STIME_MAX)
 static int
-csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
+csched2_cpu_pick(const struct scheduler *ops, struct sched_item *item)
 {
     struct csched2_private *prv = csched2_priv(ops);
+    struct vcpu *vc = item->vcpu;
     int i, min_rqi = -1, min_s_rqi = -1;
     unsigned int new_cpu, cpu = vc->processor;
     struct csched2_vcpu *svc = csched2_vcpu(vc);
@@ -2733,9 +2740,10 @@  retry:
 }
 
 static void
-csched2_vcpu_migrate(
-    const struct scheduler *ops, struct vcpu *vc, unsigned int new_cpu)
+csched2_item_migrate(
+    const struct scheduler *ops, struct sched_item *item, unsigned int new_cpu)
 {
+    struct vcpu *vc = item->vcpu;
     struct domain *d = vc->domain;
     struct csched2_vcpu * const svc = csched2_vcpu(vc);
     struct csched2_runqueue_data *trqd;
@@ -2996,9 +3004,10 @@  csched2_dom_cntl(
 }
 
 static void
-csched2_aff_cntl(const struct scheduler *ops, struct vcpu *v,
+csched2_aff_cntl(const struct scheduler *ops, struct sched_item *item,
                  const cpumask_t *hard, const cpumask_t *soft)
 {
+    struct vcpu *v = item->vcpu;
     struct csched2_vcpu *svc = csched2_vcpu(v);
 
     if ( !hard )
@@ -3096,8 +3105,9 @@  csched2_free_domdata(const struct scheduler *ops, void *data)
 }
 
 static void
-csched2_vcpu_insert(const struct scheduler *ops, struct vcpu *vc)
+csched2_item_insert(const struct scheduler *ops, struct sched_item *item)
 {
+    struct vcpu *vc = item->vcpu;
     struct csched2_vcpu *svc = vc->sched_priv;
     struct csched2_dom * const sdom = svc->sdom;
     spinlock_t *lock;
@@ -3108,7 +3118,7 @@  csched2_vcpu_insert(const struct scheduler *ops, struct vcpu *vc)
     /* csched2_cpu_pick() expects the pcpu lock to be held */
     lock = vcpu_schedule_lock_irq(vc);
 
-    vc->processor = csched2_cpu_pick(ops, vc);
+    vc->processor = csched2_cpu_pick(ops, item);
 
     spin_unlock_irq(lock);
 
@@ -3135,8 +3145,9 @@  csched2_free_vdata(const struct scheduler *ops, void *priv)
 }
 
 static void
-csched2_vcpu_remove(const struct scheduler *ops, struct vcpu *vc)
+csched2_item_remove(const struct scheduler *ops, struct sched_item *item)
 {
+    struct vcpu *vc = item->vcpu;
     struct csched2_vcpu * const svc = csched2_vcpu(vc);
     spinlock_t *lock;
 
@@ -4084,19 +4095,19 @@  static const struct scheduler sched_credit2_def = {
 
     .global_init    = csched2_global_init,
 
-    .insert_vcpu    = csched2_vcpu_insert,
-    .remove_vcpu    = csched2_vcpu_remove,
+    .insert_item    = csched2_item_insert,
+    .remove_item    = csched2_item_remove,
 
-    .sleep          = csched2_vcpu_sleep,
-    .wake           = csched2_vcpu_wake,
-    .yield          = csched2_vcpu_yield,
+    .sleep          = csched2_item_sleep,
+    .wake           = csched2_item_wake,
+    .yield          = csched2_item_yield,
 
     .adjust         = csched2_dom_cntl,
     .adjust_affinity= csched2_aff_cntl,
     .adjust_global  = csched2_sys_cntl,
 
     .pick_cpu       = csched2_cpu_pick,
-    .migrate        = csched2_vcpu_migrate,
+    .migrate        = csched2_item_migrate,
     .do_schedule    = csched2_schedule,
     .context_saved  = csched2_context_saved,
 
diff --git a/xen/common/sched_null.c b/xen/common/sched_null.c
index a59dbb2692..7b508f35a4 100644
--- a/xen/common/sched_null.c
+++ b/xen/common/sched_null.c
@@ -194,8 +194,9 @@  static void null_deinit_pdata(const struct scheduler *ops, void *pcpu, int cpu)
 }
 
 static void *null_alloc_vdata(const struct scheduler *ops,
-                              struct vcpu *v, void *dd)
+                              struct sched_item *item, void *dd)
 {
+    struct vcpu *v = item->vcpu;
     struct null_vcpu *nvc;
 
     nvc = xzalloc(struct null_vcpu);
@@ -413,8 +414,10 @@  static void null_switch_sched(struct scheduler *new_ops, unsigned int cpu,
     sd->schedule_lock = &sd->_lock;
 }
 
-static void null_vcpu_insert(const struct scheduler *ops, struct vcpu *v)
+static void null_item_insert(const struct scheduler *ops,
+                             struct sched_item *item)
 {
+    struct vcpu *v = item->vcpu;
     struct null_private *prv = null_priv(ops);
     struct null_vcpu *nvc = null_vcpu(v);
     unsigned int cpu;
@@ -505,8 +508,10 @@  static void _vcpu_remove(struct null_private *prv, struct vcpu *v)
     spin_unlock(&prv->waitq_lock);
 }
 
-static void null_vcpu_remove(const struct scheduler *ops, struct vcpu *v)
+static void null_item_remove(const struct scheduler *ops,
+                             struct sched_item *item)
 {
+    struct vcpu *v = item->vcpu;
     struct null_private *prv = null_priv(ops);
     struct null_vcpu *nvc = null_vcpu(v);
     spinlock_t *lock;
@@ -536,8 +541,11 @@  static void null_vcpu_remove(const struct scheduler *ops, struct vcpu *v)
     SCHED_STAT_CRANK(vcpu_remove);
 }
 
-static void null_vcpu_wake(const struct scheduler *ops, struct vcpu *v)
+static void null_item_wake(const struct scheduler *ops,
+                           struct sched_item *item)
 {
+    struct vcpu *v = item->vcpu;
+
     ASSERT(!is_idle_vcpu(v));
 
     if ( unlikely(curr_on_cpu(v->processor) == v) )
@@ -562,8 +570,11 @@  static void null_vcpu_wake(const struct scheduler *ops, struct vcpu *v)
     cpu_raise_softirq(v->processor, SCHEDULE_SOFTIRQ);
 }
 
-static void null_vcpu_sleep(const struct scheduler *ops, struct vcpu *v)
+static void null_item_sleep(const struct scheduler *ops,
+                            struct sched_item *item)
 {
+    struct vcpu *v = item->vcpu;
+
     ASSERT(!is_idle_vcpu(v));
 
     /* If v is not assigned to a pCPU, or is not running, no need to bother */
@@ -573,15 +584,17 @@  static void null_vcpu_sleep(const struct scheduler *ops, struct vcpu *v)
     SCHED_STAT_CRANK(vcpu_sleep);
 }
 
-static int null_cpu_pick(const struct scheduler *ops, struct vcpu *v)
+static int null_cpu_pick(const struct scheduler *ops, struct sched_item *item)
 {
+    struct vcpu *v = item->vcpu;
     ASSERT(!is_idle_vcpu(v));
     return pick_cpu(null_priv(ops), v);
 }
 
-static void null_vcpu_migrate(const struct scheduler *ops, struct vcpu *v,
-                              unsigned int new_cpu)
+static void null_item_migrate(const struct scheduler *ops,
+                              struct sched_item *item, unsigned int new_cpu)
 {
+    struct vcpu *v = item->vcpu;
     struct null_private *prv = null_priv(ops);
     struct null_vcpu *nvc = null_vcpu(v);
 
@@ -888,13 +901,13 @@  const struct scheduler sched_null_def = {
     .alloc_domdata  = null_alloc_domdata,
     .free_domdata   = null_free_domdata,
 
-    .insert_vcpu    = null_vcpu_insert,
-    .remove_vcpu    = null_vcpu_remove,
+    .insert_item    = null_item_insert,
+    .remove_item    = null_item_remove,
 
-    .wake           = null_vcpu_wake,
-    .sleep          = null_vcpu_sleep,
+    .wake           = null_item_wake,
+    .sleep          = null_item_sleep,
     .pick_cpu       = null_cpu_pick,
-    .migrate        = null_vcpu_migrate,
+    .migrate        = null_item_migrate,
     .do_schedule    = null_schedule,
 
     .dump_cpu_state = null_dump_pcpu,
diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index f1b81f0373..ab8fa02306 100644
--- a/xen/common/sched_rt.c
+++ b/xen/common/sched_rt.c
@@ -136,7 +136,7 @@ 
  * RTDS_delayed_runq_add: Do we need to add this to the RunQ/DepletedQ
  * once it's done being context switching out?
  * + Set when scheduling out in rt_schedule() if prev is runable
- * + Set in rt_vcpu_wake if it finds RTDS_scheduled set
+ * + Set in rt_item_wake if it finds RTDS_scheduled set
  * + Read in rt_context_saved(). If set, it adds prev to the Runqueue/DepletedQ
  *   and clears the bit.
  */
@@ -637,8 +637,9 @@  replq_reinsert(const struct scheduler *ops, struct rt_vcpu *svc)
  * and available cpus
  */
 static int
-rt_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
+rt_cpu_pick(const struct scheduler *ops, struct sched_item *item)
 {
+    struct vcpu *vc = item->vcpu;
     cpumask_t cpus;
     cpumask_t *online;
     int cpu;
@@ -846,8 +847,9 @@  rt_free_domdata(const struct scheduler *ops, void *data)
 }
 
 static void *
-rt_alloc_vdata(const struct scheduler *ops, struct vcpu *vc, void *dd)
+rt_alloc_vdata(const struct scheduler *ops, struct sched_item *item, void *dd)
 {
+    struct vcpu *vc = item->vcpu;
     struct rt_vcpu *svc;
 
     /* Allocate per-VCPU info */
@@ -889,8 +891,9 @@  rt_free_vdata(const struct scheduler *ops, void *priv)
  * dest. cpupool.
  */
 static void
-rt_vcpu_insert(const struct scheduler *ops, struct vcpu *vc)
+rt_item_insert(const struct scheduler *ops, struct sched_item *item)
 {
+    struct vcpu *vc = item->vcpu;
     struct rt_vcpu *svc = rt_vcpu(vc);
     s_time_t now;
     spinlock_t *lock;
@@ -898,7 +901,7 @@  rt_vcpu_insert(const struct scheduler *ops, struct vcpu *vc)
     BUG_ON( is_idle_vcpu(vc) );
 
     /* This is safe because vc isn't yet being scheduled */
-    vc->processor = rt_cpu_pick(ops, vc);
+    vc->processor = rt_cpu_pick(ops, item);
 
     lock = vcpu_schedule_lock_irq(vc);
 
@@ -922,8 +925,9 @@  rt_vcpu_insert(const struct scheduler *ops, struct vcpu *vc)
  * Remove rt_vcpu svc from the old scheduler in source cpupool.
  */
 static void
-rt_vcpu_remove(const struct scheduler *ops, struct vcpu *vc)
+rt_item_remove(const struct scheduler *ops, struct sched_item *item)
 {
+    struct vcpu *vc = item->vcpu;
     struct rt_vcpu * const svc = rt_vcpu(vc);
     struct rt_dom * const sdom = svc->sdom;
     spinlock_t *lock;
@@ -1142,8 +1146,9 @@  rt_schedule(const struct scheduler *ops, s_time_t now, bool_t tasklet_work_sched
  * The lock is already grabbed in schedule.c, no need to lock here
  */
 static void
-rt_vcpu_sleep(const struct scheduler *ops, struct vcpu *vc)
+rt_item_sleep(const struct scheduler *ops, struct sched_item *item)
 {
+    struct vcpu *vc = item->vcpu;
     struct rt_vcpu * const svc = rt_vcpu(vc);
 
     BUG_ON( is_idle_vcpu(vc) );
@@ -1257,8 +1262,9 @@  runq_tickle(const struct scheduler *ops, struct rt_vcpu *new)
  * TODO: what if these two vcpus belongs to the same domain?
  */
 static void
-rt_vcpu_wake(const struct scheduler *ops, struct vcpu *vc)
+rt_item_wake(const struct scheduler *ops, struct sched_item *item)
 {
+    struct vcpu *vc = item->vcpu;
     struct rt_vcpu * const svc = rt_vcpu(vc);
     s_time_t now;
     bool_t missed;
@@ -1327,8 +1333,9 @@  rt_vcpu_wake(const struct scheduler *ops, struct vcpu *vc)
  * and then pick the highest priority vcpu from runq to run
  */
 static void
-rt_context_saved(const struct scheduler *ops, struct vcpu *vc)
+rt_context_saved(const struct scheduler *ops, struct sched_item *item)
 {
+    struct vcpu *vc = item->vcpu;
     struct rt_vcpu *svc = rt_vcpu(vc);
     spinlock_t *lock = vcpu_schedule_lock_irq(vc);
 
@@ -1557,15 +1564,15 @@  static const struct scheduler sched_rtds_def = {
     .free_domdata   = rt_free_domdata,
     .alloc_vdata    = rt_alloc_vdata,
     .free_vdata     = rt_free_vdata,
-    .insert_vcpu    = rt_vcpu_insert,
-    .remove_vcpu    = rt_vcpu_remove,
+    .insert_item    = rt_item_insert,
+    .remove_item    = rt_item_remove,
 
     .adjust         = rt_dom_cntl,
 
     .pick_cpu       = rt_cpu_pick,
     .do_schedule    = rt_schedule,
-    .sleep          = rt_vcpu_sleep,
-    .wake           = rt_vcpu_wake,
+    .sleep          = rt_item_sleep,
+    .wake           = rt_item_wake,
     .context_saved  = rt_context_saved,
 };
 
diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index 6b5d454630..d1a958143a 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -256,6 +256,7 @@  static void sched_spin_unlock_double(spinlock_t *lock1, spinlock_t *lock2,
 int sched_init_vcpu(struct vcpu *v, unsigned int processor)
 {
     struct domain *d = v->domain;
+    struct sched_item item = { .vcpu = v };
 
     v->processor = processor;
 
@@ -267,7 +268,7 @@  int sched_init_vcpu(struct vcpu *v, unsigned int processor)
     init_timer(&v->poll_timer, poll_timer_fn,
                v, v->processor);
 
-    v->sched_priv = SCHED_OP(dom_scheduler(d), alloc_vdata, v,
+    v->sched_priv = SCHED_OP(dom_scheduler(d), alloc_vdata, &item,
                      d->sched_priv);
     if ( v->sched_priv == NULL )
         return 1;
@@ -289,7 +290,7 @@  int sched_init_vcpu(struct vcpu *v, unsigned int processor)
     }
     else
     {
-        SCHED_OP(dom_scheduler(d), insert_vcpu, v);
+        SCHED_OP(dom_scheduler(d), insert_item, &item);
     }
 
     return 0;
@@ -310,6 +311,7 @@  int sched_move_domain(struct domain *d, struct cpupool *c)
     void *vcpudata;
     struct scheduler *old_ops;
     void *old_domdata;
+    struct sched_item item;
 
     for_each_vcpu ( d, v )
     {
@@ -330,7 +332,8 @@  int sched_move_domain(struct domain *d, struct cpupool *c)
 
     for_each_vcpu ( d, v )
     {
-        vcpu_priv[v->vcpu_id] = SCHED_OP(c->sched, alloc_vdata, v, domdata);
+        item.vcpu = v;
+        vcpu_priv[v->vcpu_id] = SCHED_OP(c->sched, alloc_vdata, &item, domdata);
         if ( vcpu_priv[v->vcpu_id] == NULL )
         {
             for_each_vcpu ( d, v )
@@ -348,7 +351,8 @@  int sched_move_domain(struct domain *d, struct cpupool *c)
 
     for_each_vcpu ( d, v )
     {
-        SCHED_OP(old_ops, remove_vcpu, v);
+        item.vcpu = v;
+        SCHED_OP(old_ops, remove_item, &item);
     }
 
     d->cpupool = c;
@@ -359,6 +363,7 @@  int sched_move_domain(struct domain *d, struct cpupool *c)
     {
         spinlock_t *lock;
 
+        item.vcpu = v;
         vcpudata = v->sched_priv;
 
         migrate_timer(&v->periodic_timer, new_p);
@@ -383,7 +388,7 @@  int sched_move_domain(struct domain *d, struct cpupool *c)
 
         new_p = cpumask_cycle(new_p, c->cpu_valid);
 
-        SCHED_OP(c->sched, insert_vcpu, v);
+        SCHED_OP(c->sched, insert_item, &item);
 
         SCHED_OP(old_ops, free_vdata, vcpudata);
     }
@@ -401,12 +406,14 @@  int sched_move_domain(struct domain *d, struct cpupool *c)
 
 void sched_destroy_vcpu(struct vcpu *v)
 {
+    struct sched_item item = { .vcpu = v };
+
     kill_timer(&v->periodic_timer);
     kill_timer(&v->singleshot_timer);
     kill_timer(&v->poll_timer);
     if ( test_and_clear_bool(v->is_urgent) )
         atomic_dec(&per_cpu(schedule_data, v->processor).urgent_count);
-    SCHED_OP(vcpu_scheduler(v), remove_vcpu, v);
+    SCHED_OP(vcpu_scheduler(v), remove_item, &item);
     SCHED_OP(vcpu_scheduler(v), free_vdata, v->sched_priv);
 }
 
@@ -451,6 +458,8 @@  void sched_destroy_domain(struct domain *d)
 
 void vcpu_sleep_nosync_locked(struct vcpu *v)
 {
+    struct sched_item item = { .vcpu = v };
+
     ASSERT(spin_is_locked(per_cpu(schedule_data,v->processor).schedule_lock));
 
     if ( likely(!vcpu_runnable(v)) )
@@ -458,7 +467,7 @@  void vcpu_sleep_nosync_locked(struct vcpu *v)
         if ( v->runstate.state == RUNSTATE_runnable )
             vcpu_runstate_change(v, RUNSTATE_offline, NOW());
 
-        SCHED_OP(vcpu_scheduler(v), sleep, v);
+        SCHED_OP(vcpu_scheduler(v), sleep, &item);
     }
 }
 
@@ -490,6 +499,7 @@  void vcpu_wake(struct vcpu *v)
 {
     unsigned long flags;
     spinlock_t *lock;
+    struct sched_item item = { .vcpu = v };
 
     TRACE_2D(TRC_SCHED_WAKE, v->domain->domain_id, v->vcpu_id);
 
@@ -499,7 +509,7 @@  void vcpu_wake(struct vcpu *v)
     {
         if ( v->runstate.state >= RUNSTATE_blocked )
             vcpu_runstate_change(v, RUNSTATE_runnable, NOW());
-        SCHED_OP(vcpu_scheduler(v), wake, v);
+        SCHED_OP(vcpu_scheduler(v), wake, &item);
     }
     else if ( !(v->pause_flags & VPF_blocked) )
     {
@@ -538,6 +548,7 @@  void vcpu_unblock(struct vcpu *v)
 static void vcpu_move_locked(struct vcpu *v, unsigned int new_cpu)
 {
     unsigned int old_cpu = v->processor;
+    struct sched_item item = { .vcpu = v };
 
     /*
      * Transfer urgency status to new CPU before switching CPUs, as
@@ -555,7 +566,7 @@  static void vcpu_move_locked(struct vcpu *v, unsigned int new_cpu)
      * pointer cant' change while the current lock is held.
      */
     if ( vcpu_scheduler(v)->migrate )
-        SCHED_OP(vcpu_scheduler(v), migrate, v, new_cpu);
+        SCHED_OP(vcpu_scheduler(v), migrate, &item, new_cpu);
     else
         v->processor = new_cpu;
 }
@@ -599,6 +610,7 @@  static void vcpu_migrate_finish(struct vcpu *v)
     unsigned int old_cpu, new_cpu;
     spinlock_t *old_lock, *new_lock;
     bool_t pick_called = 0;
+    struct sched_item item = { .vcpu = v };
 
     /*
      * If the vcpu is currently running, this will be handled by
@@ -635,7 +647,7 @@  static void vcpu_migrate_finish(struct vcpu *v)
                 break;
 
             /* Select a new CPU. */
-            new_cpu = SCHED_OP(vcpu_scheduler(v), pick_cpu, v);
+            new_cpu = SCHED_OP(vcpu_scheduler(v), pick_cpu, &item);
             if ( (new_lock == per_cpu(schedule_data, new_cpu).schedule_lock) &&
                  cpumask_test_cpu(new_cpu, v->domain->cpupool->cpu_valid) )
                 break;
@@ -705,6 +717,7 @@  void restore_vcpu_affinity(struct domain *d)
     {
         spinlock_t *lock;
         unsigned int old_cpu = v->processor;
+        struct sched_item item = { .vcpu = v };
 
         ASSERT(!vcpu_runnable(v));
 
@@ -740,7 +753,7 @@  void restore_vcpu_affinity(struct domain *d)
         v->processor = cpumask_any(cpumask_scratch_cpu(cpu));
 
         lock = vcpu_schedule_lock_irq(v);
-        v->processor = SCHED_OP(vcpu_scheduler(v), pick_cpu, v);
+        v->processor = SCHED_OP(vcpu_scheduler(v), pick_cpu, &item);
         spin_unlock_irq(lock);
 
         if ( old_cpu != v->processor )
@@ -858,7 +871,9 @@  static int cpu_disable_scheduler_check(unsigned int cpu)
 void sched_set_affinity(
     struct vcpu *v, const cpumask_t *hard, const cpumask_t *soft)
 {
-    SCHED_OP(dom_scheduler(v->domain), adjust_affinity, v, hard, soft);
+    struct sched_item item = { .vcpu = v };
+
+    SCHED_OP(dom_scheduler(v->domain), adjust_affinity, &item, hard, soft);
 
     if ( hard )
         cpumask_copy(v->cpu_hard_affinity, hard);
@@ -1034,9 +1049,10 @@  static long do_poll(struct sched_poll *sched_poll)
 long vcpu_yield(void)
 {
     struct vcpu * v=current;
+    struct sched_item item = { .vcpu = v };
     spinlock_t *lock = vcpu_schedule_lock_irq(v);
 
-    SCHED_OP(vcpu_scheduler(v), yield, v);
+    SCHED_OP(vcpu_scheduler(v), yield, &item);
     vcpu_schedule_unlock_irq(lock, v);
 
     SCHED_STAT_CRANK(vcpu_yield);
@@ -1531,6 +1547,8 @@  static void schedule(void)
 
 void context_saved(struct vcpu *prev)
 {
+    struct sched_item item = { .vcpu = prev };
+
     /* Clear running flag /after/ writing context to memory. */
     smp_wmb();
 
@@ -1539,7 +1557,7 @@  void context_saved(struct vcpu *prev)
     /* Check for migration request /after/ clearing running flag. */
     smp_mb();
 
-    SCHED_OP(vcpu_scheduler(prev), context_saved, prev);
+    SCHED_OP(vcpu_scheduler(prev), context_saved, &item);
 
     vcpu_migrate_finish(prev);
 }
@@ -1595,6 +1613,7 @@  static int cpu_schedule_up(unsigned int cpu)
     else
     {
         struct vcpu *idle = idle_vcpu[cpu];
+        struct sched_item item = { .vcpu = idle };
 
         /*
          * During (ACPI?) suspend the idle vCPU for this pCPU is not freed,
@@ -1608,7 +1627,7 @@  static int cpu_schedule_up(unsigned int cpu)
          */
         ASSERT(idle->sched_priv == NULL);
 
-        idle->sched_priv = SCHED_OP(&ops, alloc_vdata, idle,
+        idle->sched_priv = SCHED_OP(&ops, alloc_vdata, &item,
                                     idle->domain->sched_priv);
         if ( idle->sched_priv == NULL )
             return -ENOMEM;
@@ -1801,6 +1820,7 @@  void __init scheduler_init(void)
 int schedule_cpu_switch(unsigned int cpu, struct cpupool *c)
 {
     struct vcpu *idle;
+    struct sched_item item;
     void *ppriv, *ppriv_old, *vpriv, *vpriv_old;
     struct scheduler *old_ops = per_cpu(scheduler, cpu);
     struct scheduler *new_ops = (c == NULL) ? &ops : c->sched;
@@ -1836,10 +1856,11 @@  int schedule_cpu_switch(unsigned int cpu, struct cpupool *c)
      *    sched_priv field of the per-vCPU info of the idle domain.
      */
     idle = idle_vcpu[cpu];
+    item.vcpu = idle;
     ppriv = SCHED_OP(new_ops, alloc_pdata, cpu);
     if ( IS_ERR(ppriv) )
         return PTR_ERR(ppriv);
-    vpriv = SCHED_OP(new_ops, alloc_vdata, idle, idle->domain->sched_priv);
+    vpriv = SCHED_OP(new_ops, alloc_vdata, &item, idle->domain->sched_priv);
     if ( vpriv == NULL )
     {
         SCHED_OP(new_ops, free_pdata, ppriv, cpu);
diff --git a/xen/include/xen/sched-if.h b/xen/include/xen/sched-if.h
index 92bc7a0365..a9916f35b8 100644
--- a/xen/include/xen/sched-if.h
+++ b/xen/include/xen/sched-if.h
@@ -48,6 +48,10 @@  DECLARE_PER_CPU(struct schedule_data, schedule_data);
 DECLARE_PER_CPU(struct scheduler *, scheduler);
 DECLARE_PER_CPU(struct cpupool *, cpupool);
 
+struct sched_item {
+    struct vcpu           *vcpu;
+};
+
 /*
  * Scratch space, for avoiding having too many cpumask_t on the stack.
  * Within each scheduler, when using the scratch mask of one pCPU:
@@ -141,8 +145,8 @@  struct scheduler {
     void         (*deinit)         (struct scheduler *);
 
     void         (*free_vdata)     (const struct scheduler *, void *);
-    void *       (*alloc_vdata)    (const struct scheduler *, struct vcpu *,
-                                    void *);
+    void *       (*alloc_vdata)    (const struct scheduler *,
+                                    struct sched_item *, void *);
     void         (*free_pdata)     (const struct scheduler *, void *, int);
     void *       (*alloc_pdata)    (const struct scheduler *, int);
     void         (*init_pdata)     (const struct scheduler *, void *, int);
@@ -156,24 +160,32 @@  struct scheduler {
     void         (*switch_sched)   (struct scheduler *, unsigned int,
                                     void *, void *);
 
-    /* Activate / deactivate vcpus in a cpu pool */
-    void         (*insert_vcpu)    (const struct scheduler *, struct vcpu *);
-    void         (*remove_vcpu)    (const struct scheduler *, struct vcpu *);
-
-    void         (*sleep)          (const struct scheduler *, struct vcpu *);
-    void         (*wake)           (const struct scheduler *, struct vcpu *);
-    void         (*yield)          (const struct scheduler *, struct vcpu *);
-    void         (*context_saved)  (const struct scheduler *, struct vcpu *);
+    /* Activate / deactivate items in a cpu pool */
+    void         (*insert_item)    (const struct scheduler *,
+                                    struct sched_item *);
+    void         (*remove_item)    (const struct scheduler *,
+                                    struct sched_item *);
+
+    void         (*sleep)          (const struct scheduler *,
+                                    struct sched_item *);
+    void         (*wake)           (const struct scheduler *,
+                                    struct sched_item *);
+    void         (*yield)          (const struct scheduler *,
+                                    struct sched_item *);
+    void         (*context_saved)  (const struct scheduler *,
+                                    struct sched_item *);
 
     struct task_slice (*do_schedule) (const struct scheduler *, s_time_t,
                                       bool_t tasklet_work_scheduled);
 
-    int          (*pick_cpu)       (const struct scheduler *, struct vcpu *);
-    void         (*migrate)        (const struct scheduler *, struct vcpu *,
-                                    unsigned int);
+    int          (*pick_cpu)       (const struct scheduler *,
+                                    struct sched_item *);
+    void         (*migrate)        (const struct scheduler *,
+                                    struct sched_item *, unsigned int);
     int          (*adjust)         (const struct scheduler *, struct domain *,
                                     struct xen_domctl_scheduler_op *);
-    void         (*adjust_affinity)(const struct scheduler *, struct vcpu *,
+    void         (*adjust_affinity)(const struct scheduler *,
+                                    struct sched_item *,
                                     const struct cpumask *,
                                     const struct cpumask *);
     int          (*adjust_global)  (const struct scheduler *,