diff mbox series

[RFC] xen/sched: Optimise when only one scheduler is compiled in

Message ID 20220303004015.17688-1-andrew.cooper3@citrix.com (mailing list archive)
State New, archived
Headers show
Series [RFC] xen/sched: Optimise when only one scheduler is compiled in | expand

Commit Message

Andrew Cooper March 3, 2022, 12:40 a.m. UTC
When only one scheduler is compiled in, function pointers can be optimised to
direct calls, and the hooks hardened against controlflow hijacking.

RFC for several reasons.

1) There's an almost beautiful way of not introducing MAYBE_SCHED() and hiding
   the magic in REGISTER_SCHEDULER(), except it falls over
   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91765 which has no comment or
   resolution at all.

2) A different alternative which almost works is to remove the indirection in
   .data.schedulers, but the singleton scheduler object can't be both there
   and in .init.rodata.cf_clobber.

3) I can't think of a way of build time check to enforce that new schedulers
   get added to the preprocessor magic.

And the blocker:
4) This isn't compatible with how sched_idle_ops get used for granularity > 1.

Suggestions very welcome.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Juergen Gross <jgross@suse.com>
CC: Dario Faggioli <dfaggioli@suse.com>
---
 xen/common/sched/arinc653.c |  2 +-
 xen/common/sched/core.c     |  4 +-
 xen/common/sched/credit.c   |  2 +-
 xen/common/sched/credit2.c  |  2 +-
 xen/common/sched/null.c     |  2 +-
 xen/common/sched/private.h  | 91 ++++++++++++++++++++++++++++++++-------------
 xen/common/sched/rt.c       |  2 +-
 7 files changed, 72 insertions(+), 33 deletions(-)

Comments

Jan Beulich March 3, 2022, 8:24 a.m. UTC | #1
On 03.03.2022 01:40, Andrew Cooper wrote:
> When only one scheduler is compiled in, function pointers can be optimised to
> direct calls, and the hooks hardened against controlflow hijacking.
> 
> RFC for several reasons.
> 
> 1) There's an almost beautiful way of not introducing MAYBE_SCHED() and hiding
>    the magic in REGISTER_SCHEDULER(), except it falls over
>    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91765 which has no comment or
>    resolution at all.
> 
> 2) A different alternative which almost works is to remove the indirection in
>    .data.schedulers, but the singleton scheduler object can't be both there
>    and in .init.rodata.cf_clobber.

Couldn't we name the section differently when there's just one member,
placing that section inside __initdata_cf_clobber_{start,end} in the
linker script?

> 3) I can't think of a way of build time check to enforce that new schedulers
>    get added to the preprocessor magic.

An assertion in the linker script, checking that .data.schedulers has a
single entry when the sched_ops symbol exists? This may involve a
PROVIDE(sched_ops = 0) as there doesn't look to be a way to probe for
symbol defined-ness in expressions.

> And the blocker:
> 4) This isn't compatible with how sched_idle_ops get used for granularity > 1.

Special case it just like we special case plt_tsc in x86/time.c?

> --- a/xen/common/sched/private.h
> +++ b/xen/common/sched/private.h
> @@ -271,6 +271,33 @@ static inline spinlock_t *pcpu_schedule_trylock(unsigned int cpu)
>      return NULL;
>  }
>  
> +#if 1 ==                                                                \
> +    defined(CONFIG_SCHED_CREDIT) + defined(CONFIG_SCHED_CREDIT2) +      \
> +    defined(CONFIG_SCHED_RTDS) + defined(CONFIG_SCHED_ARINC653) +       \
> +    defined(CONFIG_SCHED_NULL)
> +
> +extern const struct scheduler sched_ops;
> +#define MAYBE_SCHED(x) __initdata_cf_clobber sched_ops

__initconst_cf_clobber, seeing that all use sites also use const?

> @@ -333,39 +360,48 @@ struct scheduler {
>      void         (*dump_cpu_state) (const struct scheduler *, int);
>  };
>  
> +static inline int sched_global_init(const struct scheduler *s)
> +{
> +    if ( s->global_init )
> +        return sched_call(s, global_init);
> +    return 0;
> +}

Is it really a good idea to expose this here when it's supposed to be
used from core.c only, and even there in just a single place?

Jan
Jürgen Groß March 3, 2022, 8:33 a.m. UTC | #2
On 03.03.22 01:40, Andrew Cooper wrote:
> When only one scheduler is compiled in, function pointers can be optimised to
> direct calls, and the hooks hardened against controlflow hijacking.
> 
> RFC for several reasons.
> 
> 1) There's an almost beautiful way of not introducing MAYBE_SCHED() and hiding
>     the magic in REGISTER_SCHEDULER(), except it falls over
>     https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91765 which has no comment or
>     resolution at all.
> 
> 2) A different alternative which almost works is to remove the indirection in
>     .data.schedulers, but the singleton scheduler object can't be both there
>     and in .init.rodata.cf_clobber.
> 
> 3) I can't think of a way of build time check to enforce that new schedulers
>     get added to the preprocessor magic.
> 
> And the blocker:
> 4) This isn't compatible with how sched_idle_ops get used for granularity > 1.
> 
> Suggestions very welcome.

Did you consider to generate the needed code dynamically instead?

I guess this could even be extended to avoid function pointers
completely using the same technique as in my hypercall series.

In order to avoid the need for a central table the per-scheduler
hooks could use standard names (as most of them do already).

I think I could come up with a patch in a few hours if you like
that approach.


Juergen
Jürgen Groß March 3, 2022, 10:06 a.m. UTC | #3
On 03.03.22 09:33, Juergen Gross wrote:
> On 03.03.22 01:40, Andrew Cooper wrote:
>> When only one scheduler is compiled in, function pointers can be 
>> optimised to
>> direct calls, and the hooks hardened against controlflow hijacking.
>>
>> RFC for several reasons.
>>
>> 1) There's an almost beautiful way of not introducing MAYBE_SCHED() 
>> and hiding
>>     the magic in REGISTER_SCHEDULER(), except it falls over
>>     https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91765 which has no 
>> comment or
>>     resolution at all.
>>
>> 2) A different alternative which almost works is to remove the 
>> indirection in
>>     .data.schedulers, but the singleton scheduler object can't be both 
>> there
>>     and in .init.rodata.cf_clobber.
>>
>> 3) I can't think of a way of build time check to enforce that new 
>> schedulers
>>     get added to the preprocessor magic.
>>
>> And the blocker:
>> 4) This isn't compatible with how sched_idle_ops get used for 
>> granularity > 1.
>>
>> Suggestions very welcome.
> 
> Did you consider to generate the needed code dynamically instead?
> 
> I guess this could even be extended to avoid function pointers
> completely using the same technique as in my hypercall series.
> 
> In order to avoid the need for a central table the per-scheduler
> hooks could use standard names (as most of them do already).
> 
> I think I could come up with a patch in a few hours if you like
> that approach.

BTW, in theory this approach could be generalized for other function
vectors in the hypervisor, too (maybe even all?).


Juergen
Jürgen Groß March 3, 2022, 10:14 a.m. UTC | #4
On 03.03.22 09:33, Juergen Gross wrote:
> On 03.03.22 01:40, Andrew Cooper wrote:
>> When only one scheduler is compiled in, function pointers can be 
>> optimised to
>> direct calls, and the hooks hardened against controlflow hijacking.
>>
>> RFC for several reasons.
>>
>> 1) There's an almost beautiful way of not introducing MAYBE_SCHED() 
>> and hiding
>>     the magic in REGISTER_SCHEDULER(), except it falls over
>>     https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91765 which has no 
>> comment or
>>     resolution at all.
>>
>> 2) A different alternative which almost works is to remove the 
>> indirection in
>>     .data.schedulers, but the singleton scheduler object can't be both 
>> there
>>     and in .init.rodata.cf_clobber.
>>
>> 3) I can't think of a way of build time check to enforce that new 
>> schedulers
>>     get added to the preprocessor magic.
>>
>> And the blocker:
>> 4) This isn't compatible with how sched_idle_ops get used for 
>> granularity > 1.
>>
>> Suggestions very welcome.
> 
> Did you consider to generate the needed code dynamically instead?
> 
> I guess this could even be extended to avoid function pointers
> completely using the same technique as in my hypercall series.
> 
> In order to avoid the need for a central table the per-scheduler
> hooks could use standard names (as most of them do already).
> 
> I think I could come up with a patch in a few hours if you like
> that approach.

BTW, in theory this approach could be generalized for other function
vectors in the hypervisor, too (maybe even all?).


Juergen
Jürgen Groß March 4, 2022, 5:21 a.m. UTC | #5
On 03.03.22 01:40, Andrew Cooper wrote:
> When only one scheduler is compiled in, function pointers can be optimised to
> direct calls, and the hooks hardened against controlflow hijacking.
> 
> RFC for several reasons.
> 
> 1) There's an almost beautiful way of not introducing MAYBE_SCHED() and hiding
>     the magic in REGISTER_SCHEDULER(), except it falls over
>     https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91765 which has no comment or
>     resolution at all.
> 
> 2) A different alternative which almost works is to remove the indirection in
>     .data.schedulers, but the singleton scheduler object can't be both there
>     and in .init.rodata.cf_clobber.
> 
> 3) I can't think of a way of build time check to enforce that new schedulers
>     get added to the preprocessor magic.
> 
> And the blocker:
> 4) This isn't compatible with how sched_idle_ops get used for granularity > 1.

Correct. Either you have core scheduling or you can optimize the
indirect calls to direct ones.

> Suggestions very welcome.

Dynamic code generation for replacing the function vector with scheduler
id based if/switch statements.

Something like the attached patch for generating the code (won't build
right now as the related scheduler code adaptions are missing, but the
header is created).

The code generation script could share quite some code with my hypercall
series, and expansion for other function vectors should be rather easy.


Juergen
diff mbox series

Patch

diff --git a/xen/common/sched/arinc653.c b/xen/common/sched/arinc653.c
index a82c0d7314a1..73738b007e7d 100644
--- a/xen/common/sched/arinc653.c
+++ b/xen/common/sched/arinc653.c
@@ -694,7 +694,7 @@  a653sched_adjust_global(const struct scheduler *ops,
  * callback functions.
  * The symbol must be visible to the rest of Xen at link time.
  */
-static const struct scheduler sched_arinc653_def = {
+const struct scheduler MAYBE_SCHED(sched_arinc653_def) = {
     .name           = "ARINC 653 Scheduler",
     .opt_name       = "arinc653",
     .sched_id       = XEN_SCHEDULER_ARINC653,
diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
index 19ab67818106..020a5741ca31 100644
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -2263,7 +2263,7 @@  static struct sched_unit *do_schedule(struct sched_unit *prev, s_time_t now,
     struct sched_unit *next;
 
     /* get policy-specific decision on scheduling... */
-    sched->do_schedule(sched, prev, now, sched_tasklet_check(cpu));
+    sched_vcall(sched, do_schedule, sched, prev, now, sched_tasklet_check(cpu));
 
     next = prev->next_task;
 
@@ -2975,7 +2975,7 @@  void __init scheduler_init(void)
 
 #undef sched_test_func
 
-        if ( schedulers[i]->global_init && schedulers[i]->global_init() < 0 )
+        if ( sched_global_init(schedulers[i]) < 0 )
         {
             printk("scheduler %s failed initialization, dropped\n",
                    schedulers[i]->opt_name);
diff --git a/xen/common/sched/credit.c b/xen/common/sched/credit.c
index 4d3bd8cba6fc..8b85e9617fc0 100644
--- a/xen/common/sched/credit.c
+++ b/xen/common/sched/credit.c
@@ -2230,7 +2230,7 @@  csched_deinit(struct scheduler *ops)
     }
 }
 
-static const struct scheduler sched_credit_def = {
+const struct scheduler MAYBE_SCHED(sched_credit_def) = {
     .name           = "SMP Credit Scheduler",
     .opt_name       = "credit",
     .sched_id       = XEN_SCHEDULER_CREDIT,
diff --git a/xen/common/sched/credit2.c b/xen/common/sched/credit2.c
index 0e3f89e5378e..fda3812d7ac1 100644
--- a/xen/common/sched/credit2.c
+++ b/xen/common/sched/credit2.c
@@ -4199,7 +4199,7 @@  csched2_deinit(struct scheduler *ops)
     xfree(prv);
 }
 
-static const struct scheduler sched_credit2_def = {
+const struct scheduler MAYBE_SCHED(sched_credit2_def) = {
     .name           = "SMP Credit Scheduler rev2",
     .opt_name       = "credit2",
     .sched_id       = XEN_SCHEDULER_CREDIT2,
diff --git a/xen/common/sched/null.c b/xen/common/sched/null.c
index 65a0a6c5312d..907a8ae1ca50 100644
--- a/xen/common/sched/null.c
+++ b/xen/common/sched/null.c
@@ -1025,7 +1025,7 @@  static void cf_check null_dump(const struct scheduler *ops)
     spin_unlock_irqrestore(&prv->lock, flags);
 }
 
-static const struct scheduler sched_null_def = {
+const struct scheduler MAYBE_SCHED(sched_null_def) = {
     .name           = "null Scheduler",
     .opt_name       = "null",
     .sched_id       = XEN_SCHEDULER_NULL,
diff --git a/xen/common/sched/private.h b/xen/common/sched/private.h
index a870320146ef..f3ba0101ecc7 100644
--- a/xen/common/sched/private.h
+++ b/xen/common/sched/private.h
@@ -271,6 +271,33 @@  static inline spinlock_t *pcpu_schedule_trylock(unsigned int cpu)
     return NULL;
 }
 
+#if 1 ==                                                                \
+    defined(CONFIG_SCHED_CREDIT) + defined(CONFIG_SCHED_CREDIT2) +      \
+    defined(CONFIG_SCHED_RTDS) + defined(CONFIG_SCHED_ARINC653) +       \
+    defined(CONFIG_SCHED_NULL)
+
+extern const struct scheduler sched_ops;
+#define MAYBE_SCHED(x) __initdata_cf_clobber sched_ops
+#define REGISTER_SCHEDULER(x) static const struct scheduler *x##_entry \
+  __used_section(".data.schedulers") = &sched_ops;
+
+#define sched_call(s, fn, ...) \
+    alternative_call(sched_ops.fn, ##__VA_ARGS__)
+
+#define sched_vcall(s, fn, ...) \
+    alternative_vcall(sched_ops.fn, ##__VA_ARGS__)
+
+#else
+
+#define MAYBE_SCHED(x) static x
+#define REGISTER_SCHEDULER(x) static const struct scheduler *x##_entry \
+  __used_section(".data.schedulers") = &x;
+
+#define sched_call(s, fn, ...)  (s)->fn(__VA_ARGS__)
+#define sched_vcall(s, fn, ...) (s)->fn(__VA_ARGS__)
+
+#endif
+
 struct scheduler {
     const char *name;       /* full name for this scheduler      */
     const char *opt_name;   /* option name for this scheduler    */
@@ -333,39 +360,48 @@  struct scheduler {
     void         (*dump_cpu_state) (const struct scheduler *, int);
 };
 
+static inline int sched_global_init(const struct scheduler *s)
+{
+    if ( s->global_init )
+        return sched_call(s, global_init);
+    return 0;
+}
+
 static inline int sched_init(struct scheduler *s)
 {
-    return s->init(s);
+    return sched_call(s, init, s);
 }
 
 static inline void sched_deinit(struct scheduler *s)
 {
-    s->deinit(s);
+    sched_vcall(s, deinit, s);
 }
 
 static inline spinlock_t *sched_switch_sched(struct scheduler *s,
                                              unsigned int cpu,
                                              void *pdata, void *vdata)
 {
-    return s->switch_sched(s, cpu, pdata, vdata);
+    return sched_call(s, switch_sched, s, cpu, pdata, vdata);
 }
 
 static inline void sched_dump_settings(const struct scheduler *s)
 {
     if ( s->dump_settings )
-        s->dump_settings(s);
+        sched_vcall(s, dump_settings, s);
 }
 
 static inline void sched_dump_cpu_state(const struct scheduler *s, int cpu)
 {
     if ( s->dump_cpu_state )
-        s->dump_cpu_state(s, cpu);
+        sched_vcall(s, dump_cpu_state, s, cpu);
 }
 
 static inline void *sched_alloc_domdata(const struct scheduler *s,
                                         struct domain *d)
 {
-    return s->alloc_domdata ? s->alloc_domdata(s, d) : NULL;
+    if ( s->alloc_domdata )
+        return sched_call(s, alloc_domdata, s, d);
+    return NULL;
 }
 
 static inline void sched_free_domdata(const struct scheduler *s,
@@ -373,12 +409,14 @@  static inline void sched_free_domdata(const struct scheduler *s,
 {
     ASSERT(s->free_domdata || !data);
     if ( s->free_domdata )
-        s->free_domdata(s, data);
+        sched_vcall(s, free_domdata, s, data);
 }
 
 static inline void *sched_alloc_pdata(const struct scheduler *s, int cpu)
 {
-    return s->alloc_pdata ? s->alloc_pdata(s, cpu) : NULL;
+    if ( s->alloc_pdata )
+        return sched_call(s, alloc_pdata, s, cpu);
+    return NULL;
 }
 
 static inline void sched_free_pdata(const struct scheduler *s, void *data,
@@ -386,74 +424,74 @@  static inline void sched_free_pdata(const struct scheduler *s, void *data,
 {
     ASSERT(s->free_pdata || !data);
     if ( s->free_pdata )
-        s->free_pdata(s, data, cpu);
+        sched_vcall(s, free_pdata, s, data, cpu);
 }
 
 static inline void sched_deinit_pdata(const struct scheduler *s, void *data,
                                       int cpu)
 {
     if ( s->deinit_pdata )
-        s->deinit_pdata(s, data, cpu);
+        sched_vcall(s, deinit_pdata, s, data, cpu);
 }
 
 static inline void *sched_alloc_udata(const struct scheduler *s,
                                       struct sched_unit *unit, void *dom_data)
 {
-    return s->alloc_udata(s, unit, dom_data);
+    return sched_call(s, alloc_udata, s, unit, dom_data);
 }
 
 static inline void sched_free_udata(const struct scheduler *s, void *data)
 {
-    s->free_udata(s, data);
+    sched_vcall(s, free_udata, s, data);
 }
 
 static inline void sched_insert_unit(const struct scheduler *s,
                                      struct sched_unit *unit)
 {
     if ( s->insert_unit )
-        s->insert_unit(s, unit);
+        sched_vcall(s, insert_unit, s, unit);
 }
 
 static inline void sched_remove_unit(const struct scheduler *s,
                                      struct sched_unit *unit)
 {
     if ( s->remove_unit )
-        s->remove_unit(s, unit);
+        sched_vcall(s, remove_unit, s, unit);
 }
 
 static inline void sched_sleep(const struct scheduler *s,
                                struct sched_unit *unit)
 {
     if ( s->sleep )
-        s->sleep(s, unit);
+        sched_vcall(s, sleep, s, unit);
 }
 
 static inline void sched_wake(const struct scheduler *s,
                               struct sched_unit *unit)
 {
     if ( s->wake )
-        s->wake(s, unit);
+        sched_vcall(s, wake, s, unit);
 }
 
 static inline void sched_yield(const struct scheduler *s,
                                struct sched_unit *unit)
 {
     if ( s->yield )
-        s->yield(s, unit);
+        sched_vcall(s, yield, s, unit);
 }
 
 static inline void sched_context_saved(const struct scheduler *s,
                                        struct sched_unit *unit)
 {
     if ( s->context_saved )
-        s->context_saved(s, unit);
+        sched_vcall(s, context_saved, s, unit);
 }
 
 static inline void sched_migrate(const struct scheduler *s,
                                  struct sched_unit *unit, unsigned int cpu)
 {
     if ( s->migrate )
-        s->migrate(s, unit, cpu);
+        sched_vcall(s, migrate, s, unit, cpu);
     else
         sched_set_res(unit, get_sched_res(cpu));
 }
@@ -461,7 +499,7 @@  static inline void sched_migrate(const struct scheduler *s,
 static inline struct sched_resource *sched_pick_resource(
     const struct scheduler *s, const struct sched_unit *unit)
 {
-    return s->pick_resource(s, unit);
+    return sched_call(s, pick_resource, s, unit);
 }
 
 static inline void sched_adjust_affinity(const struct scheduler *s,
@@ -470,19 +508,23 @@  static inline void sched_adjust_affinity(const struct scheduler *s,
                                          const cpumask_t *soft)
 {
     if ( s->adjust_affinity )
-        s->adjust_affinity(s, unit, hard, soft);
+        sched_vcall(s, adjust_affinity, s, unit, hard, soft);
 }
 
 static inline int sched_adjust_dom(const struct scheduler *s, struct domain *d,
                                    struct xen_domctl_scheduler_op *op)
 {
-    return s->adjust ? s->adjust(s, d, op) : 0;
+    if ( s->adjust )
+        return sched_call(s, adjust, s, d, op);
+    return 0;
 }
 
 static inline int sched_adjust_cpupool(const struct scheduler *s,
                                        struct xen_sysctl_scheduler_op *op)
 {
-    return s->adjust_global ? s->adjust_global(s, op) : 0;
+    if ( s->adjust_global )
+        return sched_call(s, adjust_global, s, op);
+    return 0;
 }
 
 static inline void sched_unit_pause_nosync(const struct sched_unit *unit)
@@ -501,9 +543,6 @@  static inline void sched_unit_unpause(const struct sched_unit *unit)
         vcpu_unpause(v);
 }
 
-#define REGISTER_SCHEDULER(x) static const struct scheduler *x##_entry \
-  __used_section(".data.schedulers") = &x;
-
 struct cpupool
 {
     unsigned int     cpupool_id;
diff --git a/xen/common/sched/rt.c b/xen/common/sched/rt.c
index d6de25531b3c..9b42852b2de5 100644
--- a/xen/common/sched/rt.c
+++ b/xen/common/sched/rt.c
@@ -1529,7 +1529,7 @@  static void cf_check repl_timer_handler(void *data)
     spin_unlock_irq(&prv->lock);
 }
 
-static const struct scheduler sched_rtds_def = {
+const struct scheduler MAYBE_SCHED(sched_rtds_def) = {
     .name           = "SMP RTDS Scheduler",
     .opt_name       = "rtds",
     .sched_id       = XEN_SCHEDULER_RTDS,