diff mbox series

[PATCH-for-4.17] xen/sched: fix restore_vcpu_affinity() by removing it

Message ID 20221021065806.14316-1-jgross@suse.com (mailing list archive)
State Superseded
Headers show
Series [PATCH-for-4.17] xen/sched: fix restore_vcpu_affinity() by removing it | expand

Commit Message

Jürgen Groß Oct. 21, 2022, 6:58 a.m. UTC
When the system is coming up after having been suspended,
restore_vcpu_affinity() is called for each domain in order to adjust
the vcpu's affinity settings in case a cpu didn't come to live again.

The way restore_vcpu_affinity() is doing that is wrong, because the
specific scheduler isn't being informed about a possible migration of
the vcpu to another cpu. Additionally the migration is often even
happening if all cpus are running again, as it is done without check
whether it is really needed.

As cpupool management is already calling cpu_disable_scheduler() for
cpus not having come up again, and cpu_disable_scheduler() is taking
care of eventually needed vcpu migration in the proper way, there is
simply no need for restore_vcpu_affinity().

So just remove restore_vcpu_affinity() completely.

Fixes: 8a5d50dd0b04 ("xen: sched: simplify ACPI S3 resume path.")
Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
---
 xen/arch/x86/acpi/power.c |  3 --
 xen/common/sched/core.c   | 70 ---------------------------------------
 xen/include/xen/sched.h   |  1 -
 3 files changed, 74 deletions(-)

Comments

Jürgen Groß Oct. 21, 2022, 7:06 a.m. UTC | #1
On 21.10.22 08:58, Juergen Gross wrote:
> When the system is coming up after having been suspended,
> restore_vcpu_affinity() is called for each domain in order to adjust
> the vcpu's affinity settings in case a cpu didn't come to live again.
> 
> The way restore_vcpu_affinity() is doing that is wrong, because the
> specific scheduler isn't being informed about a possible migration of
> the vcpu to another cpu. Additionally the migration is often even
> happening if all cpus are running again, as it is done without check
> whether it is really needed.
> 
> As cpupool management is already calling cpu_disable_scheduler() for
> cpus not having come up again, and cpu_disable_scheduler() is taking
> care of eventually needed vcpu migration in the proper way, there is
> simply no need for restore_vcpu_affinity().
> 
> So just remove restore_vcpu_affinity() completely.
> 
> Fixes: 8a5d50dd0b04 ("xen: sched: simplify ACPI S3 resume path.")

This Fixes: tag is wrong. It should be:

Fixes: 8a04eaa8ea83 ("xen/sched: move some per-vcpu items to struct sched_unit")


Juergen
Dario Faggioli Oct. 21, 2022, 9:16 a.m. UTC | #2
On Fri, 2022-10-21 at 09:06 +0200, Juergen Gross wrote:
> On 21.10.22 08:58, Juergen Gross wrote:
> > When the system is coming up after having been suspended,
> > restore_vcpu_affinity() is called for each domain in order to
> > adjust
> > the vcpu's affinity settings in case a cpu didn't come to live
> > again.
> > 
> > The way restore_vcpu_affinity() is doing that is wrong, because the
> > specific scheduler isn't being informed about a possible migration
> > of
> > the vcpu to another cpu. Additionally the migration is often even
> > happening if all cpus are running again, as it is done without
> > check
> > whether it is really needed.
> > 
> > As cpupool management is already calling cpu_disable_scheduler()
> > for
> > cpus not having come up again, and cpu_disable_scheduler() is
> > taking
> > care of eventually needed vcpu migration in the proper way, there
> > is
> > simply no need for restore_vcpu_affinity().
> > 
> > So just remove restore_vcpu_affinity() completely.
> > 
> > Fixes: 8a5d50dd0b04 ("xen: sched: simplify ACPI S3 resume path.")
> 
> This Fixes: tag is wrong. It should be:
> 
> Fixes: 8a04eaa8ea83 ("xen/sched: move some per-vcpu items to struct
> sched_unit")
> 
Acked-by: Dario Faggioli <dfaggioli@suse.com>

Thanks and Regards
Henry Wang Oct. 21, 2022, 9:20 a.m. UTC | #3
Hi Juergen,

> -----Original Message-----
> From: Juergen Gross <jgross@suse.com>
> Subject: [PATCH-for-4.17] xen/sched: fix restore_vcpu_affinity() by removing
> it
> 
> When the system is coming up after having been suspended,
> restore_vcpu_affinity() is called for each domain in order to adjust
> the vcpu's affinity settings in case a cpu didn't come to live again.
> 
> The way restore_vcpu_affinity() is doing that is wrong, because the
> specific scheduler isn't being informed about a possible migration of
> the vcpu to another cpu. Additionally the migration is often even
> happening if all cpus are running again, as it is done without check
> whether it is really needed.
> 
> As cpupool management is already calling cpu_disable_scheduler() for
> cpus not having come up again, and cpu_disable_scheduler() is taking
> care of eventually needed vcpu migration in the proper way, there is
> simply no need for restore_vcpu_affinity().
> 
> So just remove restore_vcpu_affinity() completely.
> 
> Fixes: 8a5d50dd0b04 ("xen: sched: simplify ACPI S3 resume path.")
> Reported-by: Marek Marczykowski-Górecki
> <marmarek@invisiblethingslab.com>
> Signed-off-by: Juergen Gross <jgross@suse.com>

Release-acked-by: Henry Wang <Henry.Wang@arm.com>

Kind regards,
Henry
Jan Beulich Oct. 21, 2022, 10:37 a.m. UTC | #4
On 21.10.2022 08:58, Juergen Gross wrote:
> --- a/xen/common/sched/core.c
> +++ b/xen/common/sched/core.c
> @@ -1196,76 +1196,6 @@ static void sched_reset_affinity_broken(const struct sched_unit *unit)
>          v->affinity_broken = false;
>  }

My pre-push build test failed because the function above ...

> -void restore_vcpu_affinity(struct domain *d)
> -{
> -    unsigned int cpu = smp_processor_id();
> -    struct sched_unit *unit;
> -
> -    ASSERT(system_state == SYS_STATE_resume);
> -
> -    rcu_read_lock(&sched_res_rculock);
> -
> -    for_each_sched_unit ( d, unit )
> -    {
> -        spinlock_t *lock;
> -        unsigned int old_cpu = sched_unit_master(unit);
> -        struct sched_resource *res;
> -
> -        ASSERT(!unit_runnable(unit));
> -
> -        /*
> -         * Re-assign the initial processor as after resume we have no
> -         * guarantee the old processor has come back to life again.
> -         *
> -         * Therefore, here, before actually unpausing the domains, we should
> -         * set v->processor of each of their vCPUs to something that will
> -         * make sense for the scheduler of the cpupool in which they are in.
> -         */
> -        lock = unit_schedule_lock_irq(unit);
> -
> -        cpumask_and(cpumask_scratch_cpu(cpu), unit->cpu_hard_affinity,
> -                    cpupool_domain_master_cpumask(d));
> -        if ( cpumask_empty(cpumask_scratch_cpu(cpu)) )
> -        {
> -            if ( sched_check_affinity_broken(unit) )
> -            {
> -                sched_set_affinity(unit, unit->cpu_hard_affinity_saved, NULL);
> -                sched_reset_affinity_broken(unit);

... has its only use removed here. It didn't seem appropriate for me to
go and silently remove that function as well.

Jan

> -                cpumask_and(cpumask_scratch_cpu(cpu), unit->cpu_hard_affinity,
> -                            cpupool_domain_master_cpumask(d));
> -            }
> -
> -            if ( cpumask_empty(cpumask_scratch_cpu(cpu)) )
> -            {
> -                /* Affinity settings of one vcpu are for the complete unit. */
> -                printk(XENLOG_DEBUG "Breaking affinity for %pv\n",
> -                       unit->vcpu_list);
> -                sched_set_affinity(unit, &cpumask_all, NULL);
> -                cpumask_and(cpumask_scratch_cpu(cpu), unit->cpu_hard_affinity,
> -                            cpupool_domain_master_cpumask(d));
> -            }
> -        }
> -
> -        res = get_sched_res(cpumask_any(cpumask_scratch_cpu(cpu)));
> -        sched_set_res(unit, res);
> -
> -        spin_unlock_irq(lock);
> -
> -        /* v->processor might have changed, so reacquire the lock. */
> -        lock = unit_schedule_lock_irq(unit);
> -        res = sched_pick_resource(unit_scheduler(unit), unit);
> -        sched_set_res(unit, res);
> -        spin_unlock_irq(lock);
> -
> -        if ( old_cpu != sched_unit_master(unit) )
> -            sched_move_irqs(unit);
> -    }
> -
> -    rcu_read_unlock(&sched_res_rculock);
> -
> -    domain_update_node_affinity(d);
> -}
> -
>  /*
>   * This function is used by cpu_hotplug code via cpu notifier chain
>   * and from cpupools to switch schedulers on a cpu.
> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> index 557b3229f6..072e4846aa 100644
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -1019,7 +1019,6 @@ void vcpu_set_periodic_timer(struct vcpu *v, s_time_t value);
>  void sched_setup_dom0_vcpus(struct domain *d);
>  int vcpu_temporary_affinity(struct vcpu *v, unsigned int cpu, uint8_t reason);
>  int vcpu_set_hard_affinity(struct vcpu *v, const cpumask_t *affinity);
> -void restore_vcpu_affinity(struct domain *d);
>  int vcpu_affinity_domctl(struct domain *d, uint32_t cmd,
>                           struct xen_domctl_vcpuaffinity *vcpuaff);
>
Jürgen Groß Oct. 21, 2022, 10:43 a.m. UTC | #5
On 21.10.22 12:37, Jan Beulich wrote:
> On 21.10.2022 08:58, Juergen Gross wrote:
>> --- a/xen/common/sched/core.c
>> +++ b/xen/common/sched/core.c
>> @@ -1196,76 +1196,6 @@ static void sched_reset_affinity_broken(const struct sched_unit *unit)
>>           v->affinity_broken = false;
>>   }
> 
> My pre-push build test failed because the function above ...
> 
>> -void restore_vcpu_affinity(struct domain *d)
>> -{
>> -    unsigned int cpu = smp_processor_id();
>> -    struct sched_unit *unit;
>> -
>> -    ASSERT(system_state == SYS_STATE_resume);
>> -
>> -    rcu_read_lock(&sched_res_rculock);
>> -
>> -    for_each_sched_unit ( d, unit )
>> -    {
>> -        spinlock_t *lock;
>> -        unsigned int old_cpu = sched_unit_master(unit);
>> -        struct sched_resource *res;
>> -
>> -        ASSERT(!unit_runnable(unit));
>> -
>> -        /*
>> -         * Re-assign the initial processor as after resume we have no
>> -         * guarantee the old processor has come back to life again.
>> -         *
>> -         * Therefore, here, before actually unpausing the domains, we should
>> -         * set v->processor of each of their vCPUs to something that will
>> -         * make sense for the scheduler of the cpupool in which they are in.
>> -         */
>> -        lock = unit_schedule_lock_irq(unit);
>> -
>> -        cpumask_and(cpumask_scratch_cpu(cpu), unit->cpu_hard_affinity,
>> -                    cpupool_domain_master_cpumask(d));
>> -        if ( cpumask_empty(cpumask_scratch_cpu(cpu)) )
>> -        {
>> -            if ( sched_check_affinity_broken(unit) )
>> -            {
>> -                sched_set_affinity(unit, unit->cpu_hard_affinity_saved, NULL);
>> -                sched_reset_affinity_broken(unit);
> 
> ... has its only use removed here. It didn't seem appropriate for me to
> go and silently remove that function as well.

Weird I didn't spot that.

I'll update the patch.


Juergen
Marek Marczykowski-Górecki Oct. 21, 2022, 10:47 a.m. UTC | #6
On Fri, Oct 21, 2022 at 08:58:06AM +0200, Juergen Gross wrote:
> When the system is coming up after having been suspended,
> restore_vcpu_affinity() is called for each domain in order to adjust
> the vcpu's affinity settings in case a cpu didn't come to live again.
> 
> The way restore_vcpu_affinity() is doing that is wrong, because the
> specific scheduler isn't being informed about a possible migration of
> the vcpu to another cpu. Additionally the migration is often even
> happening if all cpus are running again, as it is done without check
> whether it is really needed.
> 
> As cpupool management is already calling cpu_disable_scheduler() for
> cpus not having come up again, and cpu_disable_scheduler() is taking
> care of eventually needed vcpu migration in the proper way, there is
> simply no need for restore_vcpu_affinity().
> 
> So just remove restore_vcpu_affinity() completely.
> 
> Fixes: 8a5d50dd0b04 ("xen: sched: simplify ACPI S3 resume path.")
> Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> Signed-off-by: Juergen Gross <jgross@suse.com>

I can only test it on a different configuration right now, but I can
confirm with this patch applied the system survives S3 (and it did
crashed without it at least once on this config).

With the now-unused function (also noticed by Jan) dealt with, you can add my:

Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>


> ---
>  xen/arch/x86/acpi/power.c |  3 --
>  xen/common/sched/core.c   | 70 ---------------------------------------
>  xen/include/xen/sched.h   |  1 -
>  3 files changed, 74 deletions(-)
> 
> diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
> index 1bb4d78392..b76f673acb 100644
> --- a/xen/arch/x86/acpi/power.c
> +++ b/xen/arch/x86/acpi/power.c
> @@ -159,10 +159,7 @@ static void thaw_domains(void)
>  
>      rcu_read_lock(&domlist_read_lock);
>      for_each_domain ( d )
> -    {
> -        restore_vcpu_affinity(d);
>          domain_unpause(d);
> -    }
>      rcu_read_unlock(&domlist_read_lock);
>  }
>  
> diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
> index 83455fbde1..358fa077e3 100644
> --- a/xen/common/sched/core.c
> +++ b/xen/common/sched/core.c
> @@ -1196,76 +1196,6 @@ static void sched_reset_affinity_broken(const struct sched_unit *unit)
>          v->affinity_broken = false;
>  }
>  
> -void restore_vcpu_affinity(struct domain *d)
> -{
> -    unsigned int cpu = smp_processor_id();
> -    struct sched_unit *unit;
> -
> -    ASSERT(system_state == SYS_STATE_resume);
> -
> -    rcu_read_lock(&sched_res_rculock);
> -
> -    for_each_sched_unit ( d, unit )
> -    {
> -        spinlock_t *lock;
> -        unsigned int old_cpu = sched_unit_master(unit);
> -        struct sched_resource *res;
> -
> -        ASSERT(!unit_runnable(unit));
> -
> -        /*
> -         * Re-assign the initial processor as after resume we have no
> -         * guarantee the old processor has come back to life again.
> -         *
> -         * Therefore, here, before actually unpausing the domains, we should
> -         * set v->processor of each of their vCPUs to something that will
> -         * make sense for the scheduler of the cpupool in which they are in.
> -         */
> -        lock = unit_schedule_lock_irq(unit);
> -
> -        cpumask_and(cpumask_scratch_cpu(cpu), unit->cpu_hard_affinity,
> -                    cpupool_domain_master_cpumask(d));
> -        if ( cpumask_empty(cpumask_scratch_cpu(cpu)) )
> -        {
> -            if ( sched_check_affinity_broken(unit) )
> -            {
> -                sched_set_affinity(unit, unit->cpu_hard_affinity_saved, NULL);
> -                sched_reset_affinity_broken(unit);
> -                cpumask_and(cpumask_scratch_cpu(cpu), unit->cpu_hard_affinity,
> -                            cpupool_domain_master_cpumask(d));
> -            }
> -
> -            if ( cpumask_empty(cpumask_scratch_cpu(cpu)) )
> -            {
> -                /* Affinity settings of one vcpu are for the complete unit. */
> -                printk(XENLOG_DEBUG "Breaking affinity for %pv\n",
> -                       unit->vcpu_list);
> -                sched_set_affinity(unit, &cpumask_all, NULL);
> -                cpumask_and(cpumask_scratch_cpu(cpu), unit->cpu_hard_affinity,
> -                            cpupool_domain_master_cpumask(d));
> -            }
> -        }
> -
> -        res = get_sched_res(cpumask_any(cpumask_scratch_cpu(cpu)));
> -        sched_set_res(unit, res);
> -
> -        spin_unlock_irq(lock);
> -
> -        /* v->processor might have changed, so reacquire the lock. */
> -        lock = unit_schedule_lock_irq(unit);
> -        res = sched_pick_resource(unit_scheduler(unit), unit);
> -        sched_set_res(unit, res);
> -        spin_unlock_irq(lock);
> -
> -        if ( old_cpu != sched_unit_master(unit) )
> -            sched_move_irqs(unit);
> -    }
> -
> -    rcu_read_unlock(&sched_res_rculock);
> -
> -    domain_update_node_affinity(d);
> -}
> -
>  /*
>   * This function is used by cpu_hotplug code via cpu notifier chain
>   * and from cpupools to switch schedulers on a cpu.
> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> index 557b3229f6..072e4846aa 100644
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -1019,7 +1019,6 @@ void vcpu_set_periodic_timer(struct vcpu *v, s_time_t value);
>  void sched_setup_dom0_vcpus(struct domain *d);
>  int vcpu_temporary_affinity(struct vcpu *v, unsigned int cpu, uint8_t reason);
>  int vcpu_set_hard_affinity(struct vcpu *v, const cpumask_t *affinity);
> -void restore_vcpu_affinity(struct domain *d);
>  int vcpu_affinity_domctl(struct domain *d, uint32_t cmd,
>                           struct xen_domctl_vcpuaffinity *vcpuaff);
>  
> -- 
> 2.35.3
>
diff mbox series

Patch

diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
index 1bb4d78392..b76f673acb 100644
--- a/xen/arch/x86/acpi/power.c
+++ b/xen/arch/x86/acpi/power.c
@@ -159,10 +159,7 @@  static void thaw_domains(void)
 
     rcu_read_lock(&domlist_read_lock);
     for_each_domain ( d )
-    {
-        restore_vcpu_affinity(d);
         domain_unpause(d);
-    }
     rcu_read_unlock(&domlist_read_lock);
 }
 
diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
index 83455fbde1..358fa077e3 100644
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -1196,76 +1196,6 @@  static void sched_reset_affinity_broken(const struct sched_unit *unit)
         v->affinity_broken = false;
 }
 
-void restore_vcpu_affinity(struct domain *d)
-{
-    unsigned int cpu = smp_processor_id();
-    struct sched_unit *unit;
-
-    ASSERT(system_state == SYS_STATE_resume);
-
-    rcu_read_lock(&sched_res_rculock);
-
-    for_each_sched_unit ( d, unit )
-    {
-        spinlock_t *lock;
-        unsigned int old_cpu = sched_unit_master(unit);
-        struct sched_resource *res;
-
-        ASSERT(!unit_runnable(unit));
-
-        /*
-         * Re-assign the initial processor as after resume we have no
-         * guarantee the old processor has come back to life again.
-         *
-         * Therefore, here, before actually unpausing the domains, we should
-         * set v->processor of each of their vCPUs to something that will
-         * make sense for the scheduler of the cpupool in which they are in.
-         */
-        lock = unit_schedule_lock_irq(unit);
-
-        cpumask_and(cpumask_scratch_cpu(cpu), unit->cpu_hard_affinity,
-                    cpupool_domain_master_cpumask(d));
-        if ( cpumask_empty(cpumask_scratch_cpu(cpu)) )
-        {
-            if ( sched_check_affinity_broken(unit) )
-            {
-                sched_set_affinity(unit, unit->cpu_hard_affinity_saved, NULL);
-                sched_reset_affinity_broken(unit);
-                cpumask_and(cpumask_scratch_cpu(cpu), unit->cpu_hard_affinity,
-                            cpupool_domain_master_cpumask(d));
-            }
-
-            if ( cpumask_empty(cpumask_scratch_cpu(cpu)) )
-            {
-                /* Affinity settings of one vcpu are for the complete unit. */
-                printk(XENLOG_DEBUG "Breaking affinity for %pv\n",
-                       unit->vcpu_list);
-                sched_set_affinity(unit, &cpumask_all, NULL);
-                cpumask_and(cpumask_scratch_cpu(cpu), unit->cpu_hard_affinity,
-                            cpupool_domain_master_cpumask(d));
-            }
-        }
-
-        res = get_sched_res(cpumask_any(cpumask_scratch_cpu(cpu)));
-        sched_set_res(unit, res);
-
-        spin_unlock_irq(lock);
-
-        /* v->processor might have changed, so reacquire the lock. */
-        lock = unit_schedule_lock_irq(unit);
-        res = sched_pick_resource(unit_scheduler(unit), unit);
-        sched_set_res(unit, res);
-        spin_unlock_irq(lock);
-
-        if ( old_cpu != sched_unit_master(unit) )
-            sched_move_irqs(unit);
-    }
-
-    rcu_read_unlock(&sched_res_rculock);
-
-    domain_update_node_affinity(d);
-}
-
 /*
  * This function is used by cpu_hotplug code via cpu notifier chain
  * and from cpupools to switch schedulers on a cpu.
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 557b3229f6..072e4846aa 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -1019,7 +1019,6 @@  void vcpu_set_periodic_timer(struct vcpu *v, s_time_t value);
 void sched_setup_dom0_vcpus(struct domain *d);
 int vcpu_temporary_affinity(struct vcpu *v, unsigned int cpu, uint8_t reason);
 int vcpu_set_hard_affinity(struct vcpu *v, const cpumask_t *affinity);
-void restore_vcpu_affinity(struct domain *d);
 int vcpu_affinity_domctl(struct domain *d, uint32_t cmd,
                          struct xen_domctl_vcpuaffinity *vcpuaff);