diff mbox series

[v3] rcu-tasks: Make rude RCU-Tasks work well with CPU hotplug

Message ID 20221130012445.1863104-1-qiang1.zhang@intel.com (mailing list archive)
State Superseded
Headers show
Series [v3] rcu-tasks: Make rude RCU-Tasks work well with CPU hotplug | expand

Commit Message

Zqiang Nov. 30, 2022, 1:24 a.m. UTC
Currently, invoke rcu_tasks_rude_wait_gp() to wait one rude
RCU-tasks grace period, if __num_online_cpus == 1, will return
directly, indicates the end of the rude RCU-task grace period.
suppose the system has two cpus, consider the following scenario:

        CPU0                                   CPU1 (going offline)
                                          migration/1 task:
                                      cpu_stopper_thread
                                       -> take_cpu_down
                                          -> _cpu_disable
                                           (dec __num_online_cpus)
                                          ->cpuhp_invoke_callback
                                                preempt_disable
                                                access old_data0
           task1
 del old_data0                                  .....
 synchronize_rcu_tasks_rude()
 task1 schedule out
 ....
 task2 schedule in
 rcu_tasks_rude_wait_gp()
     ->__num_online_cpus == 1
       ->return
 ....
 task1 schedule in
 ->free old_data0
                                                preempt_enable

when CPU1 dec __num_online_cpus and __num_online_cpus is equal one,
the CPU1 has not finished offline, stop_machine task(migration/1)
still running on CPU1, maybe still accessing 'old_data0', but the
'old_data0' has freed on CPU0.

In order to prevent the above scenario from happening, this commit
remove check for __num_online_cpus == 0 and add handling of calling
synchronize_rcu_tasks_generic() during early boot(when the
rcu_scheduler_active variable is RCU_SCHEDULER_INACTIVE, the scheduler
not yet initialized and only one boot-CPU online).

Signed-off-by: Zqiang <qiang1.zhang@intel.com>
---
 kernel/rcu/tasks.h | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

Comments

Paul E. McKenney Nov. 30, 2022, 5:11 p.m. UTC | #1
On Wed, Nov 30, 2022 at 09:24:45AM +0800, Zqiang wrote:
> Currently, invoke rcu_tasks_rude_wait_gp() to wait one rude
> RCU-tasks grace period, if __num_online_cpus == 1, will return
> directly, indicates the end of the rude RCU-task grace period.
> suppose the system has two cpus, consider the following scenario:
> 
>         CPU0                                   CPU1 (going offline)
>                                           migration/1 task:
>                                       cpu_stopper_thread
>                                        -> take_cpu_down
>                                           -> _cpu_disable
>                                            (dec __num_online_cpus)
>                                           ->cpuhp_invoke_callback
>                                                 preempt_disable
>                                                 access old_data0
>            task1
>  del old_data0                                  .....
>  synchronize_rcu_tasks_rude()
>  task1 schedule out
>  ....
>  task2 schedule in
>  rcu_tasks_rude_wait_gp()
>      ->__num_online_cpus == 1
>        ->return
>  ....
>  task1 schedule in
>  ->free old_data0
>                                                 preempt_enable
> 
> when CPU1 dec __num_online_cpus and __num_online_cpus is equal one,
> the CPU1 has not finished offline, stop_machine task(migration/1)
> still running on CPU1, maybe still accessing 'old_data0', but the
> 'old_data0' has freed on CPU0.
> 
> In order to prevent the above scenario from happening, this commit
> remove check for __num_online_cpus == 0 and add handling of calling
> synchronize_rcu_tasks_generic() during early boot(when the
> rcu_scheduler_active variable is RCU_SCHEDULER_INACTIVE, the scheduler
> not yet initialized and only one boot-CPU online).
> 
> Signed-off-by: Zqiang <qiang1.zhang@intel.com>
> ---
>  kernel/rcu/tasks.h | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
> index 4dda8e6e5707..e4f7d08bde64 100644
> --- a/kernel/rcu/tasks.h
> +++ b/kernel/rcu/tasks.h
> @@ -560,8 +560,9 @@ static int __noreturn rcu_tasks_kthread(void *arg)
>  static void synchronize_rcu_tasks_generic(struct rcu_tasks *rtp)
>  {
>  	/* Complain if the scheduler has not started.  */
> -	WARN_ONCE(rcu_scheduler_active == RCU_SCHEDULER_INACTIVE,
> -			 "synchronize_rcu_tasks called too soon");
> +	if (WARN_ONCE(rcu_scheduler_active == RCU_SCHEDULER_INACTIVE,
> +			"synchronize_rcu_tasks called too soon"))

Much better, thank you!

But as long as you are touching this line of code, could you please also
fix it to print rtp->name instead of "synchronize_rcu_tasks"?

							Thanx, Paul

> +		return;
>  
>  	// If the grace-period kthread is running, use it.
>  	if (READ_ONCE(rtp->kthread_ptr)) {
> @@ -1064,9 +1065,6 @@ static void rcu_tasks_be_rude(struct work_struct *work)
>  // Wait for one rude RCU-tasks grace period.
>  static void rcu_tasks_rude_wait_gp(struct rcu_tasks *rtp)
>  {
> -	if (num_online_cpus() <= 1)
> -		return;	// Fastpath for only one CPU.
> -
>  	rtp->n_ipis += cpumask_weight(cpu_online_mask);
>  	schedule_on_each_cpu(rcu_tasks_be_rude);
>  }
> -- 
> 2.25.1
>
diff mbox series

Patch

diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index 4dda8e6e5707..e4f7d08bde64 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -560,8 +560,9 @@  static int __noreturn rcu_tasks_kthread(void *arg)
 static void synchronize_rcu_tasks_generic(struct rcu_tasks *rtp)
 {
 	/* Complain if the scheduler has not started.  */
-	WARN_ONCE(rcu_scheduler_active == RCU_SCHEDULER_INACTIVE,
-			 "synchronize_rcu_tasks called too soon");
+	if (WARN_ONCE(rcu_scheduler_active == RCU_SCHEDULER_INACTIVE,
+			"synchronize_rcu_tasks called too soon"))
+		return;
 
 	// If the grace-period kthread is running, use it.
 	if (READ_ONCE(rtp->kthread_ptr)) {
@@ -1064,9 +1065,6 @@  static void rcu_tasks_be_rude(struct work_struct *work)
 // Wait for one rude RCU-tasks grace period.
 static void rcu_tasks_rude_wait_gp(struct rcu_tasks *rtp)
 {
-	if (num_online_cpus() <= 1)
-		return;	// Fastpath for only one CPU.
-
 	rtp->n_ipis += cpumask_weight(cpu_online_mask);
 	schedule_on_each_cpu(rcu_tasks_be_rude);
 }