diff mbox series

rcu-tasks: Allow RCU-Task trace stall warning dump late IPI CPU stacks

Message ID 20230221043219.2384044-1-qiang1.zhang@intel.com (mailing list archive)
State New, archived
Headers show
Series rcu-tasks: Allow RCU-Task trace stall warning dump late IPI CPU stacks | expand

Commit Message

Zqiang Feb. 21, 2023, 4:32 a.m. UTC
The task structure's->trc_ipi_to_cpu and percpu trc_ipi_to_cpu is
used to record whether the IPI is completed, if the percpu trc_ipi_to_cpu
is true and task structure's->trc_ipi_to_cpu is non-negative value,
indicates that IPI is not completed, if the IPI is unresponsive for
along time for some reason, there will be a possibility of causing
RCU-Tasks trace stall. this commit therefore allow dump late IPI CPU
stacks to show the path that current CPU is executing.

Signed-off-by: Zqiang <qiang1.zhang@intel.com>
---
 In the real world, the IPI delay will be very small, so the
 probability of triggering dump_cpu_stack() may be very low,
 so if I makes noise, please ignore it.

 kernel/rcu/tasks.h | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Zhuo, Qiuxu Feb. 22, 2023, 5:44 a.m. UTC | #1
> From: Zqiang <qiang1.zhang@intel.com>
> Sent: Tuesday, February 21, 2023 12:32 PM
> To: paulmck@kernel.org; frederic@kernel.org; joel@joelfernandes.org
> Cc: rcu@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: [PATCH] rcu-tasks: Allow RCU-Task trace stall warning dump late IPI
> CPU stacks
> 
> The task structure's->trc_ipi_to_cpu and percpu trc_ipi_to_cpu is used to
> record whether the IPI is completed, if the percpu trc_ipi_to_cpu is true and
> task structure's->trc_ipi_to_cpu is non-negative value, indicates that IPI is not
> completed, if the IPI is unresponsive for along time for some reason, there
> will be a possibility of causing RCU-Tasks trace stall. this commit therefore
> allow dump late IPI CPU stacks to show the path that current CPU is executing.
> 
> Signed-off-by: Zqiang <qiang1.zhang@intel.com>
> ---
>  In the real world, the IPI delay will be very small, so the  probability of
> triggering dump_cpu_stack() may be very low,  so if I makes noise, please
> ignore it.
> 
>  kernel/rcu/tasks.h | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index
> baf7ec178155..85237fc1d0f0 100644
> --- a/kernel/rcu/tasks.h
> +++ b/kernel/rcu/tasks.h
> @@ -1658,8 +1658,13 @@ static void show_stalled_ipi_trace(void)
>  	int cpu;
> 
>  	for_each_possible_cpu(cpu)
> -		if (per_cpu(trc_ipi_to_cpu, cpu))
> +		if (per_cpu(trc_ipi_to_cpu, cpu)) {
>  			pr_alert("\tIPI outstanding to CPU %d\n", cpu);
> +			if (cpu_is_offline(cpu))
> +				pr_alert("offline CPU %d blocking gp\n", cpu);
> +			else
> +				dump_cpu_task(cpu);

check_all_holdout_tasks_trace() -> show_stalled_task_trace() has already showed the states/traces of tasks stalling the current RCU tasks trace GP.
Perhaps we don't need to dump these tasks again here.

> +		}
>  }
> 
>  /* Do one scan of the holdout list. */
> --
> 2.25.1
diff mbox series

Patch

diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index baf7ec178155..85237fc1d0f0 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -1658,8 +1658,13 @@  static void show_stalled_ipi_trace(void)
 	int cpu;
 
 	for_each_possible_cpu(cpu)
-		if (per_cpu(trc_ipi_to_cpu, cpu))
+		if (per_cpu(trc_ipi_to_cpu, cpu)) {
 			pr_alert("\tIPI outstanding to CPU %d\n", cpu);
+			if (cpu_is_offline(cpu))
+				pr_alert("offline CPU %d blocking gp\n", cpu);
+			else
+				dump_cpu_task(cpu);
+		}
 }
 
 /* Do one scan of the holdout list. */