diff mbox series

[1/2] rcu: Delete a redundant check in rcu_check_gp_kthread_starvation()

Message ID 20230705073020.2030-2-thunder.leizhen@huawei.com (mailing list archive)
State Superseded
Headers show
Series rcu: Don't dump the stalled CPU on where RCU GP kthread last ran twice | expand

Commit Message

Leizhen (ThunderTown) July 5, 2023, 7:30 a.m. UTC
The above condition "if (gpk)" already ensures that gp_kthread is created,
so the local variable 'cpu' cannot be negative here.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 kernel/rcu/tree_stall.h | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

Comments

Paul E. McKenney July 10, 2023, 7:03 p.m. UTC | #1
On Wed, Jul 05, 2023 at 03:30:19PM +0800, Zhen Lei wrote:
> The above condition "if (gpk)" already ensures that gp_kthread is created,
> so the local variable 'cpu' cannot be negative here.
> 
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
>  kernel/rcu/tree_stall.h | 12 +++++-------
>  1 file changed, 5 insertions(+), 7 deletions(-)
> 
> diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> index b10b8349bb2a48b..dcfaa3d5db2cbc7 100644
> --- a/kernel/rcu/tree_stall.h
> +++ b/kernel/rcu/tree_stall.h
> @@ -537,13 +537,11 @@ static void rcu_check_gp_kthread_starvation(void)
>  			pr_err("\tUnless %s kthread gets sufficient CPU time, OOM is now expected behavior.\n", rcu_state.name);
>  			pr_err("RCU grace-period kthread stack dump:\n");
>  			sched_show_task(gpk);
> -			if (cpu >= 0) {

I am not quite this trusting of the relation between the relationship
between the existence of the grace-period khread and its CPU number
being in range.  Let's please start with something like this:

			if (!WARN_ON_ONCE(cpu < 0)) {

Please note that this is not just me.  See for example the use of the
cpumask_check() function, albeit the opposite concern.

> -				if (cpu_is_offline(cpu)) {
> -					pr_err("RCU GP kthread last ran on offline CPU %d.\n", cpu);
> -				} else  {
> -					pr_err("Stack dump where RCU GP kthread last ran:\n");
> -					dump_cpu_task(cpu);
> -				}
> +			if (cpu_is_offline(cpu)) {
> +				pr_err("RCU GP kthread last ran on offline CPU %d.\n", cpu);
> +			} else  {
> +				pr_err("Stack dump where RCU GP kthread last ran:\n");
> +				dump_cpu_task(cpu);
>  			}
>  			wake_up_process(gpk);
>  		}
> -- 
> 2.25.1
>
Leizhen (ThunderTown) July 11, 2023, 3:20 a.m. UTC | #2
On 2023/7/11 3:03, Paul E. McKenney wrote:
> On Wed, Jul 05, 2023 at 03:30:19PM +0800, Zhen Lei wrote:
>> The above condition "if (gpk)" already ensures that gp_kthread is created,
>> so the local variable 'cpu' cannot be negative here.
>>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>> ---
>>  kernel/rcu/tree_stall.h | 12 +++++-------
>>  1 file changed, 5 insertions(+), 7 deletions(-)
>>
>> diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
>> index b10b8349bb2a48b..dcfaa3d5db2cbc7 100644
>> --- a/kernel/rcu/tree_stall.h
>> +++ b/kernel/rcu/tree_stall.h
>> @@ -537,13 +537,11 @@ static void rcu_check_gp_kthread_starvation(void)
>>  			pr_err("\tUnless %s kthread gets sufficient CPU time, OOM is now expected behavior.\n", rcu_state.name);
>>  			pr_err("RCU grace-period kthread stack dump:\n");
>>  			sched_show_task(gpk);
>> -			if (cpu >= 0) {
> 
> I am not quite this trusting of the relation between the relationship
> between the existence of the grace-period khread and its CPU number
> being in range.  Let's please start with something like this:
> 
> 			if (!WARN_ON_ONCE(cpu < 0)) {
> 
> Please note that this is not just me.  See for example the use of the
> cpumask_check() function, albeit the opposite concern.

git grep -wn "\->cpu" kernel/ include/
kernel/kthread.c:583:   to_kthread(p)->cpu = cpu;				//kthread_create_on_cpu()
kernel/sched/sched.h:2024:      WRITE_ONCE(task_thread_info(p)->cpu, cpu);	//__set_task_cpu()
include/linux/sched.h:2250:     return READ_ONCE(task_thread_info(p)->cpu);	//task_cpu()

git grep -wn "\.cpu" kernel/ include/						//There is no task related, the search result is omitted.

Therefore, there is only one path "set_task_cpu()-->__set_task_cpu()" that can dynamically
change the value of task_cpu(p). In fact, this guarantee has been made in set_task_cpu().
set_task_cpu
	WARN_ON_ONCE(!cpu_online(new_cpu));
	__set_task_cpu(p, new_cpu);

In addition, task_struct has member 'on_rq'. Therefore, when a task leaves the scheduling
queue, setting the member 'cpu' to an invalid value will be thankless.

Sorry, these two patches was posted too quickly, and I'm still regretting that I should have
attached this to the commit description these days.


> 
>> -				if (cpu_is_offline(cpu)) {
>> -					pr_err("RCU GP kthread last ran on offline CPU %d.\n", cpu);
>> -				} else  {
>> -					pr_err("Stack dump where RCU GP kthread last ran:\n");
>> -					dump_cpu_task(cpu);
>> -				}
>> +			if (cpu_is_offline(cpu)) {
>> +				pr_err("RCU GP kthread last ran on offline CPU %d.\n", cpu);
>> +			} else  {
>> +				pr_err("Stack dump where RCU GP kthread last ran:\n");
>> +				dump_cpu_task(cpu);
>>  			}
>>  			wake_up_process(gpk);
>>  		}
>> -- 
>> 2.25.1
>>
> .
>
Paul E. McKenney July 11, 2023, 4:48 p.m. UTC | #3
On Tue, Jul 11, 2023 at 11:20:07AM +0800, Leizhen (ThunderTown) wrote:
> 
> 
> On 2023/7/11 3:03, Paul E. McKenney wrote:
> > On Wed, Jul 05, 2023 at 03:30:19PM +0800, Zhen Lei wrote:
> >> The above condition "if (gpk)" already ensures that gp_kthread is created,
> >> so the local variable 'cpu' cannot be negative here.
> >>
> >> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> >> ---
> >>  kernel/rcu/tree_stall.h | 12 +++++-------
> >>  1 file changed, 5 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> >> index b10b8349bb2a48b..dcfaa3d5db2cbc7 100644
> >> --- a/kernel/rcu/tree_stall.h
> >> +++ b/kernel/rcu/tree_stall.h
> >> @@ -537,13 +537,11 @@ static void rcu_check_gp_kthread_starvation(void)
> >>  			pr_err("\tUnless %s kthread gets sufficient CPU time, OOM is now expected behavior.\n", rcu_state.name);
> >>  			pr_err("RCU grace-period kthread stack dump:\n");
> >>  			sched_show_task(gpk);
> >> -			if (cpu >= 0) {
> > 
> > I am not quite this trusting of the relation between the relationship
> > between the existence of the grace-period khread and its CPU number
> > being in range.  Let's please start with something like this:
> > 
> > 			if (!WARN_ON_ONCE(cpu < 0)) {
> > 
> > Please note that this is not just me.  See for example the use of the
> > cpumask_check() function, albeit the opposite concern.
> 
> git grep -wn "\->cpu" kernel/ include/
> kernel/kthread.c:583:   to_kthread(p)->cpu = cpu;				//kthread_create_on_cpu()
> kernel/sched/sched.h:2024:      WRITE_ONCE(task_thread_info(p)->cpu, cpu);	//__set_task_cpu()
> include/linux/sched.h:2250:     return READ_ONCE(task_thread_info(p)->cpu);	//task_cpu()
> 
> git grep -wn "\.cpu" kernel/ include/						//There is no task related, the search result is omitted.
> 
> Therefore, there is only one path "set_task_cpu()-->__set_task_cpu()" that can dynamically
> change the value of task_cpu(p). In fact, this guarantee has been made in set_task_cpu().
> set_task_cpu
> 	WARN_ON_ONCE(!cpu_online(new_cpu));
> 	__set_task_cpu(p, new_cpu);
> 
> In addition, task_struct has member 'on_rq'. Therefore, when a task leaves the scheduling
> queue, setting the member 'cpu' to an invalid value will be thankless.

Thank you for digging into this!  Given that, as you say, we can dispense
with the check.

> Sorry, these two patches was posted too quickly, and I'm still regretting that I should have
> attached this to the commit description these days.

Please do resend the patches with this explanation in the commit log.
And please don't worry about making the English pretty, as I can always
wordsmith.

							Thanx, Paul

> >> -				if (cpu_is_offline(cpu)) {
> >> -					pr_err("RCU GP kthread last ran on offline CPU %d.\n", cpu);
> >> -				} else  {
> >> -					pr_err("Stack dump where RCU GP kthread last ran:\n");
> >> -					dump_cpu_task(cpu);
> >> -				}
> >> +			if (cpu_is_offline(cpu)) {
> >> +				pr_err("RCU GP kthread last ran on offline CPU %d.\n", cpu);
> >> +			} else  {
> >> +				pr_err("Stack dump where RCU GP kthread last ran:\n");
> >> +				dump_cpu_task(cpu);
> >>  			}
> >>  			wake_up_process(gpk);
> >>  		}
> >> -- 
> >> 2.25.1
> >>
> > .
> > 
> 
> -- 
> Regards,
>   Zhen Lei
Leizhen (ThunderTown) July 13, 2023, 2:03 a.m. UTC | #4
On 2023/7/12 0:48, Paul E. McKenney wrote:
> On Tue, Jul 11, 2023 at 11:20:07AM +0800, Leizhen (ThunderTown) wrote:
>>
>>
>> On 2023/7/11 3:03, Paul E. McKenney wrote:
>>> On Wed, Jul 05, 2023 at 03:30:19PM +0800, Zhen Lei wrote:
>>>> The above condition "if (gpk)" already ensures that gp_kthread is created,
>>>> so the local variable 'cpu' cannot be negative here.
>>>>
>>>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>>>> ---
>>>>  kernel/rcu/tree_stall.h | 12 +++++-------
>>>>  1 file changed, 5 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
>>>> index b10b8349bb2a48b..dcfaa3d5db2cbc7 100644
>>>> --- a/kernel/rcu/tree_stall.h
>>>> +++ b/kernel/rcu/tree_stall.h
>>>> @@ -537,13 +537,11 @@ static void rcu_check_gp_kthread_starvation(void)
>>>>  			pr_err("\tUnless %s kthread gets sufficient CPU time, OOM is now expected behavior.\n", rcu_state.name);
>>>>  			pr_err("RCU grace-period kthread stack dump:\n");
>>>>  			sched_show_task(gpk);
>>>> -			if (cpu >= 0) {
>>>
>>> I am not quite this trusting of the relation between the relationship
>>> between the existence of the grace-period khread and its CPU number
>>> being in range.  Let's please start with something like this:
>>>
>>> 			if (!WARN_ON_ONCE(cpu < 0)) {
>>>
>>> Please note that this is not just me.  See for example the use of the
>>> cpumask_check() function, albeit the opposite concern.
>>
>> git grep -wn "\->cpu" kernel/ include/
>> kernel/kthread.c:583:   to_kthread(p)->cpu = cpu;				//kthread_create_on_cpu()
>> kernel/sched/sched.h:2024:      WRITE_ONCE(task_thread_info(p)->cpu, cpu);	//__set_task_cpu()
>> include/linux/sched.h:2250:     return READ_ONCE(task_thread_info(p)->cpu);	//task_cpu()
>>
>> git grep -wn "\.cpu" kernel/ include/						//There is no task related, the search result is omitted.
>>
>> Therefore, there is only one path "set_task_cpu()-->__set_task_cpu()" that can dynamically
>> change the value of task_cpu(p). In fact, this guarantee has been made in set_task_cpu().
>> set_task_cpu
>> 	WARN_ON_ONCE(!cpu_online(new_cpu));
>> 	__set_task_cpu(p, new_cpu);
>>
>> In addition, task_struct has member 'on_rq'. Therefore, when a task leaves the scheduling
>> queue, setting the member 'cpu' to an invalid value will be thankless.
> 
> Thank you for digging into this!  Given that, as you say, we can dispense
> with the check.
> 
>> Sorry, these two patches was posted too quickly, and I'm still regretting that I should have
>> attached this to the commit description these days.
> 
> Please do resend the patches with this explanation in the commit log.
> And please don't worry about making the English pretty, as I can always
> wordsmith.

OK, thank you very much.

> 
> 							Thanx, Paul
> 
>>>> -				if (cpu_is_offline(cpu)) {
>>>> -					pr_err("RCU GP kthread last ran on offline CPU %d.\n", cpu);
>>>> -				} else  {
>>>> -					pr_err("Stack dump where RCU GP kthread last ran:\n");
>>>> -					dump_cpu_task(cpu);
>>>> -				}
>>>> +			if (cpu_is_offline(cpu)) {
>>>> +				pr_err("RCU GP kthread last ran on offline CPU %d.\n", cpu);
>>>> +			} else  {
>>>> +				pr_err("Stack dump where RCU GP kthread last ran:\n");
>>>> +				dump_cpu_task(cpu);
>>>>  			}
>>>>  			wake_up_process(gpk);
>>>>  		}
>>>> -- 
>>>> 2.25.1
>>>>
>>> .
>>>
>>
>> -- 
>> Regards,
>>   Zhen Lei
> .
>
diff mbox series

Patch

diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index b10b8349bb2a48b..dcfaa3d5db2cbc7 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -537,13 +537,11 @@  static void rcu_check_gp_kthread_starvation(void)
 			pr_err("\tUnless %s kthread gets sufficient CPU time, OOM is now expected behavior.\n", rcu_state.name);
 			pr_err("RCU grace-period kthread stack dump:\n");
 			sched_show_task(gpk);
-			if (cpu >= 0) {
-				if (cpu_is_offline(cpu)) {
-					pr_err("RCU GP kthread last ran on offline CPU %d.\n", cpu);
-				} else  {
-					pr_err("Stack dump where RCU GP kthread last ran:\n");
-					dump_cpu_task(cpu);
-				}
+			if (cpu_is_offline(cpu)) {
+				pr_err("RCU GP kthread last ran on offline CPU %d.\n", cpu);
+			} else  {
+				pr_err("Stack dump where RCU GP kthread last ran:\n");
+				dump_cpu_task(cpu);
 			}
 			wake_up_process(gpk);
 		}