Message ID | 20220730102330.1255-2-thunder.leizhen@huawei.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | rcu: Display registers of self-detected stall as far as possible | expand |
On Sat, Jul 30, 2022 at 06:23:28PM +0800, Zhen Lei wrote: > The backtrace of cpu_curr(other_cpu) is unwinded based on the 'fp' saved > during its last switch-out. For the most part, it's out of date. So try > to use NMI to get the backtrace first, just like those functions in > "tree_stall.h" did. Such as rcu_dump_cpu_stacks(). > > Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Much better, thank you! > --- > kernel/rcu/tree_exp.h | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h > index 0f70f62039a9090..21381697de23f0b 100644 > --- a/kernel/rcu/tree_exp.h > +++ b/kernel/rcu/tree_exp.h > @@ -665,7 +665,8 @@ static void synchronize_rcu_expedited_wait(void) > mask = leaf_node_cpu_bit(rnp, cpu); > if (!(READ_ONCE(rnp->expmask) & mask)) > continue; > - dump_cpu_task(cpu); > + if (!trigger_single_cpu_backtrace(cpu)) > + dump_cpu_task(cpu); But why not just leave this unchanged, rather than adding the call to trigger_single_cpu_backtrace() in this patch and then removing it in the next patch? Thanx, Paul > } > } > jiffies_stall = 3 * rcu_exp_jiffies_till_stall_check() + 3; > -- > 2.25.1 >
On 2022/8/2 7:14, Paul E. McKenney wrote: > On Sat, Jul 30, 2022 at 06:23:28PM +0800, Zhen Lei wrote: >> The backtrace of cpu_curr(other_cpu) is unwinded based on the 'fp' saved >> during its last switch-out. For the most part, it's out of date. So try >> to use NMI to get the backtrace first, just like those functions in >> "tree_stall.h" did. Such as rcu_dump_cpu_stacks(). >> >> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> > > Much better, thank you! > >> --- >> kernel/rcu/tree_exp.h | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h >> index 0f70f62039a9090..21381697de23f0b 100644 >> --- a/kernel/rcu/tree_exp.h >> +++ b/kernel/rcu/tree_exp.h >> @@ -665,7 +665,8 @@ static void synchronize_rcu_expedited_wait(void) >> mask = leaf_node_cpu_bit(rnp, cpu); >> if (!(READ_ONCE(rnp->expmask) & mask)) >> continue; >> - dump_cpu_task(cpu); >> + if (!trigger_single_cpu_backtrace(cpu)) >> + dump_cpu_task(cpu); > > But why not just leave this unchanged, rather than adding the call to > trigger_single_cpu_backtrace() in this patch and then removing it in > the next patch? To make the patch clear and easy to describe. Otherwise, I need to give an additional description of it in the next patch, because I searched all dump_cpu_task(). This seems to make the next patch less simple. Some of the patch sets I've seen have been done step by step like this. But I can't find it now. On the other hand, this patch is a small fix. Earlier versions may only backport it, not the next cleanup patch. > > Thanx, Paul > >> } >> } >> jiffies_stall = 3 * rcu_exp_jiffies_till_stall_check() + 3; >> -- >> 2.25.1 >> > . >
On Tue, Aug 02, 2022 at 10:06:00AM +0800, Leizhen (ThunderTown) wrote: > > > On 2022/8/2 7:14, Paul E. McKenney wrote: > > On Sat, Jul 30, 2022 at 06:23:28PM +0800, Zhen Lei wrote: > >> The backtrace of cpu_curr(other_cpu) is unwinded based on the 'fp' saved > >> during its last switch-out. For the most part, it's out of date. So try > >> to use NMI to get the backtrace first, just like those functions in > >> "tree_stall.h" did. Such as rcu_dump_cpu_stacks(). > >> > >> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> > > > > Much better, thank you! > > > >> --- > >> kernel/rcu/tree_exp.h | 3 ++- > >> 1 file changed, 2 insertions(+), 1 deletion(-) > >> > >> diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h > >> index 0f70f62039a9090..21381697de23f0b 100644 > >> --- a/kernel/rcu/tree_exp.h > >> +++ b/kernel/rcu/tree_exp.h > >> @@ -665,7 +665,8 @@ static void synchronize_rcu_expedited_wait(void) > >> mask = leaf_node_cpu_bit(rnp, cpu); > >> if (!(READ_ONCE(rnp->expmask) & mask)) > >> continue; > >> - dump_cpu_task(cpu); > >> + if (!trigger_single_cpu_backtrace(cpu)) > >> + dump_cpu_task(cpu); > > > > But why not just leave this unchanged, rather than adding the call to > > trigger_single_cpu_backtrace() in this patch and then removing it in > > the next patch? > > To make the patch clear and easy to describe. Otherwise, I need to > give an additional description of it in the next patch, because I > searched all dump_cpu_task(). This seems to make the next patch > less simple. > > Some of the patch sets I've seen have been done step by step like > this. But I can't find it now. > > On the other hand, this patch is a small fix. Earlier versions may > only backport it, not the next cleanup patch. You do have the option of doing a Cc to stable to control the backporting, if that is a potential issue for you. On the commit log, just say that the one use case already avoided doing the trigger_single_cpu_backtrace(), and thus did not need to be updated. So please resend the series, but without the undo/redo. There would thus be two patches rather than three, but there are plenty of other things that need fixing anyway. Thanx, Paul > >> } > >> } > >> jiffies_stall = 3 * rcu_exp_jiffies_till_stall_check() + 3; > >> -- > >> 2.25.1 > >> > > . > > > > -- > Regards, > Zhen Lei
On 2022/8/4 8:07, Paul E. McKenney wrote: > On Tue, Aug 02, 2022 at 10:06:00AM +0800, Leizhen (ThunderTown) wrote: >> >> >> On 2022/8/2 7:14, Paul E. McKenney wrote: >>> On Sat, Jul 30, 2022 at 06:23:28PM +0800, Zhen Lei wrote: >>>> The backtrace of cpu_curr(other_cpu) is unwinded based on the 'fp' saved >>>> during its last switch-out. For the most part, it's out of date. So try >>>> to use NMI to get the backtrace first, just like those functions in >>>> "tree_stall.h" did. Such as rcu_dump_cpu_stacks(). >>>> >>>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> >>> >>> Much better, thank you! >>> >>>> --- >>>> kernel/rcu/tree_exp.h | 3 ++- >>>> 1 file changed, 2 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h >>>> index 0f70f62039a9090..21381697de23f0b 100644 >>>> --- a/kernel/rcu/tree_exp.h >>>> +++ b/kernel/rcu/tree_exp.h >>>> @@ -665,7 +665,8 @@ static void synchronize_rcu_expedited_wait(void) >>>> mask = leaf_node_cpu_bit(rnp, cpu); >>>> if (!(READ_ONCE(rnp->expmask) & mask)) >>>> continue; >>>> - dump_cpu_task(cpu); >>>> + if (!trigger_single_cpu_backtrace(cpu)) >>>> + dump_cpu_task(cpu); >>> >>> But why not just leave this unchanged, rather than adding the call to >>> trigger_single_cpu_backtrace() in this patch and then removing it in >>> the next patch? >> >> To make the patch clear and easy to describe. Otherwise, I need to >> give an additional description of it in the next patch, because I >> searched all dump_cpu_task(). This seems to make the next patch >> less simple. >> >> Some of the patch sets I've seen have been done step by step like >> this. But I can't find it now. >> >> On the other hand, this patch is a small fix. Earlier versions may >> only backport it, not the next cleanup patch. > > You do have the option of doing a Cc to stable to control the backporting, > if that is a potential issue for you. > > On the commit log, just say that the one use case already avoided doing > the trigger_single_cpu_backtrace(), and thus did not need to be updated. > > So please resend the series, but without the undo/redo. There would > thus be two patches rather than three, but there are plenty of other > things that need fixing anyway. OK, thanks. > > Thanx, Paul > >>>> } >>>> } >>>> jiffies_stall = 3 * rcu_exp_jiffies_till_stall_check() + 3; >>>> -- >>>> 2.25.1 >>>> >>> . >>> >> >> -- >> Regards, >> Zhen Lei > . >
diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index 0f70f62039a9090..21381697de23f0b 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -665,7 +665,8 @@ static void synchronize_rcu_expedited_wait(void) mask = leaf_node_cpu_bit(rnp, cpu); if (!(READ_ONCE(rnp->expmask) & mask)) continue; - dump_cpu_task(cpu); + if (!trigger_single_cpu_backtrace(cpu)) + dump_cpu_task(cpu); } } jiffies_stall = 3 * rcu_exp_jiffies_till_stall_check() + 3;
The backtrace of cpu_curr(other_cpu) is unwinded based on the 'fp' saved during its last switch-out. For the most part, it's out of date. So try to use NMI to get the backtrace first, just like those functions in "tree_stall.h" did. Such as rcu_dump_cpu_stacks(). Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> --- kernel/rcu/tree_exp.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)