[v6,0/2] rcu: Add RCU stall diagnosis information

Message ID	20221109093739.187-1-thunder.leizhen@huawei.com (mailing list archive)
Headers	show Return-Path: <rcu-owner@kernel.org> From: Zhen Lei <thunder.leizhen@huawei.com> To: "Paul E . McKenney" <paulmck@kernel.org>, Frederic Weisbecker <frederic@kernel.org>, Neeraj Upadhyay <quic_neeraju@quicinc.com>, "Josh Triplett" <josh@joshtriplett.org>, Steven Rostedt <rostedt@goodmis.org>, Mathieu Desnoyers <mathieu.desnoyers@efficios.com>, Lai Jiangshan <jiangshanlai@gmail.com>, Joel Fernandes <joel@joelfernandes.org>, <rcu@vger.kernel.org>, <linux-kernel@vger.kernel.org> CC: Zhen Lei <thunder.leizhen@huawei.com>, Robert Elliott <elliott@hpe.com> Subject: [PATCH v6 0/2] rcu: Add RCU stall diagnosis information Date: Wed, 9 Nov 2022 17:37:36 +0800 Message-ID: <20221109093739.187-1-thunder.leizhen@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII Precedence: bulk
Series	rcu: Add RCU stall diagnosis information \| expand [v6,0/2] rcu: Add RCU stall diagnosis information [v6,1/2] rcu: Add RCU stall diagnosis information [v6,2/2] doc: Document CONFIG_RCU_CPU_STALL_CPUTIME=y stall information

Leizhen (ThunderTown) Nov. 9, 2022, 9:37 a.m. UTC

v5 --> v6:
1. When there are more than two continuous RCU stallings, correctly handle the
   value of the second and subsequent sampling periods. Update comments and
   document.
   Thanks to Elliott, Robert for the test.
2. Change "rcu stall" to "RCU stall".

v4 --> v5:
1. Resolve a git am conflict. No code change.

v3 --> v4:
1. Rename rcu_cpu_stall_deep_debug to rcu_cpu_stall_cputime.

v2 --> v3:
1. Fix the return type of kstat_cpu_irqs_sum()
2. Add Kconfig option CONFIG_RCU_CPU_STALL_DEEP_DEBUG and boot parameter
   rcupdate.rcu_cpu_stall_deep_debug.
3. Add comments and normalize local variable name


v1 --> v2:
1. Fixed a bug in the code. If the rcu stall is detected by another CPU,
   kcpustat_this_cpu cannot be used.
@@ -451,7 +451,7 @@ static void print_cpu_stat_info(int cpu)
        if (r->gp_seq != rdp->gp_seq)
                return;

-       cpustat = kcpustat_this_cpu->cpustat;
+       cpustat = kcpustat_cpu(cpu).cpustat;
2. Move the start point of statistics from rcu_stall_kick_kthreads() to
   rcu_implicit_dynticks_qs(), removing the dependency on irq_work.

v1:
In some extreme cases, such as the I/O pressure test, the CPU usage may
be 100%, causing RCU stall. In this case, the printed information about
current is not useful. Displays the number and usage of hard interrupts,
soft interrupts, and context switches that are generated within half of
the CPU stall timeout, can help us make a general judgment. In other
cases, we can preliminarily determine whether an infinite loop occurs
when local_irq, local_bh or preempt is disabled.

Zhen Lei (2):
  rcu: Add RCU stall diagnosis information
  doc: Document CONFIG_RCU_CPU_STALL_CPUTIME=y stall information

 Documentation/RCU/stallwarn.rst               | 88 +++++++++++++++++++
 .../admin-guide/kernel-parameters.txt         |  6 ++
 kernel/rcu/Kconfig.debug                      | 11 +++
 kernel/rcu/rcu.h                              |  1 +
 kernel/rcu/tree.c                             | 17 ++++
 kernel/rcu/tree.h                             | 19 ++++
 kernel/rcu/tree_stall.h                       | 29 ++++++
 kernel/rcu/update.c                           |  2 +
 8 files changed, 173 insertions(+)

Frederic Weisbecker Nov. 9, 2022, 3:26 p.m. UTC | #1

Hi Zhen Lei,

On Wed, Nov 09, 2022 at 05:37:36PM +0800, Zhen Lei wrote:
> v5 --> v6:
> 1. When there are more than two continuous RCU stallings, correctly handle the
>    value of the second and subsequent sampling periods. Update comments and
>    document.
>    Thanks to Elliott, Robert for the test.
> 2. Change "rcu stall" to "RCU stall".
> 
> v4 --> v5:
> 1. Resolve a git am conflict. No code change.
> 
> v3 --> v4:
> 1. Rename rcu_cpu_stall_deep_debug to rcu_cpu_stall_cputime.
> 
> v2 --> v3:
> 1. Fix the return type of kstat_cpu_irqs_sum()
> 2. Add Kconfig option CONFIG_RCU_CPU_STALL_DEEP_DEBUG and boot parameter
>    rcupdate.rcu_cpu_stall_deep_debug.
> 3. Add comments and normalize local variable name
> 
> 
> v1 --> v2:
> 1. Fixed a bug in the code. If the rcu stall is detected by another CPU,
>    kcpustat_this_cpu cannot be used.
> @@ -451,7 +451,7 @@ static void print_cpu_stat_info(int cpu)
>         if (r->gp_seq != rdp->gp_seq)
>                 return;
> 
> -       cpustat = kcpustat_this_cpu->cpustat;
> +       cpustat = kcpustat_cpu(cpu).cpustat;
> 2. Move the start point of statistics from rcu_stall_kick_kthreads() to
>    rcu_implicit_dynticks_qs(), removing the dependency on irq_work.
> 
> v1:
> In some extreme cases, such as the I/O pressure test, the CPU usage may
> be 100%, causing RCU stall. In this case, the printed information about
> current is not useful. Displays the number and usage of hard interrupts,
> soft interrupts, and context switches that are generated within half of
> the CPU stall timeout, can help us make a general judgment. In other
> cases, we can preliminarily determine whether an infinite loop occurs
> when local_irq, local_bh or preempt is disabled.

That looks useful but I have to ask: what does it bring that the softlockup
and hardlockup watchdog can not already solve?

Thanks.

> 
> Zhen Lei (2):
>   rcu: Add RCU stall diagnosis information
>   doc: Document CONFIG_RCU_CPU_STALL_CPUTIME=y stall information
> 
>  Documentation/RCU/stallwarn.rst               | 88 +++++++++++++++++++
>  .../admin-guide/kernel-parameters.txt         |  6 ++
>  kernel/rcu/Kconfig.debug                      | 11 +++
>  kernel/rcu/rcu.h                              |  1 +
>  kernel/rcu/tree.c                             | 17 ++++
>  kernel/rcu/tree.h                             | 19 ++++
>  kernel/rcu/tree_stall.h                       | 29 ++++++
>  kernel/rcu/update.c                           |  2 +
>  8 files changed, 173 insertions(+)
> 
> -- 
> 2.25.1
>

Paul E. McKenney Nov. 9, 2022, 3:59 p.m. UTC | #2

On Wed, Nov 09, 2022 at 04:26:21PM +0100, Frederic Weisbecker wrote:
> Hi Zhen Lei,
> 
> On Wed, Nov 09, 2022 at 05:37:36PM +0800, Zhen Lei wrote:
> > v5 --> v6:
> > 1. When there are more than two continuous RCU stallings, correctly handle the
> >    value of the second and subsequent sampling periods. Update comments and
> >    document.
> >    Thanks to Elliott, Robert for the test.
> > 2. Change "rcu stall" to "RCU stall".
> > 
> > v4 --> v5:
> > 1. Resolve a git am conflict. No code change.
> > 
> > v3 --> v4:
> > 1. Rename rcu_cpu_stall_deep_debug to rcu_cpu_stall_cputime.
> > 
> > v2 --> v3:
> > 1. Fix the return type of kstat_cpu_irqs_sum()
> > 2. Add Kconfig option CONFIG_RCU_CPU_STALL_DEEP_DEBUG and boot parameter
> >    rcupdate.rcu_cpu_stall_deep_debug.
> > 3. Add comments and normalize local variable name
> > 
> > 
> > v1 --> v2:
> > 1. Fixed a bug in the code. If the rcu stall is detected by another CPU,
> >    kcpustat_this_cpu cannot be used.
> > @@ -451,7 +451,7 @@ static void print_cpu_stat_info(int cpu)
> >         if (r->gp_seq != rdp->gp_seq)
> >                 return;
> > 
> > -       cpustat = kcpustat_this_cpu->cpustat;
> > +       cpustat = kcpustat_cpu(cpu).cpustat;
> > 2. Move the start point of statistics from rcu_stall_kick_kthreads() to
> >    rcu_implicit_dynticks_qs(), removing the dependency on irq_work.
> > 
> > v1:
> > In some extreme cases, such as the I/O pressure test, the CPU usage may
> > be 100%, causing RCU stall. In this case, the printed information about
> > current is not useful. Displays the number and usage of hard interrupts,
> > soft interrupts, and context switches that are generated within half of
> > the CPU stall timeout, can help us make a general judgment. In other
> > cases, we can preliminarily determine whether an infinite loop occurs
> > when local_irq, local_bh or preempt is disabled.
> 
> That looks useful but I have to ask: what does it bring that the softlockup
> and hardlockup watchdog can not already solve?

This is a good point.  One possible benefit is putting the needed information
in one spot, for example, in cases where the soft/hard lockup timeouts are
significantly different than the RCU CPU stall warning timeout.

Thoughts?

							Thanx, Paul

> Thanks.
> 
> > 
> > Zhen Lei (2):
> >   rcu: Add RCU stall diagnosis information
> >   doc: Document CONFIG_RCU_CPU_STALL_CPUTIME=y stall information
> > 
> >  Documentation/RCU/stallwarn.rst               | 88 +++++++++++++++++++
> >  .../admin-guide/kernel-parameters.txt         |  6 ++
> >  kernel/rcu/Kconfig.debug                      | 11 +++
> >  kernel/rcu/rcu.h                              |  1 +
> >  kernel/rcu/tree.c                             | 17 ++++
> >  kernel/rcu/tree.h                             | 19 ++++
> >  kernel/rcu/tree_stall.h                       | 29 ++++++
> >  kernel/rcu/update.c                           |  2 +
> >  8 files changed, 173 insertions(+)
> > 
> > -- 
> > 2.25.1
> >

Frederic Weisbecker Nov. 9, 2022, 5:03 p.m. UTC | #3

On Wed, Nov 09, 2022 at 07:59:01AM -0800, Paul E. McKenney wrote:
> On Wed, Nov 09, 2022 at 04:26:21PM +0100, Frederic Weisbecker wrote:
> > Hi Zhen Lei,
> > 
> > On Wed, Nov 09, 2022 at 05:37:36PM +0800, Zhen Lei wrote:
> > > v5 --> v6:
> > > 1. When there are more than two continuous RCU stallings, correctly handle the
> > >    value of the second and subsequent sampling periods. Update comments and
> > >    document.
> > >    Thanks to Elliott, Robert for the test.
> > > 2. Change "rcu stall" to "RCU stall".
> > > 
> > > v4 --> v5:
> > > 1. Resolve a git am conflict. No code change.
> > > 
> > > v3 --> v4:
> > > 1. Rename rcu_cpu_stall_deep_debug to rcu_cpu_stall_cputime.
> > > 
> > > v2 --> v3:
> > > 1. Fix the return type of kstat_cpu_irqs_sum()
> > > 2. Add Kconfig option CONFIG_RCU_CPU_STALL_DEEP_DEBUG and boot parameter
> > >    rcupdate.rcu_cpu_stall_deep_debug.
> > > 3. Add comments and normalize local variable name
> > > 
> > > 
> > > v1 --> v2:
> > > 1. Fixed a bug in the code. If the rcu stall is detected by another CPU,
> > >    kcpustat_this_cpu cannot be used.
> > > @@ -451,7 +451,7 @@ static void print_cpu_stat_info(int cpu)
> > >         if (r->gp_seq != rdp->gp_seq)
> > >                 return;
> > > 
> > > -       cpustat = kcpustat_this_cpu->cpustat;
> > > +       cpustat = kcpustat_cpu(cpu).cpustat;
> > > 2. Move the start point of statistics from rcu_stall_kick_kthreads() to
> > >    rcu_implicit_dynticks_qs(), removing the dependency on irq_work.
> > > 
> > > v1:
> > > In some extreme cases, such as the I/O pressure test, the CPU usage may
> > > be 100%, causing RCU stall. In this case, the printed information about
> > > current is not useful. Displays the number and usage of hard interrupts,
> > > soft interrupts, and context switches that are generated within half of
> > > the CPU stall timeout, can help us make a general judgment. In other
> > > cases, we can preliminarily determine whether an infinite loop occurs
> > > when local_irq, local_bh or preempt is disabled.
> > 
> > That looks useful but I have to ask: what does it bring that the softlockup
> > and hardlockup watchdog can not already solve?
> 
> This is a good point.  One possible benefit is putting the needed information
> in one spot, for example, in cases where the soft/hard lockup timeouts are
> significantly different than the RCU CPU stall warning timeout.

Arguably, the hardlockup/softlockup detectors usually trigger after RCU stall,
unless all CPUs are caught into a hardlockup, in which case only the hardlockup
detector has a chance.

Anyway I would say that in this case just lower the delay for the lockup
detectors to consider the situation is a lockup?

Thanks.


> 
> Thoughts?
> 
> 							Thanx, Paul
> 
> > Thanks.
> > 
> > > 
> > > Zhen Lei (2):
> > >   rcu: Add RCU stall diagnosis information
> > >   doc: Document CONFIG_RCU_CPU_STALL_CPUTIME=y stall information
> > > 
> > >  Documentation/RCU/stallwarn.rst               | 88 +++++++++++++++++++
> > >  .../admin-guide/kernel-parameters.txt         |  6 ++
> > >  kernel/rcu/Kconfig.debug                      | 11 +++
> > >  kernel/rcu/rcu.h                              |  1 +
> > >  kernel/rcu/tree.c                             | 17 ++++
> > >  kernel/rcu/tree.h                             | 19 ++++
> > >  kernel/rcu/tree_stall.h                       | 29 ++++++
> > >  kernel/rcu/update.c                           |  2 +
> > >  8 files changed, 173 insertions(+)
> > > 
> > > -- 
> > > 2.25.1
> > >

Paul E. McKenney Nov. 9, 2022, 5:22 p.m. UTC | #4

On Wed, Nov 09, 2022 at 06:03:17PM +0100, Frederic Weisbecker wrote:
> On Wed, Nov 09, 2022 at 07:59:01AM -0800, Paul E. McKenney wrote:
> > On Wed, Nov 09, 2022 at 04:26:21PM +0100, Frederic Weisbecker wrote:
> > > Hi Zhen Lei,
> > > 
> > > On Wed, Nov 09, 2022 at 05:37:36PM +0800, Zhen Lei wrote:
> > > > v5 --> v6:
> > > > 1. When there are more than two continuous RCU stallings, correctly handle the
> > > >    value of the second and subsequent sampling periods. Update comments and
> > > >    document.
> > > >    Thanks to Elliott, Robert for the test.
> > > > 2. Change "rcu stall" to "RCU stall".
> > > > 
> > > > v4 --> v5:
> > > > 1. Resolve a git am conflict. No code change.
> > > > 
> > > > v3 --> v4:
> > > > 1. Rename rcu_cpu_stall_deep_debug to rcu_cpu_stall_cputime.
> > > > 
> > > > v2 --> v3:
> > > > 1. Fix the return type of kstat_cpu_irqs_sum()
> > > > 2. Add Kconfig option CONFIG_RCU_CPU_STALL_DEEP_DEBUG and boot parameter
> > > >    rcupdate.rcu_cpu_stall_deep_debug.
> > > > 3. Add comments and normalize local variable name
> > > > 
> > > > 
> > > > v1 --> v2:
> > > > 1. Fixed a bug in the code. If the rcu stall is detected by another CPU,
> > > >    kcpustat_this_cpu cannot be used.
> > > > @@ -451,7 +451,7 @@ static void print_cpu_stat_info(int cpu)
> > > >         if (r->gp_seq != rdp->gp_seq)
> > > >                 return;
> > > > 
> > > > -       cpustat = kcpustat_this_cpu->cpustat;
> > > > +       cpustat = kcpustat_cpu(cpu).cpustat;
> > > > 2. Move the start point of statistics from rcu_stall_kick_kthreads() to
> > > >    rcu_implicit_dynticks_qs(), removing the dependency on irq_work.
> > > > 
> > > > v1:
> > > > In some extreme cases, such as the I/O pressure test, the CPU usage may
> > > > be 100%, causing RCU stall. In this case, the printed information about
> > > > current is not useful. Displays the number and usage of hard interrupts,
> > > > soft interrupts, and context switches that are generated within half of
> > > > the CPU stall timeout, can help us make a general judgment. In other
> > > > cases, we can preliminarily determine whether an infinite loop occurs
> > > > when local_irq, local_bh or preempt is disabled.
> > > 
> > > That looks useful but I have to ask: what does it bring that the softlockup
> > > and hardlockup watchdog can not already solve?
> > 
> > This is a good point.  One possible benefit is putting the needed information
> > in one spot, for example, in cases where the soft/hard lockup timeouts are
> > significantly different than the RCU CPU stall warning timeout.
> 
> Arguably, the hardlockup/softlockup detectors usually trigger after RCU stall,
> unless all CPUs are caught into a hardlockup, in which case only the hardlockup
> detector has a chance.
> 
> Anyway I would say that in this case just lower the delay for the lockup
> detectors to consider the situation is a lockup?

Try it both ways and see how it works?  The rcutorture module parameters
stall_cpu and stall_cpu_irqsoff are easy ways to generate these sorts
of scenarios.

Actually, that does remind me of something.  Back when I was chasing
that interrupt storm, would this patch have helped me?  In that case, the
half-way point would have been reached while all online CPUs were spinning
with interrupts disabled and the incoming CPU was getting hammered with
continual scheduling-clock interrupts.  So I suspect that the answer is
"no" because the incoming CPU was not blocking the grace period.

Instead of being snapshot halfway to the RCU CPU stall warning, should
the values be snapshot when the CPU notices the beginning or end of an
RCU grace period and when a CPU goes offline?

But that would not suffice, because detailed information would not have
been dumped for the incoming CPU.

However, the lack of context switches and interrupts on the rest of the
CPUs would likely have been a big cluebat, so there is that.  It might
be better to rework the warning at the beginning of rcu_sched_clock_irq()
to complain if more than (say) 10 scheduling-clock interrupts occur on
a given CPU during a single jiffy.

Independent of Zhen Lei patch.

Thoughts?

							Thanx, Paul

> Thanks.
> 
> 
> > 
> > Thoughts?
> > 
> > 							Thanx, Paul
> > 
> > > Thanks.
> > > 
> > > > 
> > > > Zhen Lei (2):
> > > >   rcu: Add RCU stall diagnosis information
> > > >   doc: Document CONFIG_RCU_CPU_STALL_CPUTIME=y stall information
> > > > 
> > > >  Documentation/RCU/stallwarn.rst               | 88 +++++++++++++++++++
> > > >  .../admin-guide/kernel-parameters.txt         |  6 ++
> > > >  kernel/rcu/Kconfig.debug                      | 11 +++
> > > >  kernel/rcu/rcu.h                              |  1 +
> > > >  kernel/rcu/tree.c                             | 17 ++++
> > > >  kernel/rcu/tree.h                             | 19 ++++
> > > >  kernel/rcu/tree_stall.h                       | 29 ++++++
> > > >  kernel/rcu/update.c                           |  2 +
> > > >  8 files changed, 173 insertions(+)
> > > > 
> > > > -- 
> > > > 2.25.1
> > > >

Leizhen (ThunderTown) Nov. 10, 2022, 2:27 a.m. UTC | #5

On 2022/11/10 1:22, Paul E. McKenney wrote:
> On Wed, Nov 09, 2022 at 06:03:17PM +0100, Frederic Weisbecker wrote:
>> On Wed, Nov 09, 2022 at 07:59:01AM -0800, Paul E. McKenney wrote:
>>> On Wed, Nov 09, 2022 at 04:26:21PM +0100, Frederic Weisbecker wrote:
>>>> Hi Zhen Lei,
>>>>
>>>> On Wed, Nov 09, 2022 at 05:37:36PM +0800, Zhen Lei wrote:
>>>>> v5 --> v6:
>>>>> 1. When there are more than two continuous RCU stallings, correctly handle the
>>>>>    value of the second and subsequent sampling periods. Update comments and
>>>>>    document.
>>>>>    Thanks to Elliott, Robert for the test.
>>>>> 2. Change "rcu stall" to "RCU stall".
>>>>>
>>>>> v4 --> v5:
>>>>> 1. Resolve a git am conflict. No code change.
>>>>>
>>>>> v3 --> v4:
>>>>> 1. Rename rcu_cpu_stall_deep_debug to rcu_cpu_stall_cputime.
>>>>>
>>>>> v2 --> v3:
>>>>> 1. Fix the return type of kstat_cpu_irqs_sum()
>>>>> 2. Add Kconfig option CONFIG_RCU_CPU_STALL_DEEP_DEBUG and boot parameter
>>>>>    rcupdate.rcu_cpu_stall_deep_debug.
>>>>> 3. Add comments and normalize local variable name
>>>>>
>>>>>
>>>>> v1 --> v2:
>>>>> 1. Fixed a bug in the code. If the rcu stall is detected by another CPU,
>>>>>    kcpustat_this_cpu cannot be used.
>>>>> @@ -451,7 +451,7 @@ static void print_cpu_stat_info(int cpu)
>>>>>         if (r->gp_seq != rdp->gp_seq)
>>>>>                 return;
>>>>>
>>>>> -       cpustat = kcpustat_this_cpu->cpustat;
>>>>> +       cpustat = kcpustat_cpu(cpu).cpustat;
>>>>> 2. Move the start point of statistics from rcu_stall_kick_kthreads() to
>>>>>    rcu_implicit_dynticks_qs(), removing the dependency on irq_work.
>>>>>
>>>>> v1:
>>>>> In some extreme cases, such as the I/O pressure test, the CPU usage may
>>>>> be 100%, causing RCU stall. In this case, the printed information about
>>>>> current is not useful. Displays the number and usage of hard interrupts,
>>>>> soft interrupts, and context switches that are generated within half of
>>>>> the CPU stall timeout, can help us make a general judgment. In other
>>>>> cases, we can preliminarily determine whether an infinite loop occurs
>>>>> when local_irq, local_bh or preempt is disabled.
>>>>
>>>> That looks useful but I have to ask: what does it bring that the softlockup
>>>> and hardlockup watchdog can not already solve?
>>>
>>> This is a good point.  One possible benefit is putting the needed information
>>> in one spot, for example, in cases where the soft/hard lockup timeouts are
>>> significantly different than the RCU CPU stall warning timeout.
>>
>> Arguably, the hardlockup/softlockup detectors usually trigger after RCU stall,
>> unless all CPUs are caught into a hardlockup, in which case only the hardlockup
>> detector has a chance.
>>
>> Anyway I would say that in this case just lower the delay for the lockup
>> detectors to consider the situation is a lockup?
> 
> Try it both ways and see how it works?  The rcutorture module parameters
> stall_cpu and stall_cpu_irqsoff are easy ways to generate these sorts
> of scenarios.
> 
> Actually, that does remind me of something.  Back when I was chasing
> that interrupt storm, would this patch have helped me?  In that case, the

Yes, this patch series originally addressed an RCU stall issue caused by an
interruption storm. The serial port driver written by another project team
failed to write the register in a specific condition. As a result, interrupts
were repeatedly reported.

> half-way point would have been reached while all online CPUs were spinning
> with interrupts disabled and the incoming CPU was getting hammered with
> continual scheduling-clock interrupts.  So I suspect that the answer is
> "no" because the incoming CPU was not blocking the grace period.
> 
> Instead of being snapshot halfway to the RCU CPU stall warning, should
> the values be snapshot when the CPU notices the beginning or end of an
> RCU grace period and when a CPU goes offline?

This won't work. Those normal counts that occurred before the failure
have an impact on our analysis. For example, some software interrupts
may have been generated before local_bh_disable() is called.

> 
> But that would not suffice, because detailed information would not have
> been dumped for the incoming CPU.
> 
> However, the lack of context switches and interrupts on the rest of the
> CPUs would likely have been a big cluebat, so there is that.  It might
> be better to rework the warning at the beginning of rcu_sched_clock_irq()
> to complain if more than (say) 10 scheduling-clock interrupts occur on
> a given CPU during a single jiffy.
> 
> Independent of Zhen Lei patch.
> 
> Thoughts?
> 
> 							Thanx, Paul
> 
>> Thanks.
>>
>>
>>>
>>> Thoughts?
>>>
>>> 							Thanx, Paul
>>>
>>>> Thanks.
>>>>
>>>>>
>>>>> Zhen Lei (2):
>>>>>   rcu: Add RCU stall diagnosis information
>>>>>   doc: Document CONFIG_RCU_CPU_STALL_CPUTIME=y stall information
>>>>>
>>>>>  Documentation/RCU/stallwarn.rst               | 88 +++++++++++++++++++
>>>>>  .../admin-guide/kernel-parameters.txt         |  6 ++
>>>>>  kernel/rcu/Kconfig.debug                      | 11 +++
>>>>>  kernel/rcu/rcu.h                              |  1 +
>>>>>  kernel/rcu/tree.c                             | 17 ++++
>>>>>  kernel/rcu/tree.h                             | 19 ++++
>>>>>  kernel/rcu/tree_stall.h                       | 29 ++++++
>>>>>  kernel/rcu/update.c                           |  2 +
>>>>>  8 files changed, 173 insertions(+)
>>>>>
>>>>> -- 
>>>>> 2.25.1
>>>>>
> .
>

Leizhen (ThunderTown) Nov. 10, 2022, 7:29 a.m. UTC | #6

On 2022/11/10 1:03, Frederic Weisbecker wrote:
> On Wed, Nov 09, 2022 at 07:59:01AM -0800, Paul E. McKenney wrote:
>> On Wed, Nov 09, 2022 at 04:26:21PM +0100, Frederic Weisbecker wrote:
>>> Hi Zhen Lei,
>>>
>>> On Wed, Nov 09, 2022 at 05:37:36PM +0800, Zhen Lei wrote:
>>>> v5 --> v6:
>>>> 1. When there are more than two continuous RCU stallings, correctly handle the
>>>>    value of the second and subsequent sampling periods. Update comments and
>>>>    document.
>>>>    Thanks to Elliott, Robert for the test.
>>>> 2. Change "rcu stall" to "RCU stall".
>>>>
>>>> v4 --> v5:
>>>> 1. Resolve a git am conflict. No code change.
>>>>
>>>> v3 --> v4:
>>>> 1. Rename rcu_cpu_stall_deep_debug to rcu_cpu_stall_cputime.
>>>>
>>>> v2 --> v3:
>>>> 1. Fix the return type of kstat_cpu_irqs_sum()
>>>> 2. Add Kconfig option CONFIG_RCU_CPU_STALL_DEEP_DEBUG and boot parameter
>>>>    rcupdate.rcu_cpu_stall_deep_debug.
>>>> 3. Add comments and normalize local variable name
>>>>
>>>>
>>>> v1 --> v2:
>>>> 1. Fixed a bug in the code. If the rcu stall is detected by another CPU,
>>>>    kcpustat_this_cpu cannot be used.
>>>> @@ -451,7 +451,7 @@ static void print_cpu_stat_info(int cpu)
>>>>         if (r->gp_seq != rdp->gp_seq)
>>>>                 return;
>>>>
>>>> -       cpustat = kcpustat_this_cpu->cpustat;
>>>> +       cpustat = kcpustat_cpu(cpu).cpustat;
>>>> 2. Move the start point of statistics from rcu_stall_kick_kthreads() to
>>>>    rcu_implicit_dynticks_qs(), removing the dependency on irq_work.
>>>>
>>>> v1:
>>>> In some extreme cases, such as the I/O pressure test, the CPU usage may
>>>> be 100%, causing RCU stall. In this case, the printed information about
>>>> current is not useful. Displays the number and usage of hard interrupts,
>>>> soft interrupts, and context switches that are generated within half of
>>>> the CPU stall timeout, can help us make a general judgment. In other
>>>> cases, we can preliminarily determine whether an infinite loop occurs
>>>> when local_irq, local_bh or preempt is disabled.
>>>
>>> That looks useful but I have to ask: what does it bring that the softlockup
>>> and hardlockup watchdog can not already solve?
>>
>> This is a good point.  One possible benefit is putting the needed information
>> in one spot, for example, in cases where the soft/hard lockup timeouts are
>> significantly different than the RCU CPU stall warning timeout.
> 
> Arguably, the hardlockup/softlockup detectors usually trigger after RCU stall,
> unless all CPUs are caught into a hardlockup, in which case only the hardlockup
> detector has a chance.

But not all ARCHs support hardlockup, such as s390. Maybe arm64.

config HARDLOCKUP_DETECTOR
        bool "Detect Hard Lockups"
        depends on DEBUG_KERNEL && !S390
        depends on HAVE_HARDLOCKUP_DETECTOR_PERF || HAVE_HARDLOCKUP_DETECTOR_ARCH

> 
> Anyway I would say that in this case just lower the delay for the lockup
> detectors to consider the situation is a lockup?

In most architectures, CONFIG_SOFTLOCKUP_DETECTOR is not set by default.
Otherwise 20 is less than 21.

Softlockups are bugs that cause the kernel to loop in kernel
mode for more than 20 seconds, without giving other tasks a
chance to run.

config RCU_CPU_STALL_TIMEOUT
	default 21


In short, hardlockup and softlockup are completely uncontrollable to RCU stall.

> 
> Thanks.
> 
> 
>>
>> Thoughts?
>>
>> 							Thanx, Paul
>>
>>> Thanks.
>>>
>>>>
>>>> Zhen Lei (2):
>>>>   rcu: Add RCU stall diagnosis information
>>>>   doc: Document CONFIG_RCU_CPU_STALL_CPUTIME=y stall information
>>>>
>>>>  Documentation/RCU/stallwarn.rst               | 88 +++++++++++++++++++
>>>>  .../admin-guide/kernel-parameters.txt         |  6 ++
>>>>  kernel/rcu/Kconfig.debug                      | 11 +++
>>>>  kernel/rcu/rcu.h                              |  1 +
>>>>  kernel/rcu/tree.c                             | 17 ++++
>>>>  kernel/rcu/tree.h                             | 19 ++++
>>>>  kernel/rcu/tree_stall.h                       | 29 ++++++
>>>>  kernel/rcu/update.c                           |  2 +
>>>>  8 files changed, 173 insertions(+)
>>>>
>>>> -- 
>>>> 2.25.1
>>>>
> .
>

Frederic Weisbecker Nov. 10, 2022, 11:35 a.m. UTC | #7

On Thu, Nov 10, 2022 at 03:29:04PM +0800, Leizhen (ThunderTown) wrote:
> 
> 
> On 2022/11/10 1:03, Frederic Weisbecker wrote:
> > On Wed, Nov 09, 2022 at 07:59:01AM -0800, Paul E. McKenney wrote:
> >> On Wed, Nov 09, 2022 at 04:26:21PM +0100, Frederic Weisbecker wrote:
> >>> Hi Zhen Lei,
> >>>
> >>> On Wed, Nov 09, 2022 at 05:37:36PM +0800, Zhen Lei wrote:
> >>>> v5 --> v6:
> >>>> 1. When there are more than two continuous RCU stallings, correctly handle the
> >>>>    value of the second and subsequent sampling periods. Update comments and
> >>>>    document.
> >>>>    Thanks to Elliott, Robert for the test.
> >>>> 2. Change "rcu stall" to "RCU stall".
> >>>>
> >>>> v4 --> v5:
> >>>> 1. Resolve a git am conflict. No code change.
> >>>>
> >>>> v3 --> v4:
> >>>> 1. Rename rcu_cpu_stall_deep_debug to rcu_cpu_stall_cputime.
> >>>>
> >>>> v2 --> v3:
> >>>> 1. Fix the return type of kstat_cpu_irqs_sum()
> >>>> 2. Add Kconfig option CONFIG_RCU_CPU_STALL_DEEP_DEBUG and boot parameter
> >>>>    rcupdate.rcu_cpu_stall_deep_debug.
> >>>> 3. Add comments and normalize local variable name
> >>>>
> >>>>
> >>>> v1 --> v2:
> >>>> 1. Fixed a bug in the code. If the rcu stall is detected by another CPU,
> >>>>    kcpustat_this_cpu cannot be used.
> >>>> @@ -451,7 +451,7 @@ static void print_cpu_stat_info(int cpu)
> >>>>         if (r->gp_seq != rdp->gp_seq)
> >>>>                 return;
> >>>>
> >>>> -       cpustat = kcpustat_this_cpu->cpustat;
> >>>> +       cpustat = kcpustat_cpu(cpu).cpustat;
> >>>> 2. Move the start point of statistics from rcu_stall_kick_kthreads() to
> >>>>    rcu_implicit_dynticks_qs(), removing the dependency on irq_work.
> >>>>
> >>>> v1:
> >>>> In some extreme cases, such as the I/O pressure test, the CPU usage may
> >>>> be 100%, causing RCU stall. In this case, the printed information about
> >>>> current is not useful. Displays the number and usage of hard interrupts,
> >>>> soft interrupts, and context switches that are generated within half of
> >>>> the CPU stall timeout, can help us make a general judgment. In other
> >>>> cases, we can preliminarily determine whether an infinite loop occurs
> >>>> when local_irq, local_bh or preempt is disabled.
> >>>
> >>> That looks useful but I have to ask: what does it bring that the softlockup
> >>> and hardlockup watchdog can not already solve?
> >>
> >> This is a good point.  One possible benefit is putting the needed information
> >> in one spot, for example, in cases where the soft/hard lockup timeouts are
> >> significantly different than the RCU CPU stall warning timeout.
> > 
> > Arguably, the hardlockup/softlockup detectors usually trigger after RCU stall,
> > unless all CPUs are caught into a hardlockup, in which case only the hardlockup
> > detector has a chance.
> 
> But not all ARCHs support hardlockup, such as s390. Maybe arm64.
> 
> config HARDLOCKUP_DETECTOR
>         bool "Detect Hard Lockups"
>         depends on DEBUG_KERNEL && !S390
>         depends on HAVE_HARDLOCKUP_DETECTOR_PERF || HAVE_HARDLOCKUP_DETECTOR_ARCH

Ah fair point indeed.

Thanks!

Paul E. McKenney Nov. 12, 2022, 6:59 p.m. UTC | #8

On Thu, Nov 10, 2022 at 10:27:44AM +0800, Leizhen (ThunderTown) wrote:
> On 2022/11/10 1:22, Paul E. McKenney wrote:
> > On Wed, Nov 09, 2022 at 06:03:17PM +0100, Frederic Weisbecker wrote:
> >> On Wed, Nov 09, 2022 at 07:59:01AM -0800, Paul E. McKenney wrote:
> >>> On Wed, Nov 09, 2022 at 04:26:21PM +0100, Frederic Weisbecker wrote:
> >>>> Hi Zhen Lei,
> >>>>
> >>>> On Wed, Nov 09, 2022 at 05:37:36PM +0800, Zhen Lei wrote:
> >>>>> v5 --> v6:
> >>>>> 1. When there are more than two continuous RCU stallings, correctly handle the
> >>>>>    value of the second and subsequent sampling periods. Update comments and
> >>>>>    document.
> >>>>>    Thanks to Elliott, Robert for the test.
> >>>>> 2. Change "rcu stall" to "RCU stall".
> >>>>>
> >>>>> v4 --> v5:
> >>>>> 1. Resolve a git am conflict. No code change.
> >>>>>
> >>>>> v3 --> v4:
> >>>>> 1. Rename rcu_cpu_stall_deep_debug to rcu_cpu_stall_cputime.
> >>>>>
> >>>>> v2 --> v3:
> >>>>> 1. Fix the return type of kstat_cpu_irqs_sum()
> >>>>> 2. Add Kconfig option CONFIG_RCU_CPU_STALL_DEEP_DEBUG and boot parameter
> >>>>>    rcupdate.rcu_cpu_stall_deep_debug.
> >>>>> 3. Add comments and normalize local variable name
> >>>>>
> >>>>>
> >>>>> v1 --> v2:
> >>>>> 1. Fixed a bug in the code. If the rcu stall is detected by another CPU,
> >>>>>    kcpustat_this_cpu cannot be used.
> >>>>> @@ -451,7 +451,7 @@ static void print_cpu_stat_info(int cpu)
> >>>>>         if (r->gp_seq != rdp->gp_seq)
> >>>>>                 return;
> >>>>>
> >>>>> -       cpustat = kcpustat_this_cpu->cpustat;
> >>>>> +       cpustat = kcpustat_cpu(cpu).cpustat;
> >>>>> 2. Move the start point of statistics from rcu_stall_kick_kthreads() to
> >>>>>    rcu_implicit_dynticks_qs(), removing the dependency on irq_work.
> >>>>>
> >>>>> v1:
> >>>>> In some extreme cases, such as the I/O pressure test, the CPU usage may
> >>>>> be 100%, causing RCU stall. In this case, the printed information about
> >>>>> current is not useful. Displays the number and usage of hard interrupts,
> >>>>> soft interrupts, and context switches that are generated within half of
> >>>>> the CPU stall timeout, can help us make a general judgment. In other
> >>>>> cases, we can preliminarily determine whether an infinite loop occurs
> >>>>> when local_irq, local_bh or preempt is disabled.
> >>>>
> >>>> That looks useful but I have to ask: what does it bring that the softlockup
> >>>> and hardlockup watchdog can not already solve?
> >>>
> >>> This is a good point.  One possible benefit is putting the needed information
> >>> in one spot, for example, in cases where the soft/hard lockup timeouts are
> >>> significantly different than the RCU CPU stall warning timeout.
> >>
> >> Arguably, the hardlockup/softlockup detectors usually trigger after RCU stall,
> >> unless all CPUs are caught into a hardlockup, in which case only the hardlockup
> >> detector has a chance.
> >>
> >> Anyway I would say that in this case just lower the delay for the lockup
> >> detectors to consider the situation is a lockup?
> > 
> > Try it both ways and see how it works?  The rcutorture module parameters
> > stall_cpu and stall_cpu_irqsoff are easy ways to generate these sorts
> > of scenarios.
> > 
> > Actually, that does remind me of something.  Back when I was chasing
> > that interrupt storm, would this patch have helped me?  In that case, the
> 
> Yes, this patch series originally addressed an RCU stall issue caused by an
> interruption storm. The serial port driver written by another project team
> failed to write the register in a specific condition. As a result, interrupts
> were repeatedly reported.

Very good!

> > half-way point would have been reached while all online CPUs were spinning
> > with interrupts disabled and the incoming CPU was getting hammered with
> > continual scheduling-clock interrupts.  So I suspect that the answer is
> > "no" because the incoming CPU was not blocking the grace period.
> > 
> > Instead of being snapshot halfway to the RCU CPU stall warning, should
> > the values be snapshot when the CPU notices the beginning or end of an
> > RCU grace period and when a CPU goes offline?
> 
> This won't work. Those normal counts that occurred before the failure
> have an impact on our analysis. For example, some software interrupts
> may have been generated before local_bh_disable() is called.

Fair enough, and thank you for considering this option.  But please be
prepared to adjust (somehow or another) as needed to accommodate other
failure scenarios as they arise.

							Thanx, Paul

> > But that would not suffice, because detailed information would not have
> > been dumped for the incoming CPU.
> > 
> > However, the lack of context switches and interrupts on the rest of the
> > CPUs would likely have been a big cluebat, so there is that.  It might
> > be better to rework the warning at the beginning of rcu_sched_clock_irq()
> > to complain if more than (say) 10 scheduling-clock interrupts occur on
> > a given CPU during a single jiffy.
> > 
> > Independent of Zhen Lei patch.
> > 
> > Thoughts?
> > 
> > 							Thanx, Paul
> > 
> >> Thanks.
> >>
> >>
> >>>
> >>> Thoughts?
> >>>
> >>> 							Thanx, Paul
> >>>
> >>>> Thanks.
> >>>>
> >>>>>
> >>>>> Zhen Lei (2):
> >>>>>   rcu: Add RCU stall diagnosis information
> >>>>>   doc: Document CONFIG_RCU_CPU_STALL_CPUTIME=y stall information
> >>>>>
> >>>>>  Documentation/RCU/stallwarn.rst               | 88 +++++++++++++++++++
> >>>>>  .../admin-guide/kernel-parameters.txt         |  6 ++
> >>>>>  kernel/rcu/Kconfig.debug                      | 11 +++
> >>>>>  kernel/rcu/rcu.h                              |  1 +
> >>>>>  kernel/rcu/tree.c                             | 17 ++++
> >>>>>  kernel/rcu/tree.h                             | 19 ++++
> >>>>>  kernel/rcu/tree_stall.h                       | 29 ++++++
> >>>>>  kernel/rcu/update.c                           |  2 +
> >>>>>  8 files changed, 173 insertions(+)
> >>>>>
> >>>>> -- 
> >>>>> 2.25.1
> >>>>>
> > .
> > 
> 
> -- 
> Regards,
>   Zhen Lei

Leizhen (ThunderTown) Nov. 15, 2022, 9:11 a.m. UTC | #9

On 2022/11/13 2:59, Paul E. McKenney wrote:
> On Thu, Nov 10, 2022 at 10:27:44AM +0800, Leizhen (ThunderTown) wrote:
>> On 2022/11/10 1:22, Paul E. McKenney wrote:
>>> On Wed, Nov 09, 2022 at 06:03:17PM +0100, Frederic Weisbecker wrote:
>>>> On Wed, Nov 09, 2022 at 07:59:01AM -0800, Paul E. McKenney wrote:
>>>>> On Wed, Nov 09, 2022 at 04:26:21PM +0100, Frederic Weisbecker wrote:
>>>>>> Hi Zhen Lei,
>>>>>>
>>>>>> On Wed, Nov 09, 2022 at 05:37:36PM +0800, Zhen Lei wrote:
>>>>>>> v5 --> v6:
>>>>>>> 1. When there are more than two continuous RCU stallings, correctly handle the
>>>>>>>    value of the second and subsequent sampling periods. Update comments and
>>>>>>>    document.
>>>>>>>    Thanks to Elliott, Robert for the test.
>>>>>>> 2. Change "rcu stall" to "RCU stall".
>>>>>>>
>>>>>>> v4 --> v5:
>>>>>>> 1. Resolve a git am conflict. No code change.
>>>>>>>
>>>>>>> v3 --> v4:
>>>>>>> 1. Rename rcu_cpu_stall_deep_debug to rcu_cpu_stall_cputime.
>>>>>>>
>>>>>>> v2 --> v3:
>>>>>>> 1. Fix the return type of kstat_cpu_irqs_sum()
>>>>>>> 2. Add Kconfig option CONFIG_RCU_CPU_STALL_DEEP_DEBUG and boot parameter
>>>>>>>    rcupdate.rcu_cpu_stall_deep_debug.
>>>>>>> 3. Add comments and normalize local variable name
>>>>>>>
>>>>>>>
>>>>>>> v1 --> v2:
>>>>>>> 1. Fixed a bug in the code. If the rcu stall is detected by another CPU,
>>>>>>>    kcpustat_this_cpu cannot be used.
>>>>>>> @@ -451,7 +451,7 @@ static void print_cpu_stat_info(int cpu)
>>>>>>>         if (r->gp_seq != rdp->gp_seq)
>>>>>>>                 return;
>>>>>>>
>>>>>>> -       cpustat = kcpustat_this_cpu->cpustat;
>>>>>>> +       cpustat = kcpustat_cpu(cpu).cpustat;
>>>>>>> 2. Move the start point of statistics from rcu_stall_kick_kthreads() to
>>>>>>>    rcu_implicit_dynticks_qs(), removing the dependency on irq_work.
>>>>>>>
>>>>>>> v1:
>>>>>>> In some extreme cases, such as the I/O pressure test, the CPU usage may
>>>>>>> be 100%, causing RCU stall. In this case, the printed information about
>>>>>>> current is not useful. Displays the number and usage of hard interrupts,
>>>>>>> soft interrupts, and context switches that are generated within half of
>>>>>>> the CPU stall timeout, can help us make a general judgment. In other
>>>>>>> cases, we can preliminarily determine whether an infinite loop occurs
>>>>>>> when local_irq, local_bh or preempt is disabled.
>>>>>>
>>>>>> That looks useful but I have to ask: what does it bring that the softlockup
>>>>>> and hardlockup watchdog can not already solve?
>>>>>
>>>>> This is a good point.  One possible benefit is putting the needed information
>>>>> in one spot, for example, in cases where the soft/hard lockup timeouts are
>>>>> significantly different than the RCU CPU stall warning timeout.
>>>>
>>>> Arguably, the hardlockup/softlockup detectors usually trigger after RCU stall,
>>>> unless all CPUs are caught into a hardlockup, in which case only the hardlockup
>>>> detector has a chance.
>>>>
>>>> Anyway I would say that in this case just lower the delay for the lockup
>>>> detectors to consider the situation is a lockup?
>>>
>>> Try it both ways and see how it works?  The rcutorture module parameters
>>> stall_cpu and stall_cpu_irqsoff are easy ways to generate these sorts
>>> of scenarios.
>>>
>>> Actually, that does remind me of something.  Back when I was chasing
>>> that interrupt storm, would this patch have helped me?  In that case, the
>>
>> Yes, this patch series originally addressed an RCU stall issue caused by an
>> interruption storm. The serial port driver written by another project team
>> failed to write the register in a specific condition. As a result, interrupts
>> were repeatedly reported.
> 
> Very good!
> 
>>> half-way point would have been reached while all online CPUs were spinning
>>> with interrupts disabled and the incoming CPU was getting hammered with
>>> continual scheduling-clock interrupts.  So I suspect that the answer is
>>> "no" because the incoming CPU was not blocking the grace period.
>>>
>>> Instead of being snapshot halfway to the RCU CPU stall warning, should
>>> the values be snapshot when the CPU notices the beginning or end of an
>>> RCU grace period and when a CPU goes offline?
>>
>> This won't work. Those normal counts that occurred before the failure
>> have an impact on our analysis. For example, some software interrupts
>> may have been generated before local_bh_disable() is called.
> 
> Fair enough, and thank you for considering this option.  But please be
> prepared to adjust (somehow or another) as needed to accommodate other
> failure scenarios as they arise.

Except the document has an warning of "make htmldocs". I can't think of
anything to improve on the v7 at the moment.

Change the type of softirqs[NR_SOFTIRQS] from "unsigned int" to
"unsigned long", I will post a separate patch in future. Because
more people may join the discussion.

How about I post v8 tomorrow?

> 
> 							Thanx, Paul
> 
>>> But that would not suffice, because detailed information would not have
>>> been dumped for the incoming CPU.
>>>
>>> However, the lack of context switches and interrupts on the rest of the
>>> CPUs would likely have been a big cluebat, so there is that.  It might
>>> be better to rework the warning at the beginning of rcu_sched_clock_irq()
>>> to complain if more than (say) 10 scheduling-clock interrupts occur on
>>> a given CPU during a single jiffy.
>>>
>>> Independent of Zhen Lei patch.
>>>
>>> Thoughts?
>>>
>>> 							Thanx, Paul
>>>
>>>> Thanks.
>>>>
>>>>
>>>>>
>>>>> Thoughts?
>>>>>
>>>>> 							Thanx, Paul
>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>>>
>>>>>>> Zhen Lei (2):
>>>>>>>   rcu: Add RCU stall diagnosis information
>>>>>>>   doc: Document CONFIG_RCU_CPU_STALL_CPUTIME=y stall information
>>>>>>>
>>>>>>>  Documentation/RCU/stallwarn.rst               | 88 +++++++++++++++++++
>>>>>>>  .../admin-guide/kernel-parameters.txt         |  6 ++
>>>>>>>  kernel/rcu/Kconfig.debug                      | 11 +++
>>>>>>>  kernel/rcu/rcu.h                              |  1 +
>>>>>>>  kernel/rcu/tree.c                             | 17 ++++
>>>>>>>  kernel/rcu/tree.h                             | 19 ++++
>>>>>>>  kernel/rcu/tree_stall.h                       | 29 ++++++
>>>>>>>  kernel/rcu/update.c                           |  2 +
>>>>>>>  8 files changed, 173 insertions(+)
>>>>>>>
>>>>>>> -- 
>>>>>>> 2.25.1
>>>>>>>
>>> .
>>>
>>
>> -- 
>> Regards,
>>   Zhen Lei
> .
>

[v6,0/2] rcu: Add RCU stall diagnosis information

Message

Comments