diff mbox series

rcu/torture: Improve badness extraction from console logs

Message ID 20230831012257.1189964-1-joel@joelfernandes.org (mailing list archive)
State New
Headers show
Series rcu/torture: Improve badness extraction from console logs | expand

Commit Message

Joel Fernandes Aug. 31, 2023, 1:22 a.m. UTC
Currently console.log.diags contains an output like follows:
[ 2457.293734] WARNING: CPU: 2 PID: 13 at kernel/rcu/tasks.h:1061 rcu_tasks_trace_pregp_step+0x4a/0x50
[ 2457.542385] Call Trace:

This is not very useful and the Call trace is desired. Improve the
script by Extracting more lines after each grep match.

With this the above becomes:

Issue 1:
[ 2457.293734] WARNING: CPU: 2 PID: 13 at kernel/rcu/tasks.h:1061 rcu_tasks_trace_pregp_step+0x4a/0x50
[ 2457.326661] Modules linked in:
[ 2457.334818] CPU: 2 PID: 13 Comm: rcu_tasks_trace Not tainted 5.15.128+ #381
[ 2457.349782] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 2457.373309] RIP: 0010:rcu_tasks_trace_pregp_step+0x4a/0x50
[ 2457.386691] Code: 48 63 c7 48 8b 0
[ 2457.421803] RSP: 0018:ffffa80fc0073e40 EFLAGS: 00010202
[ 2457.431940] RAX: ffff8db91f580000 RBX: 000000000001b900 RCX: 0000000000000003
[ 2457.443206] RDX: 0000000000000008 RSI: ffffffffac6bebd8 RDI: 0000000000000003
[ 2457.454428] RBP: 0000000000000004 R08: 0000000000000001 R09: 0000000000000001
[ 2457.465668] R10: 0000000000000000 R11: 00000000ffffffff R12: ffff8db902d87f40
[ 2457.476971] R13: ffffffffac556620 R14: ffffffffac556630 R15: ffff8db9011a3200
[ 2457.488251] FS:  0000000000000000(0000) GS:ffff8db91f500000(0000) knlGS:0000000000000000
[ 2457.500834] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2457.509602] CR2: 0000000000000000 CR3: 0000000002cbc000 CR4: 00000000000006e0
[ 2457.520378] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2457.531440] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2457.542385] Call Trace:
[ 2457.546756]  <TASK>
[ 2457.550349]  ? __warn+0x7b/0x100
[ 2457.567214]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
-------------------------------------
Issue 2:
[ 2457.542385] Call Trace:
[ 2457.546756]  <TASK>
[ 2457.550349]  ? __warn+0x7b/0x100
[ 2457.567214]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
[ 2457.574948]  ? report_bug+0x99/0xc0
[ 2457.593824]  ? handle_bug+0x3c/0x70
[ 2457.599534]  ? exc_invalid_op+0x13/0x60
[ 2457.625729]  ? asm_exc_invalid_op+0x16/0x20
[ 2457.632249]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
[ 2457.660010]  rcu_tasks_wait_gp+0x54/0x360
[ 2457.677761]  ? _raw_spin_unlock_irqrestore+0x2b/0x60
[ 2457.705658]  rcu_tasks_kthread+0x114/0x200
[ 2457.712450]  ? wait_woken+0x70/0x70
[ 2457.727283]  ? synchronize_rcu_tasks_rude+0x10/0x10
[ 2457.746221]  kthread+0x130/0x160
[ 2457.751487]  ? set_kthread_struct+0x40/0x40
[ 2457.758178]  ret_from_fork+0x22/0x30
[ 2457.763909]  </TASK>
[ 2457.767546] irq event stamp: 29544441
[ 2457.773344] hardirqs last  enabled at (29544451): [<ffffffffaace6cbd>] __up_console_sem+0x4d/0x60
[ 2457.786967] hardirqs last disabled at (29544460): [<ffffffffaace6ca2>] __up_console_sem+0x32/0x60
-------------------------------------

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
 .../rcutorture/bin/console-badness.sh         | 24 ++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

Comments

Paul E. McKenney Sept. 1, 2023, 2:49 p.m. UTC | #1
On Thu, Aug 31, 2023 at 01:22:56AM +0000, Joel Fernandes (Google) wrote:
> Currently console.log.diags contains an output like follows:
> [ 2457.293734] WARNING: CPU: 2 PID: 13 at kernel/rcu/tasks.h:1061 rcu_tasks_trace_pregp_step+0x4a/0x50
> [ 2457.542385] Call Trace:
> 
> This is not very useful and the Call trace is desired. Improve the
> script by Extracting more lines after each grep match.
> 
> With this the above becomes:
> 
> Issue 1:
> [ 2457.293734] WARNING: CPU: 2 PID: 13 at kernel/rcu/tasks.h:1061 rcu_tasks_trace_pregp_step+0x4a/0x50
> [ 2457.326661] Modules linked in:
> [ 2457.334818] CPU: 2 PID: 13 Comm: rcu_tasks_trace Not tainted 5.15.128+ #381
> [ 2457.349782] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> [ 2457.373309] RIP: 0010:rcu_tasks_trace_pregp_step+0x4a/0x50
> [ 2457.386691] Code: 48 63 c7 48 8b 0
> [ 2457.421803] RSP: 0018:ffffa80fc0073e40 EFLAGS: 00010202
> [ 2457.431940] RAX: ffff8db91f580000 RBX: 000000000001b900 RCX: 0000000000000003
> [ 2457.443206] RDX: 0000000000000008 RSI: ffffffffac6bebd8 RDI: 0000000000000003
> [ 2457.454428] RBP: 0000000000000004 R08: 0000000000000001 R09: 0000000000000001
> [ 2457.465668] R10: 0000000000000000 R11: 00000000ffffffff R12: ffff8db902d87f40
> [ 2457.476971] R13: ffffffffac556620 R14: ffffffffac556630 R15: ffff8db9011a3200
> [ 2457.488251] FS:  0000000000000000(0000) GS:ffff8db91f500000(0000) knlGS:0000000000000000
> [ 2457.500834] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2457.509602] CR2: 0000000000000000 CR3: 0000000002cbc000 CR4: 00000000000006e0
> [ 2457.520378] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 2457.531440] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 2457.542385] Call Trace:
> [ 2457.546756]  <TASK>
> [ 2457.550349]  ? __warn+0x7b/0x100
> [ 2457.567214]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
> -------------------------------------
> Issue 2:
> [ 2457.542385] Call Trace:
> [ 2457.546756]  <TASK>
> [ 2457.550349]  ? __warn+0x7b/0x100
> [ 2457.567214]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
> [ 2457.574948]  ? report_bug+0x99/0xc0
> [ 2457.593824]  ? handle_bug+0x3c/0x70
> [ 2457.599534]  ? exc_invalid_op+0x13/0x60
> [ 2457.625729]  ? asm_exc_invalid_op+0x16/0x20
> [ 2457.632249]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
> [ 2457.660010]  rcu_tasks_wait_gp+0x54/0x360
> [ 2457.677761]  ? _raw_spin_unlock_irqrestore+0x2b/0x60
> [ 2457.705658]  rcu_tasks_kthread+0x114/0x200
> [ 2457.712450]  ? wait_woken+0x70/0x70
> [ 2457.727283]  ? synchronize_rcu_tasks_rude+0x10/0x10
> [ 2457.746221]  kthread+0x130/0x160
> [ 2457.751487]  ? set_kthread_struct+0x40/0x40
> [ 2457.758178]  ret_from_fork+0x22/0x30
> [ 2457.763909]  </TASK>
> [ 2457.767546] irq event stamp: 29544441
> [ 2457.773344] hardirqs last  enabled at (29544451): [<ffffffffaace6cbd>] __up_console_sem+0x4d/0x60
> [ 2457.786967] hardirqs last disabled at (29544460): [<ffffffffaace6ca2>] __up_console_sem+0x32/0x60
> -------------------------------------
> 
> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>

Nice!!!

One request, though.  If I am reading this correctly, if a console.log
file is filled with splats, then console.log.diags will replicate most
of console.log.  Could you please limit something, perhaps the number
of lines, the number of splats, or some such?

Just for context, my use case for the current setup is to run
kvm-find-errors.sh, which puts each console.log.diags/console.log pair
into my editor.  I copy an appropriate string from the console.log.diags,
and then paste that string to search for it in the corresponding
console.log file, which is what is edited next.

This script does the same with any build errors.

							Thanx, Paul

> ---
>  .../rcutorture/bin/console-badness.sh         | 24 ++++++++++++++++++-
>  1 file changed, 23 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/rcutorture/bin/console-badness.sh b/tools/testing/selftests/rcutorture/bin/console-badness.sh
> index aad51e7c0183..d28efcd86b64 100755
> --- a/tools/testing/selftests/rcutorture/bin/console-badness.sh
> +++ b/tools/testing/selftests/rcutorture/bin/console-badness.sh
> @@ -9,10 +9,32 @@
>  # Copyright (C) 2020 Facebook, Inc.
>  #
>  # Authors: Paul E. McKenney <paulmck@kernel.org>
> +INPUT_DATA=$(< /dev/stdin)
>  
> -grep -E 'Badness|WARNING:|Warn|BUG|===========|BUG: KCSAN:|Call Trace:|Oops:|detected stalls on CPUs/tasks:|self-detected stall on CPU|Stall ended before state dump start|\?\?\? Writer stall state|rcu_.*kthread starved for|!!!' |
> +# Get the line numbers for all the grep matches
> +GREP_LINES="$(echo "$INPUT_DATA" |
> +grep -n -E 'Badness|WARNING:|Warn|BUG|===========|BUG: KCSAN:|Call Trace:|Oops:|detected stalls on CPUs/tasks:|self-detected stall on CPU|Stall ended before state dump start|\?\?\? Writer stall state|rcu_.*kthread starved for|!!!' |
>  grep -v 'ODEBUG: ' |
>  grep -v 'This means that this is a DEBUG kernel and it is' |
>  grep -v 'Warning: unable to open an initial console' |
>  grep -v 'Warning: Failed to add ttynull console. No stdin, stdout, and stderr.*the init process!' |
>  grep -v 'NOHZ tick-stop error: Non-RCU local softirq work is pending, handler'
> +)"
> +
> +# Exit if no grep matches
> +if [ ! -n "$GREP_LINES" ]; then exit 0; fi
> +
> +# Go through each line of GREP_LINES, extract the line number and then
> +# print from that line and 20 lines after that line. Do that for each
> +# grep match.
> +issue_num=1
> +while IFS= read -r line; do
> +    # Extract the line number from the line
> +    num=$(echo "$line" | awk -F: '{print $1}')
> +    # Print 20 lines after the matched line
> +    echo "Issue $issue_num:"
> +    issue_num="$(($issue_num + 1))"
> +    echo "$INPUT_DATA" | sed -n "${num},$(($num + 20))p"
> +    echo "-------------------------------------"
> +done <<< "$GREP_LINES"
> +
> -- 
> 2.42.0.rc2.253.gd59a3bf2b4-goog
>
Joel Fernandes Sept. 1, 2023, 6:04 p.m. UTC | #2
On Fri, Sep 1, 2023 at 10:49 AM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Thu, Aug 31, 2023 at 01:22:56AM +0000, Joel Fernandes (Google) wrote:
> > Currently console.log.diags contains an output like follows:
> > [ 2457.293734] WARNING: CPU: 2 PID: 13 at kernel/rcu/tasks.h:1061 rcu_tasks_trace_pregp_step+0x4a/0x50
> > [ 2457.542385] Call Trace:
> >
> > This is not very useful and the Call trace is desired. Improve the
> > script by Extracting more lines after each grep match.
> >
> > With this the above becomes:
> >
> > Issue 1:
> > [ 2457.293734] WARNING: CPU: 2 PID: 13 at kernel/rcu/tasks.h:1061 rcu_tasks_trace_pregp_step+0x4a/0x50
> > [ 2457.326661] Modules linked in:
> > [ 2457.334818] CPU: 2 PID: 13 Comm: rcu_tasks_trace Not tainted 5.15.128+ #381
> > [ 2457.349782] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> > [ 2457.373309] RIP: 0010:rcu_tasks_trace_pregp_step+0x4a/0x50
> > [ 2457.386691] Code: 48 63 c7 48 8b 0
> > [ 2457.421803] RSP: 0018:ffffa80fc0073e40 EFLAGS: 00010202
> > [ 2457.431940] RAX: ffff8db91f580000 RBX: 000000000001b900 RCX: 0000000000000003
> > [ 2457.443206] RDX: 0000000000000008 RSI: ffffffffac6bebd8 RDI: 0000000000000003
> > [ 2457.454428] RBP: 0000000000000004 R08: 0000000000000001 R09: 0000000000000001
> > [ 2457.465668] R10: 0000000000000000 R11: 00000000ffffffff R12: ffff8db902d87f40
> > [ 2457.476971] R13: ffffffffac556620 R14: ffffffffac556630 R15: ffff8db9011a3200
> > [ 2457.488251] FS:  0000000000000000(0000) GS:ffff8db91f500000(0000) knlGS:0000000000000000
> > [ 2457.500834] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 2457.509602] CR2: 0000000000000000 CR3: 0000000002cbc000 CR4: 00000000000006e0
> > [ 2457.520378] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [ 2457.531440] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [ 2457.542385] Call Trace:
> > [ 2457.546756]  <TASK>
> > [ 2457.550349]  ? __warn+0x7b/0x100
> > [ 2457.567214]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
> > -------------------------------------
> > Issue 2:
> > [ 2457.542385] Call Trace:
> > [ 2457.546756]  <TASK>
> > [ 2457.550349]  ? __warn+0x7b/0x100
> > [ 2457.567214]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
> > [ 2457.574948]  ? report_bug+0x99/0xc0
> > [ 2457.593824]  ? handle_bug+0x3c/0x70
> > [ 2457.599534]  ? exc_invalid_op+0x13/0x60
> > [ 2457.625729]  ? asm_exc_invalid_op+0x16/0x20
> > [ 2457.632249]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
> > [ 2457.660010]  rcu_tasks_wait_gp+0x54/0x360
> > [ 2457.677761]  ? _raw_spin_unlock_irqrestore+0x2b/0x60
> > [ 2457.705658]  rcu_tasks_kthread+0x114/0x200
> > [ 2457.712450]  ? wait_woken+0x70/0x70
> > [ 2457.727283]  ? synchronize_rcu_tasks_rude+0x10/0x10
> > [ 2457.746221]  kthread+0x130/0x160
> > [ 2457.751487]  ? set_kthread_struct+0x40/0x40
> > [ 2457.758178]  ret_from_fork+0x22/0x30
> > [ 2457.763909]  </TASK>
> > [ 2457.767546] irq event stamp: 29544441
> > [ 2457.773344] hardirqs last  enabled at (29544451): [<ffffffffaace6cbd>] __up_console_sem+0x4d/0x60
> > [ 2457.786967] hardirqs last disabled at (29544460): [<ffffffffaace6ca2>] __up_console_sem+0x32/0x60
> > -------------------------------------
> >
> > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
>
> Nice!!!
>
> One request, though.  If I am reading this correctly, if a console.log
> file is filled with splats, then console.log.diags will replicate most
> of console.log.  Could you please limit something, perhaps the number
> of lines, the number of splats, or some such?
>
> Just for context, my use case for the current setup is to run
> kvm-find-errors.sh, which puts each console.log.diags/console.log pair
> into my editor.  I copy an appropriate string from the console.log.diags,
> and then paste that string to search for it in the corresponding
> console.log file, which is what is edited next.

I could add a summary on the top for your usecase (which keeps the
original output), and then the more detailed splats below it. Would
that work?

So the whole file would look something like:
=====x8==========
Summary of console.log lines showing issues:
Line 300: [ 2457.293734] WARNING: CPU: 2 PID: 13 at
kernel/rcu/tasks.h:1061 rcu_tasks_trace_pregp_step+0x4a/0x50
Line 400:[ 2457.542385] Call Trace:

Detailed list of issues:
1. Issue on line 300:
[ 2457.293734] WARNING: CPU: 2 PID: 13 at kernel/rcu/tasks.h:1061
rcu_tasks_trace_pregp_step+0x4a/0x50
[ 2457.326661] Modules linked in:
[ 2457.334818] CPU: 2 PID: 13 Comm: rcu_tasks_trace Not tainted 5.15.128+ #381
[ 2457.349782] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 2457.373309] RIP: 0010:rcu_tasks_trace_pregp_step+0x4a/0x50
[ 2457.386691] Code: 48 63 c7 48 8b 0
[ 2457.421803] RSP: 0018:ffffa80fc0073e40 EFLAGS: 00010202
[ 2457.431940] RAX: ffff8db91f580000 RBX: 000000000001b900 RCX: 0000000000000003
[ 2457.443206] RDX: 0000000000000008 RSI: ffffffffac6bebd8 RDI: 0000000000000003
[ 2457.454428] RBP: 0000000000000004 R08: 0000000000000001 R09: 0000000000000001
[ 2457.465668] R10: 0000000000000000 R11: 00000000ffffffff R12: ffff8db902d87f40
[ 2457.476971] R13: ffffffffac556620 R14: ffffffffac556630 R15: ffff8db9011a3200
[ 2457.488251] FS:  0000000000000000(0000) GS:ffff8db91f500000(0000)
knlGS:0000000000000000
[ 2457.500834] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2457.509602] CR2: 0000000000000000 CR3: 0000000002cbc000 CR4: 00000000000006e0
[ 2457.520378] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2457.531440] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2457.542385] Call Trace:
[ 2457.546756]  <TASK>
[ 2457.550349]  ? __warn+0x7b/0x100
[ 2457.567214]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
-------------------------------------
2. Issue on Line 500:
[ 2457.542385] Call Trace:
[ 2457.546756]  <TASK>
[ 2457.550349]  ? __warn+0x7b/0x100
[ 2457.567214]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
[ 2457.574948]  ? report_bug+0x99/0xc0
[ 2457.593824]  ? handle_bug+0x3c/0x70
[ 2457.599534]  ? exc_invalid_op+0x13/0x60
[ 2457.625729]  ? asm_exc_invalid_op+0x16/0x20
[ 2457.632249]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
[ 2457.660010]  rcu_tasks_wait_gp+0x54/0x360
[ 2457.677761]  ? _raw_spin_unlock_irqrestore+0x2b/0x60
[ 2457.705658]  rcu_tasks_kthread+0x114/0x200
[ 2457.712450]  ? wait_woken+0x70/0x70
[ 2457.727283]  ? synchronize_rcu_tasks_rude+0x10/0x10
[ 2457.746221]  kthread+0x130/0x160
[ 2457.751487]  ? set_kthread_struct+0x40/0x40
[ 2457.758178]  ret_from_fork+0x22/0x30
[ 2457.763909]  </TASK>
[ 2457.767546] irq event stamp: 29544441
[ 2457.773344] hardirqs last  enabled at (29544451):
[<ffffffffaace6cbd>] __up_console_sem+0x4d/0x60
[ 2457.786967] hardirqs last disabled at (29544460):
[<ffffffffaace6ca2>] __up_console_sem+0x32/0x60
-------------------------------------
=====x8==========>
Paul E. McKenney Sept. 2, 2023, 1:53 p.m. UTC | #3
On Fri, Sep 01, 2023 at 02:04:24PM -0400, Joel Fernandes wrote:
> On Fri, Sep 1, 2023 at 10:49 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Thu, Aug 31, 2023 at 01:22:56AM +0000, Joel Fernandes (Google) wrote:
> > > Currently console.log.diags contains an output like follows:
> > > [ 2457.293734] WARNING: CPU: 2 PID: 13 at kernel/rcu/tasks.h:1061 rcu_tasks_trace_pregp_step+0x4a/0x50
> > > [ 2457.542385] Call Trace:
> > >
> > > This is not very useful and the Call trace is desired. Improve the
> > > script by Extracting more lines after each grep match.
> > >
> > > With this the above becomes:
> > >
> > > Issue 1:
> > > [ 2457.293734] WARNING: CPU: 2 PID: 13 at kernel/rcu/tasks.h:1061 rcu_tasks_trace_pregp_step+0x4a/0x50
> > > [ 2457.326661] Modules linked in:
> > > [ 2457.334818] CPU: 2 PID: 13 Comm: rcu_tasks_trace Not tainted 5.15.128+ #381
> > > [ 2457.349782] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> > > [ 2457.373309] RIP: 0010:rcu_tasks_trace_pregp_step+0x4a/0x50
> > > [ 2457.386691] Code: 48 63 c7 48 8b 0
> > > [ 2457.421803] RSP: 0018:ffffa80fc0073e40 EFLAGS: 00010202
> > > [ 2457.431940] RAX: ffff8db91f580000 RBX: 000000000001b900 RCX: 0000000000000003
> > > [ 2457.443206] RDX: 0000000000000008 RSI: ffffffffac6bebd8 RDI: 0000000000000003
> > > [ 2457.454428] RBP: 0000000000000004 R08: 0000000000000001 R09: 0000000000000001
> > > [ 2457.465668] R10: 0000000000000000 R11: 00000000ffffffff R12: ffff8db902d87f40
> > > [ 2457.476971] R13: ffffffffac556620 R14: ffffffffac556630 R15: ffff8db9011a3200
> > > [ 2457.488251] FS:  0000000000000000(0000) GS:ffff8db91f500000(0000) knlGS:0000000000000000
> > > [ 2457.500834] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [ 2457.509602] CR2: 0000000000000000 CR3: 0000000002cbc000 CR4: 00000000000006e0
> > > [ 2457.520378] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > [ 2457.531440] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > [ 2457.542385] Call Trace:
> > > [ 2457.546756]  <TASK>
> > > [ 2457.550349]  ? __warn+0x7b/0x100
> > > [ 2457.567214]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
> > > -------------------------------------
> > > Issue 2:
> > > [ 2457.542385] Call Trace:
> > > [ 2457.546756]  <TASK>
> > > [ 2457.550349]  ? __warn+0x7b/0x100
> > > [ 2457.567214]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
> > > [ 2457.574948]  ? report_bug+0x99/0xc0
> > > [ 2457.593824]  ? handle_bug+0x3c/0x70
> > > [ 2457.599534]  ? exc_invalid_op+0x13/0x60
> > > [ 2457.625729]  ? asm_exc_invalid_op+0x16/0x20
> > > [ 2457.632249]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
> > > [ 2457.660010]  rcu_tasks_wait_gp+0x54/0x360
> > > [ 2457.677761]  ? _raw_spin_unlock_irqrestore+0x2b/0x60
> > > [ 2457.705658]  rcu_tasks_kthread+0x114/0x200
> > > [ 2457.712450]  ? wait_woken+0x70/0x70
> > > [ 2457.727283]  ? synchronize_rcu_tasks_rude+0x10/0x10
> > > [ 2457.746221]  kthread+0x130/0x160
> > > [ 2457.751487]  ? set_kthread_struct+0x40/0x40
> > > [ 2457.758178]  ret_from_fork+0x22/0x30
> > > [ 2457.763909]  </TASK>
> > > [ 2457.767546] irq event stamp: 29544441
> > > [ 2457.773344] hardirqs last  enabled at (29544451): [<ffffffffaace6cbd>] __up_console_sem+0x4d/0x60
> > > [ 2457.786967] hardirqs last disabled at (29544460): [<ffffffffaace6ca2>] __up_console_sem+0x32/0x60
> > > -------------------------------------
> > >
> > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> >
> > Nice!!!
> >
> > One request, though.  If I am reading this correctly, if a console.log
> > file is filled with splats, then console.log.diags will replicate most
> > of console.log.  Could you please limit something, perhaps the number
> > of lines, the number of splats, or some such?
> >
> > Just for context, my use case for the current setup is to run
> > kvm-find-errors.sh, which puts each console.log.diags/console.log pair
> > into my editor.  I copy an appropriate string from the console.log.diags,
> > and then paste that string to search for it in the corresponding
> > console.log file, which is what is edited next.
> 
> I could add a summary on the top for your usecase (which keeps the
> original output), and then the more detailed splats below it. Would
> that work?

That summary would normally be at the top of the file anyway, right?
Either way, what I don't want is many megabytes of splats from console.log
duplicated into console.log.diags, especially in those (admittedly sad)
cases where the console.log file is the largest file in the results
directory.  Are you really going to be reading carefully after (say)
the fifth splat?  ;-)

							Thanx, Paul

> So the whole file would look something like:
> =====x8==========
> Summary of console.log lines showing issues:
> Line 300: [ 2457.293734] WARNING: CPU: 2 PID: 13 at
> kernel/rcu/tasks.h:1061 rcu_tasks_trace_pregp_step+0x4a/0x50
> Line 400:[ 2457.542385] Call Trace:
> 
> Detailed list of issues:
> 1. Issue on line 300:
> [ 2457.293734] WARNING: CPU: 2 PID: 13 at kernel/rcu/tasks.h:1061
> rcu_tasks_trace_pregp_step+0x4a/0x50
> [ 2457.326661] Modules linked in:
> [ 2457.334818] CPU: 2 PID: 13 Comm: rcu_tasks_trace Not tainted 5.15.128+ #381
> [ 2457.349782] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> [ 2457.373309] RIP: 0010:rcu_tasks_trace_pregp_step+0x4a/0x50
> [ 2457.386691] Code: 48 63 c7 48 8b 0
> [ 2457.421803] RSP: 0018:ffffa80fc0073e40 EFLAGS: 00010202
> [ 2457.431940] RAX: ffff8db91f580000 RBX: 000000000001b900 RCX: 0000000000000003
> [ 2457.443206] RDX: 0000000000000008 RSI: ffffffffac6bebd8 RDI: 0000000000000003
> [ 2457.454428] RBP: 0000000000000004 R08: 0000000000000001 R09: 0000000000000001
> [ 2457.465668] R10: 0000000000000000 R11: 00000000ffffffff R12: ffff8db902d87f40
> [ 2457.476971] R13: ffffffffac556620 R14: ffffffffac556630 R15: ffff8db9011a3200
> [ 2457.488251] FS:  0000000000000000(0000) GS:ffff8db91f500000(0000)
> knlGS:0000000000000000
> [ 2457.500834] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2457.509602] CR2: 0000000000000000 CR3: 0000000002cbc000 CR4: 00000000000006e0
> [ 2457.520378] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 2457.531440] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 2457.542385] Call Trace:
> [ 2457.546756]  <TASK>
> [ 2457.550349]  ? __warn+0x7b/0x100
> [ 2457.567214]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
> -------------------------------------
> 2. Issue on Line 500:
> [ 2457.542385] Call Trace:
> [ 2457.546756]  <TASK>
> [ 2457.550349]  ? __warn+0x7b/0x100
> [ 2457.567214]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
> [ 2457.574948]  ? report_bug+0x99/0xc0
> [ 2457.593824]  ? handle_bug+0x3c/0x70
> [ 2457.599534]  ? exc_invalid_op+0x13/0x60
> [ 2457.625729]  ? asm_exc_invalid_op+0x16/0x20
> [ 2457.632249]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
> [ 2457.660010]  rcu_tasks_wait_gp+0x54/0x360
> [ 2457.677761]  ? _raw_spin_unlock_irqrestore+0x2b/0x60
> [ 2457.705658]  rcu_tasks_kthread+0x114/0x200
> [ 2457.712450]  ? wait_woken+0x70/0x70
> [ 2457.727283]  ? synchronize_rcu_tasks_rude+0x10/0x10
> [ 2457.746221]  kthread+0x130/0x160
> [ 2457.751487]  ? set_kthread_struct+0x40/0x40
> [ 2457.758178]  ret_from_fork+0x22/0x30
> [ 2457.763909]  </TASK>
> [ 2457.767546] irq event stamp: 29544441
> [ 2457.773344] hardirqs last  enabled at (29544451):
> [<ffffffffaace6cbd>] __up_console_sem+0x4d/0x60
> [ 2457.786967] hardirqs last disabled at (29544460):
> [<ffffffffaace6ca2>] __up_console_sem+0x32/0x60
> -------------------------------------
> =====x8==========>
Joel Fernandes Sept. 2, 2023, 6:28 p.m. UTC | #4
On Sat, Sep 2, 2023 at 9:53 AM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Fri, Sep 01, 2023 at 02:04:24PM -0400, Joel Fernandes wrote:
> > On Fri, Sep 1, 2023 at 10:49 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > >
> > > On Thu, Aug 31, 2023 at 01:22:56AM +0000, Joel Fernandes (Google) wrote:
> > > > Currently console.log.diags contains an output like follows:
> > > > [ 2457.293734] WARNING: CPU: 2 PID: 13 at kernel/rcu/tasks.h:1061 rcu_tasks_trace_pregp_step+0x4a/0x50
> > > > [ 2457.542385] Call Trace:
> > > >
> > > > This is not very useful and the Call trace is desired. Improve the
> > > > script by Extracting more lines after each grep match.
> > > >
> > > > With this the above becomes:
> > > >
> > > > Issue 1:
> > > > [ 2457.293734] WARNING: CPU: 2 PID: 13 at kernel/rcu/tasks.h:1061 rcu_tasks_trace_pregp_step+0x4a/0x50
> > > > [ 2457.326661] Modules linked in:
> > > > [ 2457.334818] CPU: 2 PID: 13 Comm: rcu_tasks_trace Not tainted 5.15.128+ #381
> > > > [ 2457.349782] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> > > > [ 2457.373309] RIP: 0010:rcu_tasks_trace_pregp_step+0x4a/0x50
> > > > [ 2457.386691] Code: 48 63 c7 48 8b 0
> > > > [ 2457.421803] RSP: 0018:ffffa80fc0073e40 EFLAGS: 00010202
> > > > [ 2457.431940] RAX: ffff8db91f580000 RBX: 000000000001b900 RCX: 0000000000000003
> > > > [ 2457.443206] RDX: 0000000000000008 RSI: ffffffffac6bebd8 RDI: 0000000000000003
> > > > [ 2457.454428] RBP: 0000000000000004 R08: 0000000000000001 R09: 0000000000000001
> > > > [ 2457.465668] R10: 0000000000000000 R11: 00000000ffffffff R12: ffff8db902d87f40
> > > > [ 2457.476971] R13: ffffffffac556620 R14: ffffffffac556630 R15: ffff8db9011a3200
> > > > [ 2457.488251] FS:  0000000000000000(0000) GS:ffff8db91f500000(0000) knlGS:0000000000000000
> > > > [ 2457.500834] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [ 2457.509602] CR2: 0000000000000000 CR3: 0000000002cbc000 CR4: 00000000000006e0
> > > > [ 2457.520378] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > [ 2457.531440] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > > [ 2457.542385] Call Trace:
> > > > [ 2457.546756]  <TASK>
> > > > [ 2457.550349]  ? __warn+0x7b/0x100
> > > > [ 2457.567214]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
> > > > -------------------------------------
> > > > Issue 2:
> > > > [ 2457.542385] Call Trace:
> > > > [ 2457.546756]  <TASK>
> > > > [ 2457.550349]  ? __warn+0x7b/0x100
> > > > [ 2457.567214]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
> > > > [ 2457.574948]  ? report_bug+0x99/0xc0
> > > > [ 2457.593824]  ? handle_bug+0x3c/0x70
> > > > [ 2457.599534]  ? exc_invalid_op+0x13/0x60
> > > > [ 2457.625729]  ? asm_exc_invalid_op+0x16/0x20
> > > > [ 2457.632249]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
> > > > [ 2457.660010]  rcu_tasks_wait_gp+0x54/0x360
> > > > [ 2457.677761]  ? _raw_spin_unlock_irqrestore+0x2b/0x60
> > > > [ 2457.705658]  rcu_tasks_kthread+0x114/0x200
> > > > [ 2457.712450]  ? wait_woken+0x70/0x70
> > > > [ 2457.727283]  ? synchronize_rcu_tasks_rude+0x10/0x10
> > > > [ 2457.746221]  kthread+0x130/0x160
> > > > [ 2457.751487]  ? set_kthread_struct+0x40/0x40
> > > > [ 2457.758178]  ret_from_fork+0x22/0x30
> > > > [ 2457.763909]  </TASK>
> > > > [ 2457.767546] irq event stamp: 29544441
> > > > [ 2457.773344] hardirqs last  enabled at (29544451): [<ffffffffaace6cbd>] __up_console_sem+0x4d/0x60
> > > > [ 2457.786967] hardirqs last disabled at (29544460): [<ffffffffaace6ca2>] __up_console_sem+0x32/0x60
> > > > -------------------------------------
> > > >
> > > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> > >
> > > Nice!!!
> > >
> > > One request, though.  If I am reading this correctly, if a console.log
> > > file is filled with splats, then console.log.diags will replicate most
> > > of console.log.  Could you please limit something, perhaps the number
> > > of lines, the number of splats, or some such?
> > >
> > > Just for context, my use case for the current setup is to run
> > > kvm-find-errors.sh, which puts each console.log.diags/console.log pair
> > > into my editor.  I copy an appropriate string from the console.log.diags,
> > > and then paste that string to search for it in the corresponding
> > > console.log file, which is what is edited next.
> >
> > I could add a summary on the top for your usecase (which keeps the
> > original output), and then the more detailed splats below it. Would
> > that work?
>
> That summary would normally be at the top of the file anyway, right?

Not after this patch. I removed the summary in my patch from the
console .diags in favor of the detailed ones. I can add it back.

> Either way, what I don't want is many megabytes of splats from console.log
> duplicated into console.log.diags, especially in those (admittedly sad)
> cases where the console.log file is the largest file in the results
> directory.  Are you really going to be reading carefully after (say)
> the fifth splat?  ;-)

I could limit it to just the first 5, with the existing unlimited
summary at the top.

Let me know if that works or does not work for you. :-)

thanks,

 - Joel


>
>                                                         Thanx, Paul
>
> > So the whole file would look something like:
> > =====x8==========
> > Summary of console.log lines showing issues:
> > Line 300: [ 2457.293734] WARNING: CPU: 2 PID: 13 at
> > kernel/rcu/tasks.h:1061 rcu_tasks_trace_pregp_step+0x4a/0x50
> > Line 400:[ 2457.542385] Call Trace:
> >
> > Detailed list of issues:
> > 1. Issue on line 300:
> > [ 2457.293734] WARNING: CPU: 2 PID: 13 at kernel/rcu/tasks.h:1061
> > rcu_tasks_trace_pregp_step+0x4a/0x50
> > [ 2457.326661] Modules linked in:
> > [ 2457.334818] CPU: 2 PID: 13 Comm: rcu_tasks_trace Not tainted 5.15.128+ #381
> > [ 2457.349782] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> > BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> > [ 2457.373309] RIP: 0010:rcu_tasks_trace_pregp_step+0x4a/0x50
> > [ 2457.386691] Code: 48 63 c7 48 8b 0
> > [ 2457.421803] RSP: 0018:ffffa80fc0073e40 EFLAGS: 00010202
> > [ 2457.431940] RAX: ffff8db91f580000 RBX: 000000000001b900 RCX: 0000000000000003
> > [ 2457.443206] RDX: 0000000000000008 RSI: ffffffffac6bebd8 RDI: 0000000000000003
> > [ 2457.454428] RBP: 0000000000000004 R08: 0000000000000001 R09: 0000000000000001
> > [ 2457.465668] R10: 0000000000000000 R11: 00000000ffffffff R12: ffff8db902d87f40
> > [ 2457.476971] R13: ffffffffac556620 R14: ffffffffac556630 R15: ffff8db9011a3200
> > [ 2457.488251] FS:  0000000000000000(0000) GS:ffff8db91f500000(0000)
> > knlGS:0000000000000000
> > [ 2457.500834] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 2457.509602] CR2: 0000000000000000 CR3: 0000000002cbc000 CR4: 00000000000006e0
> > [ 2457.520378] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [ 2457.531440] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [ 2457.542385] Call Trace:
> > [ 2457.546756]  <TASK>
> > [ 2457.550349]  ? __warn+0x7b/0x100
> > [ 2457.567214]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
> > -------------------------------------
> > 2. Issue on Line 500:
> > [ 2457.542385] Call Trace:
> > [ 2457.546756]  <TASK>
> > [ 2457.550349]  ? __warn+0x7b/0x100
> > [ 2457.567214]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
> > [ 2457.574948]  ? report_bug+0x99/0xc0
> > [ 2457.593824]  ? handle_bug+0x3c/0x70
> > [ 2457.599534]  ? exc_invalid_op+0x13/0x60
> > [ 2457.625729]  ? asm_exc_invalid_op+0x16/0x20
> > [ 2457.632249]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
> > [ 2457.660010]  rcu_tasks_wait_gp+0x54/0x360
> > [ 2457.677761]  ? _raw_spin_unlock_irqrestore+0x2b/0x60
> > [ 2457.705658]  rcu_tasks_kthread+0x114/0x200
> > [ 2457.712450]  ? wait_woken+0x70/0x70
> > [ 2457.727283]  ? synchronize_rcu_tasks_rude+0x10/0x10
> > [ 2457.746221]  kthread+0x130/0x160
> > [ 2457.751487]  ? set_kthread_struct+0x40/0x40
> > [ 2457.758178]  ret_from_fork+0x22/0x30
> > [ 2457.763909]  </TASK>
> > [ 2457.767546] irq event stamp: 29544441
> > [ 2457.773344] hardirqs last  enabled at (29544451):
> > [<ffffffffaace6cbd>] __up_console_sem+0x4d/0x60
> > [ 2457.786967] hardirqs last disabled at (29544460):
> > [<ffffffffaace6ca2>] __up_console_sem+0x32/0x60
> > -------------------------------------
> > =====x8==========>
Paul E. McKenney Sept. 2, 2023, 7:24 p.m. UTC | #5
On Sat, Sep 02, 2023 at 02:28:41PM -0400, Joel Fernandes wrote:
> On Sat, Sep 2, 2023 at 9:53 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Fri, Sep 01, 2023 at 02:04:24PM -0400, Joel Fernandes wrote:
> > > On Fri, Sep 1, 2023 at 10:49 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > >
> > > > On Thu, Aug 31, 2023 at 01:22:56AM +0000, Joel Fernandes (Google) wrote:
> > > > > Currently console.log.diags contains an output like follows:
> > > > > [ 2457.293734] WARNING: CPU: 2 PID: 13 at kernel/rcu/tasks.h:1061 rcu_tasks_trace_pregp_step+0x4a/0x50
> > > > > [ 2457.542385] Call Trace:
> > > > >
> > > > > This is not very useful and the Call trace is desired. Improve the
> > > > > script by Extracting more lines after each grep match.
> > > > >
> > > > > With this the above becomes:
> > > > >
> > > > > Issue 1:
> > > > > [ 2457.293734] WARNING: CPU: 2 PID: 13 at kernel/rcu/tasks.h:1061 rcu_tasks_trace_pregp_step+0x4a/0x50
> > > > > [ 2457.326661] Modules linked in:
> > > > > [ 2457.334818] CPU: 2 PID: 13 Comm: rcu_tasks_trace Not tainted 5.15.128+ #381
> > > > > [ 2457.349782] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> > > > > [ 2457.373309] RIP: 0010:rcu_tasks_trace_pregp_step+0x4a/0x50
> > > > > [ 2457.386691] Code: 48 63 c7 48 8b 0
> > > > > [ 2457.421803] RSP: 0018:ffffa80fc0073e40 EFLAGS: 00010202
> > > > > [ 2457.431940] RAX: ffff8db91f580000 RBX: 000000000001b900 RCX: 0000000000000003
> > > > > [ 2457.443206] RDX: 0000000000000008 RSI: ffffffffac6bebd8 RDI: 0000000000000003
> > > > > [ 2457.454428] RBP: 0000000000000004 R08: 0000000000000001 R09: 0000000000000001
> > > > > [ 2457.465668] R10: 0000000000000000 R11: 00000000ffffffff R12: ffff8db902d87f40
> > > > > [ 2457.476971] R13: ffffffffac556620 R14: ffffffffac556630 R15: ffff8db9011a3200
> > > > > [ 2457.488251] FS:  0000000000000000(0000) GS:ffff8db91f500000(0000) knlGS:0000000000000000
> > > > > [ 2457.500834] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > [ 2457.509602] CR2: 0000000000000000 CR3: 0000000002cbc000 CR4: 00000000000006e0
> > > > > [ 2457.520378] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > > [ 2457.531440] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > > > [ 2457.542385] Call Trace:
> > > > > [ 2457.546756]  <TASK>
> > > > > [ 2457.550349]  ? __warn+0x7b/0x100
> > > > > [ 2457.567214]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
> > > > > -------------------------------------
> > > > > Issue 2:
> > > > > [ 2457.542385] Call Trace:
> > > > > [ 2457.546756]  <TASK>
> > > > > [ 2457.550349]  ? __warn+0x7b/0x100
> > > > > [ 2457.567214]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
> > > > > [ 2457.574948]  ? report_bug+0x99/0xc0
> > > > > [ 2457.593824]  ? handle_bug+0x3c/0x70
> > > > > [ 2457.599534]  ? exc_invalid_op+0x13/0x60
> > > > > [ 2457.625729]  ? asm_exc_invalid_op+0x16/0x20
> > > > > [ 2457.632249]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
> > > > > [ 2457.660010]  rcu_tasks_wait_gp+0x54/0x360
> > > > > [ 2457.677761]  ? _raw_spin_unlock_irqrestore+0x2b/0x60
> > > > > [ 2457.705658]  rcu_tasks_kthread+0x114/0x200
> > > > > [ 2457.712450]  ? wait_woken+0x70/0x70
> > > > > [ 2457.727283]  ? synchronize_rcu_tasks_rude+0x10/0x10
> > > > > [ 2457.746221]  kthread+0x130/0x160
> > > > > [ 2457.751487]  ? set_kthread_struct+0x40/0x40
> > > > > [ 2457.758178]  ret_from_fork+0x22/0x30
> > > > > [ 2457.763909]  </TASK>
> > > > > [ 2457.767546] irq event stamp: 29544441
> > > > > [ 2457.773344] hardirqs last  enabled at (29544451): [<ffffffffaace6cbd>] __up_console_sem+0x4d/0x60
> > > > > [ 2457.786967] hardirqs last disabled at (29544460): [<ffffffffaace6ca2>] __up_console_sem+0x32/0x60
> > > > > -------------------------------------
> > > > >
> > > > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> > > >
> > > > Nice!!!
> > > >
> > > > One request, though.  If I am reading this correctly, if a console.log
> > > > file is filled with splats, then console.log.diags will replicate most
> > > > of console.log.  Could you please limit something, perhaps the number
> > > > of lines, the number of splats, or some such?
> > > >
> > > > Just for context, my use case for the current setup is to run
> > > > kvm-find-errors.sh, which puts each console.log.diags/console.log pair
> > > > into my editor.  I copy an appropriate string from the console.log.diags,
> > > > and then paste that string to search for it in the corresponding
> > > > console.log file, which is what is edited next.
> > >
> > > I could add a summary on the top for your usecase (which keeps the
> > > original output), and then the more detailed splats below it. Would
> > > that work?
> >
> > That summary would normally be at the top of the file anyway, right?
> 
> Not after this patch. I removed the summary in my patch from the
> console .diags in favor of the detailed ones. I can add it back.

My point is rather that the first line of the existing summary would
likely appear in the first few lines of the new output.

> > Either way, what I don't want is many megabytes of splats from console.log
> > duplicated into console.log.diags, especially in those (admittedly sad)
> > cases where the console.log file is the largest file in the results
> > directory.  Are you really going to be reading carefully after (say)
> > the fifth splat?  ;-)
> 
> I could limit it to just the first 5, with the existing unlimited
> summary at the top.

How about limiting the existing summary to (say) 10 lines?  I do not
recall every caring about more than that.  The only reason that I did
not ever limit it myself is that the summary is unlikely to be more than
a few percent of the full console.log.

> Let me know if that works or does not work for you. :-)

I do believe that we might be converging.  ;-)

							Thanx, Paul

> thanks,
> 
>  - Joel
> 
> 
> >
> >                                                         Thanx, Paul
> >
> > > So the whole file would look something like:
> > > =====x8==========
> > > Summary of console.log lines showing issues:
> > > Line 300: [ 2457.293734] WARNING: CPU: 2 PID: 13 at
> > > kernel/rcu/tasks.h:1061 rcu_tasks_trace_pregp_step+0x4a/0x50
> > > Line 400:[ 2457.542385] Call Trace:
> > >
> > > Detailed list of issues:
> > > 1. Issue on line 300:
> > > [ 2457.293734] WARNING: CPU: 2 PID: 13 at kernel/rcu/tasks.h:1061
> > > rcu_tasks_trace_pregp_step+0x4a/0x50
> > > [ 2457.326661] Modules linked in:
> > > [ 2457.334818] CPU: 2 PID: 13 Comm: rcu_tasks_trace Not tainted 5.15.128+ #381
> > > [ 2457.349782] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> > > BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> > > [ 2457.373309] RIP: 0010:rcu_tasks_trace_pregp_step+0x4a/0x50
> > > [ 2457.386691] Code: 48 63 c7 48 8b 0
> > > [ 2457.421803] RSP: 0018:ffffa80fc0073e40 EFLAGS: 00010202
> > > [ 2457.431940] RAX: ffff8db91f580000 RBX: 000000000001b900 RCX: 0000000000000003
> > > [ 2457.443206] RDX: 0000000000000008 RSI: ffffffffac6bebd8 RDI: 0000000000000003
> > > [ 2457.454428] RBP: 0000000000000004 R08: 0000000000000001 R09: 0000000000000001
> > > [ 2457.465668] R10: 0000000000000000 R11: 00000000ffffffff R12: ffff8db902d87f40
> > > [ 2457.476971] R13: ffffffffac556620 R14: ffffffffac556630 R15: ffff8db9011a3200
> > > [ 2457.488251] FS:  0000000000000000(0000) GS:ffff8db91f500000(0000)
> > > knlGS:0000000000000000
> > > [ 2457.500834] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [ 2457.509602] CR2: 0000000000000000 CR3: 0000000002cbc000 CR4: 00000000000006e0
> > > [ 2457.520378] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > [ 2457.531440] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > [ 2457.542385] Call Trace:
> > > [ 2457.546756]  <TASK>
> > > [ 2457.550349]  ? __warn+0x7b/0x100
> > > [ 2457.567214]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
> > > -------------------------------------
> > > 2. Issue on Line 500:
> > > [ 2457.542385] Call Trace:
> > > [ 2457.546756]  <TASK>
> > > [ 2457.550349]  ? __warn+0x7b/0x100
> > > [ 2457.567214]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
> > > [ 2457.574948]  ? report_bug+0x99/0xc0
> > > [ 2457.593824]  ? handle_bug+0x3c/0x70
> > > [ 2457.599534]  ? exc_invalid_op+0x13/0x60
> > > [ 2457.625729]  ? asm_exc_invalid_op+0x16/0x20
> > > [ 2457.632249]  ? rcu_tasks_trace_pregp_step+0x4a/0x50
> > > [ 2457.660010]  rcu_tasks_wait_gp+0x54/0x360
> > > [ 2457.677761]  ? _raw_spin_unlock_irqrestore+0x2b/0x60
> > > [ 2457.705658]  rcu_tasks_kthread+0x114/0x200
> > > [ 2457.712450]  ? wait_woken+0x70/0x70
> > > [ 2457.727283]  ? synchronize_rcu_tasks_rude+0x10/0x10
> > > [ 2457.746221]  kthread+0x130/0x160
> > > [ 2457.751487]  ? set_kthread_struct+0x40/0x40
> > > [ 2457.758178]  ret_from_fork+0x22/0x30
> > > [ 2457.763909]  </TASK>
> > > [ 2457.767546] irq event stamp: 29544441
> > > [ 2457.773344] hardirqs last  enabled at (29544451):
> > > [<ffffffffaace6cbd>] __up_console_sem+0x4d/0x60
> > > [ 2457.786967] hardirqs last disabled at (29544460):
> > > [<ffffffffaace6ca2>] __up_console_sem+0x32/0x60
> > > -------------------------------------
> > > =====x8==========>
Joel Fernandes Sept. 3, 2023, 12:10 a.m. UTC | #6
On Sat, Sep 2, 2023 at 3:24 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
[...]
> > > Either way, what I don't want is many megabytes of splats from console.log
> > > duplicated into console.log.diags, especially in those (admittedly sad)
> > > cases where the console.log file is the largest file in the results
> > > directory.  Are you really going to be reading carefully after (say)
> > > the fifth splat?  ;-)
> >
> > I could limit it to just the first 5, with the existing unlimited
> > summary at the top.
>
> How about limiting the existing summary to (say) 10 lines?  I do not
> recall every caring about more than that.  The only reason that I did
> not ever limit it myself is that the summary is unlikely to be more than
> a few percent of the full console.log.
>
> > Let me know if that works or does not work for you. :-)
>
> I do believe that we might be converging.  ;-)

Ok, so 10 lines of summary (similar to existing), and 10 detailed
splats below that :-).  If that's not Ok, let me know. Otherwise I
will send the new patch soon. Thanks!

 - Joel
Paul E. McKenney Sept. 3, 2023, 1:38 a.m. UTC | #7
On Sat, Sep 02, 2023 at 08:10:13PM -0400, Joel Fernandes wrote:
> On Sat, Sep 2, 2023 at 3:24 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> [...]
> > > > Either way, what I don't want is many megabytes of splats from console.log
> > > > duplicated into console.log.diags, especially in those (admittedly sad)
> > > > cases where the console.log file is the largest file in the results
> > > > directory.  Are you really going to be reading carefully after (say)
> > > > the fifth splat?  ;-)
> > >
> > > I could limit it to just the first 5, with the existing unlimited
> > > summary at the top.
> >
> > How about limiting the existing summary to (say) 10 lines?  I do not
> > recall every caring about more than that.  The only reason that I did
> > not ever limit it myself is that the summary is unlikely to be more than
> > a few percent of the full console.log.
> >
> > > Let me know if that works or does not work for you. :-)
> >
> > I do believe that we might be converging.  ;-)
> 
> Ok, so 10 lines of summary (similar to existing), and 10 detailed
> splats below that :-).  If that's not Ok, let me know. Otherwise I
> will send the new patch soon. Thanks!

That works!

						Thanx, Paul
diff mbox series

Patch

diff --git a/tools/testing/selftests/rcutorture/bin/console-badness.sh b/tools/testing/selftests/rcutorture/bin/console-badness.sh
index aad51e7c0183..d28efcd86b64 100755
--- a/tools/testing/selftests/rcutorture/bin/console-badness.sh
+++ b/tools/testing/selftests/rcutorture/bin/console-badness.sh
@@ -9,10 +9,32 @@ 
 # Copyright (C) 2020 Facebook, Inc.
 #
 # Authors: Paul E. McKenney <paulmck@kernel.org>
+INPUT_DATA=$(< /dev/stdin)
 
-grep -E 'Badness|WARNING:|Warn|BUG|===========|BUG: KCSAN:|Call Trace:|Oops:|detected stalls on CPUs/tasks:|self-detected stall on CPU|Stall ended before state dump start|\?\?\? Writer stall state|rcu_.*kthread starved for|!!!' |
+# Get the line numbers for all the grep matches
+GREP_LINES="$(echo "$INPUT_DATA" |
+grep -n -E 'Badness|WARNING:|Warn|BUG|===========|BUG: KCSAN:|Call Trace:|Oops:|detected stalls on CPUs/tasks:|self-detected stall on CPU|Stall ended before state dump start|\?\?\? Writer stall state|rcu_.*kthread starved for|!!!' |
 grep -v 'ODEBUG: ' |
 grep -v 'This means that this is a DEBUG kernel and it is' |
 grep -v 'Warning: unable to open an initial console' |
 grep -v 'Warning: Failed to add ttynull console. No stdin, stdout, and stderr.*the init process!' |
 grep -v 'NOHZ tick-stop error: Non-RCU local softirq work is pending, handler'
+)"
+
+# Exit if no grep matches
+if [ ! -n "$GREP_LINES" ]; then exit 0; fi
+
+# Go through each line of GREP_LINES, extract the line number and then
+# print from that line and 20 lines after that line. Do that for each
+# grep match.
+issue_num=1
+while IFS= read -r line; do
+    # Extract the line number from the line
+    num=$(echo "$line" | awk -F: '{print $1}')
+    # Print 20 lines after the matched line
+    echo "Issue $issue_num:"
+    issue_num="$(($issue_num + 1))"
+    echo "$INPUT_DATA" | sed -n "${num},$(($num + 20))p"
+    echo "-------------------------------------"
+done <<< "$GREP_LINES"
+