Message ID | 20230301034702.3449755-1-zhengyejian1@huawei.com (mailing list archive) |
---|---|
State | Rejected |
Headers | show |
Series | [RFC] tracing/ring-buffer: Drop inappropriate WARN in rb_set_head_page() | expand |
On Wed, 1 Mar 2023 11:47:02 +0800 Zheng Yejian <zhengyejian1@huawei.com> wrote: > Following WARNING appears several times during test on v5.10 but > mainline kernel should have the same problem. However I currently > can't find the reproduction method. > > WARNING: CPU: 29 PID: 686834 at kernel/trace/ring_buffer.c:1357 > rb_set_head_page+0x168/0x264 > Link: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/kernel/trace/ring_buffer.c?h=linux-5.10.y#n1357 > Call trace: > rb_set_head_page+0x168/0x264 > rb_per_cpu_empty+0x34/0x15c > ring_buffer_empty_cpu.part.0.isra.0+0x1a4/0x3f0 > ring_buffer_empty_cpu+0x74/0xb4 > __find_next_entry+0x14c/0x2f4 > trace_find_next_entry_inc+0x48/0x13c > tracing_read_pipe+0x2c8/0x6b4 > vfs_read+0x144/0x324 > ksys_read+0x104/0x220 > __arm64_sys_read+0x54/0x70 > el0_svc_common.constprop.0+0xd8/0x37c > do_el0_svc+0x50/0x120 > el0_svc+0x24/0x3c > el0_sync_handler+0x17c/0x180 > el0_sync+0x160/0x180 > > The WARNING appears because rb_set_head_page() didn't grab the header > after three loops traversing buffer pages. This was not considered > to be expected, as comment said, writer possibly moves the header in > one loop. > > However, supposing writer keeps moving the header, we may miss more > loops and it seems normal not to grab the header within three loops > in rb_set_head_page(). Therefore drop that RB_WARN_ON(). It's normal to grab the header in two loops. I only made it three in case I was wrong. If it took 4 tries, something is wrong. Just returning NULL without setting the header will cause bugs elsewhere. -- Steve > > Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> > --- > kernel/trace/ring_buffer.c | 6 +----- > 1 file changed, 1 insertion(+), 5 deletions(-) > > diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c > index af50d931b020..cbfa306570d3 100644 > --- a/kernel/trace/ring_buffer.c > +++ b/kernel/trace/ring_buffer.c > @@ -1471,9 +1471,7 @@ rb_set_head_page(struct ring_buffer_per_cpu *cpu_buffer) > page = head = cpu_buffer->head_page; > /* > * It is possible that the writer moves the header behind > - * where we started, and we miss in one loop. > - * A second loop should grab the header, but we'll do > - * three loops just because I'm paranoid. > + * where we started, so we try three loops to grab the header. > */ > for (i = 0; i < 3; i++) { > do { > @@ -1485,8 +1483,6 @@ rb_set_head_page(struct ring_buffer_per_cpu *cpu_buffer) > } while (page != head); > } > > - RB_WARN_ON(cpu_buffer, 1); > - > return NULL; > } >
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index af50d931b020..cbfa306570d3 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -1471,9 +1471,7 @@ rb_set_head_page(struct ring_buffer_per_cpu *cpu_buffer) page = head = cpu_buffer->head_page; /* * It is possible that the writer moves the header behind - * where we started, and we miss in one loop. - * A second loop should grab the header, but we'll do - * three loops just because I'm paranoid. + * where we started, so we try three loops to grab the header. */ for (i = 0; i < 3; i++) { do { @@ -1485,8 +1483,6 @@ rb_set_head_page(struct ring_buffer_per_cpu *cpu_buffer) } while (page != head); } - RB_WARN_ON(cpu_buffer, 1); - return NULL; }
Following WARNING appears several times during test on v5.10 but mainline kernel should have the same problem. However I currently can't find the reproduction method. WARNING: CPU: 29 PID: 686834 at kernel/trace/ring_buffer.c:1357 rb_set_head_page+0x168/0x264 Link: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/kernel/trace/ring_buffer.c?h=linux-5.10.y#n1357 Call trace: rb_set_head_page+0x168/0x264 rb_per_cpu_empty+0x34/0x15c ring_buffer_empty_cpu.part.0.isra.0+0x1a4/0x3f0 ring_buffer_empty_cpu+0x74/0xb4 __find_next_entry+0x14c/0x2f4 trace_find_next_entry_inc+0x48/0x13c tracing_read_pipe+0x2c8/0x6b4 vfs_read+0x144/0x324 ksys_read+0x104/0x220 __arm64_sys_read+0x54/0x70 el0_svc_common.constprop.0+0xd8/0x37c do_el0_svc+0x50/0x120 el0_svc+0x24/0x3c el0_sync_handler+0x17c/0x180 el0_sync+0x160/0x180 The WARNING appears because rb_set_head_page() didn't grab the header after three loops traversing buffer pages. This was not considered to be expected, as comment said, writer possibly moves the header in one loop. However, supposing writer keeps moving the header, we may miss more loops and it seems normal not to grab the header within three loops in rb_set_head_page(). Therefore drop that RB_WARN_ON(). Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> --- kernel/trace/ring_buffer.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)