Message ID | 169262755804.106231.8245792908363050528.stgit@devnote2 (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | tracing: Fix to avoid wakeup loop in splice read of per-cpu buffer | expand |
On Mon, 21 Aug 2023 23:19:18 +0900 "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote: > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > ftrace user can set 0 or small number to the 'buffer_percent' for quick > response for the ring buffer. In that case wait_on_pipe() will return > before filling a page of the ring buffer. That is too soon for splice() > because ring_buffer_read_page() will fail again. > This leads unnecessary loop in tracing_buffers_splice_read(). > > Set a minimum percentage of the buffer which is enough to fill a page to > wait_on_pipe() to avoid this situation. > > Fixes: 03329f993978 ("tracing: Add tracefs file buffer_percentage") > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> So this is what I found when I played with virtio-trace. I think now trace-cmd has richer feature, so it can be replaced with that, maybe what we need is a selftest that ensures trace_pipe_raw supports splice correctly. BTW, even with this fix, blocking splice still causes a strange behavior. If I set '400' to buffer_size_kb (so 100 pages) and '1' to buffer_percent, splice always returns 8192 (2 pages) to read. But I expected that should return 4096 (1 page). This means splice() waits longer than I thought. I think the fullfilled percentage calculation will be a bit wrong. Thank you, > --- > kernel/trace/trace.c | 12 +++++++++++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c > index b8870078ef58..88448e8d8214 100644 > --- a/kernel/trace/trace.c > +++ b/kernel/trace/trace.c > @@ -8462,6 +8462,8 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos, > /* did we read anything? */ > if (!spd.nr_pages) { > long wait_index; > + size_t nr_pages; > + size_t full; > > if (ret) > goto out; > @@ -8472,7 +8474,15 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos, > > wait_index = READ_ONCE(iter->wait_index); > > - ret = wait_on_pipe(iter, iter->tr->buffer_percent); > + /* For splice, we have to ensure at least 1 page is filled */ > + nr_pages = ring_buffer_nr_pages(iter->array_buffer->buffer, iter->cpu_file); > + if (nr_pages * iter->tr->buffer_percent < 100) { > + full = nr_pages + 99; > + do_div(full, nr_pages); > + } else > + full = iter->tr->buffer_percent; > + > + ret = wait_on_pipe(iter, full); > if (ret) > goto out; > >
On Tue, 22 Aug 2023 00:16:39 +0900 Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote: > BTW, even with this fix, blocking splice still causes a strange behavior. > If I set '400' to buffer_size_kb (so 100 pages) and '1' to buffer_percent, > splice always returns 8192 (2 pages) to read. But I expected that should > return 4096 (1 page). This means splice() waits longer than I thought. > > I think the fullfilled percentage calculation will be a bit wrong. Note, the percentage is when to wake up the task. If between the wakeup and the read/splice, another ring buffer page gets filled more, then it will give that as well. The buffer_percent is just how long to wait, not for how much to read. Now if you test this with just adding enough to fill one page, and it doesn't wake up the reader, then that would be a bug. -- Steve
On Mon, 21 Aug 2023 11:27:03 -0400 Steven Rostedt <rostedt@goodmis.org> wrote: > On Tue, 22 Aug 2023 00:16:39 +0900 > Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote: > > > BTW, even with this fix, blocking splice still causes a strange behavior. > > If I set '400' to buffer_size_kb (so 100 pages) and '1' to buffer_percent, > > splice always returns 8192 (2 pages) to read. But I expected that should > > return 4096 (1 page). This means splice() waits longer than I thought. > > > > I think the fullfilled percentage calculation will be a bit wrong. > > Note, the percentage is when to wake up the task. If between the wakeup and > the read/splice, another ring buffer page gets filled more, then it will > give that as well. The buffer_percent is just how long to wait, not for how > much to read. Yes, but I used the trace_marker as Zheng did for testing, and I saw the buffer is filled more than 4096 bytes via tracefs/per_cpu/cpu*/stats. > > Now if you test this with just adding enough to fill one page, and it > doesn't wake up the reader, then that would be a bug. Yes. Run trace_agent with 1% buffer_percent == 1 page; # echo 400 > /sys/kernel/tracing/buffer_size_kb # echo 1 > /sys/kernel/tracing/buffer_percent # trace-agent & Ready to read in the host side, $ for i in `seq 0 7`; do cat trace-path-cpu$i.out > trace-data.$i & done $ echo 1 > agent-ctl-path.in Write events on per-cpu buffer in the guest # for i in `seq 1 1000`; do echo "test event data $i" | tee /sys/kernel/tracing/trace_marker > /dev/null; done But trace_agent doesn't wake up. The data is still there. # grep bytes /sys/kernel/tracing/per_cpu/cpu*/stats /sys/kernel/tracing/per_cpu/cpu0/stats:bytes: 5936 /sys/kernel/tracing/per_cpu/cpu1/stats:bytes: 5584 /sys/kernel/tracing/per_cpu/cpu2/stats:bytes: 6368 /sys/kernel/tracing/per_cpu/cpu3/stats:bytes: 4704 /sys/kernel/tracing/per_cpu/cpu4/stats:bytes: 5972 /sys/kernel/tracing/per_cpu/cpu5/stats:bytes: 5620 /sys/kernel/tracing/per_cpu/cpu6/stats:bytes: 6588 /sys/kernel/tracing/per_cpu/cpu7/stats:bytes: 3496 And write more events via trace_marker again # for i in `seq 1001 2000`; do echo "test event data $i" | tee /sys/kernel/tracing/trace_marker > /dev/null; done The data is read; # grep bytes /sys/kernel/tracing/per_cpu/cpu*/stats /sys/kernel/tracing/per_cpu/cpu0/stats:bytes: 3220 /sys/kernel/tracing/per_cpu/cpu1/stats:bytes: 3960 /sys/kernel/tracing/per_cpu/cpu2/stats:bytes: 2420 /sys/kernel/tracing/per_cpu/cpu3/stats:bytes: 2024 /sys/kernel/tracing/per_cpu/cpu4/stats:bytes: 2912 /sys/kernel/tracing/per_cpu/cpu5/stats:bytes: 1064 /sys/kernel/tracing/per_cpu/cpu6/stats:bytes: 5004 /sys/kernel/tracing/per_cpu/cpu7/stats:bytes: 2684 I think this full_hit() function is somewhat wrong. static __always_inline bool full_hit(struct trace_buffer *buffer, int cpu, int full) { ... return (dirty * 100) > (full * nr_pages); } Ah, I also found a mistake on this patch too. Thanks,
On Mon, 21 Aug 2023 23:19:18 +0900 "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote: > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > ftrace user can set 0 or small number to the 'buffer_percent' for quick > response for the ring buffer. In that case wait_on_pipe() will return > before filling a page of the ring buffer. That is too soon for splice() > because ring_buffer_read_page() will fail again. > This leads unnecessary loop in tracing_buffers_splice_read(). > > Set a minimum percentage of the buffer which is enough to fill a page to > wait_on_pipe() to avoid this situation. > > Fixes: 03329f993978 ("tracing: Add tracefs file buffer_percentage") > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> > --- > kernel/trace/trace.c | 12 +++++++++++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c > index b8870078ef58..88448e8d8214 100644 > --- a/kernel/trace/trace.c > +++ b/kernel/trace/trace.c > @@ -8462,6 +8462,8 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos, > /* did we read anything? */ > if (!spd.nr_pages) { > long wait_index; > + size_t nr_pages; > + size_t full; > > if (ret) > goto out; > @@ -8472,7 +8474,15 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos, > > wait_index = READ_ONCE(iter->wait_index); > > - ret = wait_on_pipe(iter, iter->tr->buffer_percent); > + /* For splice, we have to ensure at least 1 page is filled */ > + nr_pages = ring_buffer_nr_pages(iter->array_buffer->buffer, iter->cpu_file); > + if (nr_pages * iter->tr->buffer_percent < 100) { > + full = nr_pages + 99; > + do_div(full, nr_pages); > + } else > + full = iter->tr->buffer_percent; Ah I must have to take a sleep well. What I need is to ensure full >= 1. static __always_inline bool full_hit(struct trace_buffer *buffer, int cpu, int full) { ... return (dirty * 100) > (full * nr_pages); } If dirty = 0, this always false. But I think if full == 0, this should return true. If dirty = 1, - nr_pages < 100, this is always true and that is good. - nr_pages > 100, even if full is 1 (smallest), it doesn't true. But that is OK because dirty page number will be increased later. - nr_pages == 100 is the corner case. I think this should be return (dirty * 100) >= (full * nr_pages); Let me update the patch. Thank you, > + > + ret = wait_on_pipe(iter, full); > if (ret) > goto out; > >
Hi Masami, kernel test robot noticed the following build warnings: [auto build test WARNING on linus/master] [also build test WARNING on v6.5-rc7 next-20230821] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Masami-Hiramatsu-Google/tracing-Fix-to-avoid-wakeup-loop-in-splice-read-of-per-cpu-buffer/20230821-222307 base: linus/master patch link: https://lore.kernel.org/r/169262755804.106231.8245792908363050528.stgit%40devnote2 patch subject: [PATCH] tracing: Fix to avoid wakeup loop in splice read of per-cpu buffer config: mips-allyesconfig (https://download.01.org/0day-ci/archive/20230822/202308221154.hil5JaxI-lkp@intel.com/config) compiler: mips-linux-gcc (GCC) 12.3.0 reproduce: (https://download.01.org/0day-ci/archive/20230822/202308221154.hil5JaxI-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202308221154.hil5JaxI-lkp@intel.com/ All warnings (new ones prefixed by >>): In file included from arch/mips/include/asm/div64.h:89, from include/linux/math.h:6, from include/linux/math64.h:6, from include/linux/time64.h:5, from include/linux/restart_block.h:10, from include/linux/thread_info.h:14, from include/asm-generic/preempt.h:5, from ./arch/mips/include/generated/asm/preempt.h:1, from include/linux/preempt.h:79, from include/linux/spinlock.h:56, from include/linux/mmzone.h:8, from include/linux/gfp.h:7, from include/linux/mm.h:7, from include/linux/ring_buffer.h:5, from kernel/trace/trace.c:15: kernel/trace/trace.c: In function 'tracing_buffers_splice_read': >> include/asm-generic/div64.h:222:35: warning: comparison of distinct pointer types lacks a cast 222 | (void)(((typeof((n)) *)0) == ((uint64_t *)0)); \ | ^~ kernel/trace/trace.c:8481:25: note: in expansion of macro 'do_div' 8481 | do_div(full, nr_pages); | ^~~~~~ In file included from arch/mips/include/asm/bug.h:5, from include/linux/bug.h:5, from include/linux/mmdebug.h:5, from include/linux/mm.h:6: >> include/asm-generic/div64.h:234:32: warning: right shift count >= width of type [-Wshift-count-overflow] 234 | } else if (likely(((n) >> 32) == 0)) { \ | ^~ include/linux/compiler.h:76:45: note: in definition of macro 'likely' 76 | # define likely(x) __builtin_expect(!!(x), 1) | ^ kernel/trace/trace.c:8481:25: note: in expansion of macro 'do_div' 8481 | do_div(full, nr_pages); | ^~~~~~ vim +222 include/asm-generic/div64.h ^1da177e4c3f41 Linus Torvalds 2005-04-16 215 ^1da177e4c3f41 Linus Torvalds 2005-04-16 216 /* The unnecessary pointer compare is there ^1da177e4c3f41 Linus Torvalds 2005-04-16 217 * to check for type safety (n must be 64bit) ^1da177e4c3f41 Linus Torvalds 2005-04-16 218 */ ^1da177e4c3f41 Linus Torvalds 2005-04-16 219 # define do_div(n,base) ({ \ ^1da177e4c3f41 Linus Torvalds 2005-04-16 220 uint32_t __base = (base); \ ^1da177e4c3f41 Linus Torvalds 2005-04-16 221 uint32_t __rem; \ ^1da177e4c3f41 Linus Torvalds 2005-04-16 @222 (void)(((typeof((n)) *)0) == ((uint64_t *)0)); \ 911918aa7ef6f8 Nicolas Pitre 2015-11-02 223 if (__builtin_constant_p(__base) && \ 911918aa7ef6f8 Nicolas Pitre 2015-11-02 224 is_power_of_2(__base)) { \ 911918aa7ef6f8 Nicolas Pitre 2015-11-02 225 __rem = (n) & (__base - 1); \ 911918aa7ef6f8 Nicolas Pitre 2015-11-02 226 (n) >>= ilog2(__base); \ c747ce4706190e Geert Uytterhoeven 2021-08-11 227 } else if (__builtin_constant_p(__base) && \ 461a5e51060c93 Nicolas Pitre 2015-10-30 228 __base != 0) { \ 461a5e51060c93 Nicolas Pitre 2015-10-30 229 uint32_t __res_lo, __n_lo = (n); \ 461a5e51060c93 Nicolas Pitre 2015-10-30 230 (n) = __div64_const32(n, __base); \ 461a5e51060c93 Nicolas Pitre 2015-10-30 231 /* the remainder can be computed with 32-bit regs */ \ 461a5e51060c93 Nicolas Pitre 2015-10-30 232 __res_lo = (n); \ 461a5e51060c93 Nicolas Pitre 2015-10-30 233 __rem = __n_lo - __res_lo * __base; \ 911918aa7ef6f8 Nicolas Pitre 2015-11-02 @234 } else if (likely(((n) >> 32) == 0)) { \ ^1da177e4c3f41 Linus Torvalds 2005-04-16 235 __rem = (uint32_t)(n) % __base; \ ^1da177e4c3f41 Linus Torvalds 2005-04-16 236 (n) = (uint32_t)(n) / __base; \ c747ce4706190e Geert Uytterhoeven 2021-08-11 237 } else { \ ^1da177e4c3f41 Linus Torvalds 2005-04-16 238 __rem = __div64_32(&(n), __base); \ c747ce4706190e Geert Uytterhoeven 2021-08-11 239 } \ ^1da177e4c3f41 Linus Torvalds 2005-04-16 240 __rem; \ ^1da177e4c3f41 Linus Torvalds 2005-04-16 241 }) ^1da177e4c3f41 Linus Torvalds 2005-04-16 242
Hi Masami, kernel test robot noticed the following build errors: [auto build test ERROR on linus/master] [also build test ERROR on v6.5-rc7 next-20230821] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Masami-Hiramatsu-Google/tracing-Fix-to-avoid-wakeup-loop-in-splice-read-of-per-cpu-buffer/20230821-222307 base: linus/master patch link: https://lore.kernel.org/r/169262755804.106231.8245792908363050528.stgit%40devnote2 patch subject: [PATCH] tracing: Fix to avoid wakeup loop in splice read of per-cpu buffer config: arm-randconfig-r004-20230822 (https://download.01.org/0day-ci/archive/20230822/202308221429.cY9ifC65-lkp@intel.com/config) compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project.git 4a5ac14ee968ff0ad5d2cc1ffa0299048db4c88a) reproduce: (https://download.01.org/0day-ci/archive/20230822/202308221429.cY9ifC65-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202308221429.cY9ifC65-lkp@intel.com/ All error/warnings (new ones prefixed by >>): >> kernel/trace/trace.c:8481:4: warning: comparison of distinct pointer types ('typeof ((full)) *' (aka 'unsigned int *') and 'uint64_t *' (aka 'unsigned long long *')) [-Wcompare-distinct-pointer-types] 8481 | do_div(full, nr_pages); | ^~~~~~~~~~~~~~~~~~~~~~ include/asm-generic/div64.h:222:28: note: expanded from macro 'do_div' 222 | (void)(((typeof((n)) *)0) == ((uint64_t *)0)); \ | ~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~ >> kernel/trace/trace.c:8481:4: error: incompatible pointer types passing 'size_t *' (aka 'unsigned int *') to parameter of type 'uint64_t *' (aka 'unsigned long long *') [-Werror,-Wincompatible-pointer-types] 8481 | do_div(full, nr_pages); | ^~~~~~~~~~~~~~~~~~~~~~ include/asm-generic/div64.h:238:22: note: expanded from macro 'do_div' 238 | __rem = __div64_32(&(n), __base); \ | ^~~~ arch/arm/include/asm/div64.h:24:45: note: passing argument to parameter 'n' here 24 | static inline uint32_t __div64_32(uint64_t *n, uint32_t base) | ^ >> kernel/trace/trace.c:8481:4: warning: shift count >= width of type [-Wshift-count-overflow] 8481 | do_div(full, nr_pages); | ^~~~~~~~~~~~~~~~~~~~~~ include/asm-generic/div64.h:234:25: note: expanded from macro 'do_div' 234 | } else if (likely(((n) >> 32) == 0)) { \ | ^ ~~ include/linux/compiler.h:76:40: note: expanded from macro 'likely' 76 | # define likely(x) __builtin_expect(!!(x), 1) | ^ 2 warnings and 1 error generated. vim +8481 kernel/trace/trace.c 8400 8401 if (*ppos & (PAGE_SIZE - 1)) 8402 return -EINVAL; 8403 8404 if (len & (PAGE_SIZE - 1)) { 8405 if (len < PAGE_SIZE) 8406 return -EINVAL; 8407 len &= PAGE_MASK; 8408 } 8409 8410 if (splice_grow_spd(pipe, &spd)) 8411 return -ENOMEM; 8412 8413 again: 8414 trace_access_lock(iter->cpu_file); 8415 entries = ring_buffer_entries_cpu(iter->array_buffer->buffer, iter->cpu_file); 8416 8417 for (i = 0; i < spd.nr_pages_max && len && entries; i++, len -= PAGE_SIZE) { 8418 struct page *page; 8419 int r; 8420 8421 ref = kzalloc(sizeof(*ref), GFP_KERNEL); 8422 if (!ref) { 8423 ret = -ENOMEM; 8424 break; 8425 } 8426 8427 refcount_set(&ref->refcount, 1); 8428 ref->buffer = iter->array_buffer->buffer; 8429 ref->page = ring_buffer_alloc_read_page(ref->buffer, iter->cpu_file); 8430 if (IS_ERR(ref->page)) { 8431 ret = PTR_ERR(ref->page); 8432 ref->page = NULL; 8433 kfree(ref); 8434 break; 8435 } 8436 ref->cpu = iter->cpu_file; 8437 8438 r = ring_buffer_read_page(ref->buffer, &ref->page, 8439 len, iter->cpu_file, 1); 8440 if (r < 0) { 8441 ring_buffer_free_read_page(ref->buffer, ref->cpu, 8442 ref->page); 8443 kfree(ref); 8444 break; 8445 } 8446 8447 page = virt_to_page(ref->page); 8448 8449 spd.pages[i] = page; 8450 spd.partial[i].len = PAGE_SIZE; 8451 spd.partial[i].offset = 0; 8452 spd.partial[i].private = (unsigned long)ref; 8453 spd.nr_pages++; 8454 *ppos += PAGE_SIZE; 8455 8456 entries = ring_buffer_entries_cpu(iter->array_buffer->buffer, iter->cpu_file); 8457 } 8458 8459 trace_access_unlock(iter->cpu_file); 8460 spd.nr_pages = i; 8461 8462 /* did we read anything? */ 8463 if (!spd.nr_pages) { 8464 long wait_index; 8465 size_t nr_pages; 8466 size_t full; 8467 8468 if (ret) 8469 goto out; 8470 8471 ret = -EAGAIN; 8472 if ((file->f_flags & O_NONBLOCK) || (flags & SPLICE_F_NONBLOCK)) 8473 goto out; 8474 8475 wait_index = READ_ONCE(iter->wait_index); 8476 8477 /* For splice, we have to ensure at least 1 page is filled */ 8478 nr_pages = ring_buffer_nr_pages(iter->array_buffer->buffer, iter->cpu_file); 8479 if (nr_pages * iter->tr->buffer_percent < 100) { 8480 full = nr_pages + 99; > 8481 do_div(full, nr_pages); 8482 } else 8483 full = iter->tr->buffer_percent; 8484 8485 ret = wait_on_pipe(iter, full); 8486 if (ret) 8487 goto out; 8488 8489 /* No need to wait after waking up when tracing is off */ 8490 if (!tracer_tracing_is_on(iter->tr)) 8491 goto out; 8492 8493 /* Make sure we see the new wait_index */ 8494 smp_rmb(); 8495 if (wait_index != iter->wait_index) 8496 goto out; 8497 8498 goto again; 8499 } 8500 8501 ret = splice_to_pipe(pipe, &spd); 8502 out: 8503 splice_shrink_spd(&spd); 8504 8505 return ret; 8506 } 8507
On Mon, 21 Aug 2023 23:19:18 +0900 "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote: > if (!spd.nr_pages) { > long wait_index; > + size_t nr_pages; > + size_t full; size_t is usually considered "long" (machine word length). > > if (ret) > goto out; > @@ -8472,7 +8474,15 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos, > > wait_index = READ_ONCE(iter->wait_index); > > - ret = wait_on_pipe(iter, iter->tr->buffer_percent); > + /* For splice, we have to ensure at least 1 page is filled */ > + nr_pages = ring_buffer_nr_pages(iter->array_buffer->buffer, iter->cpu_file); > + if (nr_pages * iter->tr->buffer_percent < 100) { > + full = nr_pages + 99; > + do_div(full, nr_pages); No need for do_div() as full is not 64 bit on 32 bit machines. That's why the kernel test robot is complaining. -- Steve > + } else > + full = iter->tr->buffer_percent; > + > + ret = wait_on_pipe(iter, full);
On Tue, 22 Aug 2023 04:59:37 -0400 Steven Rostedt <rostedt@goodmis.org> wrote: > On Mon, 21 Aug 2023 23:19:18 +0900 > "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote: > > > if (!spd.nr_pages) { > > long wait_index; > > + size_t nr_pages; > > + size_t full; > > size_t is usually considered "long" (machine word length). > > > > > if (ret) > > goto out; > > @@ -8472,7 +8474,15 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos, > > > > wait_index = READ_ONCE(iter->wait_index); > > > > - ret = wait_on_pipe(iter, iter->tr->buffer_percent); > > + /* For splice, we have to ensure at least 1 page is filled */ > > + nr_pages = ring_buffer_nr_pages(iter->array_buffer->buffer, iter->cpu_file); > > + if (nr_pages * iter->tr->buffer_percent < 100) { > > + full = nr_pages + 99; > > + do_div(full, nr_pages); > > No need for do_div() as full is not 64 bit on 32 bit machines. > > That's why the kernel test robot is complaining. Thanks for the comment. I decided to set full = 1, maybe we don't need to set the percent to be the page size because full_hit() checks it by the number of dirty pages. :) I'm testing a new one. Thanks, > > -- Steve > > > + } else > > + full = iter->tr->buffer_percent; > > + > > + ret = wait_on_pipe(iter, full);
Hi Masami, kernel test robot noticed the following build warnings: [auto build test WARNING on linus/master] [also build test WARNING on v6.5-rc7 next-20230823] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Masami-Hiramatsu-Google/tracing-Fix-to-avoid-wakeup-loop-in-splice-read-of-per-cpu-buffer/20230821-222307 base: linus/master patch link: https://lore.kernel.org/r/169262755804.106231.8245792908363050528.stgit%40devnote2 patch subject: [PATCH] tracing: Fix to avoid wakeup loop in splice read of per-cpu buffer config: mips-randconfig-r131-20230823 (https://download.01.org/0day-ci/archive/20230823/202308232056.401OxapL-lkp@intel.com/config) compiler: mipsel-linux-gcc (GCC) 13.2.0 reproduce: (https://download.01.org/0day-ci/archive/20230823/202308232056.401OxapL-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202308232056.401OxapL-lkp@intel.com/ sparse warnings: (new ones prefixed by >>) kernel/trace/trace.c:446:28: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected struct trace_export **list @@ got struct trace_export [noderef] __rcu ** @@ kernel/trace/trace.c:446:28: sparse: expected struct trace_export **list kernel/trace/trace.c:446:28: sparse: got struct trace_export [noderef] __rcu ** kernel/trace/trace.c:460:33: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected struct trace_export **list @@ got struct trace_export [noderef] __rcu ** @@ kernel/trace/trace.c:460:33: sparse: expected struct trace_export **list kernel/trace/trace.c:460:33: sparse: got struct trace_export [noderef] __rcu ** kernel/trace/trace.c:2959:38: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected struct event_filter *filter @@ got struct event_filter [noderef] __rcu *filter @@ kernel/trace/trace.c:2959:38: sparse: expected struct event_filter *filter kernel/trace/trace.c:2959:38: sparse: got struct event_filter [noderef] __rcu *filter >> kernel/trace/trace.c:8481:25: sparse: sparse: incompatible types in comparison expression (different type sizes): >> kernel/trace/trace.c:8481:25: sparse: unsigned int * >> kernel/trace/trace.c:8481:25: sparse: unsigned long long [usertype] * kernel/trace/trace.c:400:9: sparse: sparse: incompatible types in comparison expression (different address spaces): kernel/trace/trace.c:400:9: sparse: struct trace_export [noderef] __rcu * kernel/trace/trace.c:400:9: sparse: struct trace_export * kernel/trace/trace.c:415:9: sparse: sparse: incompatible types in comparison expression (different address spaces): kernel/trace/trace.c:415:9: sparse: struct trace_export [noderef] __rcu * kernel/trace/trace.c:415:9: sparse: struct trace_export * >> kernel/trace/trace.c:8481:25: sparse: sparse: shift too big (32) for type unsigned int vim +8481 kernel/trace/trace.c 8400 8401 if (*ppos & (PAGE_SIZE - 1)) 8402 return -EINVAL; 8403 8404 if (len & (PAGE_SIZE - 1)) { 8405 if (len < PAGE_SIZE) 8406 return -EINVAL; 8407 len &= PAGE_MASK; 8408 } 8409 8410 if (splice_grow_spd(pipe, &spd)) 8411 return -ENOMEM; 8412 8413 again: 8414 trace_access_lock(iter->cpu_file); 8415 entries = ring_buffer_entries_cpu(iter->array_buffer->buffer, iter->cpu_file); 8416 8417 for (i = 0; i < spd.nr_pages_max && len && entries; i++, len -= PAGE_SIZE) { 8418 struct page *page; 8419 int r; 8420 8421 ref = kzalloc(sizeof(*ref), GFP_KERNEL); 8422 if (!ref) { 8423 ret = -ENOMEM; 8424 break; 8425 } 8426 8427 refcount_set(&ref->refcount, 1); 8428 ref->buffer = iter->array_buffer->buffer; 8429 ref->page = ring_buffer_alloc_read_page(ref->buffer, iter->cpu_file); 8430 if (IS_ERR(ref->page)) { 8431 ret = PTR_ERR(ref->page); 8432 ref->page = NULL; 8433 kfree(ref); 8434 break; 8435 } 8436 ref->cpu = iter->cpu_file; 8437 8438 r = ring_buffer_read_page(ref->buffer, &ref->page, 8439 len, iter->cpu_file, 1); 8440 if (r < 0) { 8441 ring_buffer_free_read_page(ref->buffer, ref->cpu, 8442 ref->page); 8443 kfree(ref); 8444 break; 8445 } 8446 8447 page = virt_to_page(ref->page); 8448 8449 spd.pages[i] = page; 8450 spd.partial[i].len = PAGE_SIZE; 8451 spd.partial[i].offset = 0; 8452 spd.partial[i].private = (unsigned long)ref; 8453 spd.nr_pages++; 8454 *ppos += PAGE_SIZE; 8455 8456 entries = ring_buffer_entries_cpu(iter->array_buffer->buffer, iter->cpu_file); 8457 } 8458 8459 trace_access_unlock(iter->cpu_file); 8460 spd.nr_pages = i; 8461 8462 /* did we read anything? */ 8463 if (!spd.nr_pages) { 8464 long wait_index; 8465 size_t nr_pages; 8466 size_t full; 8467 8468 if (ret) 8469 goto out; 8470 8471 ret = -EAGAIN; 8472 if ((file->f_flags & O_NONBLOCK) || (flags & SPLICE_F_NONBLOCK)) 8473 goto out; 8474 8475 wait_index = READ_ONCE(iter->wait_index); 8476 8477 /* For splice, we have to ensure at least 1 page is filled */ 8478 nr_pages = ring_buffer_nr_pages(iter->array_buffer->buffer, iter->cpu_file); 8479 if (nr_pages * iter->tr->buffer_percent < 100) { 8480 full = nr_pages + 99; > 8481 do_div(full, nr_pages); 8482 } else 8483 full = iter->tr->buffer_percent; 8484 8485 ret = wait_on_pipe(iter, full); 8486 if (ret) 8487 goto out; 8488 8489 /* No need to wait after waking up when tracing is off */ 8490 if (!tracer_tracing_is_on(iter->tr)) 8491 goto out; 8492 8493 /* Make sure we see the new wait_index */ 8494 smp_rmb(); 8495 if (wait_index != iter->wait_index) 8496 goto out; 8497 8498 goto again; 8499 } 8500 8501 ret = splice_to_pipe(pipe, &spd); 8502 out: 8503 splice_shrink_spd(&spd); 8504 8505 return ret; 8506 } 8507
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index b8870078ef58..88448e8d8214 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -8462,6 +8462,8 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos, /* did we read anything? */ if (!spd.nr_pages) { long wait_index; + size_t nr_pages; + size_t full; if (ret) goto out; @@ -8472,7 +8474,15 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos, wait_index = READ_ONCE(iter->wait_index); - ret = wait_on_pipe(iter, iter->tr->buffer_percent); + /* For splice, we have to ensure at least 1 page is filled */ + nr_pages = ring_buffer_nr_pages(iter->array_buffer->buffer, iter->cpu_file); + if (nr_pages * iter->tr->buffer_percent < 100) { + full = nr_pages + 99; + do_div(full, nr_pages); + } else + full = iter->tr->buffer_percent; + + ret = wait_on_pipe(iter, full); if (ret) goto out;