Message ID | 20231219185631.809766769@goodmis.org (mailing list archive) |
---|---|
State | Accepted |
Commit | 2f84b39f48476615186af5b3220ad5a2c756679b |
Headers | show |
Series | ring-buffer/tracing: Allow ring buffer to have bigger sub buffers | expand |
On Tue, 19 Dec 2023 13:54:29 -0500 Steven Rostedt <rostedt@goodmis.org> wrote: > From: "Steven Rostedt (Google)" <rostedt@goodmis.org> > > Using page order for deciding what the size of the ring buffer sub buffers > are is exposing a bit too much of the implementation. Although the sub > buffers are only allocated in orders of pages, allow the user to specify > the minimum size of each sub-buffer via kilobytes like they can with the > buffer size itself. > > If the user specifies 3 via: > > echo 3 > buffer_subbuf_size_kb > > Then the sub-buffer size will round up to 4kb (on a 4kb page size system). > > If they specify: > > echo 6 > buffer_subbuf_size_kb > > The sub-buffer size will become 8kb. I think this is better interface. Can we apply this earlier in the series to avoid rewriting the document and test code? Thank you, > > and so on. > > Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> > --- > Documentation/trace/ftrace.rst | 46 ++++++++----------- > kernel/trace/trace.c | 38 +++++++++------ > ...fer_order.tc => ringbuffer_subbuf_size.tc} | 18 ++++---- > 3 files changed, 54 insertions(+), 48 deletions(-) > rename tools/testing/selftests/ftrace/test.d/00basic/{ringbuffer_order.tc => ringbuffer_subbuf_size.tc} (85%) > > diff --git a/Documentation/trace/ftrace.rst b/Documentation/trace/ftrace.rst > index 231d26ceedb0..933e7efb9f1b 100644 > --- a/Documentation/trace/ftrace.rst > +++ b/Documentation/trace/ftrace.rst > @@ -203,32 +203,26 @@ of ftrace. Here is a list of some of the key files: > > This displays the total combined size of all the trace buffers. > > - buffer_subbuf_order: > - > - This sets or displays the sub buffer page size order. The ring buffer > - is broken up into several same size "sub buffers". An event can not be > - bigger than the size of the sub buffer. Normally, the sub buffer is > - the size of the architecture's page (4K on x86). The sub buffer also > - contains meta data at the start which also limits the size of an event. > - That means when the sub buffer is a page size, no event can be larger > - than the page size minus the sub buffer meta data. > - > - The buffer_subbuf_order allows the user to change the size of the sub > - buffer. As the sub buffer is a set of pages by the power of 2, thus > - the sub buffer total size is defined by the order: > - > - order size > - ---- ---- > - 0 PAGE_SIZE > - 1 PAGE_SIZE * 2 > - 2 PAGE_SIZE * 4 > - 3 PAGE_SIZE * 8 > - > - Changing the order will change the sub buffer size allowing for events > - to be larger than the page size. > - > - Note: When changing the order, tracing is stopped and any data in the > - ring buffer and the snapshot buffer will be discarded. > + buffer_subbuf_size_kb: > + > + This sets or displays the sub buffer size. The ring buffer is broken up > + into several same size "sub buffers". An event can not be bigger than > + the size of the sub buffer. Normally, the sub buffer is the size of the > + architecture's page (4K on x86). The sub buffer also contains meta data > + at the start which also limits the size of an event. That means when > + the sub buffer is a page size, no event can be larger than the page > + size minus the sub buffer meta data. > + > + Note, the buffer_subbuf_size_kb is a way for the user to specify the > + minimum size of the subbuffer. The kernel may make it bigger due to the > + implementation details, or simply fail the operation if the kernel can > + not handle the request. > + > + Changing the sub buffer size allows for events to be larger than the > + page size. > + > + Note: When changing the sub-buffer size, tracing is stopped and any > + data in the ring buffer and the snapshot buffer will be discarded. > > free_buffer: > > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c > index 82303bd2bba1..46dbe22121e9 100644 > --- a/kernel/trace/trace.c > +++ b/kernel/trace/trace.c > @@ -9384,42 +9384,54 @@ static const struct file_operations buffer_percent_fops = { > }; > > static ssize_t > -buffer_order_read(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos) > +buffer_subbuf_size_read(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos) > { > struct trace_array *tr = filp->private_data; > + size_t size; > char buf[64]; > + int order; > int r; > > - r = sprintf(buf, "%d\n", ring_buffer_subbuf_order_get(tr->array_buffer.buffer)); > + order = ring_buffer_subbuf_order_get(tr->array_buffer.buffer); > + size = (PAGE_SIZE << order) / 1024; > + > + r = sprintf(buf, "%zd\n", size); > > return simple_read_from_buffer(ubuf, cnt, ppos, buf, r); > } > > static ssize_t > -buffer_order_write(struct file *filp, const char __user *ubuf, > - size_t cnt, loff_t *ppos) > +buffer_subbuf_size_write(struct file *filp, const char __user *ubuf, > + size_t cnt, loff_t *ppos) > { > struct trace_array *tr = filp->private_data; > unsigned long val; > int old_order; > + int order; > + int pages; > int ret; > > ret = kstrtoul_from_user(ubuf, cnt, 10, &val); > if (ret) > return ret; > > + val *= 1024; /* value passed in is in KB */ > + > + pages = DIV_ROUND_UP(val, PAGE_SIZE); > + order = fls(pages - 1); > + > /* limit between 1 and 128 system pages */ > - if (val < 0 || val > 7) > + if (order < 0 || order > 7) > return -EINVAL; > > /* Do not allow tracing while changing the order of the ring buffer */ > tracing_stop_tr(tr); > > old_order = ring_buffer_subbuf_order_get(tr->array_buffer.buffer); > - if (old_order == val) > + if (old_order == order) > goto out; > > - ret = ring_buffer_subbuf_order_set(tr->array_buffer.buffer, val); > + ret = ring_buffer_subbuf_order_set(tr->array_buffer.buffer, order); > if (ret) > goto out; > > @@ -9428,7 +9440,7 @@ buffer_order_write(struct file *filp, const char __user *ubuf, > if (!tr->allocated_snapshot) > goto out_max; > > - ret = ring_buffer_subbuf_order_set(tr->max_buffer.buffer, val); > + ret = ring_buffer_subbuf_order_set(tr->max_buffer.buffer, order); > if (ret) { > /* Put back the old order */ > cnt = ring_buffer_subbuf_order_set(tr->array_buffer.buffer, old_order); > @@ -9460,10 +9472,10 @@ buffer_order_write(struct file *filp, const char __user *ubuf, > return cnt; > } > > -static const struct file_operations buffer_order_fops = { > +static const struct file_operations buffer_subbuf_size_fops = { > .open = tracing_open_generic_tr, > - .read = buffer_order_read, > - .write = buffer_order_write, > + .read = buffer_subbuf_size_read, > + .write = buffer_subbuf_size_write, > .release = tracing_release_generic_tr, > .llseek = default_llseek, > }; > @@ -9934,8 +9946,8 @@ init_tracer_tracefs(struct trace_array *tr, struct dentry *d_tracer) > trace_create_file("buffer_percent", TRACE_MODE_WRITE, d_tracer, > tr, &buffer_percent_fops); > > - trace_create_file("buffer_subbuf_order", TRACE_MODE_WRITE, d_tracer, > - tr, &buffer_order_fops); > + trace_create_file("buffer_subbuf_size_kb", TRACE_MODE_WRITE, d_tracer, > + tr, &buffer_subbuf_size_fops); > > create_trace_options_dir(tr); > > diff --git a/tools/testing/selftests/ftrace/test.d/00basic/ringbuffer_order.tc b/tools/testing/selftests/ftrace/test.d/00basic/ringbuffer_subbuf_size.tc > similarity index 85% > rename from tools/testing/selftests/ftrace/test.d/00basic/ringbuffer_order.tc > rename to tools/testing/selftests/ftrace/test.d/00basic/ringbuffer_subbuf_size.tc > index ecbcc810e6c1..d44d09a33a74 100644 > --- a/tools/testing/selftests/ftrace/test.d/00basic/ringbuffer_order.tc > +++ b/tools/testing/selftests/ftrace/test.d/00basic/ringbuffer_subbuf_size.tc > @@ -1,7 +1,7 @@ > #!/bin/sh > # SPDX-License-Identifier: GPL-2.0 > -# description: Change the ringbuffer sub-buffer order > -# requires: buffer_subbuf_order > +# description: Change the ringbuffer sub-buffer size > +# requires: buffer_subbuf_size_kb > # flags: instance > > get_buffer_data_size() { > @@ -52,8 +52,8 @@ write_buffer() { > } > > test_buffer() { > - orde=$1 > - page_size=$((4096<<order)) > + size_kb=$1 > + page_size=$((size_kb*1024)) > > size=`get_buffer_data_size` > > @@ -82,14 +82,14 @@ test_buffer() { > fi > } > > -ORIG=`cat buffer_subbuf_order` > +ORIG=`cat buffer_subbuf_size_kb` > > -# Could test bigger orders than 3, but then creating the string > +# Could test bigger sizes than 32K, but then creating the string > # to write into the ring buffer takes too long > -for a in 0 1 2 3 ; do > - echo $a > buffer_subbuf_order > +for a in 4 8 16 32 ; do > + echo $a > buffer_subbuf_size_kb > test_buffer $a > done > > -echo $ORIG > buffer_subbuf_order > +echo $ORIG > buffer_subbuf_size_kb > > -- > 2.42.0 > >
On Thu, 21 Dec 2023 09:26:21 +0900 Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote: > > If the user specifies 3 via: > > > > echo 3 > buffer_subbuf_size_kb > > > > Then the sub-buffer size will round up to 4kb (on a 4kb page size system). > > > > If they specify: > > > > echo 6 > buffer_subbuf_size_kb > > > > The sub-buffer size will become 8kb. > > I think this is better interface. Can we apply this earlier in the series > to avoid rewriting the document and test code? I kept it separate for testing purposes. Through out all this, it was a good way to make sure the two approaches were compatible. I still like to keep them separate as that's the way it was developed. It's good to keep that history. -- Steve
diff --git a/Documentation/trace/ftrace.rst b/Documentation/trace/ftrace.rst index 231d26ceedb0..933e7efb9f1b 100644 --- a/Documentation/trace/ftrace.rst +++ b/Documentation/trace/ftrace.rst @@ -203,32 +203,26 @@ of ftrace. Here is a list of some of the key files: This displays the total combined size of all the trace buffers. - buffer_subbuf_order: - - This sets or displays the sub buffer page size order. The ring buffer - is broken up into several same size "sub buffers". An event can not be - bigger than the size of the sub buffer. Normally, the sub buffer is - the size of the architecture's page (4K on x86). The sub buffer also - contains meta data at the start which also limits the size of an event. - That means when the sub buffer is a page size, no event can be larger - than the page size minus the sub buffer meta data. - - The buffer_subbuf_order allows the user to change the size of the sub - buffer. As the sub buffer is a set of pages by the power of 2, thus - the sub buffer total size is defined by the order: - - order size - ---- ---- - 0 PAGE_SIZE - 1 PAGE_SIZE * 2 - 2 PAGE_SIZE * 4 - 3 PAGE_SIZE * 8 - - Changing the order will change the sub buffer size allowing for events - to be larger than the page size. - - Note: When changing the order, tracing is stopped and any data in the - ring buffer and the snapshot buffer will be discarded. + buffer_subbuf_size_kb: + + This sets or displays the sub buffer size. The ring buffer is broken up + into several same size "sub buffers". An event can not be bigger than + the size of the sub buffer. Normally, the sub buffer is the size of the + architecture's page (4K on x86). The sub buffer also contains meta data + at the start which also limits the size of an event. That means when + the sub buffer is a page size, no event can be larger than the page + size minus the sub buffer meta data. + + Note, the buffer_subbuf_size_kb is a way for the user to specify the + minimum size of the subbuffer. The kernel may make it bigger due to the + implementation details, or simply fail the operation if the kernel can + not handle the request. + + Changing the sub buffer size allows for events to be larger than the + page size. + + Note: When changing the sub-buffer size, tracing is stopped and any + data in the ring buffer and the snapshot buffer will be discarded. free_buffer: diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 82303bd2bba1..46dbe22121e9 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -9384,42 +9384,54 @@ static const struct file_operations buffer_percent_fops = { }; static ssize_t -buffer_order_read(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos) +buffer_subbuf_size_read(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos) { struct trace_array *tr = filp->private_data; + size_t size; char buf[64]; + int order; int r; - r = sprintf(buf, "%d\n", ring_buffer_subbuf_order_get(tr->array_buffer.buffer)); + order = ring_buffer_subbuf_order_get(tr->array_buffer.buffer); + size = (PAGE_SIZE << order) / 1024; + + r = sprintf(buf, "%zd\n", size); return simple_read_from_buffer(ubuf, cnt, ppos, buf, r); } static ssize_t -buffer_order_write(struct file *filp, const char __user *ubuf, - size_t cnt, loff_t *ppos) +buffer_subbuf_size_write(struct file *filp, const char __user *ubuf, + size_t cnt, loff_t *ppos) { struct trace_array *tr = filp->private_data; unsigned long val; int old_order; + int order; + int pages; int ret; ret = kstrtoul_from_user(ubuf, cnt, 10, &val); if (ret) return ret; + val *= 1024; /* value passed in is in KB */ + + pages = DIV_ROUND_UP(val, PAGE_SIZE); + order = fls(pages - 1); + /* limit between 1 and 128 system pages */ - if (val < 0 || val > 7) + if (order < 0 || order > 7) return -EINVAL; /* Do not allow tracing while changing the order of the ring buffer */ tracing_stop_tr(tr); old_order = ring_buffer_subbuf_order_get(tr->array_buffer.buffer); - if (old_order == val) + if (old_order == order) goto out; - ret = ring_buffer_subbuf_order_set(tr->array_buffer.buffer, val); + ret = ring_buffer_subbuf_order_set(tr->array_buffer.buffer, order); if (ret) goto out; @@ -9428,7 +9440,7 @@ buffer_order_write(struct file *filp, const char __user *ubuf, if (!tr->allocated_snapshot) goto out_max; - ret = ring_buffer_subbuf_order_set(tr->max_buffer.buffer, val); + ret = ring_buffer_subbuf_order_set(tr->max_buffer.buffer, order); if (ret) { /* Put back the old order */ cnt = ring_buffer_subbuf_order_set(tr->array_buffer.buffer, old_order); @@ -9460,10 +9472,10 @@ buffer_order_write(struct file *filp, const char __user *ubuf, return cnt; } -static const struct file_operations buffer_order_fops = { +static const struct file_operations buffer_subbuf_size_fops = { .open = tracing_open_generic_tr, - .read = buffer_order_read, - .write = buffer_order_write, + .read = buffer_subbuf_size_read, + .write = buffer_subbuf_size_write, .release = tracing_release_generic_tr, .llseek = default_llseek, }; @@ -9934,8 +9946,8 @@ init_tracer_tracefs(struct trace_array *tr, struct dentry *d_tracer) trace_create_file("buffer_percent", TRACE_MODE_WRITE, d_tracer, tr, &buffer_percent_fops); - trace_create_file("buffer_subbuf_order", TRACE_MODE_WRITE, d_tracer, - tr, &buffer_order_fops); + trace_create_file("buffer_subbuf_size_kb", TRACE_MODE_WRITE, d_tracer, + tr, &buffer_subbuf_size_fops); create_trace_options_dir(tr); diff --git a/tools/testing/selftests/ftrace/test.d/00basic/ringbuffer_order.tc b/tools/testing/selftests/ftrace/test.d/00basic/ringbuffer_subbuf_size.tc similarity index 85% rename from tools/testing/selftests/ftrace/test.d/00basic/ringbuffer_order.tc rename to tools/testing/selftests/ftrace/test.d/00basic/ringbuffer_subbuf_size.tc index ecbcc810e6c1..d44d09a33a74 100644 --- a/tools/testing/selftests/ftrace/test.d/00basic/ringbuffer_order.tc +++ b/tools/testing/selftests/ftrace/test.d/00basic/ringbuffer_subbuf_size.tc @@ -1,7 +1,7 @@ #!/bin/sh # SPDX-License-Identifier: GPL-2.0 -# description: Change the ringbuffer sub-buffer order -# requires: buffer_subbuf_order +# description: Change the ringbuffer sub-buffer size +# requires: buffer_subbuf_size_kb # flags: instance get_buffer_data_size() { @@ -52,8 +52,8 @@ write_buffer() { } test_buffer() { - orde=$1 - page_size=$((4096<<order)) + size_kb=$1 + page_size=$((size_kb*1024)) size=`get_buffer_data_size` @@ -82,14 +82,14 @@ test_buffer() { fi } -ORIG=`cat buffer_subbuf_order` +ORIG=`cat buffer_subbuf_size_kb` -# Could test bigger orders than 3, but then creating the string +# Could test bigger sizes than 32K, but then creating the string # to write into the ring buffer takes too long -for a in 0 1 2 3 ; do - echo $a > buffer_subbuf_order +for a in 4 8 16 32 ; do + echo $a > buffer_subbuf_size_kb test_buffer $a done -echo $ORIG > buffer_subbuf_order +echo $ORIG > buffer_subbuf_size_kb