Message ID | 20240122073629.2594271-1-dongtai.guo@linux.dev (mailing list archive) |
---|---|
State | Handled Elsewhere |
Headers | show |
Series | percpu: improve percpu_alloc_percpu_fail event trace | expand |
On Mon, 22 Jan 2024 15:36:29 +0800 George Guo <dongtai.guo@linux.dev> wrote: > From: George Guo <guodongtai@kylinos.cn> > > Add do_warn, warn_limit fields to the output of the > percpu_alloc_percpu_fail ftrace event. > > This is required to percpu_alloc failed with no warning showing. You mean to state; In order to know why percpu_alloc failed but produces no warnings, the do_warn and warn_limit should be traced to let the user know it was rate-limited. Or something like that? Honestly, I don't think that the trace event is the proper place to do that. The trace event just shows that it did fail. If you are confused to why it doesn't print to dmesg, then you can simply add a kprobe to see those values as well. -- Steve > > Signed-off-by: George Guo <guodongtai@kylinos.cn> > ---
On Mon, 22 Jan 2024 10:57:00 -0500 Steven Rostedt <rostedt@goodmis.org> wrote: > On Mon, 22 Jan 2024 15:36:29 +0800 > George Guo <dongtai.guo@linux.dev> wrote: > > > From: George Guo <guodongtai@kylinos.cn> > > > > Add do_warn, warn_limit fields to the output of the > > percpu_alloc_percpu_fail ftrace event. > > > > This is required to percpu_alloc failed with no warning showing. > > You mean to state; > > In order to know why percpu_alloc failed but produces no warnings, > the do_warn and warn_limit should be traced to let the user know it > was rate-limited. > > Or something like that? > > Honestly, I don't think that the trace event is the proper place to do > that. The trace event just shows that it did fail. If you are > confused to why it doesn't print to dmesg, then you can simply add a > kprobe to see those values as well. > > -- Steve > > > > > Signed-off-by: George Guo <guodongtai@kylinos.cn> > > --- There are two reasons of percpu_alloc failed without warnings: 1. do_warn is false 2. do_warn is true and warn_limit is reached the limit. Showing do_warn and warn_limit makes things simple, maybe dont need kprobe again.
On Tue, 23 Jan 2024 09:44:43 +0800 George Guo <dongtai.guo@linux.dev> wrote: > There are two reasons of percpu_alloc failed without warnings: > > 1. do_warn is false > 2. do_warn is true and warn_limit is reached the limit. Yes I know the reasons. > > Showing do_warn and warn_limit makes things simple, maybe dont need > kprobe again. It's up to the maintainers of that code to decide if it's worth it or not, but honestly, my opinion it is not. The trace event in question is to trace that percpu_alloc failed and why. It's not there to determine why it did not produce a printk message. -- Steve
Hello, On Mon, Jan 22, 2024 at 08:55:39PM -0500, Steven Rostedt wrote: > On Tue, 23 Jan 2024 09:44:43 +0800 > George Guo <dongtai.guo@linux.dev> wrote: > > > There are two reasons of percpu_alloc failed without warnings: > > > > 1. do_warn is false > > 2. do_warn is true and warn_limit is reached the limit. > > Yes I know the reasons. > > > > > Showing do_warn and warn_limit makes things simple, maybe dont need > > kprobe again. > > It's up to the maintainers of that code to decide if it's worth it or not, > but honestly, my opinion it is not. > I agree, I don't think this is a worthwhile change. If we do change this, I'd like it to be more actionable in some way and as a result something we can fix or tune accordingly. George is this a common problem you're seeing? > The trace event in question is to trace that percpu_alloc failed and why. > It's not there to determine why it did not produce a printk message. > > -- Steve Thanks, Dennis
diff --git a/include/trace/events/percpu.h b/include/trace/events/percpu.h index 5b8211ca8950..c5f412e84bb8 100644 --- a/include/trace/events/percpu.h +++ b/include/trace/events/percpu.h @@ -75,15 +75,18 @@ TRACE_EVENT(percpu_free_percpu, TRACE_EVENT(percpu_alloc_percpu_fail, - TP_PROTO(bool reserved, bool is_atomic, size_t size, size_t align), + TP_PROTO(bool reserved, bool is_atomic, size_t size, size_t align, + bool do_warn, int warn_limit), - TP_ARGS(reserved, is_atomic, size, align), + TP_ARGS(reserved, is_atomic, size, align, do_warn, warn_limit), TP_STRUCT__entry( - __field( bool, reserved ) - __field( bool, is_atomic ) - __field( size_t, size ) - __field( size_t, align ) + __field(bool, reserved) + __field(bool, is_atomic) + __field(size_t, size) + __field(size_t, align) + __field(bool, do_warn) + __field(int, warn_limit) ), TP_fast_assign( @@ -91,11 +94,14 @@ TRACE_EVENT(percpu_alloc_percpu_fail, __entry->is_atomic = is_atomic; __entry->size = size; __entry->align = align; + __entry->do_warn = do_warn; + __entry->warn_limit = warn_limit; ), - TP_printk("reserved=%d is_atomic=%d size=%zu align=%zu", + TP_printk("reserved=%d is_atomic=%d size=%zu align=%zu do_warn=%d, warn_limit=%d", __entry->reserved, __entry->is_atomic, - __entry->size, __entry->align) + __entry->size, __entry->align, + __entry->do_warn, __entry->warn_limit) ); TRACE_EVENT(percpu_create_chunk, diff --git a/mm/percpu.c b/mm/percpu.c index 4e11fc1e6def..ac5b48268c99 100644 --- a/mm/percpu.c +++ b/mm/percpu.c @@ -1886,7 +1886,7 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved, fail_unlock: spin_unlock_irqrestore(&pcpu_lock, flags); fail: - trace_percpu_alloc_percpu_fail(reserved, is_atomic, size, align); + trace_percpu_alloc_percpu_fail(reserved, is_atomic, size, align, do_warn, warn_limit); if (do_warn && warn_limit) { pr_warn("allocation failed, size=%zu align=%zu atomic=%d, %s\n",