mbox series

[PATCHv2,00/10] ext4: Improve FC trace events

Message ID cover.1646922487.git.riteshh@linux.ibm.com (mailing list archive)
Headers show
Series ext4: Improve FC trace events | expand

Message

Ritesh Harjani March 10, 2022, 3:58 p.m. UTC
Hello,

Please find the v2 of this patch series.

Note:- I still couldn't figure out how to expose EXT4_FC_REASON_MAX in patch-2
which (I think) might be (only) needed by trace-cmd or perf record for trace_ext4_fc_stats.
But it seems "cat /sys/kernel/debug/tracing/trace_pipe" gives the right output
for ext4_fc_stats trace event (as shown below).

So with above reasoning, do you think we should take these patches in?
And we can later see how to provide EXT4_FC_REASON_MAX definition available to
libtraceevent?

Either ways, please let me know your opinion around this.

<output of cat /sys/kernel/debug/tracing/trace_pipe> (shows FALLOC_RANGE:5)
=====================================================
jbd2/loop2-8-2219    [000] .....  1883.771539: ext4_fc_stats: dev 7,2 fc ineligible reasons:
XATTR:0, CROSS_RENAME:0, JOURNAL_FLAG_CHANGE:0, NO_MEM:0, SWAP_BOOT:0, RESIZE:0, RENAME_DIR:0, FALLOC_RANGE:5, INODE_JOURNAL_DATA:0 num_commits:22, ineligible: 4, numblks: 22


Changes since RFC
================
RFC -> v2
1. Added new patch-5 ("ext4: Return early for non-eligible fast_commit track events")
2. Removed a trace event in ext4_fc_track_template() (which was added in RFC)
   from patch-6 and added patch-7 to add the tid info in callers of
   ext4_fc_track_template(). (As per review comments from Jan)

Tested this with xfstests -g "quick"


[RFC]: https://lore.kernel.org/linux-ext4/cover.1645558375.git.riteshh@linux.ibm.com/

Ritesh Harjani (10):
  ext4: Remove unused enum EXT4_FC_COMMIT_FAILED
  ext4: Fix ext4_fc_stats trace point
  ext4: Convert ext4_fc_track_dentry type events to use event class
  ext4: Do not call FC trace event in ext4_fc_commit() if FS does not support FC
  ext4: Return early for non-eligible fast_commit track events
  ext4: Add new trace event in ext4_fc_cleanup
  ext4: Add transaction tid info in fc_track events
  ext4: Add commit_tid info in jbd debug log
  ext4: Add commit tid info in ext4_fc_commit_start/stop trace events
  ext4: Fix remaining two trace events to use same printk convention

 fs/ext4/fast_commit.c       |  95 ++++++++----
 fs/ext4/fast_commit.h       |   1 -
 include/trace/events/ext4.h | 297 +++++++++++++++++++++++-------------
 3 files changed, 257 insertions(+), 136 deletions(-)

Cc: Steven Rostedt <rostedt@goodmis.org>

--
2.31.1

Comments

Steven Rostedt March 10, 2022, 4:05 p.m. UTC | #1
On Thu, 10 Mar 2022 21:28:54 +0530
Ritesh Harjani <riteshh@linux.ibm.com> wrote:

> Note:- I still couldn't figure out how to expose EXT4_FC_REASON_MAX in patch-2
> which (I think) might be (only) needed by trace-cmd or perf record for trace_ext4_fc_stats.
> But it seems "cat /sys/kernel/debug/tracing/trace_pipe" gives the right output
> for ext4_fc_stats trace event (as shown below).
> 
> So with above reasoning, do you think we should take these patches in?
> And we can later see how to provide EXT4_FC_REASON_MAX definition available to
> libtraceevent?

I don't see EXT4_FC_REASON_MAX being used in the TP_printk(). If it isn't
used there, it doesn't need to be exposed. Or did I miss something?

-- Steve
Ritesh Harjani March 10, 2022, 5:07 p.m. UTC | #2
On 22/03/10 11:05AM, Steven Rostedt wrote:
> On Thu, 10 Mar 2022 21:28:54 +0530
> Ritesh Harjani <riteshh@linux.ibm.com> wrote:
>
> > Note:- I still couldn't figure out how to expose EXT4_FC_REASON_MAX in patch-2
> > which (I think) might be (only) needed by trace-cmd or perf record for trace_ext4_fc_stats.
> > But it seems "cat /sys/kernel/debug/tracing/trace_pipe" gives the right output
> > for ext4_fc_stats trace event (as shown below).
> >
> > So with above reasoning, do you think we should take these patches in?
> > And we can later see how to provide EXT4_FC_REASON_MAX definition available to
> > libtraceevent?
>
> I don't see EXT4_FC_REASON_MAX being used in the TP_printk(). If it isn't
> used there, it doesn't need to be exposed. Or did I miss something?

I was mentioning about EXT4_FC_REASON_MAX used in TP_STRUCT__entry.
When I hard code EXT4_FC_REASON_MAX to 9 in TP_STRUCT__entry, I could
see proper values using trace-cmd. Otherwise I see all 0 (when using trace-cmd
or perf record).

+	TP_STRUCT__entry(
+		__field(dev_t, dev)
+		__array(unsigned int, fc_ineligible_rc, EXT4_FC_REASON_MAX)

Should we anyway hard code this to 9. Since we are anyway printing all the
9 elements of array values individually.

+	TP_printk("dev %d,%d fc ineligible reasons:\n"
+		  "%s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u "
+		  "num_commits:%lu, ineligible: %lu, numblks: %lu",
+		  MAJOR(__entry->dev), MINOR(__entry->dev),
+		  FC_REASON_NAME_STAT(EXT4_FC_REASON_XATTR),
+		  FC_REASON_NAME_STAT(EXT4_FC_REASON_CROSS_RENAME),
+		  FC_REASON_NAME_STAT(EXT4_FC_REASON_JOURNAL_FLAG_CHANGE),
+		  FC_REASON_NAME_STAT(EXT4_FC_REASON_NOMEM),
+		  FC_REASON_NAME_STAT(EXT4_FC_REASON_SWAP_BOOT),
+		  FC_REASON_NAME_STAT(EXT4_FC_REASON_RESIZE),
+		  FC_REASON_NAME_STAT(EXT4_FC_REASON_RENAME_DIR),
+		  FC_REASON_NAME_STAT(EXT4_FC_REASON_FALLOC_RANGE),
+		  FC_REASON_NAME_STAT(EXT4_FC_REASON_INODE_JOURNAL_DATA),
+		  __entry->fc_commits, __entry->fc_ineligible_commits,
+		  __entry->fc_numblks)


Thanks
-ritesh
Steven Rostedt March 11, 2022, 12:39 a.m. UTC | #3
On Thu, 10 Mar 2022 22:37:31 +0530
Ritesh Harjani <riteshh@linux.ibm.com> wrote:

> On 22/03/10 11:05AM, Steven Rostedt wrote:
> > On Thu, 10 Mar 2022 21:28:54 +0530
> > Ritesh Harjani <riteshh@linux.ibm.com> wrote:
> >  
> > > Note:- I still couldn't figure out how to expose EXT4_FC_REASON_MAX in patch-2
> > > which (I think) might be (only) needed by trace-cmd or perf record for trace_ext4_fc_stats.
> > > But it seems "cat /sys/kernel/debug/tracing/trace_pipe" gives the right output
> > > for ext4_fc_stats trace event (as shown below).
> > >
> > > So with above reasoning, do you think we should take these patches in?
> > > And we can later see how to provide EXT4_FC_REASON_MAX definition available to
> > > libtraceevent?  
> >
> > I don't see EXT4_FC_REASON_MAX being used in the TP_printk(). If it isn't
> > used there, it doesn't need to be exposed. Or did I miss something?  
> 
> I was mentioning about EXT4_FC_REASON_MAX used in TP_STRUCT__entry.
> When I hard code EXT4_FC_REASON_MAX to 9 in TP_STRUCT__entry, I could
> see proper values using trace-cmd. Otherwise I see all 0 (when using trace-cmd
> or perf record).
> 
> +	TP_STRUCT__entry(
> +		__field(dev_t, dev)
> +		__array(unsigned int, fc_ineligible_rc, EXT4_FC_REASON_MAX)

Ah, I bet it's showing up in the format portion and not the print fmt part
of the format file.

Just to confirm, can you do the following:

# cat /sys/kernel/tracing/events/ext4/ext4_fc_commit_stop/format

and show me what it outputs.

Thanks,

-- Steve


> 
> Should we anyway hard code this to 9. Since we are anyway printing all the
> 9 elements of array values individually.
> 
> +	TP_printk("dev %d,%d fc ineligible reasons:\n"
> +		  "%s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u "
> +		  "num_commits:%lu, ineligible: %lu, numblks: %lu",
> +		  MAJOR(__entry->dev), MINOR(__entry->dev),
> +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_XATTR),
> +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_CROSS_RENAME),
> +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_JOURNAL_FLAG_CHANGE),
> +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_NOMEM),
> +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_SWAP_BOOT),
> +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_RESIZE),
> +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_RENAME_DIR),
> +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_FALLOC_RANGE),
> +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_INODE_JOURNAL_DATA),
> +		  __entry->fc_commits, __entry->fc_ineligible_commits,
> +		  __entry->fc_numblks)
> 
> 
> Thanks
> -ritesh
Ritesh Harjani March 11, 2022, 2:19 a.m. UTC | #4
On 22/03/10 07:39PM, Steven Rostedt wrote:
> On Thu, 10 Mar 2022 22:37:31 +0530
> Ritesh Harjani <riteshh@linux.ibm.com> wrote:
>
> > On 22/03/10 11:05AM, Steven Rostedt wrote:
> > > On Thu, 10 Mar 2022 21:28:54 +0530
> > > Ritesh Harjani <riteshh@linux.ibm.com> wrote:
> > >
> > > > Note:- I still couldn't figure out how to expose EXT4_FC_REASON_MAX in patch-2
> > > > which (I think) might be (only) needed by trace-cmd or perf record for trace_ext4_fc_stats.
> > > > But it seems "cat /sys/kernel/debug/tracing/trace_pipe" gives the right output
> > > > for ext4_fc_stats trace event (as shown below).
> > > >
> > > > So with above reasoning, do you think we should take these patches in?
> > > > And we can later see how to provide EXT4_FC_REASON_MAX definition available to
> > > > libtraceevent?
> > >
> > > I don't see EXT4_FC_REASON_MAX being used in the TP_printk(). If it isn't
> > > used there, it doesn't need to be exposed. Or did I miss something?
> >
> > I was mentioning about EXT4_FC_REASON_MAX used in TP_STRUCT__entry.
> > When I hard code EXT4_FC_REASON_MAX to 9 in TP_STRUCT__entry, I could
> > see proper values using trace-cmd. Otherwise I see all 0 (when using trace-cmd
> > or perf record).
> >
> > +	TP_STRUCT__entry(
> > +		__field(dev_t, dev)
> > +		__array(unsigned int, fc_ineligible_rc, EXT4_FC_REASON_MAX)
>
> Ah, I bet it's showing up in the format portion and not the print fmt part
> of the format file.
>
> Just to confirm, can you do the following:
>
> # cat /sys/kernel/tracing/events/ext4/ext4_fc_commit_stop/format

I think you meant ext4_fc_stats.

>
> and show me what it outputs.

root@qemu:/home/qemu# cat /sys/kernel/tracing/events/ext4/ext4_fc_stats/format
name: ext4_fc_stats
ID: 986
format:
        field:unsigned short common_type;       offset:0;       size:2; signed:0;
        field:unsigned char common_flags;       offset:2;       size:1; signed:0;
        field:unsigned char common_preempt_count;       offset:3;       size:1; signed:0;
        field:int common_pid;   offset:4;       size:4; signed:1;

        field:dev_t dev;        offset:8;       size:4; signed:0;
        field:unsigned int fc_ineligible_rc[EXT4_FC_REASON_MAX];        offset:12;      size:36;        signed:0;
        field:unsigned long fc_commits; offset:48;      size:8; signed:0;
        field:unsigned long fc_ineligible_commits;      offset:56;      size:8; signed:0;
        field:unsigned long fc_numblks; offset:64;      size:8; signed:0;

print fmt: "dev %d,%d fc ineligible reasons:
%s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u num_commits:%lu, ineligible: %lu, numblks: %lu", ((unsigned int) ((REC->dev) >> 20)), ((unsigned int) ((REC->dev) & ((1U << 20) - 1))), __print_symbolic(0, { 0, "XATTR"}, { 1, "CROSS_RENAME"}, { 2, "JOURNAL_FLAG_CHANGE"}, { 3, "NO_MEM"}, { 4, "SWAP_BOOT"}, { 5, "RESIZE"}, { 6, "RENAME_DIR"}, { 7, "FALLOC_RANGE"}, { 8, "INODE_JOURNAL_DATA"}), REC->fc_ineligible_rc[0], __print_symbolic(1, { 0, "XATTR"}, { 1, "CROSS_RENAME"}, { 2, "JOURNAL_FLAG_CHANGE"}, { 3, "NO_MEM"}, { 4, "SWAP_BOOT"}, { 5, "RESIZE"}, { 6, "RENAME_DIR"}, { 7, "FALLOC_RANGE"}, { 8, "INODE_JOURNAL_DATA"}), REC->fc_ineligible_rc[1], __print_symbolic(2, { 0, "XATTR"}, { 1, "CROSS_RENAME"}, { 2, "JOURNAL_FLAG_CHANGE"}, { 3, "NO_MEM"}, { 4, "SWAP_BOOT"}, { 5, "RESIZE"}, { 6, "RENAME_DIR"}, { 7, "FALLOC_RANGE"}, { 8, "INODE_JOURNAL_DATA"}), REC->fc_ineligible_rc[2], __print_symbolic(3, { 0, "XATTR"}, { 1, "CROSS_RENAME"}, { 2, "JOURNAL_FLAG_CHANGE"}, { 3, "NO_MEM"}, { 4, "SWAP_BOOT"}, { 5, "RESIZE"}, { 6, "RENAME_DIR"}, { 7, "FALLOC_RANGE"}, { 8, "INODE_JOURNAL_DATA"}), REC->fc_ineligible_rc[3], __print_symbolic(4, { 0, "XATTR"}, { 1, "CROSS_RENAME"}, { 2, "JOURNAL_FLAG_CHANGE"}, { 3, "NO_MEM"}, { 4, "SWAP_BOOT"}, { 5, "RESIZE"}, { 6, "RENAME_DIR"}, { 7, "FALLOC_RANGE"}, { 8, "INODE_JOURNAL_DATA"}), REC->fc_ineligible_rc[4], __print_symbolic(5, { 0, "XATTR"}, { 1, "CROSS_RENAME"}, { 2, "JOURNAL_FLAG_CHANGE"}, { 3, "NO_MEM"}, { 4, "SWAP_BOOT"}, { 5, "RESIZE"}, { 6, "RENAME_DIR"}, { 7, "FALLOC_RANGE"}, { 8, "INODE_JOURNAL_DATA"}), REC->fc_ineligible_rc[5], __print_symbolic(6, { 0, "XATTR"}, { 1, "CROSS_RENAME"}, { 2, "JOURNAL_FLAG_CHANGE"}, { 3, "NO_MEM"}, { 4, "SWAP_BOOT"}, { 5, "RESIZE"}, { 6, "RENAME_DIR"}, { 7, "FALLOC_RANGE"}, { 8, "INODE_JOURNAL_DATA"}), REC->fc_ineligible_rc[6], __print_symbolic(7, { 0, "XATTR"}, { 1, "CROSS_RENAME"}, { 2, "JOURNAL_FLAG_CHANGE"}, { 3, "NO_MEM"}, { 4, "SWAP_BOOT"}, { 5, "RESIZE"}, { 6, "RENAME_DIR"}, { 7, "FALLOC_RANGE"}, { 8, "INODE_JOURNAL_DATA"}), REC->fc_ineligible_rc[7], __print_symbolic(8, { 0, "XATTR"}, { 1, "CROSS_RENAME"}, { 2, "JOURNAL_FLAG_CHANGE"}, { 3, "NO_MEM"}, { 4, "SWAP_BOOT"}, { 5, "RESIZE"}, { 6, "RENAME_DIR"}, { 7, "FALLOC_RANGE"}, { 8, "INODE_JOURNAL_DATA"}), REC->fc_ineligible_rc[8], REC->fc_commits, REC->fc_ineligible_commits, REC->fc_numblks


output of ext4_fc_stats (FALLOC_RANGE:0 v/s FALLOC_RANGE:13)
==========================================================================
<perf-report or trace-cmd report>
          xfs_io  8336 [003] 42950.923784:        ext4:ext4_fc_stats: dev 7,2 fc ineligible reasons:
XATTR:0, CROSS_RENAME:0, JOURNAL_FLAG_CHANGE:0, NO_MEM:0, SWAP_BOOT:0, RESIZE:0, RENAME_DIR:0, FALLOC_RANGE:0, INODE_JOURNAL_DATA:0 num_commits:22, ineligible: 12, numblks: 22



<cat /sys/kernel/debug/tracing/trace_pipe>
          xfs_io-8336    [003] ..... 42951.224155: ext4_fc_stats: dev 7,2 fc ineligible reasons:
XATTR:0, CROSS_RENAME:0, JOURNAL_FLAG_CHANGE:0, NO_MEM:0, SWAP_BOOT:0, RESIZE:0, RENAME_DIR:0, FALLOC_RANGE:13, INODE_JOURNAL_DATA:0 num_commits:22, ineligible: 12, numblks: 22


Thanks
-ritesh


>
> Thanks,
>
> -- Steve
>
>
> >
> > Should we anyway hard code this to 9. Since we are anyway printing all the
> > 9 elements of array values individually.
> >
> > +	TP_printk("dev %d,%d fc ineligible reasons:\n"
> > +		  "%s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u, %s:%u "
> > +		  "num_commits:%lu, ineligible: %lu, numblks: %lu",
> > +		  MAJOR(__entry->dev), MINOR(__entry->dev),
> > +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_XATTR),
> > +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_CROSS_RENAME),
> > +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_JOURNAL_FLAG_CHANGE),
> > +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_NOMEM),
> > +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_SWAP_BOOT),
> > +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_RESIZE),
> > +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_RENAME_DIR),
> > +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_FALLOC_RANGE),
> > +		  FC_REASON_NAME_STAT(EXT4_FC_REASON_INODE_JOURNAL_DATA),
> > +		  __entry->fc_commits, __entry->fc_ineligible_commits,
> > +		  __entry->fc_numblks)
> >
> >
> > Thanks
> > -ritesh
>
Steven Rostedt March 11, 2022, 2:33 a.m. UTC | #5
On Fri, 11 Mar 2022 07:49:31 +0530
Ritesh Harjani <riteshh@linux.ibm.com> wrote:

> > # cat /sys/kernel/tracing/events/ext4/ext4_fc_commit_stop/format  
> 
> I think you meant ext4_fc_stats.

Sure.

> 
> >
> > and show me what it outputs.  
> 
> root@qemu:/home/qemu# cat /sys/kernel/tracing/events/ext4/ext4_fc_stats/format
> name: ext4_fc_stats
> ID: 986
> format:
>         field:unsigned short common_type;       offset:0;       size:2; signed:0;
>         field:unsigned char common_flags;       offset:2;       size:1; signed:0;
>         field:unsigned char common_preempt_count;       offset:3;       size:1; signed:0;
>         field:int common_pid;   offset:4;       size:4; signed:1;
> 
>         field:dev_t dev;        offset:8;       size:4; signed:0;
>         field:unsigned int fc_ineligible_rc[EXT4_FC_REASON_MAX];        offset:12;      size:36;        signed:0;

Bah, the above tells us how many items, and the TRACE_DEFINE_ENUM() doesn't
modify this part of the file.

I could update it to do so though.

-- Steve


>         field:unsigned long fc_commits; offset:48;      size:8; signed:0;
>         field:unsigned long fc_ineligible_commits;      offset:56;      size:8; signed:0;
>         field:unsigned long fc_numblks; offset:64;      size:8; signed:0;
Ritesh Harjani March 11, 2022, 3:14 a.m. UTC | #6
On 22/03/10 09:33PM, Steven Rostedt wrote:
> On Fri, 11 Mar 2022 07:49:31 +0530
> Ritesh Harjani <riteshh@linux.ibm.com> wrote:
>
> > > # cat /sys/kernel/tracing/events/ext4/ext4_fc_commit_stop/format
> >
> > I think you meant ext4_fc_stats.
>
> Sure.
>
> >
> > >
> > > and show me what it outputs.
> >
> > root@qemu:/home/qemu# cat /sys/kernel/tracing/events/ext4/ext4_fc_stats/format
> > name: ext4_fc_stats
> > ID: 986
> > format:
> >         field:unsigned short common_type;       offset:0;       size:2; signed:0;
> >         field:unsigned char common_flags;       offset:2;       size:1; signed:0;
> >         field:unsigned char common_preempt_count;       offset:3;       size:1; signed:0;
> >         field:int common_pid;   offset:4;       size:4; signed:1;
> >
> >         field:dev_t dev;        offset:8;       size:4; signed:0;
> >         field:unsigned int fc_ineligible_rc[EXT4_FC_REASON_MAX];        offset:12;      size:36;        signed:0;
>
> Bah, the above tells us how many items, and the TRACE_DEFINE_ENUM() doesn't
> modify this part of the file.

Then shall I just define TRACE_DEFINE_ENUM(EXT4_FC_REASON_MAX) in this patch.
Would that be the correct approach? Like how we have defined other enums.
We haven't yet defined EXT4_FC_REASON_MAX in current patch.
(as I saw it doesn't affect TP_STRUCT__entry())


>
> I could update it to do so though.

Please let me know if you have any patch for me to try.

Thanks
-ritesh


>
> -- Steve
>
>
> >         field:unsigned long fc_commits; offset:48;      size:8; signed:0;
> >         field:unsigned long fc_ineligible_commits;      offset:56;      size:8; signed:0;
> >         field:unsigned long fc_numblks; offset:64;      size:8; signed:0;
>
Steven Rostedt March 11, 2022, 4:32 a.m. UTC | #7
On Fri, 11 Mar 2022 08:44:31 +0530
Ritesh Harjani <riteshh@linux.ibm.com> wrote:

> > I could update it to do so though.  
> 
> Please let me know if you have any patch for me to try.

Can you try this?

-- Steve

From 392b91c598da2a8c5bbaebad08cd0410f4607bf4 Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
Date: Thu, 10 Mar 2022 23:27:38 -0500
Subject: [PATCH] tracing: Have TRACE_DEFINE_ENUM affect trace event types as
 well

The macro TRACE_DEFINE_ENUM is used to convert enums in the kernel to
their actual value when they are exported to user space via the trace
event format file.

Currently only the enums in the "print fmt" (TP_printk in the TRACE_EVENT
macro) have the enums converted. But the enums can be used to denote array
size:

        field:unsigned int fc_ineligible_rc[EXT4_FC_REASON_MAX]; offset:12;      size:36;        signed:0;

The EXT4_FC_REASON_MAX has no meaning to userspace but it needs to know
that information to know how to parse the array.

Have the array indexes also be parsed as well.

Link: https://lore.kernel.org/all/cover.1646922487.git.riteshh@linux.ibm.com/

Reported-by: Ritesh Harjani <riteshh@linux.ibm.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace_events.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 38afd66d80e3..ae9a3b8481f5 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -2633,6 +2633,33 @@ static void update_event_printk(struct trace_event_call *call,
 	}
 }
 
+static void update_event_fields(struct trace_event_call *call,
+				struct trace_eval_map *map)
+{
+	struct ftrace_event_field *field;
+	struct list_head *head;
+	char *ptr;
+	int len = strlen(map->eval_string);
+
+	head = trace_get_fields(call);
+	list_for_each_entry(field, head, link) {
+		ptr = strchr(field->type, '[');
+		if (!ptr)
+			continue;
+		ptr++;
+
+		if (!isalpha(*ptr) && *ptr != '_')
+			continue;
+
+		if (strncmp(map->eval_string, ptr, len) != 0)
+			continue;
+
+		ptr = eval_replace(ptr, map, len);
+		/* enum/sizeof string smaller than value */
+		WARN_ON_ONCE(!ptr);
+	}
+}
+
 void trace_event_eval_update(struct trace_eval_map **map, int len)
 {
 	struct trace_event_call *call, *p;
@@ -2668,6 +2695,7 @@ void trace_event_eval_update(struct trace_eval_map **map, int len)
 					first = false;
 				}
 				update_event_printk(call, map[i]);
+				update_event_fields(call, map[i]);
 			}
 		}
 	}
Ritesh Harjani March 11, 2022, 5:12 a.m. UTC | #8
On 22/03/10 11:32PM, Steven Rostedt wrote:
> On Fri, 11 Mar 2022 08:44:31 +0530
> Ritesh Harjani <riteshh@linux.ibm.com> wrote:
>
> > > I could update it to do so though.
> >
> > Please let me know if you have any patch for me to try.
>
> Can you try this?

Thanks a lot Steve for quick patch.
I tested your patch and it seems to be working fine.

Below are the details -

root@qemu:/home/qemu# cat /sys/kernel/tracing/events/ext4/ext4_fc_stats/format
name: ext4_fc_stats
ID: 986
format:
        field:unsigned short common_type;       offset:0;       size:2; signed:0;
        field:unsigned char common_flags;       offset:2;       size:1; signed:0;
        field:unsigned char common_preempt_count;       offset:3;       size:1; signed:0;
        field:int common_pid;   offset:4;       size:4; signed:1;

        field:dev_t dev;        offset:8;       size:4; signed:0;
        field:unsigned int fc_ineligible_rc[9]; offset:12;      size:36;        signed:0;
^^^^ It's now taking the enum value of EXT4_FC_REASON_MAX in array field too.

Also output from trace-cmd and perf script shows the right data

xfs_io  1856 [000]   173.411127:        ext4:ext4_fc_stats: dev 7,2 fc ineligible reasons:
XATTR:0, CROSS_RENAME:0, JOURNAL_FLAG_CHANGE:0, NO_MEM:0, SWAP_BOOT:0, RESIZE:0, RENAME_DIR:0, FALLOC_RANGE:2, INODE_JOURNAL_DATA:0 num_commits:0, ineligible: 1, numblks: 0

>
> -- Steve
>
> From 392b91c598da2a8c5bbaebad08cd0410f4607bf4 Mon Sep 17 00:00:00 2001
> From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
> Date: Thu, 10 Mar 2022 23:27:38 -0500
> Subject: [PATCH] tracing: Have TRACE_DEFINE_ENUM affect trace event types as
>  well
>
> The macro TRACE_DEFINE_ENUM is used to convert enums in the kernel to
> their actual value when they are exported to user space via the trace
> event format file.
>
> Currently only the enums in the "print fmt" (TP_printk in the TRACE_EVENT
> macro) have the enums converted. But the enums can be used to denote array
> size:
>
>         field:unsigned int fc_ineligible_rc[EXT4_FC_REASON_MAX]; offset:12;      size:36;        signed:0;
>
> The EXT4_FC_REASON_MAX has no meaning to userspace but it needs to know
> that information to know how to parse the array.
>
> Have the array indexes also be parsed as well.
>
> Link: https://lore.kernel.org/all/cover.1646922487.git.riteshh@linux.ibm.com/
>
> Reported-by: Ritesh Harjani <riteshh@linux.ibm.com>
> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> ---
>  kernel/trace/trace_events.c | 28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)

You may add below, if you like:-

Reported-and-tested-by: Ritesh Harjani <riteshh@linux.ibm.com>

-ritesh


>
> diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> index 38afd66d80e3..ae9a3b8481f5 100644
> --- a/kernel/trace/trace_events.c
> +++ b/kernel/trace/trace_events.c
> @@ -2633,6 +2633,33 @@ static void update_event_printk(struct trace_event_call *call,
>  	}
>  }
>
> +static void update_event_fields(struct trace_event_call *call,
> +				struct trace_eval_map *map)
> +{
> +	struct ftrace_event_field *field;
> +	struct list_head *head;
> +	char *ptr;
> +	int len = strlen(map->eval_string);
> +
> +	head = trace_get_fields(call);
> +	list_for_each_entry(field, head, link) {
> +		ptr = strchr(field->type, '[');
> +		if (!ptr)
> +			continue;
> +		ptr++;
> +
> +		if (!isalpha(*ptr) && *ptr != '_')
> +			continue;
> +
> +		if (strncmp(map->eval_string, ptr, len) != 0)
> +			continue;
> +
> +		ptr = eval_replace(ptr, map, len);
> +		/* enum/sizeof string smaller than value */
> +		WARN_ON_ONCE(!ptr);
> +	}
> +}
> +
>  void trace_event_eval_update(struct trace_eval_map **map, int len)
>  {
>  	struct trace_event_call *call, *p;
> @@ -2668,6 +2695,7 @@ void trace_event_eval_update(struct trace_eval_map **map, int len)
>  					first = false;
>  				}
>  				update_event_printk(call, map[i]);
> +				update_event_fields(call, map[i]);
>  			}
>  		}
>  	}
> --
> 2.34.1
>
Steven Rostedt March 11, 2022, 2:45 p.m. UTC | #9
On Fri, 11 Mar 2022 10:42:49 +0530
Ritesh Harjani <riteshh@linux.ibm.com> wrote:

> You may add below, if you like:-
> 
> Reported-and-tested-by: Ritesh Harjani <riteshh@linux.ibm.com>

Will do. Thanks for testing.

I'll be adding this for the next merge window. I don't think this is
something that needs to be added to this rc release nor stable. Do you
agree?

-- Steve
Ritesh Harjani March 11, 2022, 3:03 p.m. UTC | #10
On 22/03/11 09:45AM, Steven Rostedt wrote:
> On Fri, 11 Mar 2022 10:42:49 +0530
> Ritesh Harjani <riteshh@linux.ibm.com> wrote:
>
> > You may add below, if you like:-
> >
> > Reported-and-tested-by: Ritesh Harjani <riteshh@linux.ibm.com>
>
> Will do. Thanks for testing.
>
> I'll be adding this for the next merge window. I don't think this is
> something that needs to be added to this rc release nor stable. Do you
> agree?

If using an enum in TP_STRUCT__entry's __array field doesn't cause any side
effect other than it just can't be decoded by userspace perf record / trace-cmd,
then I guess it should be ok.

But for this PATCH 2/10 "ext4: Fix ext4_fc_stats trace point", will be
needed to be Cc'd to stable tree as discussed before, as it tries to
dereference some sbi pointer from the tracing ring buffer. Then hopefully the
only problem with previous kernel version would be that ext4_fc_stats(), won't
show proper values for array entries in older kernel version where this patch
of trace_events is not found.
But cat /sys/kernel/debug/tracing/trace_pipe should be able to show the right values.


From my side, I will send a v3 of this patch series with just EXT4_FC_REASON_MAX
defined using TRACE_DEFINE_ENUM.

Thanks again for your help :)

-ritesh

>
> -- Steve
Steven Rostedt March 11, 2022, 4:42 p.m. UTC | #11
On Fri, 11 Mar 2022 20:33:57 +0530
Ritesh Harjani <riteshh@linux.ibm.com> wrote:

> On 22/03/11 09:45AM, Steven Rostedt wrote:
> > On Fri, 11 Mar 2022 10:42:49 +0530
> > Ritesh Harjani <riteshh@linux.ibm.com> wrote:
> >  
> > > You may add below, if you like:-
> > >
> > > Reported-and-tested-by: Ritesh Harjani <riteshh@linux.ibm.com>  
> >
> > Will do. Thanks for testing.
> >
> > I'll be adding this for the next merge window. I don't think this is
> > something that needs to be added to this rc release nor stable. Do you
> > agree?  
> 
> If using an enum in TP_STRUCT__entry's __array field doesn't cause any side
> effect other than it just can't be decoded by userspace perf record / trace-cmd,
> then I guess it should be ok.

Right. It only causes trace-cmd and perf to not be able to parse the field.
But that's not really a regression, as it never was able to parse an enum
defining an array size.

> 
> But for this PATCH 2/10 "ext4: Fix ext4_fc_stats trace point", will be
> needed to be Cc'd to stable tree as discussed before, as it tries to
> dereference some sbi pointer from the tracing ring buffer. Then hopefully the
> only problem with previous kernel version would be that ext4_fc_stats(), won't
> show proper values for array entries in older kernel version where this patch
> of trace_events is not found.
> But cat /sys/kernel/debug/tracing/trace_pipe should be able to show the right values.
> 
> 
> >From my side, I will send a v3 of this patch series with just EXT4_FC_REASON_MAX  
> defined using TRACE_DEFINE_ENUM.

OK, I'll just add this for the next merge window. If people complain about
the parser not being able to parse this from user space, then we can either
backport it, or add a plugin that parses it manually in libtraceevent.

> 
> Thanks again for your help :)
> 

No problem. Thanks for the report.

-- Steve