diff mbox series

tracing: Fix uaf issue in tracing_open_file_tr

Message ID 20240426073410.17154-1-Tze-nan.Wu@mediatek.com (mailing list archive)
State Superseded
Headers show
Series tracing: Fix uaf issue in tracing_open_file_tr | expand

Commit Message

Tze-nan Wu April 26, 2024, 7:34 a.m. UTC
"tracing_event_file" is at the risk of use-after-free due to the race of
two functions "tracing_open_file_tr" and "synth_event_release".
Specifically, it could be freed by synth_event_release before
tracing_open_file_tr has the opportunity to access its members.

It's easy to reproduced by first executing script 'A' and then script 'B'
in my environment.

  Script 'A':
    echo "hello int aaa" > /sys/kernel/tracing/synthetic_events
    while :
    do
      echo 0 > /sys/kernel/tracing/events/synthetic/hello/enable
    done

  Script 'B':
    echo > /sys/kernel/tracing/synthetic/events

  My environment:
    arm64 + generic and swtag based KASAN both enabled + kernel-6.6.18

KASAN report
==================================================================
BUG: KASAN: slab-use-after-free in tracing_open_file_tr
Read of size 8 at addr 9*ffff********** by task sh/3485
Pointer tag: [9*], memory tag: [fe]

CPU: * PID: 3485 Comm: sh Tainted: ****************
Call trace:
 __hwasan_tag_mismatch
 tracing_open_file_tr
 do_dentry_open
 vfs_open
 path_openat

Freed by task 3490:
 slab_free_freelist_hook
 kmem_cache_free
 event_file_put
 remove_event_file_dir
 __trace_remove_event_call
 trace_remove_event_call
 synth_event_release
 dyn_events_release_all
 synth_events_open

page last allocated via order 0, migratetype Unmovable:
 __slab_alloc
 kmem_cache_alloc
 trace_create_new_event
 trace_add_event_call
 register_synth_event
 __create_synth_event
 create_or_delete_synth_event
 trace_parse_run_command
 synth_events_write
 vfs_write

Based on the assumption that eventfs_inode will persist throughout the
execution of the tracing_open_file_tr function,
we can fix this issue by incrementing the reference count of
trace_event_file once it is assigned to eventfs_inode->data.
The reference count will then be decremented during the release phase of
eventfs_inode.

This ensures that trace_event_file is not prematurely freed while the
tracing_open_file_tr function is being called.

Fixes: bb32500fb9b7 ("tracing: Have trace_event_file have ref counters")
Co-developed-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com>
Signed-off-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com>
Signed-off-by: Tze-nan wu <Tze-nan.Wu@mediatek.com>
---
BTW, I've also attempted to reproduce the same issue in another
environment (described below).
It's also reproducible but in a lower rate.

another environment:
  x86 + ubuntu + generic kasan enabled + kernel-6.9-rc4

After applying the patch, KASAN no longer reports any issues when
following the same reproduction steps in my arm64 environment. 
However, I am concerned about potential side effects that the patch might introduce.
Additionally, the newly added function "is_entry_event_callback" may not
be reliable in my opinion,
as the callback function used in the comparison could change in future. 
Nonetheless, this is the best solution I can come up with now.

Looking for any suggestion or solution, appreciate.

 fs/tracefs/event_inode.c     | 3 +++
 include/linux/trace_events.h | 4 ++++
 kernel/trace/trace_events.c  | 6 ++++++
 3 files changed, 13 insertions(+)

Comments

Steven Rostedt April 29, 2024, 12:28 a.m. UTC | #1
On Fri, 26 Apr 2024 15:34:08 +0800
Tze-nan wu <Tze-nan.Wu@mediatek.com> wrote:

> "tracing_event_file" is at the risk of use-after-free due to the race of
> two functions "tracing_open_file_tr" and "synth_event_release".
> Specifically, it could be freed by synth_event_release before
> tracing_open_file_tr has the opportunity to access its members.
> 
> It's easy to reproduced by first executing script 'A' and then script 'B'
> in my environment.
> 
>   Script 'A':
>     echo "hello int aaa" > /sys/kernel/tracing/synthetic_events
>     while :
>     do
>       echo 0 > /sys/kernel/tracing/events/synthetic/hello/enable
>     done
> 
>   Script 'B':
>     echo > /sys/kernel/tracing/synthetic/events

I think you meant:

      echo > /sys/kernel/tracing/synthetic_events

> 
>   My environment:
>     arm64 + generic and swtag based KASAN both enabled + kernel-6.6.18
> 
> KASAN report
> ==================================================================
> BUG: KASAN: slab-use-after-free in tracing_open_file_tr
> Read of size 8 at addr 9*ffff********** by task sh/3485
> Pointer tag: [9*], memory tag: [fe]
> 
> CPU: * PID: 3485 Comm: sh Tainted: ****************
> Call trace:
>  __hwasan_tag_mismatch
>  tracing_open_file_tr
>  do_dentry_open
>  vfs_open
>  path_openat
> 
> Freed by task 3490:
>  slab_free_freelist_hook
>  kmem_cache_free
>  event_file_put
>  remove_event_file_dir
>  __trace_remove_event_call
>  trace_remove_event_call
>  synth_event_release
>  dyn_events_release_all
>  synth_events_open
> 
> page last allocated via order 0, migratetype Unmovable:
>  __slab_alloc
>  kmem_cache_alloc
>  trace_create_new_event
>  trace_add_event_call
>  register_synth_event
>  __create_synth_event
>  create_or_delete_synth_event
>  trace_parse_run_command
>  synth_events_write
>  vfs_write

Thanks for reporting this.

> 
> Based on the assumption that eventfs_inode will persist throughout the
> execution of the tracing_open_file_tr function,
> we can fix this issue by incrementing the reference count of
> trace_event_file once it is assigned to eventfs_inode->data.
> The reference count will then be decremented during the release phase of
> eventfs_inode.
> 
> This ensures that trace_event_file is not prematurely freed while the
> tracing_open_file_tr function is being called.
> 
> Fixes: bb32500fb9b7 ("tracing: Have trace_event_file have ref counters")
> Co-developed-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com>
> Signed-off-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com>
> Signed-off-by: Tze-nan wu <Tze-nan.Wu@mediatek.com>
> ---
> BTW, I've also attempted to reproduce the same issue in another
> environment (described below).
> It's also reproducible but in a lower rate.
> 
> another environment:
>   x86 + ubuntu + generic kasan enabled + kernel-6.9-rc4
> 
> After applying the patch, KASAN no longer reports any issues when
> following the same reproduction steps in my arm64 environment. 
> However, I am concerned about potential side effects that the patch might introduce.
> Additionally, the newly added function "is_entry_event_callback" may not
> be reliable in my opinion,
> as the callback function used in the comparison could change in future. 
> Nonetheless, this is the best solution I can come up with now.
> 
> Looking for any suggestion or solution, appreciate.

Yeah, I do not think eventfs should be involved in this. It needs to be
protected at a higher level (in the synthetic/dynamic event code).

I'm just coming back from Japan, and I'll need to take a deeper look at
this after I recover from my jetlag.

Thanks,

-- Steve
Steven Rostedt April 29, 2024, 6:46 p.m. UTC | #2
On Sun, 28 Apr 2024 20:28:37 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> > Looking for any suggestion or solution, appreciate.  
> 
> Yeah, I do not think eventfs should be involved in this. It needs to be
> protected at a higher level (in the synthetic/dynamic event code).
> 
> I'm just coming back from Japan, and I'll need to take a deeper look at
> this after I recover from my jetlag.

OK, so I guess the eventfs nodes need an optional release callback. Here's
the right way to do that. I added a "release" function to the passed in
entry array that allows for calling a release function when the
eventfs_inode is freed. Then in code for creating events, I call
event_file_get() on the file being assigned and have the freeing of the
"enable" file have the release function that will call event_file_put() on
that file structure.

Does this fix it for you?

-- Steve

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 894c6ca1e500..dc97c19f9e0a 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -84,10 +84,17 @@ enum {
 static void release_ei(struct kref *ref)
 {
 	struct eventfs_inode *ei = container_of(ref, struct eventfs_inode, kref);
+	const struct eventfs_entry *entry;
 	struct eventfs_root_inode *rei;
 
 	WARN_ON_ONCE(!ei->is_freed);
 
+	for (int i = 0; i < ei->nr_entries; i++) {
+		entry = &ei->entries[i];
+		if (entry->release)
+			entry->release(entry->name, ei->data);
+	}
+
 	kfree(ei->entry_attrs);
 	kfree_const(ei->name);
 	if (ei->is_events) {
diff --git a/include/linux/tracefs.h b/include/linux/tracefs.h
index 7a5fe17b6bf9..d03f74658716 100644
--- a/include/linux/tracefs.h
+++ b/include/linux/tracefs.h
@@ -62,6 +62,8 @@ struct eventfs_file;
 typedef int (*eventfs_callback)(const char *name, umode_t *mode, void **data,
 				const struct file_operations **fops);
 
+typedef void (*eventfs_release)(const char *name, void *data);
+
 /**
  * struct eventfs_entry - dynamically created eventfs file call back handler
  * @name:	Then name of the dynamic file in an eventfs directory
@@ -72,6 +74,7 @@ typedef int (*eventfs_callback)(const char *name, umode_t *mode, void **data,
 struct eventfs_entry {
 	const char			*name;
 	eventfs_callback		callback;
+	eventfs_release			release;
 };
 
 struct eventfs_inode;
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 52f75c36bbca..d14c84281f2b 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -2552,6 +2552,14 @@ static int event_callback(const char *name, umode_t *mode, void **data,
 	return 0;
 }
 
+/* The file is incremented on creation and freeing the enable file decrements it */
+static void event_release(const char *name, void *data)
+{
+	struct trace_event_file *file = data;
+
+	event_file_put(file);
+}
+
 static int
 event_create_dir(struct eventfs_inode *parent, struct trace_event_file *file)
 {
@@ -2566,6 +2574,7 @@ event_create_dir(struct eventfs_inode *parent, struct trace_event_file *file)
 		{
 			.name		= "enable",
 			.callback	= event_callback,
+			.release	= event_release,
 		},
 		{
 			.name		= "filter",
@@ -2634,6 +2643,9 @@ event_create_dir(struct eventfs_inode *parent, struct trace_event_file *file)
 		return ret;
 	}
 
+	/* Gets decremented on freeing of the "enable" file */
+	event_file_get(file);
+
 	return 0;
 }
Tze-nan Wu May 2, 2024, 3:10 a.m. UTC | #3
On Mon, 2024-04-29 at 14:46 -0400, Steven Rostedt wrote:
>  	 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
>  On Sun, 28 Apr 2024 20:28:37 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > > Looking for any suggestion or solution, appreciate.  
> > 
> > Yeah, I do not think eventfs should be involved in this. It needs
> to be
> > protected at a higher level (in the synthetic/dynamic event code).
> > 
> > I'm just coming back from Japan, and I'll need to take a deeper
> look at
> > this after I recover from my jetlag.
> 
> OK, so I guess the eventfs nodes need an optional release callback.
> Here's
> the right way to do that. I added a "release" function to the passed
> in
> entry array that allows for calling a release function when the
> eventfs_inode is freed. Then in code for creating events, I call
> event_file_get() on the file being assigned and have the freeing of
> the
> "enable" file have the release function that will call
> event_file_put() on
> that file structure.
> 
> Does this fix it for you?
> 
Sorry for my late reply, I'm testing the patch on my machine now. 
Test will be done in four hours.

There's something I'm worrying about in the patch,
what I'm worrying about is commented in the code below.

/kernel/trace/trace_events.c:
  static int
  event_create_dir(struct eventfs_inode *parent, 
  struct trace_event_file *file) 
  {
        ...
        ...
        ...
        nr_entries = ARRAY_SIZE(event_entries);

        name = trace_event_name(call);

        +event_file_get(file);        // Line A
            ^^^^^^^^^^^^^
        // Should we move the "event_file_get" to here, instead  
        // of calling it at line C?
        // Due to Line B could eventually invoke "event_file_put".
        //   eventfs_create_dir -> free_ei ->put_ei -> kref_put 
        //  -> release_ei -> event_release -> event_file_put
        // Not sure if this is a potential risk? If Line B do call   
        // event_file_put,"event_file_put" will be called prior to
        // "event_file_get", could corrupt the reference of the file.

        ei = eventfs_create_dir(name, e_events,    // Line B 
             event_entries, nr_entries, file);
        if (IS_ERR(ei)) {
                pr_warn("Could not create tracefs '%s' directory\n", 
                name);
                return -1;
        }
        file->ei = ei;

        ret = event_define_fields(call);
        if (ret < 0) {
                pr_warn("Could not initialize trace point events/%s\n",
name);
                return ret;
                   ^^^^^^^^^          
       // Maybe we chould have similar concern if we return here.
       // Due to the event_inode had been created, but we did not call 
       // event_file_get. 
       // Could it lead to some issues in the future while freeing 
       // event_indoe?
        }


        -event_file_get(file);       //Line C
        return 0;
  }

Thanks,
Tze-nan

> -- Steve
> 
> diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
> index 894c6ca1e500..dc97c19f9e0a 100644
> --- a/fs/tracefs/event_inode.c
> +++ b/fs/tracefs/event_inode.c
> @@ -84,10 +84,17 @@ enum {
>  static void release_ei(struct kref *ref)
>  {
>  struct eventfs_inode *ei = container_of(ref, struct eventfs_inode,
> kref);
> +const struct eventfs_entry *entry;
>  struct eventfs_root_inode *rei;
>  
>  WARN_ON_ONCE(!ei->is_freed);
>  
> +for (int i = 0; i < ei->nr_entries; i++) {
> +entry = &ei->entries[i];
> +if (entry->release)
> +entry->release(entry->name, ei->data);
> +}
> +
>  kfree(ei->entry_attrs);
>  kfree_const(ei->name);
>  if (ei->is_events) {
> diff --git a/include/linux/tracefs.h b/include/linux/tracefs.h
> index 7a5fe17b6bf9..d03f74658716 100644
> --- a/include/linux/tracefs.h
> +++ b/include/linux/tracefs.h
> @@ -62,6 +62,8 @@ struct eventfs_file;
>  typedef int (*eventfs_callback)(const char *name, umode_t *mode,
> void **data,
>  const struct file_operations **fops);
>  
> +typedef void (*eventfs_release)(const char *name, void *data);
> +
>  /**
>   * struct eventfs_entry - dynamically created eventfs file call back
> handler
>   * @name:Then name of the dynamic file in an eventfs directory
> @@ -72,6 +74,7 @@ typedef int (*eventfs_callback)(const char *name,
> umode_t *mode, void **data,
>  struct eventfs_entry {
>  const char*name;
>  eventfs_callbackcallback;
> +eventfs_releaserelease;
>  };
>  
>  struct eventfs_inode;
> diff --git a/kernel/trace/trace_events.c
> b/kernel/trace/trace_events.c
> index 52f75c36bbca..d14c84281f2b 100644
> --- a/kernel/trace/trace_events.c
> +++ b/kernel/trace/trace_events.c
> @@ -2552,6 +2552,14 @@ static int event_callback(const char *name,
> umode_t *mode, void **data,
>  return 0;
>  }
>  
> +/* The file is incremented on creation and freeing the enable file
> decrements it */
> +static void event_release(const char *name, void *data)
> +{
> +struct trace_event_file *file = data;
> +
> +event_file_put(file);
> +}
> +
>  static int
>  event_create_dir(struct eventfs_inode *parent, struct
> trace_event_file *file)
>  {
> @@ -2566,6 +2574,7 @@ event_create_dir(struct eventfs_inode *parent,
> struct trace_event_file *file)
>  {
>  .name= "enable",
>  .callback= event_callback,
> +.release= event_release,
>  },
>  {
>  .name= "filter",
> @@ -2634,6 +2643,9 @@ event_create_dir(struct eventfs_inode *parent,
> struct trace_event_file *file)
>  return ret;
>  }
>  
> +/* Gets decremented on freeing of the "enable" file */
> +event_file_get(file);
> +
>  return 0;
>  }
>
Steven Rostedt May 2, 2024, 3:50 a.m. UTC | #4
On Thu, 2 May 2024 03:10:24 +0000
Tze-nan Wu (吳澤南) <Tze-nan.Wu@mediatek.com> wrote:

> >   
> Sorry for my late reply, I'm testing the patch on my machine now. 
> Test will be done in four hours.
> 
> There's something I'm worrying about in the patch,
> what I'm worrying about is commented in the code below.
> 
> /kernel/trace/trace_events.c:
>   static int
>   event_create_dir(struct eventfs_inode *parent, 
>   struct trace_event_file *file) 
>   {
>         ...
>         ...
>         ...
>         nr_entries = ARRAY_SIZE(event_entries);
> 
>         name = trace_event_name(call);
> 
>         +event_file_get(file);        // Line A
>             ^^^^^^^^^^^^^
>         // Should we move the "event_file_get" to here, instead  
>         // of calling it at line C?
>         // Due to Line B could eventually invoke "event_file_put".
>         //   eventfs_create_dir -> free_ei ->put_ei -> kref_put 
>         //  -> release_ei -> event_release -> event_file_put
>         // Not sure if this is a potential risk? If Line B do call   
>         // event_file_put,"event_file_put" will be called prior to
>         // "event_file_get", could corrupt the reference of the file.

No, but you do bring up a good point. The release should not be called on
error, but it looks like it possibly can be.

> 
>         ei = eventfs_create_dir(name, e_events,    // Line B 
>              event_entries, nr_entries, file);
>         if (IS_ERR(ei)) {
>                 pr_warn("Could not create tracefs '%s' directory\n", 
>                 name);
>                 return -1;
>         }
>         file->ei = ei;
> 
>         ret = event_define_fields(call);
>         if (ret < 0) {
>                 pr_warn("Could not initialize trace point events/%s\n",
> name);
>                 return ret;
>                    ^^^^^^^^^          
>        // Maybe we chould have similar concern if we return here.
>        // Due to the event_inode had been created, but we did not call 
>        // event_file_get. 
>        // Could it lead to some issues in the future while freeing 
>        // event_indoe?
>         }
> 
> 
>         -event_file_get(file);       //Line C
>         return 0;
>   }

This prevents the release() function from being called on failure of
creating the ei.

Can you try this patch instead?

-- Steve

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 894c6ca1e500..f5510e26f0f6 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -84,10 +84,17 @@ enum {
 static void release_ei(struct kref *ref)
 {
 	struct eventfs_inode *ei = container_of(ref, struct eventfs_inode, kref);
+	const struct eventfs_entry *entry;
 	struct eventfs_root_inode *rei;
 
 	WARN_ON_ONCE(!ei->is_freed);
 
+	for (int i = 0; i < ei->nr_entries; i++) {
+		entry = &ei->entries[i];
+		if (entry->release)
+			entry->release(entry->name, ei->data);
+	}
+
 	kfree(ei->entry_attrs);
 	kfree_const(ei->name);
 	if (ei->is_events) {
@@ -112,6 +119,18 @@ static inline void free_ei(struct eventfs_inode *ei)
 	}
 }
 
+/*
+ * Called when creation of an ei fails, do not call release() functions.
+ */
+static inline void cleanup_ei(struct eventfs_inode *ei)
+{
+	if (ei) {
+		/* Set nr_entries to 0 to prevent release() function being called */
+		ei->nr_entries = 0;
+		free_ei(ei);
+	}
+}
+
 static inline struct eventfs_inode *get_ei(struct eventfs_inode *ei)
 {
 	if (ei)
@@ -734,7 +753,7 @@ struct eventfs_inode *eventfs_create_dir(const char *name, struct eventfs_inode
 
 	/* Was the parent freed? */
 	if (list_empty(&ei->list)) {
-		free_ei(ei);
+		cleanup_ei(ei);
 		ei = NULL;
 	}
 	return ei;
@@ -835,7 +854,7 @@ struct eventfs_inode *eventfs_create_events_dir(const char *name, struct dentry
 	return ei;
 
  fail:
-	free_ei(ei);
+	cleanup_ei(ei);
 	tracefs_failed_creating(dentry);
 	return ERR_PTR(-ENOMEM);
 }
diff --git a/include/linux/tracefs.h b/include/linux/tracefs.h
index 7a5fe17b6bf9..d03f74658716 100644
--- a/include/linux/tracefs.h
+++ b/include/linux/tracefs.h
@@ -62,6 +62,8 @@ struct eventfs_file;
 typedef int (*eventfs_callback)(const char *name, umode_t *mode, void **data,
 				const struct file_operations **fops);
 
+typedef void (*eventfs_release)(const char *name, void *data);
+
 /**
  * struct eventfs_entry - dynamically created eventfs file call back handler
  * @name:	Then name of the dynamic file in an eventfs directory
@@ -72,6 +74,7 @@ typedef int (*eventfs_callback)(const char *name, umode_t *mode, void **data,
 struct eventfs_entry {
 	const char			*name;
 	eventfs_callback		callback;
+	eventfs_release			release;
 };
 
 struct eventfs_inode;
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 52f75c36bbca..6ef29eba90ce 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -2552,6 +2552,14 @@ static int event_callback(const char *name, umode_t *mode, void **data,
 	return 0;
 }
 
+/* The file is incremented on creation and freeing the enable file decrements it */
+static void event_release(const char *name, void *data)
+{
+	struct trace_event_file *file = data;
+
+	event_file_put(file);
+}
+
 static int
 event_create_dir(struct eventfs_inode *parent, struct trace_event_file *file)
 {
@@ -2566,6 +2574,7 @@ event_create_dir(struct eventfs_inode *parent, struct trace_event_file *file)
 		{
 			.name		= "enable",
 			.callback	= event_callback,
+			.release	= event_release,
 		},
 		{
 			.name		= "filter",
@@ -2634,6 +2643,9 @@ event_create_dir(struct eventfs_inode *parent, struct trace_event_file *file)
 		return ret;
 	}
 
+	/* Gets decremented on freeing of the "enable" file */
+	event_file_get(file);
+
 	return 0;
 }
Tze-nan Wu May 2, 2024, 4:23 a.m. UTC | #5
On Wed, 2024-05-01 at 23:50 -0400, Steven Rostedt wrote:
>  	 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
>  On Thu, 2 May 2024 03:10:24 +0000
> Tze-nan Wu (吳澤南) <Tze-nan.Wu@mediatek.com> wrote:
> 
> > >   
> > Sorry for my late reply, I'm testing the patch on my machine now. 
> > Test will be done in four hours.
> > 
> > There's something I'm worrying about in the patch,
> > what I'm worrying about is commented in the code below.
> > 
> > /kernel/trace/trace_events.c:
> >   static int
> >   event_create_dir(struct eventfs_inode *parent, 
> >   struct trace_event_file *file) 
> >   {
> >         ...
> >         ...
> >         ...
> >         nr_entries = ARRAY_SIZE(event_entries);
> > 
> >         name = trace_event_name(call);
> > 
> >         +event_file_get(file);        // Line A
> >             ^^^^^^^^^^^^^
> >         // Should we move the "event_file_get" to here, instead  
> >         // of calling it at line C?
> >         // Due to Line B could eventually invoke "event_file_put".
> >         //   eventfs_create_dir -> free_ei ->put_ei -> kref_put 
> >         //  -> release_ei -> event_release -> event_file_put
> >         // Not sure if this is a potential risk? If Line B do
> call   
> >         // event_file_put,"event_file_put" will be called prior to
> >         // "event_file_get", could corrupt the reference of the
> file.
> 
> No, but you do bring up a good point. The release should not be
> called on
> error, but it looks like it possibly can be.
> 
> > 
> >         ei = eventfs_create_dir(name, e_events,    // Line B 
> >              event_entries, nr_entries, file);
> >         if (IS_ERR(ei)) {
> >                 pr_warn("Could not create tracefs '%s'
> directory\n", 
> >                 name);
> >                 return -1;
> >         }
> >         file->ei = ei;
> > 
> >         ret = event_define_fields(call);
> >         if (ret < 0) {
> >                 pr_warn("Could not initialize trace point
> events/%s\n",
> > name);
> >                 return ret;
> >                    ^^^^^^^^^          
> >        // Maybe we chould have similar concern if we return here.
> >        // Due to the event_inode had been created, but we did not
> call 
> >        // event_file_get. 
> >        // Could it lead to some issues in the future while freeing 
> >        // event_indoe?
> >         }
> > 
> > 
> >         -event_file_get(file);       //Line C
> >         return 0;
> >   }
> 
> This prevents the release() function from being called on failure of
> creating the ei.
> 
> Can you try this patch instead?
> -- Steve
> 
Sure, will reply the mail again after the test finish ASAP.
-- Tze-nan

> diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
> index 894c6ca1e500..f5510e26f0f6 100644
> --- a/fs/tracefs/event_inode.c
> +++ b/fs/tracefs/event_inode.c
> @@ -84,10 +84,17 @@ enum {
>  static void release_ei(struct kref *ref)
>  {
>  struct eventfs_inode *ei = container_of(ref, struct eventfs_inode,
> kref);
> +const struct eventfs_entry *entry;
>  struct eventfs_root_inode *rei;
>  
>  WARN_ON_ONCE(!ei->is_freed);
>  
> +for (int i = 0; i < ei->nr_entries; i++) {
> +entry = &ei->entries[i];
> +if (entry->release)
> +entry->release(entry->name, ei->data);
> +}
> +
>  kfree(ei->entry_attrs);
>  kfree_const(ei->name);
>  if (ei->is_events) {
> @@ -112,6 +119,18 @@ static inline void free_ei(struct eventfs_inode
> *ei)
>  }
>  }
>  
> +/*
> + * Called when creation of an ei fails, do not call release()
> functions.
> + */
> +static inline void cleanup_ei(struct eventfs_inode *ei)
> +{
> +if (ei) {
> +/* Set nr_entries to 0 to prevent release() function being called */
> +ei->nr_entries = 0;
> +free_ei(ei);
> +}
> +}
> +
>  static inline struct eventfs_inode *get_ei(struct eventfs_inode *ei)
>  {
>  if (ei)
> @@ -734,7 +753,7 @@ struct eventfs_inode *eventfs_create_dir(const
> char *name, struct eventfs_inode
>  
>  /* Was the parent freed? */
>  if (list_empty(&ei->list)) {
> -free_ei(ei);
> +cleanup_ei(ei);
>  ei = NULL;
>  }
>  return ei;
> @@ -835,7 +854,7 @@ struct eventfs_inode
> *eventfs_create_events_dir(const char *name, struct dentry
>  return ei;
>  
>   fail:
> -free_ei(ei);
> +cleanup_ei(ei);
>  tracefs_failed_creating(dentry);
>  return ERR_PTR(-ENOMEM);
>  }
> diff --git a/include/linux/tracefs.h b/include/linux/tracefs.h
> index 7a5fe17b6bf9..d03f74658716 100644
> --- a/include/linux/tracefs.h
> +++ b/include/linux/tracefs.h
> @@ -62,6 +62,8 @@ struct eventfs_file;
>  typedef int (*eventfs_callback)(const char *name, umode_t *mode,
> void **data,
>  const struct file_operations **fops);
>  
> +typedef void (*eventfs_release)(const char *name, void *data);
> +
>  /**
>   * struct eventfs_entry - dynamically created eventfs file call back
> handler
>   * @name:Then name of the dynamic file in an eventfs directory
> @@ -72,6 +74,7 @@ typedef int (*eventfs_callback)(const char *name,
> umode_t *mode, void **data,
>  struct eventfs_entry {
>  const char*name;
>  eventfs_callbackcallback;
> +eventfs_releaserelease;
>  };
>  
>  struct eventfs_inode;
> diff --git a/kernel/trace/trace_events.c
> b/kernel/trace/trace_events.c
> index 52f75c36bbca..6ef29eba90ce 100644
> --- a/kernel/trace/trace_events.c
> +++ b/kernel/trace/trace_events.c
> @@ -2552,6 +2552,14 @@ static int event_callback(const char *name,
> umode_t *mode, void **data,
>  return 0;
>  }
>  
> +/* The file is incremented on creation and freeing the enable file
> decrements it */
> +static void event_release(const char *name, void *data)
> +{
> +struct trace_event_file *file = data;
> +
> +event_file_put(file);
> +}
> +
>  static int
>  event_create_dir(struct eventfs_inode *parent, struct
> trace_event_file *file)
>  {
> @@ -2566,6 +2574,7 @@ event_create_dir(struct eventfs_inode *parent,
> struct trace_event_file *file)
>  {
>  .name= "enable",
>  .callback= event_callback,
> +.release= event_release,
>  },
>  {
>  .name= "filter",
> @@ -2634,6 +2643,9 @@ event_create_dir(struct eventfs_inode *parent,
> struct trace_event_file *file)
>  return ret;
>  }
>  
> +/* Gets decremented on freeing of the "enable" file */
> +event_file_get(file);
> +
>  return 0;
>  }
>
Tze-nan Wu May 2, 2024, 6:49 a.m. UTC | #6
On Wed, 2024-05-01 at 23:50 -0400, Steven Rostedt wrote:
>  	 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
>  On Thu, 2 May 2024 03:10:24 +0000
> Tze-nan Wu (吳澤南) <Tze-nan.Wu@mediatek.com> wrote:
> 
> > >   
> > Sorry for my late reply, I'm testing the patch on my machine now. 
> > Test will be done in four hours.
> > 
> > There's something I'm worrying about in the patch,
> > what I'm worrying about is commented in the code below.
> > 
> > /kernel/trace/trace_events.c:
> >   static int
> >   event_create_dir(struct eventfs_inode *parent, 
> >   struct trace_event_file *file) 
> >   {
> >         ...
> >         ...
> >         ...
> >         nr_entries = ARRAY_SIZE(event_entries);
> > 
> >         name = trace_event_name(call);
> > 
> >         +event_file_get(file);        // Line A
> >             ^^^^^^^^^^^^^
> >         // Should we move the "event_file_get" to here, instead  
> >         // of calling it at line C?
> >         // Due to Line B could eventually invoke "event_file_put".
> >         //   eventfs_create_dir -> free_ei ->put_ei -> kref_put 
> >         //  -> release_ei -> event_release -> event_file_put
> >         // Not sure if this is a potential risk? If Line B do
> call   
> >         // event_file_put,"event_file_put" will be called prior to
> >         // "event_file_get", could corrupt the reference of the
> file.
> 
> No, but you do bring up a good point. The release should not be
> called on
> error, but it looks like it possibly can be.
> 
> > 
> >         ei = eventfs_create_dir(name, e_events,    // Line B 
> >              event_entries, nr_entries, file);
> >         if (IS_ERR(ei)) {
> >                 pr_warn("Could not create tracefs '%s'
> directory\n", 
> >                 name);
> >                 return -1;
> >         }
> >         file->ei = ei;
> > 
> >         ret = event_define_fields(call);
> >         if (ret < 0) {
> >                 pr_warn("Could not initialize trace point
> events/%s\n",
> > name);
> >                 return ret;
> >                    ^^^^^^^^^          
> >        // Maybe we chould have similar concern if we return here.
> >        // Due to the event_inode had been created, but we did not
> call 
> >        // event_file_get. 
> >        // Could it lead to some issues in the future while freeing 
> >        // event_indoe?
> >         }
> > 
> > 
> >         -event_file_get(file);       //Line C
> >         return 0;
> >   }
> 
> This prevents the release() function from being called on failure of
> creating the ei.
> 
> Can you try this patch instead?
> 
> -- Steve
> 
Good news, this patch works, the test has passed, no more Kasan report
in my environment.

my environment:
 arm64 + kasan + swtag based kasan + kernel-6.6.18

Really appreciate, and learn a lot from the patch.

--Tze-nan
> diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
> index 894c6ca1e500..f5510e26f0f6 100644
> --- a/fs/tracefs/event_inode.c
> +++ b/fs/tracefs/event_inode.c
> @@ -84,10 +84,17 @@ enum {
>  static void release_ei(struct kref *ref)
>  {
>  struct eventfs_inode *ei = container_of(ref, struct eventfs_inode,
> kref);
> +const struct eventfs_entry *entry;
>  struct eventfs_root_inode *rei;
>  
>  WARN_ON_ONCE(!ei->is_freed);
>  
> +for (int i = 0; i < ei->nr_entries; i++) {
> +entry = &ei->entries[i];
> +if (entry->release)
> +entry->release(entry->name, ei->data);
> +}
> +
>  kfree(ei->entry_attrs);
>  kfree_const(ei->name);
>  if (ei->is_events) {
> @@ -112,6 +119,18 @@ static inline void free_ei(struct eventfs_inode
> *ei)
>  }
>  }
>  
> +/*
> + * Called when creation of an ei fails, do not call release()
> functions.
> + */
> +static inline void cleanup_ei(struct eventfs_inode *ei)
> +{
> +if (ei) {
> +/* Set nr_entries to 0 to prevent release() function being called */
> +ei->nr_entries = 0;
> +free_ei(ei);
> +}
> +}
> +
>  static inline struct eventfs_inode *get_ei(struct eventfs_inode *ei)
>  {
>  if (ei)
> @@ -734,7 +753,7 @@ struct eventfs_inode *eventfs_create_dir(const
> char *name, struct eventfs_inode
>  
>  /* Was the parent freed? */
>  if (list_empty(&ei->list)) {
> -free_ei(ei);
> +cleanup_ei(ei);
>  ei = NULL;
>  }
>  return ei;
> @@ -835,7 +854,7 @@ struct eventfs_inode
> *eventfs_create_events_dir(const char *name, struct dentry
>  return ei;
>  
>   fail:
> -free_ei(ei);
> +cleanup_ei(ei);
>  tracefs_failed_creating(dentry);
>  return ERR_PTR(-ENOMEM);
>  }
> diff --git a/include/linux/tracefs.h b/include/linux/tracefs.h
> index 7a5fe17b6bf9..d03f74658716 100644
> --- a/include/linux/tracefs.h
> +++ b/include/linux/tracefs.h
> @@ -62,6 +62,8 @@ struct eventfs_file;
>  typedef int (*eventfs_callback)(const char *name, umode_t *mode,
> void **data,
>  const struct file_operations **fops);
>  
> +typedef void (*eventfs_release)(const char *name, void *data);
> +
>  /**
>   * struct eventfs_entry - dynamically created eventfs file call back
> handler
>   * @name:Then name of the dynamic file in an eventfs directory
> @@ -72,6 +74,7 @@ typedef int (*eventfs_callback)(const char *name,
> umode_t *mode, void **data,
>  struct eventfs_entry {
>  const char*name;
>  eventfs_callbackcallback;
> +eventfs_releaserelease;
>  };
>  
>  struct eventfs_inode;
> diff --git a/kernel/trace/trace_events.c
> b/kernel/trace/trace_events.c
> index 52f75c36bbca..6ef29eba90ce 100644
> --- a/kernel/trace/trace_events.c
> +++ b/kernel/trace/trace_events.c
> @@ -2552,6 +2552,14 @@ static int event_callback(const char *name,
> umode_t *mode, void **data,
>  return 0;
>  }
>  
> +/* The file is incremented on creation and freeing the enable file
> decrements it */
> +static void event_release(const char *name, void *data)
> +{
> +struct trace_event_file *file = data;
> +
> +event_file_put(file);
> +}
> +
>  static int
>  event_create_dir(struct eventfs_inode *parent, struct
> trace_event_file *file)
>  {
> @@ -2566,6 +2574,7 @@ event_create_dir(struct eventfs_inode *parent,
> struct trace_event_file *file)
>  {
>  .name= "enable",
>  .callback= event_callback,
> +.release= event_release,
>  },
>  {
>  .name= "filter",
> @@ -2634,6 +2643,9 @@ event_create_dir(struct eventfs_inode *parent,
> struct trace_event_file *file)
>  return ret;
>  }
>  
> +/* Gets decremented on freeing of the "enable" file */
> +event_file_get(file);
> +
>  return 0;
>  }
>
Steven Rostedt May 2, 2024, 2:09 p.m. UTC | #7
On Thu, 2 May 2024 06:49:18 +0000
Tze-nan Wu (吳澤南) <Tze-nan.Wu@mediatek.com> wrote:

> Good news, this patch works, the test has passed, no more Kasan report
> in my environment.

Great to hear!

> 
> my environment:
>  arm64 + kasan + swtag based kasan + kernel-6.6.18
> 
> Really appreciate, and learn a lot from the patch.

Can I add:

Tested-by: Tze-nan Wu (吳澤南) <Tze-nan.Wu@mediatek.com>

?

-- Steve
Tze-nan Wu May 3, 2024, 1:20 a.m. UTC | #8
On Thu, 2024-05-02 at 10:09 -0400, Steven Rostedt wrote:
>  	 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
>  On Thu, 2 May 2024 06:49:18 +0000
> Tze-nan Wu (吳澤南) <Tze-nan.Wu@mediatek.com> wrote:
> 
> > Good news, this patch works, the test has passed, no more Kasan
> report
> > in my environment.
> 
> Great to hear!
> 
> > 
> > my environment:
> >  arm64 + kasan + swtag based kasan + kernel-6.6.18
> > 
> > Really appreciate, and learn a lot from the patch.
> 
> Can I add:
> 
> Tested-by: Tze-nan Wu (吳澤南) <Tze-nan.Wu@mediatek.com>
> 
> ?
> 
> -- Steve

Certainly, glad to be list as a tester.

-- Tze-nan
diff mbox series

Patch

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 894c6ca1e500..fd49c0f88500 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -20,6 +20,7 @@ 
 #include <linux/workqueue.h>
 #include <linux/security.h>
 #include <linux/tracefs.h>
+#include <linux/trace_events.h>
 #include <linux/kref.h>
 #include <linux/delay.h>
 #include "internal.h"
@@ -90,6 +91,8 @@  static void release_ei(struct kref *ref)
 
 	kfree(ei->entry_attrs);
 	kfree_const(ei->name);
+	if (ei->nr_entries && is_entry_event_callback(ei->entries[0]))
+		event_file_put(ei->data);
 	if (ei->is_events) {
 		rei = get_root_inode(ei);
 		kfree_rcu(rei, ei.rcu);
diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index 6f9bdfb09d1d..602e87682ee2 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -9,6 +9,7 @@ 
 #include <linux/hardirq.h>
 #include <linux/perf_event.h>
 #include <linux/tracepoint.h>
+#include <linux/tracefs.h>
 
 struct trace_array;
 struct array_buffer;
@@ -505,6 +506,7 @@  extern struct trace_event_file *trace_get_event_file(const char *instance,
 						     const char *system,
 						     const char *event);
 extern void trace_put_event_file(struct trace_event_file *file);
+bool is_entry_event_callback(struct eventfs_entry entry);
 
 #define MAX_DYNEVENT_CMD_LEN	(2048)
 
@@ -731,6 +733,8 @@  extern void
 event_triggers_post_call(struct trace_event_file *file,
 			 enum event_trigger_type tt);
 
+extern void event_file_put(struct trace_event_file *file);
+
 bool trace_event_ignore_this_pid(struct trace_event_file *trace_file);
 
 bool __trace_trigger_soft_disabled(struct trace_event_file *file);
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 52f75c36bbca..de01676b59ff 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -2626,6 +2626,7 @@  event_create_dir(struct eventfs_inode *parent, struct trace_event_file *file)
 		return -1;
 	}
 
+	event_file_get(file);
 	file->ei = ei;
 
 	ret = event_define_fields(call);
@@ -3420,6 +3421,11 @@  void trace_put_event_file(struct trace_event_file *file)
 }
 EXPORT_SYMBOL_GPL(trace_put_event_file);
 
+bool is_entry_event_callback(struct eventfs_entry entry)
+{
+	return entry.callback == event_callback;
+}
+
 #ifdef CONFIG_DYNAMIC_FTRACE
 
 /* Avoid typos */