diff mbox series

[v2] kthread: dynamically allocate memory to store kthread's full name

Message ID 20211120112850.46047-1-laoar.shao@gmail.com (mailing list archive)
State Not Applicable
Headers show
Series [v2] kthread: dynamically allocate memory to store kthread's full name | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Yafang Shao Nov. 20, 2021, 11:28 a.m. UTC
When I was implementing a new per-cpu kthread cfs_migration, I found the
comm of it "cfs_migration/%u" is truncated due to the limitation of
TASK_COMM_LEN. For example, the comm of the percpu thread on CPU10~19 are
all with the same name "cfs_migration/1", which will confuse the user. This
issue is not critical, because we can get the corresponding CPU from the
task's Cpus_allowed. But for kthreads correspoinding to other hardware
devices, it is not easy to get the detailed device info from task comm,
for example,

    jbd2/nvme0n1p2-
    xfs-reclaim/sdf

Currently there are so many truncated kthreads:

    rcu_tasks_kthre
    rcu_tasks_rude_
    rcu_tasks_trace
    poll_mpt3sas0_s
    ext4-rsv-conver
    xfs-reclaim/sd{a, b, c, ...}
    xfs-blockgc/sd{a, b, c, ...}
    xfs-inodegc/sd{a, b, c, ...}
    audit_send_repl
    ecryptfs-kthrea
    vfio-irqfd-clea
    jbd2/nvme0n1p2-
    ...

We can shorten these names to work around this problem, but it may be
not applied to all of the truncated kthreads. Take 'jbd2/nvme0n1p2-' for
example, it is a nice name, and it is not a good idea to shorten it.

One possible way to fix this issue is extending the task comm size, but
as task->comm is used in lots of places, that may cause some potential
buffer overflows. Another more conservative approach is introducing a new
pointer to store kthread's full name if it is truncated, which won't
introduce too much overhead as it is in the non-critical path. Finally we
make a dicision to use the second approach. See also the discussions in
this thread:
https://lore.kernel.org/lkml/20211101060419.4682-1-laoar.shao@gmail.com/

After this change, the full name of these truncated kthreads will be
displayed via /proc/[pid]/comm:

    rcu_tasks_kthread
    rcu_tasks_rude_kthread
    rcu_tasks_trace_kthread
    poll_mpt3sas0_statu
    ext4-rsv-conversion
    xfs-reclaim/sdf1
    xfs-blockgc/sdf1
    xfs-inodegc/sdf1
    audit_send_reply
    ecryptfs-kthread
    vfio-irqfd-cleanup
    jbd2/nvme0n1p2-8

Suggested-by: Petr Mladek <pmladek@suse.com>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Michal Miroslaw <mirq-linux@rere.qmqm.pl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Kees Cook <keescook@chromium.org>
Cc: Petr Mladek <pmladek@suse.com>

---

Changes since v1:
1. leave it turncated when out of memory (Kees & Petr)
2. do null check in free_kthread_struct (Petr)
---
 fs/proc/array.c         |  3 +++
 include/linux/kthread.h |  1 +
 kernel/kthread.c        | 32 ++++++++++++++++++++++++++++++--
 3 files changed, 34 insertions(+), 2 deletions(-)

Comments

Petr Mladek Nov. 25, 2021, 9:18 a.m. UTC | #1
On Sat 2021-11-20 11:28:50, Yafang Shao wrote:
> When I was implementing a new per-cpu kthread cfs_migration, I found the
> comm of it "cfs_migration/%u" is truncated due to the limitation of
> TASK_COMM_LEN.
> 
> One possible way to fix this issue is extending the task comm size, but
> as task->comm is used in lots of places, that may cause some potential
> buffer overflows. Another more conservative approach is introducing a new
> pointer to store kthread's full name if it is truncated, which won't
> introduce too much overhead as it is in the non-critical path. Finally we
> make a dicision to use the second approach. See also the discussions in
> this thread:
> https://lore.kernel.org/lkml/20211101060419.4682-1-laoar.shao@gmail.com/
> 
> After this change, the full name of these truncated kthreads will be
> displayed via /proc/[pid]/comm:

> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>

Looks good to me:

Reviewed-by: Petr Mladek <pmladek@suse.com>

Best Regards,
Petr
David Hildenbrand Nov. 25, 2021, 9:36 a.m. UTC | #2
On 20.11.21 12:28, Yafang Shao wrote:
> When I was implementing a new per-cpu kthread cfs_migration, I found the
> comm of it "cfs_migration/%u" is truncated due to the limitation of
> TASK_COMM_LEN. For example, the comm of the percpu thread on CPU10~19 are
> all with the same name "cfs_migration/1", which will confuse the user. This
> issue is not critical, because we can get the corresponding CPU from the
> task's Cpus_allowed. But for kthreads correspoinding to other hardware
> devices, it is not easy to get the detailed device info from task comm,
> for example,
> 
>     jbd2/nvme0n1p2-
>     xfs-reclaim/sdf
> 
> Currently there are so many truncated kthreads:
> 
>     rcu_tasks_kthre
>     rcu_tasks_rude_
>     rcu_tasks_trace
>     poll_mpt3sas0_s
>     ext4-rsv-conver
>     xfs-reclaim/sd{a, b, c, ...}
>     xfs-blockgc/sd{a, b, c, ...}
>     xfs-inodegc/sd{a, b, c, ...}
>     audit_send_repl
>     ecryptfs-kthrea
>     vfio-irqfd-clea
>     jbd2/nvme0n1p2-
>     ...
> 
> We can shorten these names to work around this problem, but it may be
> not applied to all of the truncated kthreads. Take 'jbd2/nvme0n1p2-' for
> example, it is a nice name, and it is not a good idea to shorten it.
> 
> One possible way to fix this issue is extending the task comm size, but
> as task->comm is used in lots of places, that may cause some potential
> buffer overflows. Another more conservative approach is introducing a new
> pointer to store kthread's full name if it is truncated, which won't
> introduce too much overhead as it is in the non-critical path. Finally we
> make a dicision to use the second approach. See also the discussions in
> this thread:
> https://lore.kernel.org/lkml/20211101060419.4682-1-laoar.shao@gmail.com/
> 
> After this change, the full name of these truncated kthreads will be
> displayed via /proc/[pid]/comm:
> 
>     rcu_tasks_kthread
>     rcu_tasks_rude_kthread
>     rcu_tasks_trace_kthread
>     poll_mpt3sas0_statu
>     ext4-rsv-conversion
>     xfs-reclaim/sdf1
>     xfs-blockgc/sdf1
>     xfs-inodegc/sdf1
>     audit_send_reply
>     ecryptfs-kthread
>     vfio-irqfd-cleanup
>     jbd2/nvme0n1p2-8

I do wonder if that could break some user space that assumes these names
have maximum length ..

But LGTM

Reviewed-by: David Hildenbrand <david@redhat.com>
Petr Mladek Nov. 25, 2021, 2:46 p.m. UTC | #3
On Thu 2021-11-25 10:36:49, David Hildenbrand wrote:
> On 20.11.21 12:28, Yafang Shao wrote:
> > When I was implementing a new per-cpu kthread cfs_migration, I found the
> > comm of it "cfs_migration/%u" is truncated due to the limitation of
> > TASK_COMM_LEN. For example, the comm of the percpu thread on CPU10~19 are
> > all with the same name "cfs_migration/1", which will confuse the user. This
> > issue is not critical, because we can get the corresponding CPU from the
> > task's Cpus_allowed. But for kthreads correspoinding to other hardware
> > devices, it is not easy to get the detailed device info from task comm,
> > for example,
> > 
> >     jbd2/nvme0n1p2-
> >     xfs-reclaim/sdf
> > 
> > Currently there are so many truncated kthreads:
> > 
> >     rcu_tasks_kthre
> >     rcu_tasks_rude_
> >     rcu_tasks_trace
> >     poll_mpt3sas0_s
> >     ext4-rsv-conver
> >     xfs-reclaim/sd{a, b, c, ...}
> >     xfs-blockgc/sd{a, b, c, ...}
> >     xfs-inodegc/sd{a, b, c, ...}
> >     audit_send_repl
> >     ecryptfs-kthrea
> >     vfio-irqfd-clea
> >     jbd2/nvme0n1p2-
> >     ...
> > 
> > We can shorten these names to work around this problem, but it may be
> > not applied to all of the truncated kthreads. Take 'jbd2/nvme0n1p2-' for
> > example, it is a nice name, and it is not a good idea to shorten it.
> > 
> > One possible way to fix this issue is extending the task comm size, but
> > as task->comm is used in lots of places, that may cause some potential
> > buffer overflows. Another more conservative approach is introducing a new
> > pointer to store kthread's full name if it is truncated, which won't
> > introduce too much overhead as it is in the non-critical path. Finally we
> > make a dicision to use the second approach. See also the discussions in
> > this thread:
> > https://lore.kernel.org/lkml/20211101060419.4682-1-laoar.shao@gmail.com/
> > 
> > After this change, the full name of these truncated kthreads will be
> > displayed via /proc/[pid]/comm:
> > 
> >     rcu_tasks_kthread
> >     rcu_tasks_rude_kthread
> >     rcu_tasks_trace_kthread
> >     poll_mpt3sas0_statu
> >     ext4-rsv-conversion
> >     xfs-reclaim/sdf1
> >     xfs-blockgc/sdf1
> >     xfs-inodegc/sdf1
> >     audit_send_reply
> >     ecryptfs-kthread
> >     vfio-irqfd-cleanup
> >     jbd2/nvme0n1p2-8
> 
> I do wonder if that could break some user space that assumes these names
> have maximum length ..

There is high chance that we will be on the safe side. Workqueue
kthreads already provided longer names. They are even dynamic
because the currently handled workqueue name is part of the name,
see wq_worker_comm().

Best Regards,
Petr
David Hildenbrand Nov. 25, 2021, 2:47 p.m. UTC | #4
On 25.11.21 15:46, Petr Mladek wrote:
> On Thu 2021-11-25 10:36:49, David Hildenbrand wrote:
>> On 20.11.21 12:28, Yafang Shao wrote:
>>> When I was implementing a new per-cpu kthread cfs_migration, I found the
>>> comm of it "cfs_migration/%u" is truncated due to the limitation of
>>> TASK_COMM_LEN. For example, the comm of the percpu thread on CPU10~19 are
>>> all with the same name "cfs_migration/1", which will confuse the user. This
>>> issue is not critical, because we can get the corresponding CPU from the
>>> task's Cpus_allowed. But for kthreads correspoinding to other hardware
>>> devices, it is not easy to get the detailed device info from task comm,
>>> for example,
>>>
>>>     jbd2/nvme0n1p2-
>>>     xfs-reclaim/sdf
>>>
>>> Currently there are so many truncated kthreads:
>>>
>>>     rcu_tasks_kthre
>>>     rcu_tasks_rude_
>>>     rcu_tasks_trace
>>>     poll_mpt3sas0_s
>>>     ext4-rsv-conver
>>>     xfs-reclaim/sd{a, b, c, ...}
>>>     xfs-blockgc/sd{a, b, c, ...}
>>>     xfs-inodegc/sd{a, b, c, ...}
>>>     audit_send_repl
>>>     ecryptfs-kthrea
>>>     vfio-irqfd-clea
>>>     jbd2/nvme0n1p2-
>>>     ...
>>>
>>> We can shorten these names to work around this problem, but it may be
>>> not applied to all of the truncated kthreads. Take 'jbd2/nvme0n1p2-' for
>>> example, it is a nice name, and it is not a good idea to shorten it.
>>>
>>> One possible way to fix this issue is extending the task comm size, but
>>> as task->comm is used in lots of places, that may cause some potential
>>> buffer overflows. Another more conservative approach is introducing a new
>>> pointer to store kthread's full name if it is truncated, which won't
>>> introduce too much overhead as it is in the non-critical path. Finally we
>>> make a dicision to use the second approach. See also the discussions in
>>> this thread:
>>> https://lore.kernel.org/lkml/20211101060419.4682-1-laoar.shao@gmail.com/
>>>
>>> After this change, the full name of these truncated kthreads will be
>>> displayed via /proc/[pid]/comm:
>>>
>>>     rcu_tasks_kthread
>>>     rcu_tasks_rude_kthread
>>>     rcu_tasks_trace_kthread
>>>     poll_mpt3sas0_statu
>>>     ext4-rsv-conversion
>>>     xfs-reclaim/sdf1
>>>     xfs-blockgc/sdf1
>>>     xfs-inodegc/sdf1
>>>     audit_send_reply
>>>     ecryptfs-kthread
>>>     vfio-irqfd-cleanup
>>>     jbd2/nvme0n1p2-8
>>
>> I do wonder if that could break some user space that assumes these names
>> have maximum length ..
> 
> There is high chance that we will be on the safe side. Workqueue
> kthreads already provided longer names. They are even dynamic
> because the currently handled workqueue name is part of the name,
> see wq_worker_comm().

Great, thanks!
diff mbox series

Patch

diff --git a/fs/proc/array.c b/fs/proc/array.c
index ff869a66b34e..4321aa63835d 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -92,6 +92,7 @@ 
 #include <linux/string_helpers.h>
 #include <linux/user_namespace.h>
 #include <linux/fs_struct.h>
+#include <linux/kthread.h>
 
 #include <asm/processor.h>
 #include "internal.h"
@@ -102,6 +103,8 @@  void proc_task_name(struct seq_file *m, struct task_struct *p, bool escape)
 
 	if (p->flags & PF_WQ_WORKER)
 		wq_worker_comm(tcomm, sizeof(tcomm), p);
+	else if (p->flags & PF_KTHREAD)
+		get_kthread_comm(tcomm, sizeof(tcomm), p);
 	else
 		__get_task_comm(tcomm, sizeof(tcomm), p);
 
diff --git a/include/linux/kthread.h b/include/linux/kthread.h
index 346b0f269161..2a5c04494663 100644
--- a/include/linux/kthread.h
+++ b/include/linux/kthread.h
@@ -33,6 +33,7 @@  struct task_struct *kthread_create_on_cpu(int (*threadfn)(void *data),
 					  unsigned int cpu,
 					  const char *namefmt);
 
+void get_kthread_comm(char *buf, size_t buf_size, struct task_struct *tsk);
 void set_kthread_struct(struct task_struct *p);
 
 void kthread_set_per_cpu(struct task_struct *k, int cpu);
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 7113003fab63..a70cd5dc94e3 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -60,6 +60,8 @@  struct kthread {
 #ifdef CONFIG_BLK_CGROUP
 	struct cgroup_subsys_state *blkcg_css;
 #endif
+	/* To store the full name if task comm is truncated. */
+	char *full_name;
 };
 
 enum KTHREAD_BITS {
@@ -93,6 +95,18 @@  static inline struct kthread *__to_kthread(struct task_struct *p)
 	return kthread;
 }
 
+void get_kthread_comm(char *buf, size_t buf_size, struct task_struct *tsk)
+{
+	struct kthread *kthread = to_kthread(tsk);
+
+	if (!kthread || !kthread->full_name) {
+		__get_task_comm(buf, buf_size, tsk);
+		return;
+	}
+
+	strscpy_pad(buf, kthread->full_name, buf_size);
+}
+
 void set_kthread_struct(struct task_struct *p)
 {
 	struct kthread *kthread;
@@ -118,9 +132,13 @@  void free_kthread_struct(struct task_struct *k)
 	 * or if kmalloc() in kthread() failed.
 	 */
 	kthread = to_kthread(k);
+	if (!kthread)
+		return;
+
 #ifdef CONFIG_BLK_CGROUP
-	WARN_ON_ONCE(kthread && kthread->blkcg_css);
+	WARN_ON_ONCE(kthread->blkcg_css);
 #endif
+	kfree(kthread->full_name);
 	kfree(kthread);
 }
 
@@ -406,12 +424,22 @@  struct task_struct *__kthread_create_on_node(int (*threadfn)(void *data),
 	task = create->result;
 	if (!IS_ERR(task)) {
 		char name[TASK_COMM_LEN];
+		va_list aq;
+		int len;
 
 		/*
 		 * task is already visible to other tasks, so updating
 		 * COMM must be protected.
 		 */
-		vsnprintf(name, sizeof(name), namefmt, args);
+		va_copy(aq, args);
+		len = vsnprintf(name, sizeof(name), namefmt, aq);
+		va_end(aq);
+		if (len >= TASK_COMM_LEN) {
+			struct kthread *kthread = to_kthread(task);
+
+			/* leave it truncated when out of memory. */
+			kthread->full_name = kvasprintf(GFP_KERNEL, namefmt, args);
+		}
 		set_task_comm(task, name);
 	}
 	kfree(create);