mbox series

[v7,00/11] extend task comm from 16 to 24

Message ID 20211101060419.4682-1-laoar.shao@gmail.com (mailing list archive)
Headers show
Series extend task comm from 16 to 24 | expand

Message

Yafang Shao Nov. 1, 2021, 6:04 a.m. UTC
There're many truncated kthreads in the kernel, which may make trouble
for the user, for example, the user can't get detailed device
information from the task comm.

This patchset tries to improve this problem fundamentally by extending
the task comm size from 16 to 24, which is a very simple way. 

In order to do that, we have to do some cleanups first.

1. Make the copy of task comm always safe no matter what the task
   comm size is. For example,

      Unsafe                 Safe
      strlcpy                strscpy_pad
      strncpy                strscpy_pad
      bpf_probe_read_kernel  bpf_probe_read_kernel_str
                             bpf_core_read_str
                             bpf_get_current_comm
                             perf_event__prepare_comm
                             prctl(2)

   After this step, the comm size change won't make any trouble to the
   kernel or the in-tree tools for example perf, BPF programs.

2. Cleanup some old hard-coded 16
   Actually we don't need to convert all of them to TASK_COMM_LEN or
   TASK_COMM_LEN_16, what we really care about is if the convert can
   make the code more reasonable or easier to understand. For
   example, some in-tree tools read the comm from sched:sched_switch
   tracepoint, as it is derived from the kernel, we'd better make them
   consistent with the kernel.

3. Extend the task comm size from 16 to 24
   task_struct is growing rather regularly by 8 bytes. This size change
   should be acceptable. We used to think about extending the size for
   CONFIG_BASE_FULL only, but that would be a burden for maintenance
   and introduce code complexity.

4. Print a warning if the kthread comm is still truncated.

5. What will happen to the out-of-tree tools after this change?
   If the tool get task comm through kernel API, for example prctl(2),
   bpf_get_current_comm() and etc, then it doesn't matter how large the
   user buffer is, because it will always get a string with a nul
   terminator. While if it gets the task comm through direct string copy,
   the user tool must make sure the copied string has a nul terminator
   itself. As TASK_COMM_LEN is not exposed to userspace, there's no
   reason that it must require a fixed-size task comm.

Changes since v6:
Various suggestion from Kees:
- replace strscpy_pad() with the helper get_task_comm()
- fix typo
- replace BUILD_BUG_ON() with __must_be_array()
- don't change the size of pr_fname
- merge two samples/bpf/ patches to one patch
- keep TASK_COMM_LEN_16 per

Changes since v5:
- extend the comm size for both CONFIG_BASE_{FULL, SMALL} that could
  make the code more simple and easier to maintain.
- avoid changing too much hard-coded 16 in BPF programs per Andrii.

Changes since v4:
- introduce TASK_COMM_LEN_16 and TASK_COMM_LEN_24 per Steven
- replace hard-coded 16 with TASK_COMM_LEN_16 per Kees
- use strscpy_pad() instead of strlcpy()/strncpy() per Kees
- make perf test adopt to task comm size change per Arnaldo and Mathieu
- fix warning reported by kernel test robot

Changes since v3:
- fixes -Wstringop-truncation warning reported by kernel test robot

Changes since v2:
- avoid change UAPI code per Kees
- remove the description of out of tree code from commit log per Peter

Changes since v1:
- extend task comm to 24bytes, per Petr
- improve the warning per Petr
- make the checkpatch warning a separate patch


Yafang Shao (11):
  fs/exec: make __set_task_comm always set a nul terminated string
  fs/exec: make __get_task_comm always get a nul terminated string
  sched.h: use __must_be_array instead of BUILD_BUG_ON in get_task_comm
  drivers/infiniband: make setup_ctxt always get a nul terminated task
    comm
  fs/binfmt_elf: make prpsinfo always get a nul terminated task comm
  samples/bpf/test_overhead_kprobe_kern: make it adopt to task comm size
    change
  tools/bpf/bpftool/skeleton: make it adopt to task comm size change
  tools/perf/test: make perf test adopt to task comm size change
  tools/testing/selftests/bpf: make it adopt to task comm size change
  sched.h: extend task comm from 16 to 24
  kernel/kthread: show a warning if kthread's comm is truncated

 drivers/infiniband/hw/qib/qib.h               |  2 +-
 drivers/infiniband/hw/qib/qib_file_ops.c      |  2 +-
 fs/binfmt_elf.c                               |  2 +-
 fs/exec.c                                     |  5 ++--
 include/linux/sched.h                         | 16 +++++++-----
 kernel/kthread.c                              |  7 ++++-
 samples/bpf/offwaketime_kern.c                |  4 +--
 samples/bpf/test_overhead_kprobe_kern.c       | 11 ++++----
 samples/bpf/test_overhead_tp_kern.c           |  5 ++--
 tools/bpf/bpftool/skeleton/pid_iter.bpf.c     |  4 +--
 tools/include/linux/sched.h                   | 11 ++++++++
 tools/perf/tests/evsel-tp-sched.c             | 26 ++++++++++++++-----
 .../selftests/bpf/progs/test_stacktrace_map.c |  6 ++---
 .../selftests/bpf/progs/test_tracepoint.c     |  6 ++---
 14 files changed, 72 insertions(+), 35 deletions(-)
 create mode 100644 tools/include/linux/sched.h

Comments

Matthew Wilcox (Oracle) Nov. 1, 2021, 12:44 p.m. UTC | #1
On Mon, Nov 01, 2021 at 06:04:08AM +0000, Yafang Shao wrote:
> There're many truncated kthreads in the kernel, which may make trouble
> for the user, for example, the user can't get detailed device
> information from the task comm.
> 
> This patchset tries to improve this problem fundamentally by extending
> the task comm size from 16 to 24, which is a very simple way. 

It can't be that simple if we're on v7 and at 11 patches!

It would be helpful if you included links to earlier postings.  I can
only find v5 and v6 in my inbox, so I fear I'm going to re-ask some
questions which were already answered.

Why can't we shorten the names of these kthreads?  You didn't
give any examples, so I can't suggest any possibilities.
Yafang Shao Nov. 1, 2021, 1:12 p.m. UTC | #2
On Mon, Nov 1, 2021 at 8:46 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Mon, Nov 01, 2021 at 06:04:08AM +0000, Yafang Shao wrote:
> > There're many truncated kthreads in the kernel, which may make trouble
> > for the user, for example, the user can't get detailed device
> > information from the task comm.
> >
> > This patchset tries to improve this problem fundamentally by extending
> > the task comm size from 16 to 24, which is a very simple way.
>
> It can't be that simple if we're on v7 and at 11 patches!
>

Most of the changes are because of hard-coded 16 that can't be easily grepped.
In these 11 patches, patch #1, #2, #4, #5, #6, #7 and #9 are cleanups,
which can be a different patchset.

The core changes of these patchset are patch #3, #8 and #10.

#11 can also be a seperate patch.

> It would be helpful if you included links to earlier postings.  I can
> only find v5 and v6 in my inbox, so I fear I'm going to re-ask some
> questions which were already answered.
>

v1: https://lore.kernel.org/lkml/20210929115036.4851-1-laoar.shao@gmail.com/
v2: https://lore.kernel.org/lkml/20211007120752.5195-1-laoar.shao@gmail.com/
v3: https://lore.kernel.org/lkml/20211010102429.99577-1-laoar.shao@gmail.com/
v4: https://lore.kernel.org/lkml/20211013102346.179642-1-laoar.shao@gmail.com/
v5: https://lore.kernel.org/lkml/20211021034516.4400-1-laoar.shao@gmail.com/
v6: https://lore.kernel.org/lkml/20211025083315.4752-1-laoar.shao@gmail.com/


> Why can't we shorten the names of these kthreads?  You didn't
> give any examples, so I can't suggest any possibilities.
>

Take 'jbd2/nvme0n1p2-' for example, that's a nice name, which gives a
good description via the name.
And I don't think it is a good idea to shorten its name.
Petr Mladek Nov. 1, 2021, 2:07 p.m. UTC | #3
On Mon 2021-11-01 06:04:08, Yafang Shao wrote:
> There're many truncated kthreads in the kernel, which may make trouble
> for the user, for example, the user can't get detailed device
> information from the task comm.
> 
> This patchset tries to improve this problem fundamentally by extending
> the task comm size from 16 to 24, which is a very simple way. 
> 
> In order to do that, we have to do some cleanups first.
> 
> 1. Make the copy of task comm always safe no matter what the task
>    comm size is. For example,
> 
>       Unsafe                 Safe
>       strlcpy                strscpy_pad
>       strncpy                strscpy_pad
>       bpf_probe_read_kernel  bpf_probe_read_kernel_str
>                              bpf_core_read_str
>                              bpf_get_current_comm
>                              perf_event__prepare_comm
>                              prctl(2)
> 
>    After this step, the comm size change won't make any trouble to the
>    kernel or the in-tree tools for example perf, BPF programs.
> 
> 2. Cleanup some old hard-coded 16
>    Actually we don't need to convert all of them to TASK_COMM_LEN or
>    TASK_COMM_LEN_16, what we really care about is if the convert can
>    make the code more reasonable or easier to understand. For
>    example, some in-tree tools read the comm from sched:sched_switch
>    tracepoint, as it is derived from the kernel, we'd better make them
>    consistent with the kernel.

The above changes make sense even if we do not extend comm[] array in
task_struct.


> 3. Extend the task comm size from 16 to 24
>    task_struct is growing rather regularly by 8 bytes. This size change
>    should be acceptable. We used to think about extending the size for
>    CONFIG_BASE_FULL only, but that would be a burden for maintenance
>    and introduce code complexity.
> 
> 4. Print a warning if the kthread comm is still truncated.
> 
> 5. What will happen to the out-of-tree tools after this change?
>    If the tool get task comm through kernel API, for example prctl(2),
>    bpf_get_current_comm() and etc, then it doesn't matter how large the
>    user buffer is, because it will always get a string with a nul
>    terminator. While if it gets the task comm through direct string copy,
>    the user tool must make sure the copied string has a nul terminator
>    itself. As TASK_COMM_LEN is not exposed to userspace, there's no
>    reason that it must require a fixed-size task comm.

The amount of code that has to be updated is really high. I am pretty
sure that there are more potential buffer overflows left.

You did not commented on the concerns in the thread
https://lore.kernel.org/all/CAADnVQKm0Ljj-w5PbkAu1ugLFnZRRPt-Vk-J7AhXxDD5xVompA@mail.gmail.com/

Several people suggested to use a more conservative approach. I mean
to keep comm[16] as is and add a new pointer to the full name. The buffer
for the long name might be dynamically allocated only when needed.

The pointer might be either in task_struct or struct kthread. It might
be used the same way as the full name stored by workqueue kthreads.

The advantage of the separate pointer:

   + would work for names longer than 32
   + will not open security holes in code

Best Regards,
Petr
Yafang Shao Nov. 1, 2021, 2:34 p.m. UTC | #4
On Mon, Nov 1, 2021 at 10:07 PM Petr Mladek <pmladek@suse.com> wrote:
>
> On Mon 2021-11-01 06:04:08, Yafang Shao wrote:
> > There're many truncated kthreads in the kernel, which may make trouble
> > for the user, for example, the user can't get detailed device
> > information from the task comm.
> >
> > This patchset tries to improve this problem fundamentally by extending
> > the task comm size from 16 to 24, which is a very simple way.
> >
> > In order to do that, we have to do some cleanups first.
> >
> > 1. Make the copy of task comm always safe no matter what the task
> >    comm size is. For example,
> >
> >       Unsafe                 Safe
> >       strlcpy                strscpy_pad
> >       strncpy                strscpy_pad
> >       bpf_probe_read_kernel  bpf_probe_read_kernel_str
> >                              bpf_core_read_str
> >                              bpf_get_current_comm
> >                              perf_event__prepare_comm
> >                              prctl(2)
> >
> >    After this step, the comm size change won't make any trouble to the
> >    kernel or the in-tree tools for example perf, BPF programs.
> >
> > 2. Cleanup some old hard-coded 16
> >    Actually we don't need to convert all of them to TASK_COMM_LEN or
> >    TASK_COMM_LEN_16, what we really care about is if the convert can
> >    make the code more reasonable or easier to understand. For
> >    example, some in-tree tools read the comm from sched:sched_switch
> >    tracepoint, as it is derived from the kernel, we'd better make them
> >    consistent with the kernel.
>
> The above changes make sense even if we do not extend comm[] array in
> task_struct.
>
>
> > 3. Extend the task comm size from 16 to 24
> >    task_struct is growing rather regularly by 8 bytes. This size change
> >    should be acceptable. We used to think about extending the size for
> >    CONFIG_BASE_FULL only, but that would be a burden for maintenance
> >    and introduce code complexity.
> >
> > 4. Print a warning if the kthread comm is still truncated.
> >
> > 5. What will happen to the out-of-tree tools after this change?
> >    If the tool get task comm through kernel API, for example prctl(2),
> >    bpf_get_current_comm() and etc, then it doesn't matter how large the
> >    user buffer is, because it will always get a string with a nul
> >    terminator. While if it gets the task comm through direct string copy,
> >    the user tool must make sure the copied string has a nul terminator
> >    itself. As TASK_COMM_LEN is not exposed to userspace, there's no
> >    reason that it must require a fixed-size task comm.
>
> The amount of code that has to be updated is really high. I am pretty
> sure that there are more potential buffer overflows left.
>
> You did not commented on the concerns in the thread
> https://lore.kernel.org/all/CAADnVQKm0Ljj-w5PbkAu1ugLFnZRRPt-Vk-J7AhXxDD5xVompA@mail.gmail.com/
>

I thought Steven[1] and  Kees[2] have already clearly explained why we
do it like that, so I didn't give any more words on it.

[1]. https://lore.kernel.org/all/20211025170503.59830a43@gandalf.local.home/
[2]. https://lore.kernel.org/all/202110251406.56F87A3522@keescook/

> Several people suggested to use a more conservative approach.

Yes, they are Al[3] and Alexei[4].

[3]. https://lore.kernel.org/lkml/YVkmaSUxbg%2FJtBHb@zeniv-ca.linux.org.uk/
[4]. https://lore.kernel.org/all/CAADnVQKm0Ljj-w5PbkAu1ugLFnZRRPt-Vk-J7AhXxDD5xVompA@mail.gmail.com/

> I mean
> to keep comm[16] as is and add a new pointer to the full name. The buffer
> for the long name might be dynamically allocated only when needed.
>

That would add a new allocation in the fork() for the threads with a long name.
I'm not sure if it is worth it.

> The pointer might be either in task_struct or struct kthread. It might
> be used the same way as the full name stored by workqueue kthreads.
>

If we decide to do it like that, I think we'd better add it in
task_struct, then it will work for all tasks.

> The advantage of the separate pointer:
>
>    + would work for names longer than 32
>    + will not open security holes in code
>

Yes, those are the advantages.  And the disadvantage of it is:

 - new allocation in fork()
Petr Mladek Nov. 1, 2021, 4:02 p.m. UTC | #5
On Mon 2021-11-01 22:34:30, Yafang Shao wrote:
> On Mon, Nov 1, 2021 at 10:07 PM Petr Mladek <pmladek@suse.com> wrote:
> > On Mon 2021-11-01 06:04:08, Yafang Shao wrote:
> > > 4. Print a warning if the kthread comm is still truncated.
> > >
> > > 5. What will happen to the out-of-tree tools after this change?
> > >    If the tool get task comm through kernel API, for example prctl(2),
> > >    bpf_get_current_comm() and etc, then it doesn't matter how large the
> > >    user buffer is, because it will always get a string with a nul
> > >    terminator. While if it gets the task comm through direct string copy,
> > >    the user tool must make sure the copied string has a nul terminator
> > >    itself. As TASK_COMM_LEN is not exposed to userspace, there's no
> > >    reason that it must require a fixed-size task comm.
> >
> > The amount of code that has to be updated is really high. I am pretty
> > sure that there are more potential buffer overflows left.
> >
> > You did not commented on the concerns in the thread
> > https://lore.kernel.org/all/CAADnVQKm0Ljj-w5PbkAu1ugLFnZRRPt-Vk-J7AhXxDD5xVompA@mail.gmail.com/
> >
> I thought Steven[1] and  Kees[2] have already clearly explained why we
> do it like that, so I didn't give any more words on it.
> 
> [1]. https://lore.kernel.org/all/20211025170503.59830a43@gandalf.local.home/

Steven was against switching task->comm[16] into a dynamically
allocated pointer. But he was not against storing longer names
separately.

> [2]. https://lore.kernel.org/all/202110251406.56F87A3522@keescook/

Honestly, I am a bit confused by Kees' answer. IMHO, he agreed that
switching task->comm[16] into a pointer was not worth it.

But I am not sure what he meant by "Agreed -- this is a small change
for what is already an "uncommon" corner case."


> > Several people suggested to use a more conservative approach.
> 
> Yes, they are Al[3] and Alexei[4].
> 
> [3]. https://lore.kernel.org/lkml/YVkmaSUxbg%2FJtBHb@zeniv-ca.linux.org.uk/

IMHO, Al suggested to store the long name separately and return it
by proc_task_name() when available.


> [4]. https://lore.kernel.org/all/CAADnVQKm0Ljj-w5PbkAu1ugLFnZRRPt-Vk-J7AhXxDD5xVompA@mail.gmail.com/

Alexei used dentry->d_iname as an exaxmple. struct dentry uses
d_iname[DNAME_INLINE_LEN] for short names. And dynamically
allocated d_name for long names, see *__d_alloc() implementation.

> > I mean
> > to keep comm[16] as is and add a new pointer to the full name. The buffer
> > for the long name might be dynamically allocated only when needed.
> >
> 
> That would add a new allocation in the fork() for the threads with a long name.
> I'm not sure if it is worth it.

The allocation will be done only when needed. IMHO, the performance is
important only for userspace processes. I am not aware of any kernel
subsystem that would heavily create and destroy kthreads.


> > The pointer might be either in task_struct or struct kthread. It might
> > be used the same way as the full name stored by workqueue kthreads.
> >
> 
> If we decide to do it like that, I think we'd better add it in
> task_struct, then it will work for all tasks.

Is it really needed for userspace processes? For example, ps shows
the information from /proc/*/cmdline instead.


> > The advantage of the separate pointer:
> >
> >    + would work for names longer than 32
> >    + will not open security holes in code
> >
> 
> Yes, those are the advantages.  And the disadvantage of it is:
> 
>  - new allocation in fork()

It should not be a problem if we do it only when necessary and only
for kthreads.

Best Regards,
Petr
Steven Rostedt Nov. 1, 2021, 4:06 p.m. UTC | #6
On Mon, 1 Nov 2021 17:02:12 +0100
Petr Mladek <pmladek@suse.com> wrote:

> > I thought Steven[1] and  Kees[2] have already clearly explained why we
> > do it like that, so I didn't give any more words on it.
> > 
> > [1]. https://lore.kernel.org/all/20211025170503.59830a43@gandalf.local.home/  
> 
> Steven was against switching task->comm[16] into a dynamically
> allocated pointer. But he was not against storing longer names
> separately.

Just to be clear. I was recommending that the comm[16] would still behave
like it does today. Where it is truncated. But if the name is longer, it
could be stored in a separate location if the caller wanted to know the
full name.

-- Steve
Yafang Shao Nov. 2, 2021, 1:09 a.m. UTC | #7
On Tue, Nov 2, 2021 at 12:02 AM Petr Mladek <pmladek@suse.com> wrote:
>
> On Mon 2021-11-01 22:34:30, Yafang Shao wrote:
> > On Mon, Nov 1, 2021 at 10:07 PM Petr Mladek <pmladek@suse.com> wrote:
> > > On Mon 2021-11-01 06:04:08, Yafang Shao wrote:
> > > > 4. Print a warning if the kthread comm is still truncated.
> > > >
> > > > 5. What will happen to the out-of-tree tools after this change?
> > > >    If the tool get task comm through kernel API, for example prctl(2),
> > > >    bpf_get_current_comm() and etc, then it doesn't matter how large the
> > > >    user buffer is, because it will always get a string with a nul
> > > >    terminator. While if it gets the task comm through direct string copy,
> > > >    the user tool must make sure the copied string has a nul terminator
> > > >    itself. As TASK_COMM_LEN is not exposed to userspace, there's no
> > > >    reason that it must require a fixed-size task comm.
> > >
> > > The amount of code that has to be updated is really high. I am pretty
> > > sure that there are more potential buffer overflows left.
> > >
> > > You did not commented on the concerns in the thread
> > > https://lore.kernel.org/all/CAADnVQKm0Ljj-w5PbkAu1ugLFnZRRPt-Vk-J7AhXxDD5xVompA@mail.gmail.com/
> > >
> > I thought Steven[1] and  Kees[2] have already clearly explained why we
> > do it like that, so I didn't give any more words on it.
> >
> > [1]. https://lore.kernel.org/all/20211025170503.59830a43@gandalf.local.home/
>
> Steven was against switching task->comm[16] into a dynamically
> allocated pointer. But he was not against storing longer names
> separately.
>
> > [2]. https://lore.kernel.org/all/202110251406.56F87A3522@keescook/
>
> Honestly, I am a bit confused by Kees' answer. IMHO, he agreed that
> switching task->comm[16] into a pointer was not worth it.
>
> But I am not sure what he meant by "Agreed -- this is a small change
> for what is already an "uncommon" corner case."
>
>
> > > Several people suggested to use a more conservative approach.
> >
> > Yes, they are Al[3] and Alexei[4].
> >
> > [3]. https://lore.kernel.org/lkml/YVkmaSUxbg%2FJtBHb@zeniv-ca.linux.org.uk/
>
> IMHO, Al suggested to store the long name separately and return it
> by proc_task_name() when available.
>
>
> > [4]. https://lore.kernel.org/all/CAADnVQKm0Ljj-w5PbkAu1ugLFnZRRPt-Vk-J7AhXxDD5xVompA@mail.gmail.com/
>
> Alexei used dentry->d_iname as an exaxmple. struct dentry uses
> d_iname[DNAME_INLINE_LEN] for short names. And dynamically
> allocated d_name for long names, see *__d_alloc() implementation.
>

Thanks for the summary.
So with Stenven's new reply[1], the opinion in common is storing long
names into a separate place. And no one is against it now.

[1]. https://lore.kernel.org/lkml/20211101120636.3cfc5afa@gandalf.local.home/

> > > I mean
> > > to keep comm[16] as is and add a new pointer to the full name. The buffer
> > > for the long name might be dynamically allocated only when needed.
> > >
> >
> > That would add a new allocation in the fork() for the threads with a long name.
> > I'm not sure if it is worth it.
>
> The allocation will be done only when needed. IMHO, the performance is
> important only for userspace processes. I am not aware of any kernel
> subsystem that would heavily create and destroy kthreads.
>

XFS may create many kthreads with longer names, especially if there're
many partitions in the disk.
For example,
    xfs-reclaim/sd{a, b, c, ...}
    xfs-blockgc/sd{a, b, c, ...}
    xfs-inodegc/sd{a, b, c, ...}

They are supposed to be created at boot time, and shouldn't be heavily
created and destroyed.

>
> > > The pointer might be either in task_struct or struct kthread. It might
> > > be used the same way as the full name stored by workqueue kthreads.
> > >
> >
> > If we decide to do it like that, I think we'd better add it in
> > task_struct, then it will work for all tasks.
>
> Is it really needed for userspace processes? For example, ps shows
> the information from /proc/*/cmdline instead.
>

Right. The userspace processes can be obtained from /proc/*/cmdline.

>
> > > The advantage of the separate pointer:
> > >
> > >    + would work for names longer than 32
> > >    + will not open security holes in code
> > >
> >
> > Yes, those are the advantages.  And the disadvantage of it is:
> >
> >  - new allocation in fork()
>
> It should not be a problem if we do it only when necessary and only
> for kthreads.
>

So if no one against, I will do it in two steps,

1. Send the task comm cleanups in a separate patchset named "task comm cleanups"
    This patchset includes patch #1, #2, #4,  #5, #6, #7 and #9.
    Cleaning them up can make it less error prone, and it will be
helpful if we want to extend task comm in the future :)

2.  Keep the current comm[16] as-is and introduce a separate pointer
to store kthread's long name
     Now we only care about kthread, so we can put the pointer into a
kthread specific struct.
     For example in the struct kthread, or in kthread->data (which may
conflict with workqueue).

     And then dynamically allocate a longer name if it is truncated,
for example,
     __kthread_create_on_node
         len = vsnprintf(name, sizeof(name), namefmt, args);
         if (len >= TASK_COMM_LEN) {
             /* create a longer name */
         }

     And then we modify proc_task_name(), so the user can get
kthread's longer name via /proc/[pid]/comm.

     And then free the allocated memory when the kthread is destroyed.

--
Thanks
Yafang
Steven Rostedt Nov. 2, 2021, 1:18 a.m. UTC | #8
On Tue, 2 Nov 2021 09:09:50 +0800
Yafang Shao <laoar.shao@gmail.com> wrote:

> So if no one against, I will do it in two steps,
> 
> 1. Send the task comm cleanups in a separate patchset named "task comm cleanups"
>     This patchset includes patch #1, #2, #4,  #5, #6, #7 and #9.
>     Cleaning them up can make it less error prone, and it will be
> helpful if we want to extend task comm in the future :)

Agreed.

> 
> 2.  Keep the current comm[16] as-is and introduce a separate pointer
> to store kthread's long name

I'm OK with this. Hopefully more would chime in too.

>      Now we only care about kthread, so we can put the pointer into a
> kthread specific struct.
>      For example in the struct kthread, or in kthread->data (which may
> conflict with workqueue).

No, add a new field to the structure. "full_name" or something like that.
I'm guessing it should be NULL if the name fits in TASK_COMM_LEN and
allocated if the name had to be truncated.

Do not overload data with this. That will just make things confusing.
There's not that many kthreads, where an addition of an 8 byte pointer is
going to cause issues.

> 
>      And then dynamically allocate a longer name if it is truncated,
> for example,
>      __kthread_create_on_node
>          len = vsnprintf(name, sizeof(name), namefmt, args);
>          if (len >= TASK_COMM_LEN) {
>              /* create a longer name */

And make sure you have it fail the kthread allocation if it fails to
allocate.

>          }
> 
>      And then we modify proc_task_name(), so the user can get
> kthread's longer name via /proc/[pid]/comm.

Agreed.

> 
>      And then free the allocated memory when the kthread is destroyed.

Correct.

Thanks,

-- Steve
Yafang Shao Nov. 2, 2021, 1:26 a.m. UTC | #9
On Tue, Nov 2, 2021 at 9:18 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Tue, 2 Nov 2021 09:09:50 +0800
> Yafang Shao <laoar.shao@gmail.com> wrote:
>
> > So if no one against, I will do it in two steps,
> >
> > 1. Send the task comm cleanups in a separate patchset named "task comm cleanups"
> >     This patchset includes patch #1, #2, #4,  #5, #6, #7 and #9.
> >     Cleaning them up can make it less error prone, and it will be
> > helpful if we want to extend task comm in the future :)
>
> Agreed.
>
> >
> > 2.  Keep the current comm[16] as-is and introduce a separate pointer
> > to store kthread's long name
>
> I'm OK with this. Hopefully more would chime in too.
>
> >      Now we only care about kthread, so we can put the pointer into a
> > kthread specific struct.
> >      For example in the struct kthread, or in kthread->data (which may
> > conflict with workqueue).
>
> No, add a new field to the structure. "full_name" or something like that.
> I'm guessing it should be NULL if the name fits in TASK_COMM_LEN and
> allocated if the name had to be truncated.
>
> Do not overload data with this. That will just make things confusing.
> There's not that many kthreads, where an addition of an 8 byte pointer is
> going to cause issues.
>

Sure, I will add a new field named "full_name", which only be
allocated if the kthread's comm is truncated.

> >
> >      And then dynamically allocate a longer name if it is truncated,
> > for example,
> >      __kthread_create_on_node
> >          len = vsnprintf(name, sizeof(name), namefmt, args);
> >          if (len >= TASK_COMM_LEN) {
> >              /* create a longer name */
>
> And make sure you have it fail the kthread allocation if it fails to
> allocate.
>
> >          }
> >
> >      And then we modify proc_task_name(), so the user can get
> > kthread's longer name via /proc/[pid]/comm.
>
> Agreed.
>
> >
> >      And then free the allocated memory when the kthread is destroyed.
>
> Correct.
>
> Thanks,
>
> -- Steve
Petr Mladek Nov. 2, 2021, 7:56 a.m. UTC | #10
On Tue 2021-11-02 09:26:35, Yafang Shao wrote:
> On Tue, Nov 2, 2021 at 9:18 AM Steven Rostedt <rostedt@goodmis.org> wrote:
> > On Tue, 2 Nov 2021 09:09:50 +0800
> > Yafang Shao <laoar.shao@gmail.com> wrote:
> > >      Now we only care about kthread, so we can put the pointer into a
> > > kthread specific struct.
> > >      For example in the struct kthread, or in kthread->data (which may
> > > conflict with workqueue).
> >
> > No, add a new field to the structure. "full_name" or something like that.
> > I'm guessing it should be NULL if the name fits in TASK_COMM_LEN and
> > allocated if the name had to be truncated.
> >
> > Do not overload data with this. That will just make things confusing.
> > There's not that many kthreads, where an addition of an 8 byte pointer is
> > going to cause issues.
> 
> Sure, I will add a new field named "full_name", which only be
> allocated if the kthread's comm is truncated.

The plan looks good to me.

One more thing. It should obsolete the workqueue-specific solution.
It would be great to clean up the workqueue code as the next step.

Best Regards,
Petr
David Hildenbrand Nov. 2, 2021, 9:26 a.m. UTC | #11
>> 2.  Keep the current comm[16] as-is and introduce a separate pointer
>> to store kthread's long name
> 
> I'm OK with this. Hopefully more would chime in too.

What you propose sounds sane to me.
Yafang Shao Nov. 2, 2021, 1:48 p.m. UTC | #12
On Tue, Nov 2, 2021 at 3:56 PM Petr Mladek <pmladek@suse.com> wrote:
>
> On Tue 2021-11-02 09:26:35, Yafang Shao wrote:
> > On Tue, Nov 2, 2021 at 9:18 AM Steven Rostedt <rostedt@goodmis.org> wrote:
> > > On Tue, 2 Nov 2021 09:09:50 +0800
> > > Yafang Shao <laoar.shao@gmail.com> wrote:
> > > >      Now we only care about kthread, so we can put the pointer into a
> > > > kthread specific struct.
> > > >      For example in the struct kthread, or in kthread->data (which may
> > > > conflict with workqueue).
> > >
> > > No, add a new field to the structure. "full_name" or something like that.
> > > I'm guessing it should be NULL if the name fits in TASK_COMM_LEN and
> > > allocated if the name had to be truncated.
> > >
> > > Do not overload data with this. That will just make things confusing.
> > > There's not that many kthreads, where an addition of an 8 byte pointer is
> > > going to cause issues.
> >
> > Sure, I will add a new field named "full_name", which only be
> > allocated if the kthread's comm is truncated.
>
> The plan looks good to me.
>
> One more thing. It should obsolete the workqueue-specific solution.
> It would be great to clean up the workqueue code as the next step.
>

Agreed. The worker comm can be replaced by the new kthread full_name.
I will do it in the next step.
Michał Mirosław Nov. 4, 2021, 1:37 a.m. UTC | #13
On Mon, Nov 01, 2021 at 06:04:08AM +0000, Yafang Shao wrote:
> There're many truncated kthreads in the kernel, which may make trouble
> for the user, for example, the user can't get detailed device
> information from the task comm.
> 
> This patchset tries to improve this problem fundamentally by extending
> the task comm size from 16 to 24, which is a very simple way. 
[...]

Hi,

I've tried something like this a few years back. My attempt got mostly
lost in the mailing lists, but I'm still carrying the patches in my
tree [1]. My target was userspace thread names, and it turned out more
involved than I had time for.

[1] https://rere.qmqm.pl/git/?p=linux;a=commit;h=2c3814268caf2b1fee6d1a0b61fd1730ce135d4a
    and its parents

Best Regards
Michał Mirosław
Yafang Shao Nov. 5, 2021, 6:34 a.m. UTC | #14
On Thu, Nov 4, 2021 at 9:37 AM Michał Mirosław <mirq-linux@rere.qmqm.pl> wrote:
>
> On Mon, Nov 01, 2021 at 06:04:08AM +0000, Yafang Shao wrote:
> > There're many truncated kthreads in the kernel, which may make trouble
> > for the user, for example, the user can't get detailed device
> > information from the task comm.
> >
> > This patchset tries to improve this problem fundamentally by extending
> > the task comm size from 16 to 24, which is a very simple way.
> [...]
>
> Hi,
>
> I've tried something like this a few years back. My attempt got mostly
> lost in the mailing lists, but I'm still carrying the patches in my
> tree [1]. My target was userspace thread names, and it turned out more
> involved than I had time for.
>
> [1] https://rere.qmqm.pl/git/?p=linux;a=commit;h=2c3814268caf2b1fee6d1a0b61fd1730ce135d4a
>     and its parents
>

Hi Michal,

Thanks for the information.

I have looked through your patches.  It seems to contain six patches
now and can be divided into three parts per my understanding.

1. extend task comm len
This parts contains below 4 patches:
[prctl: prepare for bigger
TASK_COMM_LEN](https://rere.qmqm.pl/git/?p=linux;a=commit;h=cfd99db9cf911bb4d106889aeba1dfe89b6527d0)
[bluetooth: prepare for bigger
TASK_COMM_LEN](https://rere.qmqm.pl/git/?p=linux;a=commit;h=ba2805f5196865b81cc6fc938ea53af2c7c2c892)
[taskstats: prepare for bigger
TASK_COMM_LEN](https://rere.qmqm.pl/git/?p=linux;a=commit;h=4d29bfedc57b36607915a0171f4864ec504908ca)
[mm: make TASK_COMM_LEN
configurable](https://rere.qmqm.pl/git/?p=linux;a=commit;h=362acc35582445174589184c738c4d86ec7d174b)

What kind of userspace issues makes you extend the task comm length ?
Why not just use /proc/[pid]/cmdline ?

2.  A fix
Below patch:
[procfs: signal /proc/PID/comm write
truncation](https://rere.qmqm.pl/git/?p=linux;a=commit;h=d72027388d4d95db5438a7a574e0a03ae4b5d6d7)

It seems this patch is incomplete ?   I don't know what it means to do.

3. A feature provided for pthread_getname_np
Below patch:
[procfs: lseek(/proc/PID/comm, 0,
SEEK_END)](https://rere.qmqm.pl/git/?p=linux;a=commit;h=2c3814268caf2b1fee6d1a0b61fd1730ce135d4a)

It seems this patch is useful. With this patch the userspace can
directly get the TASK_COMM_LEN through the API.
Michał Mirosław Nov. 5, 2021, 11:57 p.m. UTC | #15
On Fri, Nov 05, 2021 at 02:34:58PM +0800, Yafang Shao wrote:
> On Thu, Nov 4, 2021 at 9:37 AM Michał Mirosław <mirq-linux@rere.qmqm.pl> wrote:
> >
> > On Mon, Nov 01, 2021 at 06:04:08AM +0000, Yafang Shao wrote:
> > > There're many truncated kthreads in the kernel, which may make trouble
> > > for the user, for example, the user can't get detailed device
> > > information from the task comm.
> > >
> > > This patchset tries to improve this problem fundamentally by extending
> > > the task comm size from 16 to 24, which is a very simple way.
> > [...]
> >
> > Hi,
> >
> > I've tried something like this a few years back. My attempt got mostly
> > lost in the mailing lists, but I'm still carrying the patches in my
> > tree [1]. My target was userspace thread names, and it turned out more
> > involved than I had time for.
> >
> > [1] https://rere.qmqm.pl/git/?p=linux;a=commit;h=2c3814268caf2b1fee6d1a0b61fd1730ce135d4a
> >     and its parents
> >
> 
> Hi Michal,
> 
> Thanks for the information.
> 
> I have looked through your patches.  It seems to contain six patches
> now and can be divided into three parts per my understanding.
> 
> 1. extend task comm len
> This parts contains below 4 patches:
> [prctl: prepare for bigger
> TASK_COMM_LEN](https://rere.qmqm.pl/git/?p=linux;a=commit;h=cfd99db9cf911bb4d106889aeba1dfe89b6527d0)
> [bluetooth: prepare for bigger
> TASK_COMM_LEN](https://rere.qmqm.pl/git/?p=linux;a=commit;h=ba2805f5196865b81cc6fc938ea53af2c7c2c892)
> [taskstats: prepare for bigger
> TASK_COMM_LEN](https://rere.qmqm.pl/git/?p=linux;a=commit;h=4d29bfedc57b36607915a0171f4864ec504908ca)
> [mm: make TASK_COMM_LEN
> configurable](https://rere.qmqm.pl/git/?p=linux;a=commit;h=362acc35582445174589184c738c4d86ec7d174b)
> 
> What kind of userspace issues makes you extend the task comm length ?
> Why not just use /proc/[pid]/cmdline ?

This was to enable longer thread names (as set by pthread_setname_np()).
Currently its 16 bytes, and that's too short for e.g. Chrome's or Firefox'es
threads. I believe that FreeBSD has 32-byte limit and so I expect that
major portable code is already prepared for bigger thread names.

> 2.  A fix
> Below patch:
> [procfs: signal /proc/PID/comm write
> truncation](https://rere.qmqm.pl/git/?p=linux;a=commit;h=d72027388d4d95db5438a7a574e0a03ae4b5d6d7)
> 
> It seems this patch is incomplete ?   I don't know what it means to do.

Currently writes to /proc/PID/comm are silently truncated. This patch
makes the write() call return the actual number of bytes actually written
and on subsequent calls return -ENOSPC. glibc checks the length in
pthread_setname_np() before write(), so the change is not currently
relevant for it. I don't know/remember what other runtimes do, though.

> 3. A feature provided for pthread_getname_np
> Below patch:
> [procfs: lseek(/proc/PID/comm, 0,
> SEEK_END)](https://rere.qmqm.pl/git/?p=linux;a=commit;h=2c3814268caf2b1fee6d1a0b61fd1730ce135d4a)
> 
> It seems this patch is useful. With this patch the userspace can
> directly get the TASK_COMM_LEN through the API.

This one I'm not really fond of because it abuses lseek() in that it
doesn't move the write pointer. But in case of /proc files this normally
would return EINVAL anyway.

Best Regards
Michał Mirosław
Yafang Shao Nov. 6, 2021, 9:12 a.m. UTC | #16
On Sat, Nov 6, 2021 at 7:57 AM Michał Mirosław <mirq-linux@rere.qmqm.pl> wrote:
>
> On Fri, Nov 05, 2021 at 02:34:58PM +0800, Yafang Shao wrote:
> > On Thu, Nov 4, 2021 at 9:37 AM Michał Mirosław <mirq-linux@rere.qmqm.pl> wrote:
> > >
> > > On Mon, Nov 01, 2021 at 06:04:08AM +0000, Yafang Shao wrote:
> > > > There're many truncated kthreads in the kernel, which may make trouble
> > > > for the user, for example, the user can't get detailed device
> > > > information from the task comm.
> > > >
> > > > This patchset tries to improve this problem fundamentally by extending
> > > > the task comm size from 16 to 24, which is a very simple way.
> > > [...]
> > >
> > > Hi,
> > >
> > > I've tried something like this a few years back. My attempt got mostly
> > > lost in the mailing lists, but I'm still carrying the patches in my
> > > tree [1]. My target was userspace thread names, and it turned out more
> > > involved than I had time for.
> > >
> > > [1] https://rere.qmqm.pl/git/?p=linux;a=commit;h=2c3814268caf2b1fee6d1a0b61fd1730ce135d4a
> > >     and its parents
> > >
> >
> > Hi Michal,
> >
> > Thanks for the information.
> >
> > I have looked through your patches.  It seems to contain six patches
> > now and can be divided into three parts per my understanding.
> >
> > 1. extend task comm len
> > This parts contains below 4 patches:
> > [prctl: prepare for bigger
> > TASK_COMM_LEN](https://rere.qmqm.pl/git/?p=linux;a=commit;h=cfd99db9cf911bb4d106889aeba1dfe89b6527d0)
> > [bluetooth: prepare for bigger
> > TASK_COMM_LEN](https://rere.qmqm.pl/git/?p=linux;a=commit;h=ba2805f5196865b81cc6fc938ea53af2c7c2c892)
> > [taskstats: prepare for bigger
> > TASK_COMM_LEN](https://rere.qmqm.pl/git/?p=linux;a=commit;h=4d29bfedc57b36607915a0171f4864ec504908ca)
> > [mm: make TASK_COMM_LEN
> > configurable](https://rere.qmqm.pl/git/?p=linux;a=commit;h=362acc35582445174589184c738c4d86ec7d174b)
> >
> > What kind of userspace issues makes you extend the task comm length ?
> > Why not just use /proc/[pid]/cmdline ?
>
> This was to enable longer thread names (as set by pthread_setname_np()).
> Currently its 16 bytes, and that's too short for e.g. Chrome's or Firefox'es
> threads. I believe that FreeBSD has 32-byte limit and so I expect that
> major portable code is already prepared for bigger thread names.
>

The comm len in FreeBSD is (19 + 1) bytes[1], but that is still larger
than Linux :)
The task comm is short for many applications, that is why cmdline is
introduced per my understanding, but pthread_{set, get}name_np() is
reading/writing the comm or via prctl(2) rather than reading/writing
the cmdline...

Is the truncated Chrome or Firefox thread comm really harmful or is
extending the task comm just for portable?
Could you pls show me some examples if the short comm is really harmful?

Per my understanding, if the short comm is harmful to applications
then it is worth extending it.
But if it is only for portable code, it may not be worth extending it.

[1]. https://github.com/freebsd/freebsd-src/blob/main/sys/sys/param.h#L126

> > 2.  A fix
> > Below patch:
> > [procfs: signal /proc/PID/comm write
> > truncation](https://rere.qmqm.pl/git/?p=linux;a=commit;h=d72027388d4d95db5438a7a574e0a03ae4b5d6d7)
> >
> > It seems this patch is incomplete ?   I don't know what it means to do.
>
> Currently writes to /proc/PID/comm are silently truncated. This patch
> makes the write() call return the actual number of bytes actually written
> and on subsequent calls return -ENOSPC. glibc checks the length in
> pthread_setname_np() before write(), so the change is not currently
> relevant for it. I don't know/remember what other runtimes do, though.
>
> > 3. A feature provided for pthread_getname_np
> > Below patch:
> > [procfs: lseek(/proc/PID/comm, 0,
> > SEEK_END)](https://rere.qmqm.pl/git/?p=linux;a=commit;h=2c3814268caf2b1fee6d1a0b61fd1730ce135d4a)
> >
> > It seems this patch is useful. With this patch the userspace can
> > directly get the TASK_COMM_LEN through the API.
>
> This one I'm not really fond of because it abuses lseek() in that it
> doesn't move the write pointer. But in case of /proc files this normally
> would return EINVAL anyway.
>

Another possible way is introducing a new PR_GET_COMM_LEN for
prctl(2), but I'm not sure if it is worth it.
Michał Mirosław Nov. 6, 2021, 11:29 a.m. UTC | #17
On Sat, Nov 06, 2021 at 05:12:24PM +0800, Yafang Shao wrote:
> On Sat, Nov 6, 2021 at 7:57 AM Michał Mirosław <mirq-linux@rere.qmqm.pl> wrote:
> >
> > On Fri, Nov 05, 2021 at 02:34:58PM +0800, Yafang Shao wrote:
> > > On Thu, Nov 4, 2021 at 9:37 AM Michał Mirosław <mirq-linux@rere.qmqm.pl> wrote:
> > > >
> > > > On Mon, Nov 01, 2021 at 06:04:08AM +0000, Yafang Shao wrote:
> > > > > There're many truncated kthreads in the kernel, which may make trouble
> > > > > for the user, for example, the user can't get detailed device
> > > > > information from the task comm.
> > > > >
> > > > > This patchset tries to improve this problem fundamentally by extending
> > > > > the task comm size from 16 to 24, which is a very simple way.
> > > > [...]
> > > >
> > > > Hi,
> > > >
> > > > I've tried something like this a few years back. My attempt got mostly
> > > > lost in the mailing lists, but I'm still carrying the patches in my
> > > > tree [1]. My target was userspace thread names, and it turned out more
> > > > involved than I had time for.
> > > >
> > > > [1] https://rere.qmqm.pl/git/?p=linux;a=commit;h=2c3814268caf2b1fee6d1a0b61fd1730ce135d4a
> > > >     and its parents
> > > >
> > >
> > > Hi Michal,
> > >
> > > Thanks for the information.
> > >
> > > I have looked through your patches.  It seems to contain six patches
> > > now and can be divided into three parts per my understanding.
> > >
> > > 1. extend task comm len
> > > This parts contains below 4 patches:
> > > [prctl: prepare for bigger
> > > TASK_COMM_LEN](https://rere.qmqm.pl/git/?p=linux;a=commit;h=cfd99db9cf911bb4d106889aeba1dfe89b6527d0)
> > > [bluetooth: prepare for bigger
> > > TASK_COMM_LEN](https://rere.qmqm.pl/git/?p=linux;a=commit;h=ba2805f5196865b81cc6fc938ea53af2c7c2c892)
> > > [taskstats: prepare for bigger
> > > TASK_COMM_LEN](https://rere.qmqm.pl/git/?p=linux;a=commit;h=4d29bfedc57b36607915a0171f4864ec504908ca)
> > > [mm: make TASK_COMM_LEN
> > > configurable](https://rere.qmqm.pl/git/?p=linux;a=commit;h=362acc35582445174589184c738c4d86ec7d174b)
> > >
> > > What kind of userspace issues makes you extend the task comm length ?
> > > Why not just use /proc/[pid]/cmdline ?
> >
> > This was to enable longer thread names (as set by pthread_setname_np()).
> > Currently its 16 bytes, and that's too short for e.g. Chrome's or Firefox'es
> > threads. I believe that FreeBSD has 32-byte limit and so I expect that
> > major portable code is already prepared for bigger thread names.
> >
> 
> The comm len in FreeBSD is (19 + 1) bytes[1], but that is still larger
> than Linux :)
> The task comm is short for many applications, that is why cmdline is
> introduced per my understanding, but pthread_{set, get}name_np() is
> reading/writing the comm or via prctl(2) rather than reading/writing
> the cmdline...
> 
> Is the truncated Chrome or Firefox thread comm really harmful or is
> extending the task comm just for portable?
> Could you pls show me some examples if the short comm is really harmful?
> 
> Per my understanding, if the short comm is harmful to applications
> then it is worth extending it.
> But if it is only for portable code, it may not be worth extending it.
> 
> [1]. https://github.com/freebsd/freebsd-src/blob/main/sys/sys/param.h#L126

I don't think it is harmful as in exposing a bug or something. It's just
inconvenient when debugging a system where you can't differentiate
between threads because their names have been cut too short.

Best Regards
Michał Mirosław