mbox series

[v6,00/12] extend task comm from 16 to 24

Message ID 20211025083315.4752-1-laoar.shao@gmail.com (mailing list archive)
Headers show
Series extend task comm from 16 to 24 | expand

Message

Yafang Shao Oct. 25, 2021, 8:33 a.m. UTC
There're many truncated kthreads in the kernel, which may make trouble
for the user, for example, the user can't get detailed device
information from the task comm.

This patchset tries to improve this problem fundamentally by extending
the task comm size from 16 to 24. In order to do that, we have to do
some cleanups first.

1. Make the copy of task comm always safe no matter what the task
   comm size is. For example,

      Unsafe                 Safe
      strlcpy                strscpy_pad
      strncpy                strscpy_pad
      bpf_probe_read_kernel  bpf_probe_read_kernel_str
                             bpf_core_read_str
                             bpf_get_current_comm
                             perf_event__prepare_comm
                             prctl(2)

   After this step, the comm size change won't make any trouble to the 
   kernel or the in-tree tools for example perf, BPF programs.

2. Cleanup some old hard-coded 16
   Actually we don't need to convert all of them to TASK_COMM_LEN or
   TASK_COMM_LEN_16, what we really care about is if the convert can
   make the code more reasonable or easier to understand. For
   example, some in-tree tools read the comm from sched:sched_switch
   tracepoint, as it is derived from the kernel, we'd better make them
   consistent with the kernel.

3. Extend the task comm size from 16 to 24
   task_struct is growing rather regularly by 8 bytes. This size change
   should be acceptable. We used to think about extending the size for
   CONFIG_BASE_FULL only, but that would be a burden for maintenance 
   and introduce code complexity.

4. Print a warning if the kthread comm is still truncated.

5. What will happen to the out-of-tree tools after this change?
   If the tool get task comm through kernel API, for example prctl(2),
   bpf_get_current_comm() and etc, then it doesn't matter how large the
   user buffer is, because it will always get a string with a nul
   terminator. While if it gets the task comm through direct string copy,
   the user tool must make sure the copied string has a nul terminator
   itself. As TASK_COMM_LEN is not exposed to userspace, there's no
   reason that it must require a fixed-size task comm.

Changes since v5:
- extend the comm size for both CONFIG_BASE_{FULL, SMALL} that could
  make the code more simple and easier to maintain.
- avoid changing too much hard-coded 16 in BPF programs per Andrii. 

Changes since v4:
- introduce TASK_COMM_LEN_16 and TASK_COMM_LEN_24 per Steven
- replace hard-coded 16 with TASK_COMM_LEN_16 per Kees
- use strscpy_pad() instead of strlcpy()/strncpy() per Kees
- make perf test adopt to task comm size change per Arnaldo and Mathieu
- fix warning reported by kernel test robot

Changes since v3:
- fixes -Wstringop-truncation warning reported by kernel test robot

Changes since v2:
- avoid change UAPI code per Kees
- remove the description of out of tree code from commit log per Peter

Changes since v1:
- extend task comm to 24bytes, per Petr
- improve the warning per Petr
- make the checkpatch warning a separate patch

Yafang Shao (12):
  fs/exec: make __set_task_comm always set a nul ternimated string
  fs/exec: make __get_task_comm always get a nul terminated string
  drivers/connector: make connector comm always nul ternimated
  drivers/infiniband: make setup_ctxt always get a nul terminated task
    comm
  elfcore: make prpsinfo always get a nul terminated task comm
  samples/bpf/test_overhead_kprobe_kern: make it adopt to task comm size
    change
  samples/bpf/offwaketime_kern: make sched_switch tracepoint args adopt
    to comm size change
  tools/bpf/bpftool/skeleton: make it adopt to task comm size change
  tools/perf/test: make perf test adopt to task comm size change
  tools/testing/selftests/bpf: make it adopt to task comm size change
  sched.h: extend task comm from 16 to 24
  kernel/kthread: show a warning if kthread's comm is truncated

 drivers/connector/cn_proc.c                   |  5 +++-
 drivers/infiniband/hw/qib/qib.h               |  2 +-
 drivers/infiniband/hw/qib/qib_file_ops.c      |  2 +-
 fs/binfmt_elf.c                               |  2 +-
 fs/exec.c                                     |  5 ++--
 include/linux/elfcore-compat.h                |  3 ++-
 include/linux/elfcore.h                       |  4 +--
 include/linux/sched.h                         |  9 +++++--
 kernel/kthread.c                              |  7 ++++-
 samples/bpf/offwaketime_kern.c                |  4 +--
 samples/bpf/test_overhead_kprobe_kern.c       | 11 ++++----
 samples/bpf/test_overhead_tp_kern.c           |  5 ++--
 tools/bpf/bpftool/skeleton/pid_iter.bpf.c     |  4 +--
 tools/include/linux/sched.h                   | 11 ++++++++
 tools/perf/tests/evsel-tp-sched.c             | 26 ++++++++++++++-----
 .../selftests/bpf/progs/test_stacktrace_map.c |  6 ++---
 .../selftests/bpf/progs/test_tracepoint.c     |  6 ++---
 17 files changed, 77 insertions(+), 35 deletions(-)
 create mode 100644 tools/include/linux/sched.h

Comments

Alexei Starovoitov Oct. 25, 2021, 6:10 p.m. UTC | #1
On Mon, Oct 25, 2021 at 1:33 AM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> There're many truncated kthreads in the kernel, which may make trouble
> for the user, for example, the user can't get detailed device
> information from the task comm.
>
> This patchset tries to improve this problem fundamentally by extending
> the task comm size from 16 to 24. In order to do that, we have to do
> some cleanups first.

It looks like a churn that doesn't really address the problem.
If we were to allow long names then make it into a pointer and use 16 byte
as an optimized storage for short names. Any longer name would be a pointer.
In other words make it similar to dentry->d_iname.
Steven Rostedt Oct. 25, 2021, 9:05 p.m. UTC | #2
On Mon, 25 Oct 2021 11:10:09 -0700
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> It looks like a churn that doesn't really address the problem.
> If we were to allow long names then make it into a pointer and use 16 byte
> as an optimized storage for short names. Any longer name would be a pointer.
> In other words make it similar to dentry->d_iname.

That would be quite a bigger undertaking too, as it is assumed throughout
the kernel that the task->comm is TASK_COMM_LEN and is nul terminated. And
most locations that save the comm simply use a fixed size string of
TASK_COMM_LEN. Not saying its not feasible, but it would require a lot more
analysis of the impact by changing such a fundamental part of task struct
from a static to something requiring allocation.

Unless you are suggesting that we truncate like normal the 16 byte names
(to a max of 15 characters), and add a way to hold the entire name for
those locations that understand it.

-- Steve
Kees Cook Oct. 25, 2021, 9:06 p.m. UTC | #3
On Mon, Oct 25, 2021 at 05:05:03PM -0400, Steven Rostedt wrote:
> On Mon, 25 Oct 2021 11:10:09 -0700
> Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> 
> > It looks like a churn that doesn't really address the problem.
> > If we were to allow long names then make it into a pointer and use 16 byte
> > as an optimized storage for short names. Any longer name would be a pointer.
> > In other words make it similar to dentry->d_iname.
> 
> That would be quite a bigger undertaking too, as it is assumed throughout
> the kernel that the task->comm is TASK_COMM_LEN and is nul terminated. And
> most locations that save the comm simply use a fixed size string of
> TASK_COMM_LEN. Not saying its not feasible, but it would require a lot more
> analysis of the impact by changing such a fundamental part of task struct
> from a static to something requiring allocation.
> 
> Unless you are suggesting that we truncate like normal the 16 byte names
> (to a max of 15 characters), and add a way to hold the entire name for
> those locations that understand it.

Agreed -- this is a small change for what is already an "uncommon"
corner case. I don't think this needs to suddenly become an unbounded
string. :)
Petr Mladek Oct. 26, 2021, 10:35 a.m. UTC | #4
On Mon 2021-10-25 17:05:03, Steven Rostedt wrote:
> On Mon, 25 Oct 2021 11:10:09 -0700
> Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> 
> > It looks like a churn that doesn't really address the problem.
> > If we were to allow long names then make it into a pointer and use 16 byte
> > as an optimized storage for short names. Any longer name would be a pointer.
> > In other words make it similar to dentry->d_iname.
> 
> That would be quite a bigger undertaking too, as it is assumed throughout
> the kernel that the task->comm is TASK_COMM_LEN and is nul terminated. And
> most locations that save the comm simply use a fixed size string of
> TASK_COMM_LEN. Not saying its not feasible, but it would require a lot more
> analysis of the impact by changing such a fundamental part of task struct
> from a static to something requiring allocation.

I fully agree. The evolution of this patchset clearly shows how many
code paths depend on the existing behavior.


> Unless you are suggesting that we truncate like normal the 16 byte names
> (to a max of 15 characters), and add a way to hold the entire name for
> those locations that understand it.

Yup. If the problem is only with kthreads, it might be possible to
store the pointer into "struct kthread" and update proc_task_name().
It would generalize the solution already used by workqueues.
I think that something like this was mentioned in the discussion
about v1.

Best Regards,
Petr