mbox series

[v3,00/16] pidfs: provide information after task has been reaped

Message ID 20250305-work-pidfs-kill_on_last_close-v3-0-c8c3d8361705@kernel.org (mailing list archive)
Headers show
Series pidfs: provide information after task has been reaped | expand

Message

Christian Brauner March 5, 2025, 10:08 a.m. UTC
Hey,

Various tools need access to information about a process/task even after
it has already been reaped. For example, systemd's journal logs and uses
such information as the cgroup id and exit status to deal with processes
that have been sent via SCM_PIDFD or SCM_PEERPIDFD. By the time the
pidfd is received the process might have already been reaped.

This series aims to provide information by extending the PIDFD_GET_INFO
ioctl to retrieve the exit code and cgroup id. There might be other
stuff that we would want in the future.

Pidfd polling allows waiting on either task exit or for a task to have
been reaped. The contract for PIDFD_INFO_EXIT is simply that EPOLLHUP
must be observed before exit information can be retrieved, i.e., exit
information is only provided once the task has been reaped.

Note, that if a thread-group leader exits before other threads in the
thread-group then exit information will only be available once the
thread-group is empty. This aligns with wait() as well, where reaping of
a thread-group leader that exited before the thread-group was empty is
delayed until the thread-group is empty.

With PIDFD_INFO_EXIT autoreaping might actually become usable because it
means a parent can ignore SIGCHLD or set SA_NOCLDWAIT and simply use
pidfd polling and PIDFD_INFO_EXIT to get get status information for its
children. The kernel will autocleanup right away instead of delaying.

This includes expansive selftests including for thread-group behior and
multi-threaded exec by a non-thread-group leader thread.

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
Changes in v3:
- Fix various minor issues.
- Link to v2: https://lore.kernel.org/r/20250304-work-pidfs-kill_on_last_close-v2-0-44fdacfaa7b7@kernel.org

Changes in v2:
- Call pidfs_exit() from release_task().
- Don't provide exit information once the task has exited but once the
  task has been reaped. This makes for simpler semantics. Thus, call
  pidfs_exit() from release_task().
- Link to v1: https://lore.kernel.org/r/20250228-work-pidfs-kill_on_last_close-v1-0-5bd7e6bb428e@kernel.org

---
Christian Brauner (16):
      pidfs: switch to copy_struct_to_user()
      pidfd: rely on automatic cleanup in __pidfd_prepare()
      pidfs: move setting flags into pidfs_alloc_file()
      pidfs: use private inode slab cache
      pidfs: record exit code and cgroupid at exit
      pidfs: allow to retrieve exit information
      selftests/pidfd: fix header inclusion
      pidfs/selftests: ensure correct headers for ioctl handling
      selftests/pidfd: expand common pidfd header
      selftests/pidfd: add first PIDFD_INFO_EXIT selftest
      selftests/pidfd: add second PIDFD_INFO_EXIT selftest
      selftests/pidfd: add third PIDFD_INFO_EXIT selftest
      selftests/pidfd: add fourth PIDFD_INFO_EXIT selftest
      selftests/pidfd: add fifth PIDFD_INFO_EXIT selftest
      selftests/pidfd: add sixth PIDFD_INFO_EXIT selftest
      selftests/pidfd: add seventh PIDFD_INFO_EXIT selftest

 fs/internal.h                                     |   1 +
 fs/libfs.c                                        |   4 +-
 fs/pidfs.c                                        | 182 +++++++-
 include/linux/pidfs.h                             |   1 +
 include/uapi/linux/pidfd.h                        |   3 +-
 kernel/exit.c                                     |   2 +
 kernel/fork.c                                     |  15 +-
 tools/testing/selftests/pidfd/.gitignore          |   2 +
 tools/testing/selftests/pidfd/Makefile            |   4 +-
 tools/testing/selftests/pidfd/pidfd.h             |  93 ++++
 tools/testing/selftests/pidfd/pidfd_exec_helper.c |  12 +
 tools/testing/selftests/pidfd/pidfd_fdinfo_test.c |   1 +
 tools/testing/selftests/pidfd/pidfd_info_test.c   | 497 ++++++++++++++++++++++
 tools/testing/selftests/pidfd/pidfd_open_test.c   |  26 --
 tools/testing/selftests/pidfd/pidfd_setns_test.c  |  45 --
 15 files changed, 783 insertions(+), 105 deletions(-)
---
base-commit: b1e809e7f64ad47dd232ff072d8ef59c1fe414c5
change-id: 20250227-work-pidfs-kill_on_last_close-a23ddf21db47

Comments

Oleg Nesterov March 5, 2025, 12:06 p.m. UTC | #1
On 03/05, Christian Brauner wrote:
>
> Christian Brauner (16):
>       pidfs: switch to copy_struct_to_user()
>       pidfd: rely on automatic cleanup in __pidfd_prepare()
>       pidfs: move setting flags into pidfs_alloc_file()
>       pidfs: use private inode slab cache
>       pidfs: record exit code and cgroupid at exit
>       pidfs: allow to retrieve exit information
>       selftests/pidfd: fix header inclusion
>       pidfs/selftests: ensure correct headers for ioctl handling
>       selftests/pidfd: expand common pidfd header
>       selftests/pidfd: add first PIDFD_INFO_EXIT selftest
>       selftests/pidfd: add second PIDFD_INFO_EXIT selftest
>       selftests/pidfd: add third PIDFD_INFO_EXIT selftest
>       selftests/pidfd: add fourth PIDFD_INFO_EXIT selftest
>       selftests/pidfd: add fifth PIDFD_INFO_EXIT selftest
>       selftests/pidfd: add sixth PIDFD_INFO_EXIT selftest
>       selftests/pidfd: add seventh PIDFD_INFO_EXIT selftest

I see nothing wrong in V3. For 1-6

Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Christian Brauner March 5, 2025, 12:20 p.m. UTC | #2
On Wed, Mar 05, 2025 at 01:06:35PM +0100, Oleg Nesterov wrote:
> On 03/05, Christian Brauner wrote:
> >
> > Christian Brauner (16):
> >       pidfs: switch to copy_struct_to_user()
> >       pidfd: rely on automatic cleanup in __pidfd_prepare()
> >       pidfs: move setting flags into pidfs_alloc_file()
> >       pidfs: use private inode slab cache
> >       pidfs: record exit code and cgroupid at exit
> >       pidfs: allow to retrieve exit information
> >       selftests/pidfd: fix header inclusion
> >       pidfs/selftests: ensure correct headers for ioctl handling
> >       selftests/pidfd: expand common pidfd header
> >       selftests/pidfd: add first PIDFD_INFO_EXIT selftest
> >       selftests/pidfd: add second PIDFD_INFO_EXIT selftest
> >       selftests/pidfd: add third PIDFD_INFO_EXIT selftest
> >       selftests/pidfd: add fourth PIDFD_INFO_EXIT selftest
> >       selftests/pidfd: add fifth PIDFD_INFO_EXIT selftest
> >       selftests/pidfd: add sixth PIDFD_INFO_EXIT selftest
> >       selftests/pidfd: add seventh PIDFD_INFO_EXIT selftest
> 
> I see nothing wrong in V3. For 1-6

Fwiw, I'm quite happy with the tests since we now also have test for
multi-threaded exec behavior with pidfd polling.

> 
> Reviewed-by: Oleg Nesterov <oleg@redhat.com>

Thanks for all the help and the Ack!
There'll be more patches this or next cycle though. ;)