mbox series

[GIT,PULL] vfs pidfs

Message ID 20250118-vfs-pidfs-5921bfa5632a@brauner (mailing list archive)
State New
Headers show
Series [GIT,PULL] vfs pidfs | expand

Pull-request

git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.14-rc1.pidfs

Message

Christian Brauner Jan. 18, 2025, 1 p.m. UTC
Hey Linus,

/* Summary */

This contains pidfs updates for this cycle:

- Rework inode number allocation

  Recently we received a patchset that aims to enable file handle
  encoding and decoding via name_to_handle_at(2) and
  open_by_handle_at(2).

  A crucical step in the patch series is how to go from inode number to
  struct pid without leaking information into unprivileged contexts. The
  issue is that in order to find a struct pid the pid number in the
  initial pid namespace must be encoded into the file handle via
  name_to_handle_at(2).

  This can be used by containers using a separate pid namespace to learn
  what the pid number of a given process in the initial pid namespace
  is. While this is a weak information leak it could be used in various
  exploits and in general is an ugly wart in the design.

  To solve this problem a new way is needed to lookup a struct pid based
  on the inode number allocated for that struct pid. The other part is
  to remove the custom inode number allocation on 32bit systems that is
  also an ugly wart that should go away.

  Allocate unique identifiers for struct pid by simply incrementing a 64
  bit counter and insert each struct pid into the rbtree so it can be
  looked up to decode file handles avoiding to leak actual pids across
  pid namespaces in file handles.

  On both 64 bit and 32 bit the same 64 bit identifier is used to lookup
  struct pid in the rbtree. On 64 bit the unique identifier for struct pid
  simply becomes the inode number. Comparing two pidfds continues to be as
  simple as comparing inode numbers.

  On 32 bit the 64 bit number assigned to struct pid is split into two 32
  bit numbers. The lower 32 bits are used as the inode number and the
  upper 32 bits are used as the inode generation number. Whenever a
  wraparound happens on 32 bit the 64 bit number will be incremented by 2
  so inode numbering starts at 2 again.

  When a wraparound happens on 32 bit multiple pidfds with the same inode
  number are likely to exist. This isn't a problem since before pidfs
  pidfds used the anonymous inode meaning all pidfds had the same inode
  number. On 32 bit sserspace can thus reconstruct the 64 bit identifier
  by retrieving both the inode number and the inode generation number to
  compare, or use file handles. This gives the same guarantees on both 32
  bit and 64 bit.

- Implement file handle support

  This is based on custom export operation methods which allows pidfs to
  implement permission checking and opening of pidfs file handles
  cleanly without hacking around in the core file handle code too much.

- Support bind-mounts

  Allow bind-mounting pidfds. Similar to nsfs let's allow bind-mounts
  for pidfds. This allows pidfds to be safely recovered and checked for
  process recycling.

  Instead of checking d_ops for both nsfs and pidfs we could in a
  follow-up patch add a flag argument to struct dentry_operations that
  functions similar to file_operations->fop_flags.

/* Testing */

gcc version 14.2.0 (Debian 14.2.0-6)
Debian clang version 16.0.6 (27+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

No known conflicts.

The following changes since commit 40384c840ea1944d7c5a392e8975ed088ecf0b37:

  Linux 6.13-rc1 (2024-12-01 14:28:56 -0800)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.14-rc1.pidfs

for you to fetch changes up to 3781680fba3eab0b34b071cb9443fd5ad92d23cf:

  Merge patch series "pidfs: support bind-mounts" (2024-12-22 11:03:19 +0100)

Please consider pulling these changes from the signed vfs-6.14-rc1.pidfs tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.14-rc1.pidfs

----------------------------------------------------------------
Christian Brauner (16):
      pidfs: rework inode number allocation
      pidfs: remove 32bit inode number handling
      pidfs: support FS_IOC_GETVERSION
      Merge patch series "pidfs: file handle preliminaries"
      fhandle: simplify error handling
      exportfs: add open method
      fhandle: pull CAP_DAC_READ_SEARCH check into may_decode_fh()
      exportfs: add permission method
      pidfs: implement file handle support
      Merge patch series "pidfs: implement file handle support"
      pidfs: check for valid ioctl commands
      selftests/pidfd: add pidfs file handle selftests
      pidfs: lookup pid through rbtree
      pidfs: allow bind-mounts
      selftests: add pidfd bind-mount tests
      Merge patch series "pidfs: support bind-mounts"

Erin Shepherd (1):
      pseudofs: add support for export_ops

 fs/fhandle.c                                       | 115 +++--
 fs/libfs.c                                         |   1 +
 fs/namespace.c                                     |  10 +-
 fs/pidfs.c                                         | 298 ++++++++++--
 include/linux/exportfs.h                           |  20 +
 include/linux/pid.h                                |   2 +
 include/linux/pidfs.h                              |   3 +
 include/linux/pseudo_fs.h                          |   1 +
 kernel/pid.c                                       |  14 +-
 tools/testing/selftests/pidfd/.gitignore           |   2 +
 tools/testing/selftests/pidfd/Makefile             |   3 +-
 tools/testing/selftests/pidfd/pidfd.h              |  39 ++
 tools/testing/selftests/pidfd/pidfd_bind_mount.c   | 188 ++++++++
 .../selftests/pidfd/pidfd_file_handle_test.c       | 503 +++++++++++++++++++++
 tools/testing/selftests/pidfd/pidfd_setns_test.c   |  47 +-
 tools/testing/selftests/pidfd/pidfd_wait.c         |  47 +-
 16 files changed, 1110 insertions(+), 183 deletions(-)
 create mode 100644 tools/testing/selftests/pidfd/pidfd_bind_mount.c
 create mode 100644 tools/testing/selftests/pidfd/pidfd_file_handle_test.c