mbox series

[GIT,PULL] vfs libfs

Message ID 20250118-vfs-libfs-675d6c542bcc@brauner (mailing list archive)
State New
Headers show
Series [GIT,PULL] vfs libfs | expand

Pull-request

git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.14-rc1.libfs

Message

Christian Brauner Jan. 18, 2025, 1:08 p.m. UTC
Hey Linus,

/* Summary */

This improves the stable directory offset behavior in various ways.
Stable offsets are needed so that NFS can reliably read directories on
filesystems such as tmpfs:

- Improve the end-of-directory detection

  According to getdents(3), the d_off field in each returned directory
  entry points to the next entry in the directory. The d_off field in
  the last returned entry in the readdir buffer must contain a valid
  offset value, but if it points to an actual directory entry, then
  readdir/getdents can loop.

  Introduce a specific fixed offset value that is placed in the d_off
  field of the last entry in a directory. Some user space applications
  assume that the EOD offset value is larger than the offsets of real
  directory entries, so the largest valid offset value is reserved for
  this purpose. This new value is never allocated by
  simple_offset_add().

  When ->iterate_dir() returns, getdents{64} inserts the ctx->pos value
  into the d_off field of the last valid entry in the readdir buffer.
  When it hits EOD, offset_readdir() sets ctx->pos to the EOD offset
  value so the last entry is updated to point to the EOD marker.

  When trying to read the entry at the EOD offset, offset_readdir()
  terminates immediately.

- Rely on d_children to iterate stable offset directories

  Instead of using the mtree to emit entries in the order of their
  offset values, use it only to map incoming ctx->pos to a starting
  entry. Then use the directory's d_children list, which is already
  maintained properly by the dcache, to find the next child to emit.

- Narrow the range of directory offset values returned by
  simple_offset_add() to 3 .. (S32_MAX - 1) on all platforms. This means
  the allocation behavior is identical on 32-bit systems, 64-bit
  systems, and 32-bit user space on 64-bit kernels. The new range still
  permits over 2 billion concurrent entries per directory.

- Return ENOSPC when the directory offset range is exhausted. Hitting
  this error is almost impossible though.

- Remove the simple_offset_empty() helper.

/* Testing */

gcc version 14.2.0 (Debian 14.2.0-6)
Debian clang version 16.0.6 (27+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

No known conflicts.

The following changes since commit 40384c840ea1944d7c5a392e8975ed088ecf0b37:

  Linux 6.13-rc1 (2024-12-01 14:28:56 -0800)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.14-rc1.libfs

for you to fetch changes up to a0634b457eca16b21a4525bc40cd2db80f52dadc:

  Merge patch series "Improve simple directory offset wrap behavior" (2025-01-04 10:15:58 +0100)

Please consider pulling these changes from the signed vfs-6.14-rc1.libfs tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.14-rc1.libfs

----------------------------------------------------------------
Christian Brauner (1):
      Merge patch series "Improve simple directory offset wrap behavior"

Chuck Lever (5):
      libfs: Return ENOSPC when the directory offset range is exhausted
      Revert "libfs: Add simple_offset_empty()"
      Revert "libfs: fix infinite directory reads for offset dir"
      libfs: Replace simple_offset end-of-directory detection
      libfs: Use d_children list to iterate simple_offset directories

 fs/libfs.c         | 162 +++++++++++++++++++++++++----------------------------
 include/linux/fs.h |   1 -
 mm/shmem.c         |   4 +-
 3 files changed, 79 insertions(+), 88 deletions(-)