mbox series

[v7,00/11] fs: multigrain timestamp redux

Message ID 20240913-mgtime-v7-0-92d4020e3b00@kernel.org (mailing list archive)
Headers show
Series fs: multigrain timestamp redux | expand

Message

Jeff Layton Sept. 13, 2024, 1:54 p.m. UTC
Once more into the breach, dear friends!

This is a replacement for the v6 series sitting in Christian's
vfs.mgtime branch. For the uninitiated, the main rationale for this
set is described in the changelog for patch #2.

The kernel test robot reported a performance regression in v6 due to the
changes to current_time(). This patchset addresses that by moving the
ctime floor handling into the timekeeper code, which allows us to avoid
multiple seqcount loops when fetching and converting times. The basic
approach is still the same. The only difference is in where the
timestamp floor is handled, and in how we get new timestamps.

This reduces the changes to fs/inode.c and avoids a lot of the messiness
of handling both timespec64's and ktime_t values.

The pipe1_threads test shows these averages on my test rig:

    v6.11-rc7				103561295 (baseline)
    v6.11-rc7 + v6 series		95995565  (~7% slower)
    v6.11-rc7 + v7 series		101357203 (~2% slower)

...so the performance difference here is significant.

The main difference between v6 and v7 is in the first two patches, so
I've dropped the R-b's from those. The rest I left intact.

Note that there is one additional patch in this series (#4) that adds
support for handling delegated timestamps. The patches that make use of
that are in Chuck's nfsd-next branch. Taking that in here should make
that merge easier.

R-b's would be welcome (particularly from the timekeeper folks).

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
Changes in v7:
- move the floor value handling into timekeeper for better performance
- Link to v6: https://lore.kernel.org/r/20240715-mgtime-v6-0-48e5d34bd2ba@kernel.org

Changes in v6:
- Normalize timespec64 in inode_set_ctime_to_ts
- use DEFINE_PER_CPU counters for better vfs consistency
- skip ctime cmpxchg if the result means nothing will change
- add trace_ctime_xchg_skip to track skipped ctime updates
- use __print_flags in ctime_ns_xchg tracepoint
- Link to v5: https://lore.kernel.org/r/20240711-mgtime-v5-0-37bb5b465feb@kernel.org

Changes in v5:
- refetch coarse time in coarse_ctime if not returning floor
- timestamp_truncate before swapping new ctime value into place
- track floor value as atomic64_t
- cleanups to Documentation file
- Link to v4: https://lore.kernel.org/r/20240708-mgtime-v4-0-a0f3c6fb57f3@kernel.org

Changes in v4:
- reordered tracepoint fields for better packing
- rework percpu counters again to also count fine grained timestamps
- switch to try_cmpxchg for better efficiency
- Link to v3: https://lore.kernel.org/r/20240705-mgtime-v3-0-85b2daa9b335@kernel.org

Changes in v3:
- Drop the conversion of i_ctime fields to ktime_t, and use an unused bit
  of the i_ctime_nsec field as QUERIED flag.
- Better tracepoints for tracking floor and ctime updates
- Reworked percpu counters to be more useful
- Track floor as monotonic value, which eliminates clock-jump problem

Changes in v2:
- Added Documentation file
- Link to v1: https://lore.kernel.org/r/20240626-mgtime-v1-0-a189352d0f8f@kernel.org

---
Jeff Layton (11):
      timekeeping: move multigrain timestamp floor handling into timekeeper
      fs: add infrastructure for multigrain timestamps
      fs: have setattr_copy handle multigrain timestamps appropriately
      fs: handle delegated timestamps in setattr_copy_mgtime
      fs: tracepoints around multigrain timestamp events
      fs: add percpu counters for significant multigrain timestamp events
      Documentation: add a new file documenting multigrain timestamps
      xfs: switch to multigrain timestamps
      ext4: switch to multigrain timestamps
      btrfs: convert to multigrain timestamps
      tmpfs: add support for multigrain timestamps

 Documentation/filesystems/index.rst         |   1 +
 Documentation/filesystems/multigrain-ts.rst | 121 +++++++++++++
 fs/attr.c                                   |  60 +++++-
 fs/btrfs/file.c                             |  25 +--
 fs/btrfs/super.c                            |   3 +-
 fs/ext4/super.c                             |   2 +-
 fs/inode.c                                  | 271 +++++++++++++++++++++++++---
 fs/stat.c                                   |  42 ++++-
 fs/xfs/libxfs/xfs_trans_inode.c             |   6 +-
 fs/xfs/xfs_iops.c                           |  10 +-
 fs/xfs/xfs_super.c                          |   2 +-
 include/linux/fs.h                          |  36 +++-
 include/linux/timekeeping.h                 |   4 +
 include/trace/events/timestamp.h            | 124 +++++++++++++
 kernel/time/timekeeping.c                   |  81 +++++++++
 mm/shmem.c                                  |   2 +-
 16 files changed, 717 insertions(+), 73 deletions(-)
---
base-commit: da3ea35007d0af457a0afc87e84fddaebc4e0b63
change-id: 20240913-mgtime-20c98bcda88e

Best regards,

Comments

Christian Brauner Sept. 14, 2024, 1:29 p.m. UTC | #1
On Fri, Sep 13, 2024 at 09:54:09AM GMT, Jeff Layton wrote:
> Once more into the breach, dear friends!

I think this will have to be the v6.13 breach. :/
Jeff Layton Sept. 14, 2024, 1:37 p.m. UTC | #2
On Sat, 2024-09-14 at 15:29 +0200, Christian Brauner wrote:
> On Fri, Sep 13, 2024 at 09:54:09AM GMT, Jeff Layton wrote:
> > Once more into the breach, dear friends!
> 
> I think this will have to be the v6.13 breach. :/

Sigh, understood. This is a 30+ year old problem, after all, so one
more cycle isn't the end of the world.