mbox series

[00/17] Add folio_end_read

Message ID 20230915183707.2707298-1-willy@infradead.org (mailing list archive)
Headers show
Series Add folio_end_read | expand

Message

Matthew Wilcox Sept. 15, 2023, 6:36 p.m. UTC
The core of this patchset is the new folio_end_read() call which
filesystems can use when finishing a page cache read instead of separate
calls to mark the folio uptodate and unlock it.  As an illustration of
its use, I converted ext4, iomap & mpage; more can be converted.

I think that's useful by itself, but the interesting optimisation is
that we can implement that with a single XOR instruction that sets the
uptodate bit, clears the lock bit, tests the waiter bit and provides a
write memory barrier.  That removes one memory barrier and one atomic
instruction from each page read, which seems worth doing.  That's in
patch 15.

The last two patches could be a separate series, but basically we can do
the same thing with the writeback flag that we do with the unlock flag;
clear it and test the waiters bit at the same time.

I don't have any performance numbers; I'm hoping Nick might provide some
since PPC seems particularly unhappy with write-after-write hazards.

Matthew Wilcox (Oracle) (17):
  iomap: Hold state_lock over call to ifs_set_range_uptodate()
  iomap: Protect read_bytes_pending with the state_lock
  mm: Add folio_end_read()
  ext4: Use folio_end_read()
  buffer: Use folio_end_read()
  iomap: Use folio_end_read()
  bitops: Add xor_unlock_is_negative_byte()
  alpha: Implement xor_unlock_is_negative_byte
  m68k: Implement xor_unlock_is_negative_byte
  mips: Implement xor_unlock_is_negative_byte
  powerpc: Implement arch_xor_unlock_is_negative_byte on 32-bit
  riscv: Implement xor_unlock_is_negative_byte
  s390: Implement arch_xor_unlock_is_negative_byte
  mm: Delete checks for xor_unlock_is_negative_byte()
  mm: Add folio_xor_flags_has_waiters()
  mm: Make __end_folio_writeback() return void
  mm: Use folio_xor_flags_has_waiters() in folio_end_writeback()

 arch/alpha/include/asm/bitops.h               | 20 +++++
 arch/m68k/include/asm/bitops.h                | 13 ++++
 arch/mips/include/asm/bitops.h                | 25 +++++-
 arch/mips/lib/bitops.c                        | 14 ++++
 arch/powerpc/include/asm/bitops.h             | 21 ++---
 arch/riscv/include/asm/bitops.h               | 12 +++
 arch/s390/include/asm/bitops.h                | 10 +++
 arch/x86/include/asm/bitops.h                 | 11 ++-
 fs/buffer.c                                   | 16 +---
 fs/ext4/readpage.c                            | 14 +---
 fs/iomap/buffered-io.c                        | 55 ++++++++-----
 .../asm-generic/bitops/instrumented-lock.h    | 28 ++++---
 include/asm-generic/bitops/lock.h             | 20 +----
 include/linux/page-flags.h                    | 19 +++++
 include/linux/pagemap.h                       |  1 +
 kernel/kcsan/kcsan_test.c                     |  9 +--
 kernel/kcsan/selftest.c                       |  9 +--
 mm/filemap.c                                  | 77 ++++++++++---------
 mm/kasan/kasan_test.c                         |  8 +-
 mm/page-writeback.c                           | 35 ++++-----
 20 files changed, 248 insertions(+), 169 deletions(-)

Comments

Linus Torvalds Sept. 16, 2023, 12:31 a.m. UTC | #1
On Fri, 15 Sept 2023 at 11:37, Matthew Wilcox (Oracle)
<willy@infradead.org> wrote:
>
> I don't have any performance numbers; I'm hoping Nick might provide some
> since PPC seems particularly unhappy with write-after-write hazards.

I suspect you can't see the extra atomic in the IO paths.

The existing trick with bit #7 is because we do a lot of
page_lock/unlock pairs even when there is no actual IO. So it's worth
it because page_unlock() really traditionally shows up quite a bit.
But once you actually do IO, I think the effect is not measurable.

That said, the series doesn't look *wrong*, although I did note a few
things that just made me go "that looks very strange to me" in there.

                      Linus