mbox series

[00/23,V3] Repair SWAP-over_NFS

Message ID 164299573337.26253.7538614611220034049.stgit@noble.brown (mailing list archive)
Headers show
Series Repair SWAP-over_NFS | expand

Message

NeilBrown Jan. 24, 2022, 3:48 a.m. UTC
This version of the series addresses the review comments received,
particularly from Christof.
Thanks to all for review and testing.

The patch adding mm/swap.h got a minor conflict when I rebaesd on
5.17-rc1, suggestion that it could easily get more conflicts in the
future.  It might be good if it could land before 5.17 comes out, to
avoid (some of) those conflicts.

I think (Though haven't checked) that all the NFS patch patches
except
      NFS: rename nfs_direct_IO and use as ->swap_rw
      NFS: swap IO handling is slightly different for O_DIRECT IO
can land independently of the MM patches, and can be moved to the
end of the series.  Maybe they could be held until after 5.18-rc1 if we
agree to proceed with these in the next merge window.

Intro from previous series is below.
Thanks,
NeilBrown

swap-over-NFS currently has a variety of problems.

swap writes call generic_write_checks(), which always fails on a swap
file, so it completely fails.
Even without this, various deadlocks are possible - largely due to
improvements in NFS memory allocation (using NOFS instead of ATOMIC)
which weren't tested against swap-out.

NFS is the only filesystem that has supported fs-based swap IO, and it
hasn't worked for several releases, so now is a convenient time to clean
up the swap-via-filesystem interfaces - we cannot break anything !

So the first few patches here clean up and improve various parts of the
swap-via-filesystem code.  ->activate_swap() is given a cleaner
interface, a new ->swap_rw is introduced instead of burdening
->direct_IO, etc.

Current swap-to-filesystem code only ever submits single-page reads and
writes.  These patches change that to allow multi-page IO when adjacent
requests are submitted.  Writes are also changed to be async rather than
sync.  This substantially speeds up write throughput for swap-over-NFS.

Some of the NFS patches can land independently of the MM patches.  A few
require the MM patches to land first.

---

NeilBrown (23):
      MM: create new mm/swap.h header file.
      MM: extend block-plugging to cover all swap reads with read-ahead
      MM: drop swap_set_page_dirty
      MM: move responsibility for setting SWP_FS_OPS to ->swap_activate
      MM: reclaim mustn't enter FS for SWP_FS_OPS swap-space
      MM: introduce ->swap_rw and use it for reads from SWP_FS_OPS swap-space
      MM: perform async writes to SWP_FS_OPS swap-space using ->swap_rw
      DOC: update documentation for swap_activate and swap_rw
      MM: submit multipage reads for SWP_FS_OPS swap-space
      MM: submit multipage write for SWP_FS_OPS swap-space
      VFS: Add FMODE_CAN_ODIRECT file flag
      NFS: remove IS_SWAPFILE hack
      NFS: rename nfs_direct_IO and use as ->swap_rw
      NFS: swap IO handling is slightly different for O_DIRECT IO
      SUNRPC/call_alloc: async tasks mustn't block waiting for memory
      SUNRPC/auth: async tasks mustn't block waiting for memory
      SUNRPC/xprt: async tasks mustn't block waiting for memory
      SUNRPC: remove scheduling boost for "SWAPPER" tasks.
      NFS: discard NFS_RPC_SWAPFLAGS and RPC_TASK_ROOTCREDS
      SUNRPC: improve 'swap' handling: scheduling and PF_MEMALLOC
      NFSv4: keep state manager thread active if swap is enabled
      NFS: swap-out must always use STABLE writes.
      SUNRPC: lock against ->sock changing during sysfs read


 Documentation/filesystems/locking.rst |  18 ++-
 Documentation/filesystems/vfs.rst     |  17 ++-
 drivers/block/loop.c                  |   4 +-
 fs/cifs/file.c                        |   7 +-
 fs/fcntl.c                            |   9 +-
 fs/nfs/direct.c                       |  56 ++++---
 fs/nfs/file.c                         |  39 +++--
 fs/nfs/nfs4_fs.h                      |   1 +
 fs/nfs/nfs4proc.c                     |  20 +++
 fs/nfs/nfs4state.c                    |  39 ++++-
 fs/nfs/read.c                         |   4 -
 fs/nfs/write.c                        |   2 +
 fs/open.c                             |   9 +-
 fs/overlayfs/file.c                   |  13 +-
 include/linux/fs.h                    |   4 +
 include/linux/nfs_fs.h                |  11 +-
 include/linux/nfs_xdr.h               |   2 +
 include/linux/sunrpc/auth.h           |   1 +
 include/linux/sunrpc/sched.h          |   1 -
 include/linux/swap.h                  |   7 +-
 include/linux/writeback.h             |   7 +
 include/trace/events/sunrpc.h         |   1 -
 mm/madvise.c                          |   8 +-
 mm/memory.c                           |   2 +-
 mm/page_io.c                          | 210 ++++++++++++++++++++------
 mm/swap.h                             |  26 +++-
 mm/swap_state.c                       |  33 ++--
 mm/swapfile.c                         |  13 +-
 mm/vmscan.c                           |  38 +++--
 net/sunrpc/auth.c                     |   8 +-
 net/sunrpc/auth_gss/auth_gss.c        |   6 +-
 net/sunrpc/auth_unix.c                |  10 +-
 net/sunrpc/clnt.c                     |   7 +-
 net/sunrpc/sched.c                    |  29 ++--
 net/sunrpc/sysfs.c                    |   5 +-
 net/sunrpc/xprt.c                     |  19 +--
 net/sunrpc/xprtrdma/transport.c       |  10 +-
 net/sunrpc/xprtsock.c                 |  15 +-
 38 files changed, 505 insertions(+), 206 deletions(-)

--
Signature

Comments

Geert Uytterhoeven Feb. 7, 2022, 5:55 p.m. UTC | #1
Hi Neil,

On Mon, Jan 24, 2022 at 5:40 PM NeilBrown <neilb@suse.de> wrote:
> swap-over-NFS currently has a variety of problems.
>
> swap writes call generic_write_checks(), which always fails on a swap
> file, so it completely fails.
> Even without this, various deadlocks are possible - largely due to
> improvements in NFS memory allocation (using NOFS instead of ATOMIC)
> which weren't tested against swap-out.
>
> NFS is the only filesystem that has supported fs-based swap IO, and it
> hasn't worked for several releases, so now is a convenient time to clean
> up the swap-via-filesystem interfaces - we cannot break anything !
>
> So the first few patches here clean up and improve various parts of the
> swap-via-filesystem code.  ->activate_swap() is given a cleaner
> interface, a new ->swap_rw is introduced instead of burdening
> ->direct_IO, etc.
>
> Current swap-to-filesystem code only ever submits single-page reads and
> writes.  These patches change that to allow multi-page IO when adjacent
> requests are submitted.  Writes are also changed to be async rather than
> sync.  This substantially speeds up write throughput for swap-over-NFS.
>
> Some of the NFS patches can land independently of the MM patches.  A few
> require the MM patches to land first.

Thanks for your series!
Swap over NFS was indeed broken last time I tried[1], but with your
series, it's working again on arm32 (RZ/A1 with 32 MiB of RAM, 100Mbps
Ethernet and Debian 9 nfsroot). My system was exercised using "apt
update", and the subsequent "apt upgrade" is still running, though
(it took more than 6 hours to build the apt dependency tree, now it's
trying hard to create a list of packages...).

Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>

BTW, I think you do want to run scripts/checkpatch.pl on your series,
and improve it by fixing a few of the reported warnings (function
definition arguments should also have an identifier name, missing
data_race() comment, missing SPDX-License-Identifier tag).

[1] https://lore.kernel.org/all/20191230153238.29878-1-geert+renesas@glider.be/

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds