mbox series

[00/13] Repair SWAP-over-NFS

Message ID 163702956672.25805.16457749992977493579.stgit@noble.brown (mailing list archive)
Headers show
Series Repair SWAP-over-NFS | expand

Message

NeilBrown Nov. 16, 2021, 2:44 a.m. UTC
swap-over-NFS currently has a variety of problems.

Due to a newish test in generic_write_checks(), all writes to swap
currently fail.
With that fixed, there are various sources of deadlocks that can cause
a swapping system to freeze.

swap has never worked over NFSv4 due to the occasional need to start the
state-management thread - which won't happen when under high memory
pressure.

This series addresses all the problems that I could find, and also
changes writes to be asynchronous, and both reads and writes to use
multi-page RPC requests when possible (the last 2 patches).

This last change causes interesting performance changes.  The rate of
writes to the swap file (measured in K/sec) increases by a factor of
about 5 (not precisely measured).  However interactive response falls
noticeably (response time in multiple seconds, but not minutes).  So
while it seems like it should be a good idea, I'm not sure if we want it
until it is better understood.

I'd be very happy if others could test out some swapping scenarios to
see how it performs.  I've been using
    stress-ng --brk 2 --stack 2 --bigheap 2
which doesn't give me any insight into whether more useful work is
getting done.

Apart from the last two patches, I think this series is ready.

Thanks,
NeilBrown

---

NeilBrown (13):
      NFS: move generic_write_checks() call from nfs_file_direct_write() to nfs_file_write()
      NFS: do not take i_rwsem for swap IO
      MM: reclaim mustn't enter FS for swap-over-NFS
      SUNRPC/call_alloc: async tasks mustn't block waiting for memory
      SUNRPC/auth: async tasks mustn't block waiting for memory
      SUNRPC/xprt: async tasks mustn't block waiting for memory
      SUNRPC: remove scheduling boost for "SWAPPER" tasks.
      NFS: discard NFS_RPC_SWAPFLAGS and RPC_TASK_ROOTCREDS
      SUNRPC: improve 'swap' handling: scheduling and PF_MEMALLOC
      NFSv4: keep state manager thread active if swap is enabled
      NFS: swap-out must always use STABLE writes.
      MM: use AIO/DIO for reads from SWP_FS_OPS swap-space
      MM: use AIO for DIO writes to swap


 fs/nfs/direct.c                 |  12 +-
 fs/nfs/file.c                   |  21 ++-
 fs/nfs/io.c                     |   9 ++
 fs/nfs/nfs4_fs.h                |   1 +
 fs/nfs/nfs4proc.c               |  20 +++
 fs/nfs/nfs4state.c              |  39 ++++-
 fs/nfs/read.c                   |   4 -
 fs/nfs/write.c                  |   2 +
 include/linux/nfs_fs.h          |   8 +-
 include/linux/nfs_xdr.h         |   2 +
 include/linux/sunrpc/auth.h     |   1 +
 include/linux/sunrpc/sched.h    |   1 -
 include/trace/events/sunrpc.h   |   1 -
 mm/page_io.c                    | 243 +++++++++++++++++++++++++++-----
 mm/vmscan.c                     |  12 +-
 net/sunrpc/auth.c               |   8 +-
 net/sunrpc/auth_gss/auth_gss.c  |   6 +-
 net/sunrpc/auth_unix.c          |  10 +-
 net/sunrpc/clnt.c               |   7 +-
 net/sunrpc/sched.c              |  29 ++--
 net/sunrpc/xprt.c               |  19 +--
 net/sunrpc/xprtrdma/transport.c |  10 +-
 net/sunrpc/xprtsock.c           |   8 ++
 23 files changed, 374 insertions(+), 99 deletions(-)

--
Signature

Comments

Matthew Wilcox Nov. 16, 2021, 3:29 a.m. UTC | #1
On Tue, Nov 16, 2021 at 01:44:04PM +1100, NeilBrown wrote:
> swap-over-NFS currently has a variety of problems.
> 
> Due to a newish test in generic_write_checks(), all writes to swap
> currently fail.

And by "currently", you mean "for over two years" (August 2019).
Does swap-over-NFS (or any other network filesystem) actually have any
users, and should we fix it or rip it out?
NeilBrown Nov. 16, 2021, 3:55 a.m. UTC | #2
On Tue, 16 Nov 2021, Matthew Wilcox wrote:
> On Tue, Nov 16, 2021 at 01:44:04PM +1100, NeilBrown wrote:
> > swap-over-NFS currently has a variety of problems.
> > 
> > Due to a newish test in generic_write_checks(), all writes to swap
> > currently fail.
> 
> And by "currently", you mean "for over two years" (August 2019).

That's about the time scale for "enterprise" releases...
Actually, the earliest patches that impacted swap-over-NFS was more like
4 years ago.  I didn't bother tracking Fixes: tags for everything that
was a fix, as I didn't think it would really help and might encourage
people to backport little bits of the series which I wouldn't recommend.

> Does swap-over-NFS (or any other network filesystem) actually have any
> users, and should we fix it or rip it out?
> 
> 
We have at least one user (why else would I be working on this?).  I
think we have more, though they are presumably still on an earlier
release.

I'd prefer "fix it" over "rip it out".

I don't think any other network filesystem supports swap, but it is
not trivial to grep for.. There must be a 'swap_activate' method, and it
must return 0.  There must also be a direct_IO that works.
The only other network filesystem with swap_activate is cifs.  It
returns 0, but direct_IO returns -EINVAL.

NeilBrown