mbox series

[GIT,PULL] afs: Improvements for v5.8

Message ID 2240660.1591289899@warthog.procyon.org.uk (mailing list archive)
State New, archived
Headers show
Series [GIT,PULL] afs: Improvements for v5.8 | expand

Pull-request

git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git tags/afs-next-20200604

Message

David Howells June 4, 2020, 4:58 p.m. UTC
Hi Linus,

Is it too late to put in a pull request for AFS changes?  Apologies - I was
holding off and hoping that I could get Al to review the changes I made to
the core VFS change commit (first in the series) in response to his earlier
review comments.  I have an ack for the Ext4 changes made, though.  If you
would prefer it to be held off at this point, fair enough.

Note that the series also got rebased to -rc7 to remove the dependency on
fix patches that got merged through the net tree.

---

There's one core VFS change which affects a couple of filesystems:

 (1) Make the inode hash table RCU safe and providing some RCU-safe
     accessor functions.  The search can then be done without taking the
     inode_hash_lock.  Care must be taken because the object may be being
     deleted and no wait is made.

 (2) Allow iunique() to avoid taking the inode_hash_lock.

 (3) Allow AFS's callback processing to avoid taking the inode_hash_lock
     when using the inode table to find an inode to notify.

 (4) Improve Ext4's time updating.  Konstantin Khlebnikov said "For now,
     I've plugged this issue with try-lock in ext4 lazy time update.  This
     solution is much better."

Then there's a set of changes to make a number of improvements to the AFS
driver:

 (1) Improve callback (ie. third party change notification) processing by:

     (a) Relying more on the fact we're doing this under RCU and by using
     	 fewer locks.  This makes use of the RCU-based inode searching
     	 outlined above.

     (b) Moving to keeping volumes in a tree indexed by volume ID rather
     	 than a flat list.

     (c) Making the server and volume records logically part of the cell.
     	 This means that a server record now points directly at the cell
     	 and the tree of volumes is there.  This removes an N:M mapping
     	 table, simplifying things.

 (2) Improve keeping NAT or firewall channels open for the server callbacks
     to reach the client by actively polling the fileserver on a timed
     basis, instead of only doing it when we have an operation to process.

 (3) Improving detection of delayed or lost callbacks by including the
     parent directory in the list of file IDs to be queried when doing a
     bulk status fetch from lookup.  We can then check to see if our copy
     of the directory has changed under us without us getting notified.

 (4) Determine aliasing of cells (such as a cell that is pointed to be a
     DNS alias).  This allows us to avoid having ambiguity due to
     apparently different cells using the same volume and file servers.

 (5) Improve the fileserver rotation to do more probing when it detects
     that all of the addresses to a server are listed as non-responsive.
     It's possible that an address that previously stopped responding has
     become responsive again.

Beyond that, lay some foundations for making some calls asynchronous:

 (1) Turn the fileserver cursor struct into a general operation struct and
     hang the parameters off of that rather than keeping them in local
     variables and hang results off of that rather than the call struct.

 (2) Implement some general operation handling code and simplify the
     callers of operations that affect a volume or a volume component (such
     as a file).  Most of the operation is now done by core code.

 (3) Operations are supplied with a table of operations to issue different
     variants of RPCs and to manage the completion, where all the required
     data is held in the operation object, thereby allowing these to be
     called from a workqueue.

 (4) Put the standard "if (begin), while(select), call op, end" sequence
     into a canned function that just emulates the current behaviour for
     now.

There are also some fixes interspersed:

 (1) Don't let the EACCES from ICMP6 mapping reach the user as such, since
     it's confusing as to whether it's a filesystem error.  Convert it to
     EHOSTUNREACH.

 (2) Don't use the epoch value acquired through probing a server.  If we
     have two servers with the same UUID but in different cells, it's hard
     to draw conclusions from them having different epoch values.

 (3) Don't interpret the argument to the CB.ProbeUuid RPC as a fileserver
     UUID and look up a fileserver from it.

 (4) Deal with servers in different cells having the same UUIDs.  In the
     event that a CB.InitCallBackState3 RPC is received, we have to break
     the callback promises for every server record matching that UUID.

 (5) Don't let afs_statfs return values that go below 0.

 (6) Don't use running fileserver probe state to make server selection and
     address selection decisions on.  Only make decisions on final state as
     the running state is cleared at the start of probing.

Tested-by: Marc Dionne <marc.dionne@auristor.com>

Thanks,
David
---
The following changes since commit 9cb1fd0efd195590b828b9b865421ad345a4a145:

  Linux 5.7-rc7 (2020-05-24 15:32:54 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git tags/afs-next-20200604

for you to fetch changes up to 8409f67b6437c4b327ee95a71081b9c7bfee0b00:

  afs: Adjust the fileserver rotation algorithm to reprobe/retry more quickly (2020-06-04 15:37:58 +0100)

----------------------------------------------------------------
AFS Changes

----------------------------------------------------------------
David Howells (27):
      vfs, afs, ext4: Make the inode hash table RCU searchable
      rxrpc: Map the EACCES error produced by some ICMP6 to EHOSTUNREACH
      rxrpc: Adjust /proc/net/rxrpc/calls to display call->debug_id not user_ID
      afs: Always include dir in bulk status fetch from afs_do_lookup()
      afs: Use the serverUnique field in the UVLDB record to reduce rpc ops
      afs: Split the usage count on struct afs_server
      afs: Actively poll fileservers to maintain NAT or firewall openings
      afs: Show more information in /proc/net/afs/servers
      afs: Make callback processing more efficient.
      afs: Set error flag rather than return error from file status decode
      afs: Remove the error argument from afs_protocol_error()
      afs: Rename struct afs_fs_cursor to afs_operation
      afs: Build an abstraction around an "operation" concept
      afs: Don't get epoch from a server because it may be ambiguous
      afs: Fix handling of CB.ProbeUuid cache manager op
      afs: Retain more of the VLDB record for alias detection
      afs: Implement client support for the YFSVL.GetCellName RPC op
      afs: Detect cell aliases 1 - Cells with root volumes
      afs: Detect cell aliases 2 - Cells with no root volumes
      afs: Detect cell aliases 3 - YFS Cells with a canonical cell name op
      afs: Add a tracepoint to track the lifetime of the afs_volume struct
      afs: Reorganise volume and server trees to be rooted on the cell
      afs: Fix the by-UUID server tree to allow servers with the same UUID
      afs: Fix afs_statfs() to not let the values go below zero
      afs: Don't use probe running state to make decisions outside probe code
      afs: Show more a bit more server state in /proc/net/afs/servers
      afs: Adjust the fileserver rotation algorithm to reprobe/retry more quickly

 fs/afs/Makefile            |    2 +
 fs/afs/afs.h               |    3 +-
 fs/afs/afs_vl.h            |    1 +
 fs/afs/callback.c          |  345 ++++--------
 fs/afs/cell.c              |   10 +-
 fs/afs/cmservice.c         |   67 +--
 fs/afs/dir.c               | 1253 ++++++++++++++++++++----------------------
 fs/afs/dir_silly.c         |  190 ++++---
 fs/afs/dynroot.c           |   93 ++++
 fs/afs/file.c              |   62 ++-
 fs/afs/flock.c             |  114 ++--
 fs/afs/fs_operation.c      |  239 ++++++++
 fs/afs/fs_probe.c          |  339 +++++++++---
 fs/afs/fsclient.c          | 1305 +++++++++++++++++---------------------------
 fs/afs/inode.c             |  491 ++++++++---------
 fs/afs/internal.h          |  523 ++++++++++--------
 fs/afs/main.c              |    6 +-
 fs/afs/proc.c              |   42 +-
 fs/afs/protocol_yfs.h      |    2 +-
 fs/afs/rotate.c            |  447 ++++++---------
 fs/afs/rxrpc.c             |   45 +-
 fs/afs/security.c          |    8 +-
 fs/afs/server.c            |  299 ++++++----
 fs/afs/server_list.c       |   40 +-
 fs/afs/super.c             |  107 ++--
 fs/afs/vl_alias.c          |  382 +++++++++++++
 fs/afs/vl_rotate.c         |    4 +
 fs/afs/vlclient.c          |  146 ++++-
 fs/afs/volume.c            |  154 ++++--
 fs/afs/write.c             |  148 +++--
 fs/afs/xattr.c             |  300 +++++-----
 fs/afs/yfsclient.c         |  914 +++++++++++++------------------
 fs/ext4/inode.c            |   44 +-
 fs/inode.c                 |  112 +++-
 include/linux/fs.h         |    3 +
 include/trace/events/afs.h |  111 +++-
 net/rxrpc/peer_event.c     |    3 +
 net/rxrpc/proc.c           |    6 +-
 38 files changed, 4437 insertions(+), 3923 deletions(-)
 create mode 100644 fs/afs/fs_operation.c
 create mode 100644 fs/afs/vl_alias.c

Comments

Al Viro June 5, 2020, 1:50 p.m. UTC | #1
On Thu, Jun 04, 2020 at 05:58:19PM +0100, David Howells wrote:
> Hi Linus,
> 
> Is it too late to put in a pull request for AFS changes?  Apologies - I was
> holding off and hoping that I could get Al to review the changes I made to
> the core VFS change commit (first in the series) in response to his earlier
> review comments.  I have an ack for the Ext4 changes made, though.  If you
> would prefer it to be held off at this point, fair enough.
> 
> Note that the series also got rebased to -rc7 to remove the dependency on
> fix patches that got merged through the net tree.

FWIW, I can live with fs/inode.c part in its current form
Al Viro June 5, 2020, 2:02 p.m. UTC | #2
On Fri, Jun 05, 2020 at 02:50:03PM +0100, Al Viro wrote:
> On Thu, Jun 04, 2020 at 05:58:19PM +0100, David Howells wrote:
> > Hi Linus,
> > 
> > Is it too late to put in a pull request for AFS changes?  Apologies - I was
> > holding off and hoping that I could get Al to review the changes I made to
> > the core VFS change commit (first in the series) in response to his earlier
> > review comments.  I have an ack for the Ext4 changes made, though.  If you
> > would prefer it to be held off at this point, fair enough.
> > 
> > Note that the series also got rebased to -rc7 to remove the dependency on
> > fix patches that got merged through the net tree.
> 
> FWIW, I can live with fs/inode.c part in its current form

Which is to say,
ACKed-by: Al Viro <viro@zeniv.linux.org.uk> (fs/inode.c part)
I have not checked the AFS part of the series and AFAICS
ext4 one at least doesn't make things any worse there.
Linus Torvalds June 5, 2020, 11:33 p.m. UTC | #3
On Thu, Jun 4, 2020 at 9:58 AM David Howells <dhowells@redhat.com> wrote:
>
>  (4) Improve Ext4's time updating.  Konstantin Khlebnikov said "For now,
>      I've plugged this issue with try-lock in ext4 lazy time update.  This
>      solution is much better."

It would have been good to get acks on this from the ext4 people, but
I've merged this as-is (but it's still going through my sanity tests,
so if that triggers something it might get unpulled again).

                  Linus
pr-tracker-bot@kernel.org June 5, 2020, 11:50 p.m. UTC | #4
The pull request you sent on Thu, 04 Jun 2020 17:58:19 +0100:

> git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git tags/afs-next-20200604

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/9daa0a27a0bce6596be287fb1df372ff80bb1087

Thank you!