mbox series

[v2,0/5] SUNRPC: Create sysfs files for changing IP

Message ID 20210202184244.288898-1-Anna.Schumaker@Netapp.com (mailing list archive)
Headers show
Series SUNRPC: Create sysfs files for changing IP | expand

Message

Anna Schumaker Feb. 2, 2021, 6:42 p.m. UTC
From: Anna Schumaker <Anna.Schumaker@Netapp.com>

It's possible for an NFS server to go down but come back up with a
different IP address. These patches provide a way for administrators to
handle this issue by providing a new IP address for xprt sockets to
connect to.

Chuck has suggested some ideas for future work that could also use this
interface, such as:
- srcaddr: To move between network devices on the client
- type: "tcp", "rdma", "local"
- bound: 0 for autobind, or the result of the most recent rpcbind query
- connected: either true or false
- last: read-only timestamp of the last operation to use the transport
- device: A symlink to the physical network device

Changes in v2:
- Put files under /sys/kernel/sunrpc/ instead of /sys/net/sunrpc/
- Rename file from "address" to "dstaddr"

Thoughts?
Anna


Anna Schumaker (5):
  sunrpc: Create a sunrpc directory under /sys/kernel/
  sunrpc: Create a net/ subdirectory in the sunrpc sysfs
  sunrpc: Create per-rpc_clnt sysfs kobjects
  sunrpc: Prepare xs_connect() for taking NULL tasks
  sunrpc: Create a per-rpc_clnt file for managing the destination IP
    address

 include/linux/sunrpc/clnt.h |   1 +
 net/sunrpc/Makefile         |   2 +-
 net/sunrpc/clnt.c           |   5 ++
 net/sunrpc/sunrpc_syms.c    |   8 ++
 net/sunrpc/sysfs.c          | 168 ++++++++++++++++++++++++++++++++++++
 net/sunrpc/sysfs.h          |  22 +++++
 net/sunrpc/xprtsock.c       |   3 +-
 7 files changed, 207 insertions(+), 2 deletions(-)
 create mode 100644 net/sunrpc/sysfs.c
 create mode 100644 net/sunrpc/sysfs.h

Comments

Chuck Lever Feb. 2, 2021, 6:51 p.m. UTC | #1
I want to ensure Dan is aware of this work. Thanks for posting, Anna!

> On Feb 2, 2021, at 1:42 PM, schumaker.anna@gmail.com wrote:
> 
> From: Anna Schumaker <Anna.Schumaker@Netapp.com>
> 
> It's possible for an NFS server to go down but come back up with a
> different IP address. These patches provide a way for administrators to
> handle this issue by providing a new IP address for xprt sockets to
> connect to.
> 
> Chuck has suggested some ideas for future work that could also use this
> interface, such as:
> - srcaddr: To move between network devices on the client
> - type: "tcp", "rdma", "local"
> - bound: 0 for autobind, or the result of the most recent rpcbind query
> - connected: either true or false
> - last: read-only timestamp of the last operation to use the transport
> - device: A symlink to the physical network device
> 
> Changes in v2:
> - Put files under /sys/kernel/sunrpc/ instead of /sys/net/sunrpc/
> - Rename file from "address" to "dstaddr"
> 
> Thoughts?
> Anna
> 
> 
> Anna Schumaker (5):
>  sunrpc: Create a sunrpc directory under /sys/kernel/
>  sunrpc: Create a net/ subdirectory in the sunrpc sysfs
>  sunrpc: Create per-rpc_clnt sysfs kobjects
>  sunrpc: Prepare xs_connect() for taking NULL tasks
>  sunrpc: Create a per-rpc_clnt file for managing the destination IP
>    address
> 
> include/linux/sunrpc/clnt.h |   1 +
> net/sunrpc/Makefile         |   2 +-
> net/sunrpc/clnt.c           |   5 ++
> net/sunrpc/sunrpc_syms.c    |   8 ++
> net/sunrpc/sysfs.c          | 168 ++++++++++++++++++++++++++++++++++++
> net/sunrpc/sysfs.h          |  22 +++++
> net/sunrpc/xprtsock.c       |   3 +-
> 7 files changed, 207 insertions(+), 2 deletions(-)
> create mode 100644 net/sunrpc/sysfs.c
> create mode 100644 net/sunrpc/sysfs.h
> 
> -- 
> 2.29.2
> 

--
Chuck Lever
Anna Schumaker Feb. 2, 2021, 6:52 p.m. UTC | #2
You're welcome! I'll try to remember to CC him on future versions

On Tue, Feb 2, 2021 at 1:51 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>
> I want to ensure Dan is aware of this work. Thanks for posting, Anna!
>
> > On Feb 2, 2021, at 1:42 PM, schumaker.anna@gmail.com wrote:
> >
> > From: Anna Schumaker <Anna.Schumaker@Netapp.com>
> >
> > It's possible for an NFS server to go down but come back up with a
> > different IP address. These patches provide a way for administrators to
> > handle this issue by providing a new IP address for xprt sockets to
> > connect to.
> >
> > Chuck has suggested some ideas for future work that could also use this
> > interface, such as:
> > - srcaddr: To move between network devices on the client
> > - type: "tcp", "rdma", "local"
> > - bound: 0 for autobind, or the result of the most recent rpcbind query
> > - connected: either true or false
> > - last: read-only timestamp of the last operation to use the transport
> > - device: A symlink to the physical network device
> >
> > Changes in v2:
> > - Put files under /sys/kernel/sunrpc/ instead of /sys/net/sunrpc/
> > - Rename file from "address" to "dstaddr"
> >
> > Thoughts?
> > Anna
> >
> >
> > Anna Schumaker (5):
> >  sunrpc: Create a sunrpc directory under /sys/kernel/
> >  sunrpc: Create a net/ subdirectory in the sunrpc sysfs
> >  sunrpc: Create per-rpc_clnt sysfs kobjects
> >  sunrpc: Prepare xs_connect() for taking NULL tasks
> >  sunrpc: Create a per-rpc_clnt file for managing the destination IP
> >    address
> >
> > include/linux/sunrpc/clnt.h |   1 +
> > net/sunrpc/Makefile         |   2 +-
> > net/sunrpc/clnt.c           |   5 ++
> > net/sunrpc/sunrpc_syms.c    |   8 ++
> > net/sunrpc/sysfs.c          | 168 ++++++++++++++++++++++++++++++++++++
> > net/sunrpc/sysfs.h          |  22 +++++
> > net/sunrpc/xprtsock.c       |   3 +-
> > 7 files changed, 207 insertions(+), 2 deletions(-)
> > create mode 100644 net/sunrpc/sysfs.c
> > create mode 100644 net/sunrpc/sysfs.h
> >
> > --
> > 2.29.2
> >
>
> --
> Chuck Lever
>
>
>
Dan Aloni Feb. 2, 2021, 7:24 p.m. UTC | #3
On Tue, Feb 02, 2021 at 01:52:10PM -0500, Anna Schumaker wrote:
> You're welcome! I'll try to remember to CC him on future versions
> On Tue, Feb 2, 2021 at 1:51 PM Chuck Lever <chuck.lever@oracle.com> wrote:
> >
> > I want to ensure Dan is aware of this work. Thanks for posting, Anna!

Thanks Anna and Chuck. I'm accessing and monitoring the mailing list via
NNTP and I'm also on #linux-nfs for chatting (da-x).

I see srcaddr was already discussed, so the patches I'm planning to send
next will be based on the latest version of your patchset and concern
multipath.

What I'm going for is the following:

- Expose transports that are reachable from xprtmultipath. Each in its
  own sub-directory, with an interface and status representation similar
  to the top directory.
- A way to add/remove transports.
- Inspiration for coding this is various other things in the kernel that
  use configfs, perhaps it can be used here too.

Also, what do you think would be a straightforward way for a userspace
program to find what sunrpc client id serves a mountpoint? If we add an
ioctl for the mountdir AFAIK it would be the first one that the NFS
client supports, so I wonder if there's a better interface that can work
for that.
Chuck Lever Feb. 2, 2021, 7:46 p.m. UTC | #4
> On Feb 2, 2021, at 2:24 PM, Dan Aloni <dan@kernelim.com> wrote:
> 
> On Tue, Feb 02, 2021 at 01:52:10PM -0500, Anna Schumaker wrote:
>> You're welcome! I'll try to remember to CC him on future versions
>> On Tue, Feb 2, 2021 at 1:51 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>>> 
>>> I want to ensure Dan is aware of this work. Thanks for posting, Anna!
> 
> Thanks Anna and Chuck. I'm accessing and monitoring the mailing list via
> NNTP and I'm also on #linux-nfs for chatting (da-x).
> 
> I see srcaddr was already discussed, so the patches I'm planning to send
> next will be based on the latest version of your patchset and concern
> multipath.
> 
> What I'm going for is the following:
> 
> - Expose transports that are reachable from xprtmultipath. Each in its
>  own sub-directory, with an interface and status representation similar
>  to the top directory.
> - A way to add/remove transports.
> - Inspiration for coding this is various other things in the kernel that
>  use configfs, perhaps it can be used here too.
> 
> Also, what do you think would be a straightforward way for a userspace
> program to find what sunrpc client id serves a mountpoint? If we add an
> ioctl for the mountdir AFAIK it would be the first one that the NFS
> client supports, so I wonder if there's a better interface that can work
> for that.

Has the new mount API been merged? That provides a way to open
a mountpoint and get a file descriptor for it, and then write
commands to it.


--
Chuck Lever
Benjamin Coddington Feb. 2, 2021, 7:49 p.m. UTC | #5
On 2 Feb 2021, at 14:24, Dan Aloni wrote:

> On Tue, Feb 02, 2021 at 01:52:10PM -0500, Anna Schumaker wrote:
>> You're welcome! I'll try to remember to CC him on future versions
>> On Tue, Feb 2, 2021 at 1:51 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>>>
>>> I want to ensure Dan is aware of this work. Thanks for posting, Anna!
>
> Thanks Anna and Chuck. I'm accessing and monitoring the mailing list via
> NNTP and I'm also on #linux-nfs for chatting (da-x).
>
> I see srcaddr was already discussed, so the patches I'm planning to send
> next will be based on the latest version of your patchset and concern
> multipath.
>
> What I'm going for is the following:
>
> - Expose transports that are reachable from xprtmultipath. Each in its
>   own sub-directory, with an interface and status representation similar
>   to the top directory.
> - A way to add/remove transports.
> - Inspiration for coding this is various other things in the kernel that
>   use configfs, perhaps it can be used here too.
>
> Also, what do you think would be a straightforward way for a userspace
> program to find what sunrpc client id serves a mountpoint? If we add an
> ioctl for the mountdir AFAIK it would be the first one that the NFS
> client supports, so I wonder if there's a better interface that can work
> for that.

I'm a fan of adding an ioctl interface for userspace, but I think we'd
better avoid using NFS itself because it would be nice to someday implement
an NFS "shutdown" for non-responsive servers, but sending any ioctl to the
mountpoint could revalidate it, and we'd hang on the GETATTR.

Maybe we can figure out a way to expose the superblock via sysfs for each
mount.

Ben
Anna Schumaker Feb. 2, 2021, 7:51 p.m. UTC | #6
On Tue, Feb 2, 2021 at 2:48 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>
>
>
> > On Feb 2, 2021, at 2:24 PM, Dan Aloni <dan@kernelim.com> wrote:
> >
> > On Tue, Feb 02, 2021 at 01:52:10PM -0500, Anna Schumaker wrote:
> >> You're welcome! I'll try to remember to CC him on future versions
> >> On Tue, Feb 2, 2021 at 1:51 PM Chuck Lever <chuck.lever@oracle.com> wrote:
> >>>
> >>> I want to ensure Dan is aware of this work. Thanks for posting, Anna!
> >
> > Thanks Anna and Chuck. I'm accessing and monitoring the mailing list via
> > NNTP and I'm also on #linux-nfs for chatting (da-x).
> >
> > I see srcaddr was already discussed, so the patches I'm planning to send
> > next will be based on the latest version of your patchset and concern
> > multipath.
> >
> > What I'm going for is the following:
> >
> > - Expose transports that are reachable from xprtmultipath. Each in its
> >  own sub-directory, with an interface and status representation similar
> >  to the top directory.
> > - A way to add/remove transports.
> > - Inspiration for coding this is various other things in the kernel that
> >  use configfs, perhaps it can be used here too.

Sounds good! I'm looking forward to seeing them

> >
> > Also, what do you think would be a straightforward way for a userspace
> > program to find what sunrpc client id serves a mountpoint? If we add an
> > ioctl for the mountdir AFAIK it would be the first one that the NFS
> > client supports, so I wonder if there's a better interface that can work
> > for that.
>
> Has the new mount API been merged? That provides a way to open
> a mountpoint and get a file descriptor for it, and then write
> commands to it.

I'm pretty sure it was merged a release or two ago (at least, that's
when the fs_context patches went in)

Anna
>
>
> --
> Chuck Lever
>
>
>
Trond Myklebust Feb. 2, 2021, 10:17 p.m. UTC | #7
On Tue, 2021-02-02 at 14:49 -0500, Benjamin Coddington wrote:
> On 2 Feb 2021, at 14:24, Dan Aloni wrote:
> 
> > On Tue, Feb 02, 2021 at 01:52:10PM -0500, Anna Schumaker wrote:
> > > You're welcome! I'll try to remember to CC him on future versions
> > > On Tue, Feb 2, 2021 at 1:51 PM Chuck Lever
> > > <chuck.lever@oracle.com> wrote:
> > > > 
> > > > I want to ensure Dan is aware of this work. Thanks for posting,
> > > > Anna!
> > 
> > Thanks Anna and Chuck. I'm accessing and monitoring the mailing
> > list via
> > NNTP and I'm also on #linux-nfs for chatting (da-x).
> > 
> > I see srcaddr was already discussed, so the patches I'm planning to
> > send
> > next will be based on the latest version of your patchset and
> > concern
> > multipath.
> > 
> > What I'm going for is the following:
> > 
> > - Expose transports that are reachable from xprtmultipath. Each in
> > its
> >   own sub-directory, with an interface and status representation
> > similar
> >   to the top directory.
> > - A way to add/remove transports.
> > - Inspiration for coding this is various other things in the kernel
> > that
> >   use configfs, perhaps it can be used here too.
> > 
> > Also, what do you think would be a straightforward way for a
> > userspace
> > program to find what sunrpc client id serves a mountpoint? If we
> > add an
> > ioctl for the mountdir AFAIK it would be the first one that the NFS
> > client supports, so I wonder if there's a better interface that can
> > work
> > for that.
> 
> I'm a fan of adding an ioctl interface for userspace, but I think
> we'd
> better avoid using NFS itself because it would be nice to someday
> implement
> an NFS "shutdown" for non-responsive servers, but sending any ioctl
> to the
> mountpoint could revalidate it, and we'd hang on the GETATTR.
> 
> Maybe we can figure out a way to expose the superblock via sysfs for
> each
> mount.

Right. There is potential functionality here that we do not need or
even want to expose via the mount interface. Being able to cancel all
the hung RPC calls in an RPC queue, for instance, is not something you
want to do through fsopen() and friends.
Chuck Lever Feb. 2, 2021, 10:21 p.m. UTC | #8
> On Feb 2, 2021, at 5:17 PM, Trond Myklebust <trondmy@hammerspace.com> wrote:
> 
> On Tue, 2021-02-02 at 14:49 -0500, Benjamin Coddington wrote:
>> On 2 Feb 2021, at 14:24, Dan Aloni wrote:
>> 
>>> On Tue, Feb 02, 2021 at 01:52:10PM -0500, Anna Schumaker wrote:
>>>> You're welcome! I'll try to remember to CC him on future versions
>>>> On Tue, Feb 2, 2021 at 1:51 PM Chuck Lever
>>>> <chuck.lever@oracle.com> wrote:
>>>>> 
>>>>> I want to ensure Dan is aware of this work. Thanks for posting,
>>>>> Anna!
>>> 
>>> Thanks Anna and Chuck. I'm accessing and monitoring the mailing
>>> list via
>>> NNTP and I'm also on #linux-nfs for chatting (da-x).
>>> 
>>> I see srcaddr was already discussed, so the patches I'm planning to
>>> send
>>> next will be based on the latest version of your patchset and
>>> concern
>>> multipath.
>>> 
>>> What I'm going for is the following:
>>> 
>>> - Expose transports that are reachable from xprtmultipath. Each in
>>> its
>>>   own sub-directory, with an interface and status representation
>>> similar
>>>   to the top directory.
>>> - A way to add/remove transports.
>>> - Inspiration for coding this is various other things in the kernel
>>> that
>>>   use configfs, perhaps it can be used here too.
>>> 
>>> Also, what do you think would be a straightforward way for a
>>> userspace
>>> program to find what sunrpc client id serves a mountpoint? If we
>>> add an
>>> ioctl for the mountdir AFAIK it would be the first one that the NFS
>>> client supports, so I wonder if there's a better interface that can
>>> work
>>> for that.
>> 
>> I'm a fan of adding an ioctl interface for userspace, but I think
>> we'd
>> better avoid using NFS itself because it would be nice to someday
>> implement
>> an NFS "shutdown" for non-responsive servers, but sending any ioctl
>> to the
>> mountpoint could revalidate it, and we'd hang on the GETATTR.
>> 
>> Maybe we can figure out a way to expose the superblock via sysfs for
>> each
>> mount.
> 
> Right. There is potential functionality here that we do not need or
> even want to expose via the mount interface. Being able to cancel all
> the hung RPC calls in an RPC queue, for instance, is not something you
> want to do through fsopen() and friends.

I thought we were talking only about an ioctl or fsopen cmd that
identifies the transports that are associated with an NFS mount.

Ostensibly a read-only use of that API.


--
Chuck Lever
Trond Myklebust Feb. 2, 2021, 10:24 p.m. UTC | #9
On Tue, 2021-02-02 at 22:21 +0000, Chuck Lever wrote:
> 
> 
> > On Feb 2, 2021, at 5:17 PM, Trond Myklebust <
> > trondmy@hammerspace.com> wrote:
> > 
> > On Tue, 2021-02-02 at 14:49 -0500, Benjamin Coddington wrote:
> > > On 2 Feb 2021, at 14:24, Dan Aloni wrote:
> > > 
> > > > On Tue, Feb 02, 2021 at 01:52:10PM -0500, Anna Schumaker wrote:
> > > > > You're welcome! I'll try to remember to CC him on future
> > > > > versions
> > > > > On Tue, Feb 2, 2021 at 1:51 PM Chuck Lever
> > > > > <chuck.lever@oracle.com> wrote:
> > > > > > 
> > > > > > I want to ensure Dan is aware of this work. Thanks for
> > > > > > posting,
> > > > > > Anna!
> > > > 
> > > > Thanks Anna and Chuck. I'm accessing and monitoring the mailing
> > > > list via
> > > > NNTP and I'm also on #linux-nfs for chatting (da-x).
> > > > 
> > > > I see srcaddr was already discussed, so the patches I'm
> > > > planning to
> > > > send
> > > > next will be based on the latest version of your patchset and
> > > > concern
> > > > multipath.
> > > > 
> > > > What I'm going for is the following:
> > > > 
> > > > - Expose transports that are reachable from xprtmultipath. Each
> > > > in
> > > > its
> > > >   own sub-directory, with an interface and status
> > > > representation
> > > > similar
> > > >   to the top directory.
> > > > - A way to add/remove transports.
> > > > - Inspiration for coding this is various other things in the
> > > > kernel
> > > > that
> > > >   use configfs, perhaps it can be used here too.
> > > > 
> > > > Also, what do you think would be a straightforward way for a
> > > > userspace
> > > > program to find what sunrpc client id serves a mountpoint? If
> > > > we
> > > > add an
> > > > ioctl for the mountdir AFAIK it would be the first one that the
> > > > NFS
> > > > client supports, so I wonder if there's a better interface that
> > > > can
> > > > work
> > > > for that.
> > > 
> > > I'm a fan of adding an ioctl interface for userspace, but I think
> > > we'd
> > > better avoid using NFS itself because it would be nice to someday
> > > implement
> > > an NFS "shutdown" for non-responsive servers, but sending any
> > > ioctl
> > > to the
> > > mountpoint could revalidate it, and we'd hang on the GETATTR.
> > > 
> > > Maybe we can figure out a way to expose the superblock via sysfs
> > > for
> > > each
> > > mount.
> > 
> > Right. There is potential functionality here that we do not need or
> > even want to expose via the mount interface. Being able to cancel
> > all
> > the hung RPC calls in an RPC queue, for instance, is not something
> > you
> > want to do through fsopen() and friends.
> 
> I thought we were talking only about an ioctl or fsopen cmd that
> identifies the transports that are associated with an NFS mount.
> 
> Ostensibly a read-only use of that API.
> 

I'll let Anna chime in with the details of her use case, but my
understanding has always been that this would be a read/write interface
for changing the properties of those transports on the fly.
Chuck Lever Feb. 2, 2021, 10:31 p.m. UTC | #10
> On Feb 2, 2021, at 5:24 PM, Trond Myklebust <trondmy@hammerspace.com> wrote:
> 
> On Tue, 2021-02-02 at 22:21 +0000, Chuck Lever wrote:
>> 
>> 
>>> On Feb 2, 2021, at 5:17 PM, Trond Myklebust <
>>> trondmy@hammerspace.com> wrote:
>>> 
>>> On Tue, 2021-02-02 at 14:49 -0500, Benjamin Coddington wrote:
>>>> On 2 Feb 2021, at 14:24, Dan Aloni wrote:
>>>> 
>>>>> On Tue, Feb 02, 2021 at 01:52:10PM -0500, Anna Schumaker wrote:
>>>>>> You're welcome! I'll try to remember to CC him on future
>>>>>> versions
>>>>>> On Tue, Feb 2, 2021 at 1:51 PM Chuck Lever
>>>>>> <chuck.lever@oracle.com> wrote:
>>>>>>> 
>>>>>>> I want to ensure Dan is aware of this work. Thanks for
>>>>>>> posting,
>>>>>>> Anna!
>>>>> 
>>>>> Thanks Anna and Chuck. I'm accessing and monitoring the mailing
>>>>> list via
>>>>> NNTP and I'm also on #linux-nfs for chatting (da-x).
>>>>> 
>>>>> I see srcaddr was already discussed, so the patches I'm
>>>>> planning to
>>>>> send
>>>>> next will be based on the latest version of your patchset and
>>>>> concern
>>>>> multipath.
>>>>> 
>>>>> What I'm going for is the following:
>>>>> 
>>>>> - Expose transports that are reachable from xprtmultipath. Each
>>>>> in
>>>>> its
>>>>>   own sub-directory, with an interface and status
>>>>> representation
>>>>> similar
>>>>>   to the top directory.
>>>>> - A way to add/remove transports.
>>>>> - Inspiration for coding this is various other things in the
>>>>> kernel
>>>>> that
>>>>>   use configfs, perhaps it can be used here too.
>>>>> 
>>>>> Also, what do you think would be a straightforward way for a
>>>>> userspace
>>>>> program to find what sunrpc client id serves a mountpoint? If
>>>>> we
>>>>> add an
>>>>> ioctl for the mountdir AFAIK it would be the first one that the
>>>>> NFS
>>>>> client supports, so I wonder if there's a better interface that
>>>>> can
>>>>> work
>>>>> for that.
>>>> 
>>>> I'm a fan of adding an ioctl interface for userspace, but I think
>>>> we'd
>>>> better avoid using NFS itself because it would be nice to someday
>>>> implement
>>>> an NFS "shutdown" for non-responsive servers, but sending any
>>>> ioctl
>>>> to the
>>>> mountpoint could revalidate it, and we'd hang on the GETATTR.
>>>> 
>>>> Maybe we can figure out a way to expose the superblock via sysfs
>>>> for
>>>> each
>>>> mount.
>>> 
>>> Right. There is potential functionality here that we do not need or
>>> even want to expose via the mount interface. Being able to cancel
>>> all
>>> the hung RPC calls in an RPC queue, for instance, is not something
>>> you
>>> want to do through fsopen() and friends.
>> 
>> I thought we were talking only about an ioctl or fsopen cmd that
>> identifies the transports that are associated with an NFS mount.
>> 
>> Ostensibly a read-only use of that API.
>> 
> 
> I'll let Anna chime in with the details of her use case, but my
> understanding has always been that this would be a read/write interface
> for changing the properties of those transports on the fly.

Agreed, but Dan's looking for a way to match up an NFS mount to the
/sys directories that Anna is adding to do those manipulations.

So, fsopen() or ioctl() would identify the transports, and then
Anna's API would enable an appropriately privileged user to
change the properties as you indicated. Two separate steps.

If the new API already provides a mechanism to determine which
transports to adjust, then we won't need an ioctl/fsopen at all.


--
Chuck Lever
Dan Aloni Feb. 3, 2021, 9:20 p.m. UTC | #11
On Tue, Feb 02, 2021 at 02:49:38PM -0500, Benjamin Coddington wrote:
> On 2 Feb 2021, at 14:24, Dan Aloni wrote:
> > Also, what do you think would be a straightforward way for a userspace
> > program to find what sunrpc client id serves a mountpoint? If we add an
> > ioctl for the mountdir AFAIK it would be the first one that the NFS
> > client supports, so I wonder if there's a better interface that can work
> > for that.
> 
> I'm a fan of adding an ioctl interface for userspace, but I think we'd
> better avoid using NFS itself because it would be nice to someday implement
> an NFS "shutdown" for non-responsive servers, but sending any ioctl to the
> mountpoint could revalidate it, and we'd hang on the GETATTR.

For that, I was looking into using openat2() with the very recently
added RESOLVE_CACHED flag. However from some experimentation I see that it
still sleeps on the unresponsive mount in nfs_weak_revalidate(), and the
latter cannot tell whether LOOKUP_CACHED flag was passed to
d_weak_revalidate().

> Maybe we can figure out a way to expose the superblock via sysfs for each
> mount.

Essentially this is what fspick() syscall lets you do. I imagine that it
can be implemented entirely under fs/nfs, using fsconfig() from under a
FSCONFIG_SET_STRING passing a special string such as
"report-clients-ids", causing a list of sunrpc client IDs to get written
to the fs_context log.

However even with this interface we may still need to verify that the
path lookup that `fspick` does using `user_path_at` is not blocking on
non-responsive NFS mounts.
Dan Aloni Feb. 14, 2021, 5:41 p.m. UTC | #12
On Wed, Feb 03, 2021 at 11:20:35PM +0200, Dan Aloni wrote:
> On Tue, Feb 02, 2021 at 02:49:38PM -0500, Benjamin Coddington wrote:
> > On 2 Feb 2021, at 14:24, Dan Aloni wrote:
> > > Also, what do you think would be a straightforward way for a userspace
> > > program to find what sunrpc client id serves a mountpoint? If we add an
> > > ioctl for the mountdir AFAIK it would be the first one that the NFS
> > > client supports, so I wonder if there's a better interface that can work
> > > for that.
> > 
> > I'm a fan of adding an ioctl interface for userspace, but I think we'd
> > better avoid using NFS itself because it would be nice to someday implement
> > an NFS "shutdown" for non-responsive servers, but sending any ioctl to the
> > mountpoint could revalidate it, and we'd hang on the GETATTR.
> 
> For that, I was looking into using openat2() with the very recently
> added RESOLVE_CACHED flag. However from some experimentation I see that it
> still sleeps on the unresponsive mount in nfs_weak_revalidate(), and the
> latter cannot tell whether LOOKUP_CACHED flag was passed to
> d_weak_revalidate().
> 
> > Maybe we can figure out a way to expose the superblock via sysfs for each
> > mount.
> 
> Essentially this is what fspick() syscall lets you do. I imagine that it
> can be implemented entirely under fs/nfs, using fsconfig() from under a
> FSCONFIG_SET_STRING passing a special string such as
> "report-clients-ids", causing a list of sunrpc client IDs to get written
> to the fs_context log.
> 
> However even with this interface we may still need to verify that the
> path lookup that `fspick` does using `user_path_at` is not blocking on
> non-responsive NFS mounts.

Pending a response from Anna about this, in the meanwhile I've prepared
patch for the fspick approach. My experiments show that it does not
block over hung mounts compared to the ioctl method. I'll repost
following comments.

-

Using a flag named "sunrpc-id" with set-flags following fspick syscall,
the information regarding related sunrpc client IDs can be determined on
a mountpoint:

    int fd = fspick(AT_FDCWD, "/mnt/export", FSPICK_CLOEXEC |
	FSPICK_NO_AUTOMOUNT);
    fsconfig(fd, FSCONFIG_SET_FLAG, "sunrpcid", NULL, 0);

Example output:

    i sunrpc-id main 4
    i sunrpc-id shared 0
    i sunrpc-id acl 5
    i sunrpc-id nlm 3
    i sunrpc-id -

Here `-` is used as end-of-list sentinel.

The advantage over adding a potential NFS ioctl is that no `open`
syscall is needed, therefore caching invalidation issues that
may result in a hung query are avoided.

Signed-off-by: Dan Aloni <dan@kernelim.com>
---
 fs/nfs/fs_context.c | 41 +++++++++++++++++++++++++++++++++++++++++
 fs/nfs/internal.h   |  4 ++++
 2 files changed, 45 insertions(+)

diff --git a/fs/nfs/fs_context.c b/fs/nfs/fs_context.c
index 06894bcdea2d..a63aeeaaf6ce 100644
--- a/fs/nfs/fs_context.c
+++ b/fs/nfs/fs_context.c
@@ -14,6 +14,7 @@
 #include <linux/fs.h>
 #include <linux/fs_context.h>
 #include <linux/fs_parser.h>
+#include <linux/lockd/lockd.h>
 #include <linux/nfs_fs.h>
 #include <linux/nfs_mount.h>
 #include <linux/nfs4_mount.h>
@@ -76,6 +77,7 @@ enum nfs_param {
 	Opt_softerr,
 	Opt_softreval,
 	Opt_source,
+	Opt_sunrpcid,
 	Opt_tcp,
 	Opt_timeo,
 	Opt_udp,
@@ -161,6 +163,7 @@ static const struct fs_parameter_spec nfs_fs_parameters[] = {
 	fsparam_flag  ("softerr",	Opt_softerr),
 	fsparam_flag  ("softreval",	Opt_softreval),
 	fsparam_string("source",	Opt_source),
+	fsparam_flag  ("sunrpcid",	Opt_sunrpcid),
 	fsparam_flag  ("tcp",		Opt_tcp),
 	fsparam_u32   ("timeo",		Opt_timeo),
 	fsparam_flag  ("udp",		Opt_udp),
@@ -430,6 +433,41 @@ static int nfs_parse_version_string(struct fs_context *fc,
 	return 0;
 }
 
+static void nfs_client_report_sunrpcid(struct fs_context *fc,
+					struct rpc_clnt *clnt,
+					const char *kind)
+{
+	/* Client ID representation here must match /sys/kernel/sunrpc/net! */
+	nfs_resultf(fc, "sunrpcid %s %x", kind, clnt->cl_clid);
+}
+
+static int nfs_client_report_clients(struct fs_context *fc)
+{
+	struct nfs_server *server;
+
+	if (!fc->root) {
+		nfs_errorf(fc, "NFS: no root yet");
+		return 0;
+	}
+
+	server = NFS_SB(fc->root->d_sb);
+	if (!server) {
+		nfs_errorf(fc, "NFS: no superblock yet");
+		return 0;
+	}
+
+	nfs_client_report_sunrpcid(fc, server->client, "main");
+	nfs_client_report_sunrpcid(fc, server->nfs_client->cl_rpcclient, "shared");
+	nfs_client_report_sunrpcid(fc, server->client_acl, "acl");
+
+	if (server->nlm_host != NULL)
+		nfs_client_report_sunrpcid(
+			fc, server->nlm_host->h_rpcclnt, "nlm");
+
+	nfs_resultf(fc, "sunrpcid -");
+	return 0;
+}
+
 /*
  * Parse a single mount parameter.
  */
@@ -778,6 +816,9 @@ static int nfs_fs_context_parse_param(struct fs_context *fc,
 		ctx->sloppy = true;
 		dfprintk(MOUNT, "NFS:   relaxing parsing rules\n");
 		break;
+	case Opt_sunrpcid:
+		nfs_client_report_clients(fc);
+		break;
 	}
 
 	return 0;
diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
index c8939d2cce1b..fd061304434e 100644
--- a/fs/nfs/internal.h
+++ b/fs/nfs/internal.h
@@ -160,6 +160,10 @@ struct nfs_fs_context {
 	warnf(fc, fmt, ## __VA_ARGS__) :			\
 	({ dfprintk(fac, fmt "\n", ## __VA_ARGS__); }))
 
+#define nfs_resultf(fc, fmt, ...) ((fc)->log.log ?		\
+	infof(fc, fmt, ## __VA_ARGS__) :			\
+	({ ; }))
+
 static inline struct nfs_fs_context *nfs_fc2context(const struct fs_context *fc)
 {
 	return fc->fs_private;