mbox series

[v5,00/19] nfs/nfsd: add support for localio

Message ID 20240618201949.81977-1-snitzer@kernel.org (mailing list archive)
Headers show
Series nfs/nfsd: add support for localio | expand

Message

Mike Snitzer June 18, 2024, 8:19 p.m. UTC
Hi,

This v5 is rebased on Chuck's nfsd-next (only required one adjustment
in patch 15 to account for new code that dereferences nn->nfsd_serv).

Only other change is patch 19 to add Documentation/filesystems/nfs/localio.rst

My git tree is here:
https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/

This v5 is both branch nfs-localio-for-6.11 (always tracks latest)
and nfs-localio-for-6.11.v5

Branches nfs-localio-for-6.11.v[1234] are also available.

To see the changes from v4 to v5 please do:
git remote add snitzer git://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git
git remote update snitzer
git diff snitzer/nfs-localio-for-6.11.v4 snitzer/nfs-localio-for-6.11.v5

[NOTE: there will be noise due to nfsd-next causing the base kernel to
       move from v6.10-rc2 to v6.10-rc3]

All review and comments are welcome!

Thanks,
Mike

Mike Snitzer (11):
  nfs_common: add NFS LOCALIO protocol extension enablement
  nfs: implement v3 and v4 client support for NFS_LOCALIO_PROGRAM
  nfsd: implement v3 and v4 server support for NFS_LOCALIO_PROGRAM
  nfs/nfsd: consolidate {encode,decode}_opaque_fixed in nfs_xdr.h
  nfs/localio: move managing nfsd_open_local_fh symbol to nfs_common
  nfs/nfsd: ensure localio server always uses its network namespace
  nfsd/localio: manage netns reference in nfsd_open_local_fh
  nfsd: prepare to use SRCU to dereference nn->nfsd_serv
  nfsd: use SRCU to dereference nn->nfsd_serv
  nfsd/localio: use SRCU to dereference nn->nfsd_serv in nfsd_open_local_fh
  nfs: add Documentation/filesystems/nfs/localio.rst

Trond Myklebust (3):
  NFS: Enable localio for non-pNFS I/O
  pnfs/flexfiles: Enable localio for flexfiles I/O
  nfs/localio: use dedicated workqueues for filesystem read and write

Weston Andros Adamson (5):
  nfs: pass nfs_client to nfs_initiate_pgio
  nfs: pass descriptor thru nfs_initiate_pgio path
  nfs: pass struct file to nfs_init_pgio and nfs_init_commit
  sunrpc: add rpcauth_map_to_svc_cred_local
  nfs/nfsd: add "localio" support

 Documentation/filesystems/nfs/localio.rst | 101 +++
 fs/Kconfig                                |   3 +
 fs/nfs/Kconfig                            |  30 +
 fs/nfs/Makefile                           |   1 +
 fs/nfs/blocklayout/blocklayout.c          |   6 +-
 fs/nfs/client.c                           |  15 +-
 fs/nfs/filelayout/filelayout.c            |  16 +-
 fs/nfs/flexfilelayout/flexfilelayout.c    | 131 +++-
 fs/nfs/flexfilelayout/flexfilelayout.h    |   2 +
 fs/nfs/flexfilelayout/flexfilelayoutdev.c |   6 +
 fs/nfs/inode.c                            |  61 +-
 fs/nfs/internal.h                         |  88 ++-
 fs/nfs/localio.c                          | 850 ++++++++++++++++++++++
 fs/nfs/nfs3_fs.h                          |   1 +
 fs/nfs/nfs3client.c                       |  25 +
 fs/nfs/nfs3proc.c                         |   3 +
 fs/nfs/nfs3xdr.c                          |  58 ++
 fs/nfs/nfs4_fs.h                          |   2 +
 fs/nfs/nfs4client.c                       |  23 +
 fs/nfs/nfs4proc.c                         |   3 +
 fs/nfs/nfs4xdr.c                          |  65 +-
 fs/nfs/nfstrace.h                         |  61 ++
 fs/nfs/pagelist.c                         |  32 +-
 fs/nfs/pnfs.c                             |  24 +-
 fs/nfs/pnfs.h                             |   6 +-
 fs/nfs/pnfs_nfs.c                         |   2 +-
 fs/nfs/write.c                            |  13 +-
 fs/nfs_common/Makefile                    |   3 +
 fs/nfs_common/nfslocalio.c                |  71 ++
 fs/nfsd/Kconfig                           |  30 +
 fs/nfsd/Makefile                          |   1 +
 fs/nfsd/filecache.c                       |  15 +-
 fs/nfsd/localio.c                         | 398 ++++++++++
 fs/nfsd/netns.h                           |  16 +-
 fs/nfsd/nfs4state.c                       |  25 +-
 fs/nfsd/nfsctl.c                          |  28 +-
 fs/nfsd/nfsd.h                            |  11 +
 fs/nfsd/nfssvc.c                          | 182 ++++-
 fs/nfsd/trace.h                           |   3 +-
 fs/nfsd/vfs.h                             |   9 +
 fs/nfsd/xdr.h                             |   6 +
 include/linux/nfs.h                       |   2 +
 include/linux/nfs_fs.h                    |   2 +
 include/linux/nfs_fs_sb.h                 |   9 +
 include/linux/nfs_xdr.h                   |  31 +-
 include/linux/nfslocalio.h                |  41 ++
 include/linux/sunrpc/auth.h               |   4 +
 include/uapi/linux/nfs.h                  |   4 +
 net/sunrpc/auth.c                         |  15 +
 49 files changed, 2388 insertions(+), 146 deletions(-)
 create mode 100644 Documentation/filesystems/nfs/localio.rst
 create mode 100644 fs/nfs/localio.c
 create mode 100644 fs/nfs_common/nfslocalio.c
 create mode 100644 fs/nfsd/localio.c
 create mode 100644 include/linux/nfslocalio.h

Comments

Christoph Hellwig June 19, 2024, 5:49 a.m. UTC | #1
What happened to the requirement that all protocol extensions added
to Linux need to be standardized in IETF RFCs?
NeilBrown June 19, 2024, 7:10 a.m. UTC | #2
On Wed, 19 Jun 2024, Christoph Hellwig wrote:
> What happened to the requirement that all protocol extensions added
> to Linux need to be standardized in IETF RFCs?
> 
> 

Is that requirement documented somewhere?  Not that I doubt it, but it
would be nice to know where it is explicit.  I couldn't quickly find
anything in Documentation/

Can we get by without the LOCALIO protocol?

For NFSv4.1 we could use the server_owner4 returned by EXCHANGE_ID.  It
is explicitly documented as being usable to determine if two servers are
the same.

For NFSv4.0 ... I don't think we should encourage that to be used.

For NFSv3 it is harder.  I'm not as ready to deprecate it as I am for
4.0.  There is nothing in NFSv3 or MOUNT or NLM that is comparable to
server_owner4.  If krb5 was used there would probably be a server
identity in there that could be used.
I think the server could theoretically return an AUTH_SYS verifier in
each RPC reply and that could be used to identify the server.  I'm not
sure that is a good idea though.

Going through the IETF process for something that is entirely private to
Linux seems a bit more than should be necessary..

Thanks,
NeilBrown
Christoph Hellwig June 19, 2024, 7:15 a.m. UTC | #3
On Wed, Jun 19, 2024 at 05:10:10PM +1000, NeilBrown wrote:
> Is that requirement documented somewhere?

Trond has responded with that policy to various in progress features
in the past for the client.  I think it also is a generally very useful
policy.  (Note that we ignore it with the NFSv3 side band protocols,
but that is ancient past)

> Not that I doubt it, but it
> would be nice to know where it is explicit.  I couldn't quickly find
> anything in Documentation/

Agreed.
Jeff Layton June 19, 2024, 10:09 a.m. UTC | #4
On Wed, 2024-06-19 at 17:10 +1000, NeilBrown wrote:
> On Wed, 19 Jun 2024, Christoph Hellwig wrote:
> > What happened to the requirement that all protocol extensions added
> > to Linux need to be standardized in IETF RFCs?
> > 
> > 
> 
> Is that requirement documented somewhere?  Not that I doubt it, but it
> would be nice to know where it is explicit.  I couldn't quickly find
> anything in Documentation/
> 
> Can we get by without the LOCALIO protocol?
> 
> For NFSv4.1 we could use the server_owner4 returned by EXCHANGE_ID.  It
> is explicitly documented as being usable to determine if two servers are
> the same.
> 
> For NFSv4.0 ... I don't think we should encourage that to be used.
> 
> For NFSv3 it is harder.  I'm not as ready to deprecate it as I am for
> 4.0.  There is nothing in NFSv3 or MOUNT or NLM that is comparable to
> server_owner4.  If krb5 was used there would probably be a server
> identity in there that could be used.
> I think the server could theoretically return an AUTH_SYS verifier in
> each RPC reply and that could be used to identify the server.  I'm not
> sure that is a good idea though.
> 

My idea for v3 was that the localio client could do an O_TMPFILE create
on the exported fs and write some random junk to it (a uuid or
something). Construct the filehandle for that and then the client could
try to issue a READ for that filehandle via the NFS server. If it finds
that filehandle and the contents are correct then you're on the same
host. Then you just close the file and it should clean itself up.

This is a little less straightforward and efficient than the localio
protocol that Mike is proposing, but requires no protocol extensions.
 
> Going through the IETF process for something that is entirely private to
> Linux seems a bit more than should be necessary..
> 

Agreed. Given that this our own protocol extension and we don't have
any expectation of other clients or servers implementing this, I don't
see the point. I do agree that trying to avoid program number conflicts
is a good thing though.
Trond Myklebust June 19, 2024, 2:02 p.m. UTC | #5
On Tue, 2024-06-18 at 22:49 -0700, Christoph Hellwig wrote:
> What happened to the requirement that all protocol extensions added
> to Linux need to be standardized in IETF RFCs?
> 

The point of the side band protocol here is literally just to discover
if the server on the other end of the connection is me, myself and I.
IOW: did the IP + port that was used to set up a connection end up,
through the magic of routing, connecting to a knfsd service that is
running on the same machine as the client.

The only requirement for interoperability with other servers is that we
don't break them when probing. Hence the side band protocol, which uses
the fact that it is an RPC program with a value that will be ignored by
all other servers except the Linux servers that implement it.
Otherwise, the protocol is private to the Linux client and knfsd.

So, if the consensus is that this still needs to go through the IETF,
then fine, we can do that, and register the side band program name with
IANA.

If there is a better way to determine that we're talking to our own
server (which may be running in a container with its own network
namespace) then I'm all ears.
Mike Snitzer June 19, 2024, 5:57 p.m. UTC | #6
On Wed, Jun 19, 2024 at 05:10:10PM +1000, NeilBrown wrote:
> On Wed, 19 Jun 2024, Christoph Hellwig wrote:
> > What happened to the requirement that all protocol extensions added
> > to Linux need to be standardized in IETF RFCs?
> > 
> > 
> 
> Is that requirement documented somewhere?  Not that I doubt it, but it
> would be nice to know where it is explicit.  I couldn't quickly find
> anything in Documentation/
> 
> Can we get by without the LOCALIO protocol?
> 
> For NFSv4.1 we could use the server_owner4 returned by EXCHANGE_ID.  It
> is explicitly documented as being usable to determine if two servers are
> the same.

My first approach was to (ab)use EXCHANGE_ID. It worked, but it
required exporting a symbol to query the hash table local to
nfs4state, etc.  It wasn't very clean.. could it have been made
clean?: I guess... but in the end I elected to solve both v3 and v4.x in
the same way using LOCALIO protocol.

> For NFSv4.0 ... I don't think we should encourage that to be used.
> 
> For NFSv3 it is harder.  I'm not as ready to deprecate it as I am for
> 4.0.  There is nothing in NFSv3 or MOUNT or NLM that is comparable to
> server_owner4.  If krb5 was used there would probably be a server
> identity in there that could be used.
> I think the server could theoretically return an AUTH_SYS verifier in
> each RPC reply and that could be used to identify the server.  I'm not
> sure that is a good idea though.
> 
> Going through the IETF process for something that is entirely private to
> Linux seems a bit more than should be necessary..

I have to believe Christoph didn't appreciate this LOCALIO protocol is
an entirely private implementation detail to Linux (that allows client
and server handshake).  I've clarified that in Documentation (for v6).

Mike
Chuck Lever June 19, 2024, 6:04 p.m. UTC | #7
> On Jun 19, 2024, at 1:57 PM, Mike Snitzer <snitzer@kernel.org> wrote:
> 
> On Wed, Jun 19, 2024 at 05:10:10PM +1000, NeilBrown wrote:
>> On Wed, 19 Jun 2024, Christoph Hellwig wrote:
>>> What happened to the requirement that all protocol extensions added
>>> to Linux need to be standardized in IETF RFCs?
>>> 
>>> 
>> 
>> Is that requirement documented somewhere?  Not that I doubt it, but it
>> would be nice to know where it is explicit.  I couldn't quickly find
>> anything in Documentation/
>> 
>> Can we get by without the LOCALIO protocol?
>> 
>> For NFSv4.1 we could use the server_owner4 returned by EXCHANGE_ID.  It
>> is explicitly documented as being usable to determine if two servers are
>> the same.
> 
> My first approach was to (ab)use EXCHANGE_ID. It worked, but it
> required exporting a symbol to query the hash table local to
> nfs4state, etc.  It wasn't very clean.. could it have been made
> clean?: I guess... but in the end I elected to solve both v3 and v4.x in
> the same way using LOCALIO protocol.
> 
>> For NFSv4.0 ... I don't think we should encourage that to be used.
>> 
>> For NFSv3 it is harder.  I'm not as ready to deprecate it as I am for
>> 4.0.  There is nothing in NFSv3 or MOUNT or NLM that is comparable to
>> server_owner4.  If krb5 was used there would probably be a server
>> identity in there that could be used.
>> I think the server could theoretically return an AUTH_SYS verifier in
>> each RPC reply and that could be used to identify the server.  I'm not
>> sure that is a good idea though.
>> 
>> Going through the IETF process for something that is entirely private to
>> Linux seems a bit more than should be necessary..
> 
> I have to believe Christoph didn't appreciate this LOCALIO protocol is
> an entirely private implementation detail to Linux (that allows client
> and server handshake).  I've clarified that in Documentation (for v6).

Even though this is a private protocol, you don't want some
other NFS implementation re-using that RPC program number
for its own purposes.

I think registering the RPC program number and name with
IANA is going to save everyone some potential headaches
and won't be an arduous process.


--
Chuck Lever
Mike Snitzer June 19, 2024, 6:13 p.m. UTC | #8
On Wed, Jun 19, 2024 at 06:04:46PM +0000, Chuck Lever III wrote:
> 
> 
> > On Jun 19, 2024, at 1:57 PM, Mike Snitzer <snitzer@kernel.org> wrote:
> > 
> > On Wed, Jun 19, 2024 at 05:10:10PM +1000, NeilBrown wrote:
> >> On Wed, 19 Jun 2024, Christoph Hellwig wrote:
> >>> What happened to the requirement that all protocol extensions added
> >>> to Linux need to be standardized in IETF RFCs?
> >>> 
> >>> 
> >> 
> >> Is that requirement documented somewhere?  Not that I doubt it, but it
> >> would be nice to know where it is explicit.  I couldn't quickly find
> >> anything in Documentation/
> >> 
> >> Can we get by without the LOCALIO protocol?
> >> 
> >> For NFSv4.1 we could use the server_owner4 returned by EXCHANGE_ID.  It
> >> is explicitly documented as being usable to determine if two servers are
> >> the same.
> > 
> > My first approach was to (ab)use EXCHANGE_ID. It worked, but it
> > required exporting a symbol to query the hash table local to
> > nfs4state, etc.  It wasn't very clean.. could it have been made
> > clean?: I guess... but in the end I elected to solve both v3 and v4.x in
> > the same way using LOCALIO protocol.
> > 
> >> For NFSv4.0 ... I don't think we should encourage that to be used.
> >> 
> >> For NFSv3 it is harder.  I'm not as ready to deprecate it as I am for
> >> 4.0.  There is nothing in NFSv3 or MOUNT or NLM that is comparable to
> >> server_owner4.  If krb5 was used there would probably be a server
> >> identity in there that could be used.
> >> I think the server could theoretically return an AUTH_SYS verifier in
> >> each RPC reply and that could be used to identify the server.  I'm not
> >> sure that is a good idea though.
> >> 
> >> Going through the IETF process for something that is entirely private to
> >> Linux seems a bit more than should be necessary..
> > 
> > I have to believe Christoph didn't appreciate this LOCALIO protocol is
> > an entirely private implementation detail to Linux (that allows client
> > and server handshake).  I've clarified that in Documentation (for v6).
> 
> Even though this is a private protocol, you don't want some
> other NFS implementation re-using that RPC program number
> for its own purposes.
> 
> I think registering the RPC program number and name with
> IANA is going to save everyone some potential headaches
> and won't be an arduous process.

I fully agree, I will work on it. If you have hints for the best place
to start I'd welcome any help getting the process started.

In v6 I switch to using rpc program number 0x20000002
Chuck Lever June 19, 2024, 6:22 p.m. UTC | #9
> On Jun 19, 2024, at 2:13 PM, Mike Snitzer <snitzer@kernel.org> wrote:
> 
> On Wed, Jun 19, 2024 at 06:04:46PM +0000, Chuck Lever III wrote:
>> 
>> 
>>> On Jun 19, 2024, at 1:57 PM, Mike Snitzer <snitzer@kernel.org> wrote:
>>> 
>>> On Wed, Jun 19, 2024 at 05:10:10PM +1000, NeilBrown wrote:
>>>> On Wed, 19 Jun 2024, Christoph Hellwig wrote:
>>>>> What happened to the requirement that all protocol extensions added
>>>>> to Linux need to be standardized in IETF RFCs?
>>>>> 
>>>>> 
>>>> 
>>>> Is that requirement documented somewhere?  Not that I doubt it, but it
>>>> would be nice to know where it is explicit.  I couldn't quickly find
>>>> anything in Documentation/
>>>> 
>>>> Can we get by without the LOCALIO protocol?
>>>> 
>>>> For NFSv4.1 we could use the server_owner4 returned by EXCHANGE_ID.  It
>>>> is explicitly documented as being usable to determine if two servers are
>>>> the same.
>>> 
>>> My first approach was to (ab)use EXCHANGE_ID. It worked, but it
>>> required exporting a symbol to query the hash table local to
>>> nfs4state, etc.  It wasn't very clean.. could it have been made
>>> clean?: I guess... but in the end I elected to solve both v3 and v4.x in
>>> the same way using LOCALIO protocol.
>>> 
>>>> For NFSv4.0 ... I don't think we should encourage that to be used.
>>>> 
>>>> For NFSv3 it is harder.  I'm not as ready to deprecate it as I am for
>>>> 4.0.  There is nothing in NFSv3 or MOUNT or NLM that is comparable to
>>>> server_owner4.  If krb5 was used there would probably be a server
>>>> identity in there that could be used.
>>>> I think the server could theoretically return an AUTH_SYS verifier in
>>>> each RPC reply and that could be used to identify the server.  I'm not
>>>> sure that is a good idea though.
>>>> 
>>>> Going through the IETF process for something that is entirely private to
>>>> Linux seems a bit more than should be necessary..
>>> 
>>> I have to believe Christoph didn't appreciate this LOCALIO protocol is
>>> an entirely private implementation detail to Linux (that allows client
>>> and server handshake).  I've clarified that in Documentation (for v6).
>> 
>> Even though this is a private protocol, you don't want some
>> other NFS implementation re-using that RPC program number
>> for its own purposes.
>> 
>> I think registering the RPC program number and name with
>> IANA is going to save everyone some potential headaches
>> and won't be an arduous process.
> 
> I fully agree, I will work on it. If you have hints for the best place
> to start I'd welcome any help getting the process started.

See Appendix B of RFC 5531.

https://www.rfc-editor.org/rfc/rfc5531.html


> In v6 I switch to using rpc program number 0x20000002 

"Specific numbers cannot be requested. Numbers are
assigned on a First Come First Served basis." You
can use whatever you like until one is assigned,
knowing that the risk is it is almost certainly
not going to be the same value that IANA will give
you.


--
Chuck Lever
NeilBrown June 19, 2024, 9:09 p.m. UTC | #10
On Wed, 19 Jun 2024, Jeff Layton wrote:
> On Wed, 2024-06-19 at 17:10 +1000, NeilBrown wrote:
> > On Wed, 19 Jun 2024, Christoph Hellwig wrote:
> > > What happened to the requirement that all protocol extensions added
> > > to Linux need to be standardized in IETF RFCs?
> > > 
> > > 
> > 
> > Is that requirement documented somewhere?  Not that I doubt it, but it
> > would be nice to know where it is explicit.  I couldn't quickly find
> > anything in Documentation/
> > 
> > Can we get by without the LOCALIO protocol?
> > 
> > For NFSv4.1 we could use the server_owner4 returned by EXCHANGE_ID.  It
> > is explicitly documented as being usable to determine if two servers are
> > the same.
> > 
> > For NFSv4.0 ... I don't think we should encourage that to be used.
> > 
> > For NFSv3 it is harder.  I'm not as ready to deprecate it as I am for
> > 4.0.  There is nothing in NFSv3 or MOUNT or NLM that is comparable to
> > server_owner4.  If krb5 was used there would probably be a server
> > identity in there that could be used.
> > I think the server could theoretically return an AUTH_SYS verifier in
> > each RPC reply and that could be used to identify the server.  I'm not
> > sure that is a good idea though.
> > 
> 
> My idea for v3 was that the localio client could do an O_TMPFILE create
> on the exported fs and write some random junk to it (a uuid or
> something). Construct the filehandle for that and then the client could
> try to issue a READ for that filehandle via the NFS server. If it finds
> that filehandle and the contents are correct then you're on the same
> host. Then you just close the file and it should clean itself up.

I can't see how this would work, but maybe I don't have a good enough
imagination.

The high-level view of the proposed protocol is:
  - client asks remote server to identify itself.
  - server returns an identity
  - client uses local-sideband to ask each local server if it has the
    given identity.

I don't see where an O_TMPFILE could fit into this, or how a different
high-level approach would be any better.

For NFSv3 the client could ask with a new Program or Version or
Procedure, or all three.  Or it could ask with a new file-handle or path
name.  I imagine using a webnfs (rfc2054) multi-component lookup on the
public filehandle for "/linux/config/server-id" and getting back a
filehandle which encodes the server ID somehow.  All these seem credible
options and it is not clear than any one is better than any other.

For NFSv4.1 I think that LOCALIO looks a lot like trunking and so using
exactly the same mechanism to determine if two servers are the same is a
good idea.
But then LOCALIO also looks a lot like a new pNFS/DS protocol so maybe
we should specify that protocol and use GETDEVICELIST or GETDEVICEINFO
to find the identity of the server.

> 
> This is a little less straightforward and efficient than the localio
> protocol that Mike is proposing, but requires no protocol extensions.

I think that if we use anything other than the server-id in the
EXCHANGE_ID response, then we are defining a new protocol as it is a new
request which we expect existing servers to ignore or fail, even though
they have never been tested to ignore/fail that particular request.

Of all the options I would guess that a new version for an existing
protocol would be safest as that is the most likely to have been tested.
A new RPC program is probably conceptually simplest.  A little hack in
LOOKUPv3 to detect the public filehandle etc is probably the easiest to
code, and a new pnfs/ds protocol is probably the cleanest overall
except that it doesn't support NFSv3.

My purpose in all this is not to replace Mike's LOCALIO protocol, but to
explore the solution space to ensure there is nothing that is obviously
better.  As yet, I don't think there is.

NeilBrown


>  
> > Going through the IETF process for something that is entirely private to
> > Linux seems a bit more than should be necessary..
> > 
> 
> Agreed. Given that this our own protocol extension and we don't have
> any expectation of other clients or servers implementing this, I don't
> see the point. I do agree that trying to avoid program number conflicts
> is a good thing though.
> -- 
> Jeff Layton <jlayton@kernel.org>
>
Jeff Layton June 19, 2024, 10:28 p.m. UTC | #11
On Thu, 2024-06-20 at 07:09 +1000, NeilBrown wrote:
> On Wed, 19 Jun 2024, Jeff Layton wrote:
> > On Wed, 2024-06-19 at 17:10 +1000, NeilBrown wrote:
> > > On Wed, 19 Jun 2024, Christoph Hellwig wrote:
> > > > What happened to the requirement that all protocol extensions added
> > > > to Linux need to be standardized in IETF RFCs?
> > > > 
> > > > 
> > > 
> > > Is that requirement documented somewhere?  Not that I doubt it, but it
> > > would be nice to know where it is explicit.  I couldn't quickly find
> > > anything in Documentation/
> > > 
> > > Can we get by without the LOCALIO protocol?
> > > 
> > > For NFSv4.1 we could use the server_owner4 returned by EXCHANGE_ID.  It
> > > is explicitly documented as being usable to determine if two servers are
> > > the same.
> > > 
> > > For NFSv4.0 ... I don't think we should encourage that to be used.
> > > 
> > > For NFSv3 it is harder.  I'm not as ready to deprecate it as I am for
> > > 4.0.  There is nothing in NFSv3 or MOUNT or NLM that is comparable to
> > > server_owner4.  If krb5 was used there would probably be a server
> > > identity in there that could be used.
> > > I think the server could theoretically return an AUTH_SYS verifier in
> > > each RPC reply and that could be used to identify the server.  I'm not
> > > sure that is a good idea though.
> > > 
> > 
> > My idea for v3 was that the localio client could do an O_TMPFILE create
> > on the exported fs and write some random junk to it (a uuid or
> > something). Construct the filehandle for that and then the client could
> > try to issue a READ for that filehandle via the NFS server. If it finds
> > that filehandle and the contents are correct then you're on the same
> > host. Then you just close the file and it should clean itself up.
> 
> I can't see how this would work, but maybe I don't have a good enough
> imagination.
> 

Maybe I didn't explain it well:

Basically the idea was to create a "unique" file in a filesystem
exported by the local nfsd, and then see if it's accessible at the
expected filehandle via a v3 READ and has the expected contents. If it
is then you can assume localio is possible. O_TMPFILE would just make
it simple to clean up the file after you were done, and would avoid
adding spurious entries to the exported directory tree.

The problem with that method is that it's hard to make it work well
with containers. You'd need to be able to predict which net namespace's
server you were talking to, or somehow make the file be exported by all
of them. That alone makes it difficult to implement.


> The high-level view of the proposed protocol is:
>   - client asks remote server to identify itself.
>   - server returns an identity
>   - client uses local-sideband to ask each local server if it has the
>     given identity.
> 
> I don't see where an O_TMPFILE could fit into this, or how a different
> high-level approach would be any better.
>
> For NFSv3 the client could ask with a new Program or Version or
> Procedure, or all three.  Or it could ask with a new file-handle or path
> name.  I imagine using a webnfs (rfc2054) multi-component lookup on the
> public filehandle for "/linux/config/server-id" and getting back a
> filehandle which encodes the server ID somehow.  All these seem credible
> options and it is not clear than any one is better than any other.
> 
> For NFSv4.1 I think that LOCALIO looks a lot like trunking and so using
> exactly the same mechanism to determine if two servers are the same is a
> good idea.
> But then LOCALIO also looks a lot like a new pNFS/DS protocol so maybe
> we should specify that protocol and use GETDEVICELIST or GETDEVICEINFO
> to find the identity of the server.
>
> > 
> > This is a little less straightforward and efficient than the localio
> > protocol that Mike is proposing, but requires no protocol extensions.
> 
> I think that if we use anything other than the server-id in the
> EXCHANGE_ID response, then we are defining a new protocol as it is a new
> request which we expect existing servers to ignore or fail, even though
> they have never been tested to ignore/fail that particular request.
> 
> Of all the options I would guess that a new version for an existing
> protocol would be safest as that is the most likely to have been tested.
> A new RPC program is probably conceptually simplest.  A little hack in
> LOOKUPv3 to detect the public filehandle etc is probably the easiest to
> code, and a new pnfs/ds protocol is probably the cleanest overall
> except that it doesn't support NFSv3.
> 
> My purpose in all this is not to replace Mike's LOCALIO protocol, but to
> explore the solution space to ensure there is nothing that is obviously
> better.  As yet, I don't think there is.
> 
> 

Agreed. Thanks for laying out some alternatives! It's good to consider
other possibilities.

> >  
> > > Going through the IETF process for something that is entirely private to
> > > Linux seems a bit more than should be necessary..
> > > 
> > 
> > Agreed. Given that this our own protocol extension and we don't have
> > any expectation of other clients or servers implementing this, I don't
> > see the point. I do agree that trying to avoid program number conflicts
> > is a good thing though.
> > -- 
> > Jeff Layton <jlayton@kernel.org>
> > 
>
Mike Snitzer June 19, 2024, 10:46 p.m. UTC | #12
On Thu, Jun 20, 2024 at 07:09:23AM +1000, NeilBrown wrote:
> On Wed, 19 Jun 2024, Jeff Layton wrote:
> > On Wed, 2024-06-19 at 17:10 +1000, NeilBrown wrote:
> > > On Wed, 19 Jun 2024, Christoph Hellwig wrote:
> > > > What happened to the requirement that all protocol extensions added
> > > > to Linux need to be standardized in IETF RFCs?
> > > > 
> > > > 
> > > 
> > > Is that requirement documented somewhere?  Not that I doubt it, but it
> > > would be nice to know where it is explicit.  I couldn't quickly find
> > > anything in Documentation/
> > > 
> > > Can we get by without the LOCALIO protocol?
> > > 
> > > For NFSv4.1 we could use the server_owner4 returned by EXCHANGE_ID.  It
> > > is explicitly documented as being usable to determine if two servers are
> > > the same.
> > > 
> > > For NFSv4.0 ... I don't think we should encourage that to be used.
> > > 
> > > For NFSv3 it is harder.  I'm not as ready to deprecate it as I am for
> > > 4.0.  There is nothing in NFSv3 or MOUNT or NLM that is comparable to
> > > server_owner4.  If krb5 was used there would probably be a server
> > > identity in there that could be used.
> > > I think the server could theoretically return an AUTH_SYS verifier in
> > > each RPC reply and that could be used to identify the server.  I'm not
> > > sure that is a good idea though.
> > > 
> > 
> > My idea for v3 was that the localio client could do an O_TMPFILE create
> > on the exported fs and write some random junk to it (a uuid or
> > something). Construct the filehandle for that and then the client could
> > try to issue a READ for that filehandle via the NFS server. If it finds
> > that filehandle and the contents are correct then you're on the same
> > host. Then you just close the file and it should clean itself up.
> 
> I can't see how this would work, but maybe I don't have a good enough
> imagination.
> 
> The high-level view of the proposed protocol is:
>   - client asks remote server to identify itself.
>   - server returns an identity
>   - client uses local-sideband to ask each local server if it has the
>     given identity.
> 
> I don't see where an O_TMPFILE could fit into this, or how a different
> high-level approach would be any better.
> 
> For NFSv3 the client could ask with a new Program or Version or
> Procedure, or all three.  Or it could ask with a new file-handle or path
> name.  I imagine using a webnfs (rfc2054) multi-component lookup on the
> public filehandle for "/linux/config/server-id" and getting back a
> filehandle which encodes the server ID somehow.  All these seem credible
> options and it is not clear than any one is better than any other.
> 
> For NFSv4.1 I think that LOCALIO looks a lot like trunking and so using
> exactly the same mechanism to determine if two servers are the same is a
> good idea.
> But then LOCALIO also looks a lot like a new pNFS/DS protocol so maybe
> we should specify that protocol and use GETDEVICELIST or GETDEVICEINFO
> to find the identity of the server.

Easy enough to switch the RPC call used.  If either GETDEVICELIST or
GETDEVICEINFO can convey a UUID it sounds fine to me.  But for v4
EXCHANGE_ID already exists.

> > This is a little less straightforward and efficient than the localio
> > protocol that Mike is proposing, but requires no protocol extensions.
> 
> I think that if we use anything other than the server-id in the
> EXCHANGE_ID response, then we are defining a new protocol as it is a new
> request which we expect existing servers to ignore or fail, even though
> they have never been tested to ignore/fail that particular request.
> 
> Of all the options I would guess that a new version for an existing
> protocol would be safest as that is the most likely to have been tested.
> A new RPC program is probably conceptually simplest.  A little hack in
> LOOKUPv3 to detect the public filehandle etc is probably the easiest to
> code, and a new pnfs/ds protocol is probably the cleanest overall
> except that it doesn't support NFSv3.

NFSv3 support is pretty important. So when faced with no options for
v3, I decided to implement LOCALIO (with Trond's encouragement) and
just have both NFS versions use it.

I _can_ frame the v4 support in terms of EXCHANGE_ID (and have already
implemented it before writing LOCALIO, patches aren't on the internet
but I can unearth that work if needed).  But I'd still maintain the
nfsd_uuids list, and have nfs_localio's nfsd_uuid_is_local() lookup
the UUID that was embedded in the v4 EXCHANGE_ID payload...

But yes, we'd still need LOCALIO's GETUUID rpc for v3.  So EXCHANGE_ID
really doesn't buy much (because we'd still need an IANA registered
rpc program number).

> My purpose in all this is not to replace Mike's LOCALIO protocol, but to
> explore the solution space to ensure there is nothing that is obviously
> better.  As yet, I don't think there is.

Thanks, I really appreciate your professionalism and attention to
detail.  Pleasure working with you again Neil!

Mike
Christoph Hellwig June 20, 2024, 5:16 a.m. UTC | #13
On Thu, Jun 20, 2024 at 07:09:23AM +1000, NeilBrown wrote:
> I don't see where an O_TMPFILE could fit into this, or how a different
> high-level approach would be any better.

... especially given that O_TMPFILE requires and open but unlinked
file, which v3 can't really support at all, and even for v4 would
require quite a bit of work (although it would be very useful).
Christoph Hellwig June 20, 2024, 5:18 a.m. UTC | #14
On Wed, Jun 19, 2024 at 01:57:00PM -0400, Mike Snitzer wrote:
> > Going through the IETF process for something that is entirely private to
> > Linux seems a bit more than should be necessary..
> 
> I have to believe Christoph didn't appreciate this LOCALIO protocol is
> an entirely private implementation detail to Linux (that allows client
> and server handshake).  I've clarified that in Documentation (for v6).

Well, it still has XDR and code point registrations.