[-,v2] NFSv4: handle EINVAL from EXCHANGE_ID better.
diff mbox

Message ID 878tasc3ag.fsf@notabene.neil.brown.name
State New
Headers show

Commit Message

NeilBrown March 15, 2018, 11:44 p.m. UTC
nfs4_proc_exchange_id() can return -EINVAL if the server
reported NFS4INVAL (which I have seen in a packet trace),
or nfs4_check_cl_exchange_flags() exchange flags detects
a problem.
Each of these mean that NFSv4.1 later cannot be use, but
they should not prevent fallback to NFSv4.0.

Currently this EINVAL error is returned by nfs4_proc_exchange_id() to
nfs41_discover_server_trunking(), and thence to
nfs4_discover_server_trunking().
nfs4_discover_server_trunking() doesn't understand EINVAL, so converts
it to EIO which causes mount.nfs to think something is horribly wrong
and to give up.

It would be a more graceful failure if nfs4_discover_server_trunking()
mapped -EINVAL to -EPROTONOSUPPORT - a failure to negotiate a client
ID clearly shows that NFSv4.1 cannot be supported, but isn't as
general a failure as EIO.

Signed-off-by: NeilBrown <neilb@suse.com>
---

Sorry - a bit too trigger-happy with that first version of the patch.

NeilBrown

 fs/nfs/nfs4state.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Tigran Mkrtchyan March 16, 2018, 9:31 a.m. UTC | #1
Hi Neil,

according to rfc5661, NFS4ERR_INVAL is returned by the server if it
thinks that client sends an invalid request (e.g. points to a client bug)
or server misinterpret it (broken server).

With your change instead of failing the mount, client will silently go for
v4.0, even v4.1 mount was requested and produce undesirable behavior, e.g.
proxy-io instead of pnfs. I fill prefer fail-fast instead of long debug
sessions.

On the other hand, I understand, that it's not always possible to fix server
or clients in production environment and time-to-time workarounds are necessary.


Tigran.


----- Original Message -----
> From: "NeilBrown" <neilb@suse.com>
> To: "Trond Myklebust" <trond.myklebust@primarydata.com>, "Anna Schumaker" <anna.schumaker@netapp.com>
> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
> Sent: Friday, March 16, 2018 12:44:23 AM
> Subject: [PATCH - v2] NFSv4: handle EINVAL from EXCHANGE_ID better.

> nfs4_proc_exchange_id() can return -EINVAL if the server
> reported NFS4INVAL (which I have seen in a packet trace),
> or nfs4_check_cl_exchange_flags() exchange flags detects
> a problem.
> Each of these mean that NFSv4.1 later cannot be use, but
> they should not prevent fallback to NFSv4.0.
> 
> Currently this EINVAL error is returned by nfs4_proc_exchange_id() to
> nfs41_discover_server_trunking(), and thence to
> nfs4_discover_server_trunking().
> nfs4_discover_server_trunking() doesn't understand EINVAL, so converts
> it to EIO which causes mount.nfs to think something is horribly wrong
> and to give up.
> 
> It would be a more graceful failure if nfs4_discover_server_trunking()
> mapped -EINVAL to -EPROTONOSUPPORT - a failure to negotiate a client
> ID clearly shows that NFSv4.1 cannot be supported, but isn't as
> general a failure as EIO.
> 
> Signed-off-by: NeilBrown <neilb@suse.com>
> ---
> 
> Sorry - a bit too trigger-happy with that first version of the patch.
> 
> NeilBrown
> 
> fs/nfs/nfs4state.c | 2 ++
> 1 file changed, 2 insertions(+)
> 
> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> index 91a4d4eeb235..b988e460553d 100644
> --- a/fs/nfs/nfs4state.c
> +++ b/fs/nfs/nfs4state.c
> @@ -2219,6 +2219,8 @@ int nfs4_discover_server_trunking(struct nfs_client *clp,
> 		clnt = clp->cl_rpcclient;
> 		goto again;
> 
> +	case -NFS4ERR_INVAL:
> +		/* Server confused - assume this minor isn't supported */
> 	case -NFS4ERR_MINOR_VERS_MISMATCH:
> 		status = -EPROTONOSUPPORT;
> 		break;
> --
> 2.14.0.rc0.dirty
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Trond Myklebust March 16, 2018, 12:10 p.m. UTC | #2
On Fri, 2018-03-16 at 10:31 +0100, Mkrtchyan, Tigran wrote:
> Hi Neil,

> 

> according to rfc5661, NFS4ERR_INVAL is returned by the server if it

> thinks that client sends an invalid request (e.g. points to a client

> bug)

> or server misinterpret it (broken server).

> 

> With your change instead of failing the mount, client will silently

> go for

> v4.0, even v4.1 mount was requested and produce undesirable behavior,

> e.g.

> proxy-io instead of pnfs. I fill prefer fail-fast instead of long

> debug

> sessions.

> 

> On the other hand, I understand, that it's not always possible to fix

> server

> or clients in production environment and time-to-time workarounds are

> necessary.

> 

> 


I'd tend to agree with Tigran. Hiding server bugs, should not be a
priority and particularly not in this case, where the workaround is
simple: either turn off version negotiation altogether, or edit
/etc/nfsmount.conf to negotiate a different set of versions.

What we might want to do, is make it easier to allow the user to detect
that this is indeed a server bug and is not a problem with the
arguments supplied to the "mount" utility. Perhaps we might have the
kernel log something in the syslogs?

Cheers
  Trond

> Tigran.

> 

> 

> ----- Original Message -----

> > From: "NeilBrown" <neilb@suse.com>

> > To: "Trond Myklebust" <trond.myklebust@primarydata.com>, "Anna

> > Schumaker" <anna.schumaker@netapp.com>

> > Cc: "linux-nfs" <linux-nfs@vger.kernel.org>

> > Sent: Friday, March 16, 2018 12:44:23 AM

> > Subject: [PATCH - v2] NFSv4: handle EINVAL from EXCHANGE_ID better.

> > nfs4_proc_exchange_id() can return -EINVAL if the server

> > reported NFS4INVAL (which I have seen in a packet trace),

> > or nfs4_check_cl_exchange_flags() exchange flags detects

> > a problem.

> > Each of these mean that NFSv4.1 later cannot be use, but

> > they should not prevent fallback to NFSv4.0.

> > 

> > Currently this EINVAL error is returned by nfs4_proc_exchange_id()

> > to

> > nfs41_discover_server_trunking(), and thence to

> > nfs4_discover_server_trunking().

> > nfs4_discover_server_trunking() doesn't understand EINVAL, so

> > converts

> > it to EIO which causes mount.nfs to think something is horribly

> > wrong

> > and to give up.

> > 

> > It would be a more graceful failure if

> > nfs4_discover_server_trunking()

> > mapped -EINVAL to -EPROTONOSUPPORT - a failure to negotiate a

> > client

> > ID clearly shows that NFSv4.1 cannot be supported, but isn't as

> > general a failure as EIO.

> > 

> > Signed-off-by: NeilBrown <neilb@suse.com>

> > ---

> > 

> > Sorry - a bit too trigger-happy with that first version of the

> > patch.

> > 

> > NeilBrown

> > 

> > fs/nfs/nfs4state.c | 2 ++

> > 1 file changed, 2 insertions(+)

> > 

> > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c

> > index 91a4d4eeb235..b988e460553d 100644

> > --- a/fs/nfs/nfs4state.c

> > +++ b/fs/nfs/nfs4state.c

> > @@ -2219,6 +2219,8 @@ int nfs4_discover_server_trunking(struct

> > nfs_client *clp,

> > 		clnt = clp->cl_rpcclient;

> > 		goto again;

> > 

> > +	case -NFS4ERR_INVAL:

> > +		/* Server confused - assume this minor isn't

> > supported */

> > 	case -NFS4ERR_MINOR_VERS_MISMATCH:

> > 		status = -EPROTONOSUPPORT;

> > 		break;

> > --

> > 2.14.0.rc0.dirty

> 

> --

> To unsubscribe from this list: send the line "unsubscribe linux-nfs"

> in

> the body of a message to majordomo@vger.kernel.org

> More majordomo info at  http://vger.kernel.org/majordomo-info.html

> 

-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.myklebust@primarydata.com
NeilBrown March 20, 2018, 12:09 a.m. UTC | #3
On Fri, Mar 16 2018, Mkrtchyan, Tigran wrote:

> Hi Neil,
>
> according to rfc5661, NFS4ERR_INVAL is returned by the server if it
> thinks that client sends an invalid request (e.g. points to a client bug)
> or server misinterpret it (broken server).

Agreed.

>
> With your change instead of failing the mount, client will silently go for
> v4.0, even v4.1 mount was requested and produce undesirable behavior, e.g.
> proxy-io instead of pnfs. I fill prefer fail-fast instead of long debug
> sessions.

I don't quite agree.  With my change, the client will behave exactly the
same way as if the server explicitly didn't support v4.1
So if you explicitly ask for v4.1, you will get a fail-fast.
If you ask for "mount with whatever protocol seems to work" (the
default), then you will get a protocol that works - not 4.1 in this
case.

>
> On the other hand, I understand, that it's not always possible to fix server
> or clients in production environment and time-to-time workarounds are
> necessary.

Yes.  We are not overly eager to work-around broken implementations, but
my memory is that we do it when it brings measurable benefits without
unreasonable compromises.

In this can I am suggesting that change that results in a user-visible
error of EPROTONOSUPPORT instead of EIO - in a case where the server
doesn't respond to out v4.1 in the way we think it should.  I don't
think there is any compromise there.

Thanks,
NeilBrown


>
>
> Tigran.
>
>
> ----- Original Message -----
>> From: "NeilBrown" <neilb@suse.com>
>> To: "Trond Myklebust" <trond.myklebust@primarydata.com>, "Anna Schumaker" <anna.schumaker@netapp.com>
>> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
>> Sent: Friday, March 16, 2018 12:44:23 AM
>> Subject: [PATCH - v2] NFSv4: handle EINVAL from EXCHANGE_ID better.
>
>> nfs4_proc_exchange_id() can return -EINVAL if the server
>> reported NFS4INVAL (which I have seen in a packet trace),
>> or nfs4_check_cl_exchange_flags() exchange flags detects
>> a problem.
>> Each of these mean that NFSv4.1 later cannot be use, but
>> they should not prevent fallback to NFSv4.0.
>> 
>> Currently this EINVAL error is returned by nfs4_proc_exchange_id() to
>> nfs41_discover_server_trunking(), and thence to
>> nfs4_discover_server_trunking().
>> nfs4_discover_server_trunking() doesn't understand EINVAL, so converts
>> it to EIO which causes mount.nfs to think something is horribly wrong
>> and to give up.
>> 
>> It would be a more graceful failure if nfs4_discover_server_trunking()
>> mapped -EINVAL to -EPROTONOSUPPORT - a failure to negotiate a client
>> ID clearly shows that NFSv4.1 cannot be supported, but isn't as
>> general a failure as EIO.
>> 
>> Signed-off-by: NeilBrown <neilb@suse.com>
>> ---
>> 
>> Sorry - a bit too trigger-happy with that first version of the patch.
>> 
>> NeilBrown
>> 
>> fs/nfs/nfs4state.c | 2 ++
>> 1 file changed, 2 insertions(+)
>> 
>> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
>> index 91a4d4eeb235..b988e460553d 100644
>> --- a/fs/nfs/nfs4state.c
>> +++ b/fs/nfs/nfs4state.c
>> @@ -2219,6 +2219,8 @@ int nfs4_discover_server_trunking(struct nfs_client *clp,
>> 		clnt = clp->cl_rpcclient;
>> 		goto again;
>> 
>> +	case -NFS4ERR_INVAL:
>> +		/* Server confused - assume this minor isn't supported */
>> 	case -NFS4ERR_MINOR_VERS_MISMATCH:
>> 		status = -EPROTONOSUPPORT;
>> 		break;
>> --
>> 2.14.0.rc0.dirty
J. Bruce Fields March 20, 2018, 9:48 p.m. UTC | #4
On Fri, Mar 16, 2018 at 10:44:23AM +1100, NeilBrown wrote:
> 
> nfs4_proc_exchange_id() can return -EINVAL if the server
> reported NFS4INVAL (which I have seen in a packet trace),

Can you say which server this was?  (One we can fix?)

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
NeilBrown March 20, 2018, 10:12 p.m. UTC | #5
On Tue, Mar 20 2018, J. Bruce Fields wrote:

> On Fri, Mar 16, 2018 at 10:44:23AM +1100, NeilBrown wrote:
>> 
>> nfs4_proc_exchange_id() can return -EINVAL if the server
>> reported NFS4INVAL (which I have seen in a packet trace),
>
> Can you say which server this was?  (One we can fix?)

Unfortunately not - all I know is that it is a bit old and isn't SUSE
(which doen't really narrow it down at all).
I've asked for more details and will pass on anything I discover.

Thanks,
NeilBrown
NeilBrown April 3, 2018, 12:41 a.m. UTC | #6
On Tue, Mar 20 2018, J. Bruce Fields wrote:

> On Fri, Mar 16, 2018 at 10:44:23AM +1100, NeilBrown wrote:
>> 
>> nfs4_proc_exchange_id() can return -EINVAL if the server
>> reported NFS4INVAL (which I have seen in a packet trace),
>
> Can you say which server this was?  (One we can fix?)

It was Linux 2.6.32 (on armv5), and presumably anything before
Commit: 357f54d6b382 ("NFS fix the setting of exchange id flag")
which landed in 2.6.38.

These servers return NFS4ERR_INVAL when EXCHGID4_FLAG_BIND_PRINC_STATEID
is set in the exchange_id flags, which Linux has done since
Commit: 4f0b429df104 ("NFSv4.1: Enable state protection")
in 3.11.

Maybe NFS should retry without that flag if it gets EINVAL??  Or it could
just say that NFSv4.1 isn't supported.  I still don't think EIO is justified.

Thanks,
NeilBrown
J. Bruce Fields April 3, 2018, 4:02 p.m. UTC | #7
On Tue, Apr 03, 2018 at 10:41:56AM +1000, NeilBrown wrote:
> On Tue, Mar 20 2018, J. Bruce Fields wrote:
> 
> > On Fri, Mar 16, 2018 at 10:44:23AM +1100, NeilBrown wrote:
> >> 
> >> nfs4_proc_exchange_id() can return -EINVAL if the server
> >> reported NFS4INVAL (which I have seen in a packet trace),
> >
> > Can you say which server this was?  (One we can fix?)
> 
> It was Linux 2.6.32 (on armv5), and presumably anything before
> Commit: 357f54d6b382 ("NFS fix the setting of exchange id flag")
> which landed in 2.6.38.

The kernel defaulted to keeping 4.1 off at that point, probably for good
reason--that default didn't change for another 4 or 5 years.  I suspect
this is only the first of multiple 4.1 bugs you'd run into in a server
of that vintage.  So the solution is probably just to keep 4.1 off.

--b.

> 
> These servers return NFS4ERR_INVAL when EXCHGID4_FLAG_BIND_PRINC_STATEID
> is set in the exchange_id flags, which Linux has done since
> Commit: 4f0b429df104 ("NFSv4.1: Enable state protection")
> in 3.11.
> 
> Maybe NFS should retry without that flag if it gets EINVAL??  Or it could
> just say that NFSv4.1 isn't supported.  I still don't think EIO is justified.
> 
> Thanks,
> NeilBrown


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch
diff mbox

diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 91a4d4eeb235..b988e460553d 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -2219,6 +2219,8 @@  int nfs4_discover_server_trunking(struct nfs_client *clp,
 		clnt = clp->cl_rpcclient;
 		goto again;
 
+	case -NFS4ERR_INVAL:
+		/* Server confused - assume this minor isn't supported */
 	case -NFS4ERR_MINOR_VERS_MISMATCH:
 		status = -EPROTONOSUPPORT;
 		break;