[v2] NFSv4: try lease recovery on NFS4ERR_EXPIRED
diff mbox series

Message ID 115c01d5c66d$5dcd7ae0$196870a0$@gmail.com
State New
Headers show
Series
  • [v2] NFSv4: try lease recovery on NFS4ERR_EXPIRED
Related show

Commit Message

Robert Milkowski Jan. 8, 2020, 9:48 p.m. UTC
From: Robert Milkowski <rmilkowski@gmail.com>

Currently, if an nfs server returns NFS4ERR_EXPIRED to open(), etc.
we return EIO to applications without even trying to recover.

Fixes: 272289a3df72 ("NFSv4: nfs4_do_handle_exception() handle revoke/expiry
of a single stateid")
Signed-off-by: Robert Milkowski <rmilkowski@gmail.com>
---
 fs/nfs/nfs4proc.c | 4 ++++
 1 file changed, 4 insertions(+)

 			if (inode) {

Comments

Robert Milkowski Jan. 17, 2020, 4:12 p.m. UTC | #1
Anyone please?


-----Original Message-----
From: Robert Milkowski <rmilkowski@gmail.com> 
Sent: 08 January 2020 21:48
To: linux-nfs@vger.kernel.org
Cc: 'Trond Myklebust' <trondmy@hammerspace.com>; 'Chuck Lever'
<chuck.lever@oracle.com>; 'Anna Schumaker' <anna.schumaker@netapp.com>;
linux-kernel@vger.kernel.org
Subject: [PATCH v2] NFSv4: try lease recovery on NFS4ERR_EXPIRED

From: Robert Milkowski <rmilkowski@gmail.com>

Currently, if an nfs server returns NFS4ERR_EXPIRED to open(), etc.
we return EIO to applications without even trying to recover.

Fixes: 272289a3df72 ("NFSv4: nfs4_do_handle_exception() handle revoke/expiry
of a single stateid")
Signed-off-by: Robert Milkowski <rmilkowski@gmail.com>
---
 fs/nfs/nfs4proc.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 76d3716..2478405
100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -481,6 +481,10 @@ static int nfs4_do_handle_exception(struct nfs_server
*server,
 						stateid);
 				goto wait_on_recovery;
 			}
+			if (state == NULL) {
+				nfs4_schedule_lease_recovery(clp);
+				goto wait_on_recovery;
+			}
 			/* Fall through */
 		case -NFS4ERR_OPENMODE:
 			if (inode) {
--
1.8.3.1
Trond Myklebust Jan. 17, 2020, 5:24 p.m. UTC | #2
On Fri, 2020-01-17 at 16:12 +0000, Robert Milkowski wrote:
> Anyone please?
> 
> 
> -----Original Message-----
> From: Robert Milkowski <rmilkowski@gmail.com> 
> Sent: 08 January 2020 21:48
> To: linux-nfs@vger.kernel.org
> Cc: 'Trond Myklebust' <trondmy@hammerspace.com>; 'Chuck Lever'
> <chuck.lever@oracle.com>; 'Anna Schumaker' <anna.schumaker@netapp.com
> >;
> linux-kernel@vger.kernel.org
> Subject: [PATCH v2] NFSv4: try lease recovery on NFS4ERR_EXPIRED
> 
> From: Robert Milkowski <rmilkowski@gmail.com>
> 
> Currently, if an nfs server returns NFS4ERR_EXPIRED to open(), etc.
> we return EIO to applications without even trying to recover.
> 
> Fixes: 272289a3df72 ("NFSv4: nfs4_do_handle_exception() handle
> revoke/expiry
> of a single stateid")
> Signed-off-by: Robert Milkowski <rmilkowski@gmail.com>
> ---
>  fs/nfs/nfs4proc.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index
> 76d3716..2478405
> 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -481,6 +481,10 @@ static int nfs4_do_handle_exception(struct
> nfs_server
> *server,
>  						stateid);
>  				goto wait_on_recovery;
>  			}
> +			if (state == NULL) {
> +				nfs4_schedule_lease_recovery(clp);
> +				goto wait_on_recovery;
> +			}
>  			/* Fall through */
>  		case -NFS4ERR_OPENMODE:
>  			if (inode) {
> --
> 1.8.3.1
> 
> 

Does this apply to any case other than NFS4ERR_EXPIRED in the specific
case of nfs4_do_open()? I can't see that it does. It looks to me as if
the open recovery routines already have their own handling of this
case.

If so, why not just add it as a special case in the nfs4_do_open()
error handling? Otherwise this patch will end up overriding other
generic cases where we have an inode, but no open state.

Note that _nfs4_do_open() already waits for lease recovery, so we only
need the call to nfs_schedule_lease_recovery().
Robert Milkowski Jan. 22, 2020, 2:20 p.m. UTC | #3
> -----Original Message-----
> From: Trond Myklebust <trondmy@hammerspace.com>
> Sent: 17 January 2020 17:24
> To: linux-nfs@vger.kernel.org; rmilkowski@gmail.com
> Cc: anna.schumaker@netapp.com; linux-kernel@vger.kernel.org;
> chuck.lever@oracle.com
> Subject: Re: [PATCH v2] NFSv4: try lease recovery on NFS4ERR_EXPIRED
> 
> On Fri, 2020-01-17 at 16:12 +0000, Robert Milkowski wrote:
> > Anyone please?
> >
> >
> > -----Original Message-----
> > From: Robert Milkowski <rmilkowski@gmail.com>
> > Sent: 08 January 2020 21:48
> > To: linux-nfs@vger.kernel.org
> > Cc: 'Trond Myklebust' <trondmy@hammerspace.com>; 'Chuck Lever'
> > <chuck.lever@oracle.com>; 'Anna Schumaker' <anna.schumaker@netapp.com
> > >;
> > linux-kernel@vger.kernel.org
> > Subject: [PATCH v2] NFSv4: try lease recovery on NFS4ERR_EXPIRED
> >
> > From: Robert Milkowski <rmilkowski@gmail.com>
> >
> > Currently, if an nfs server returns NFS4ERR_EXPIRED to open(), etc.
> > we return EIO to applications without even trying to recover.
> >
> > Fixes: 272289a3df72 ("NFSv4: nfs4_do_handle_exception() handle
> > revoke/expiry of a single stateid")
> > Signed-off-by: Robert Milkowski <rmilkowski@gmail.com>
> > ---
> >  fs/nfs/nfs4proc.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index
> > 76d3716..2478405
> > 100644
> > --- a/fs/nfs/nfs4proc.c
> > +++ b/fs/nfs/nfs4proc.c
> > @@ -481,6 +481,10 @@ static int nfs4_do_handle_exception(struct
> > nfs_server *server,
> >  						stateid);
> >  				goto wait_on_recovery;
> >  			}
> > +			if (state == NULL) {
> > +				nfs4_schedule_lease_recovery(clp);
> > +				goto wait_on_recovery;
> > +			}
> >  			/* Fall through */
> >  		case -NFS4ERR_OPENMODE:
> >  			if (inode) {
> > --
> > 1.8.3.1
> >
> >
> 
> Does this apply to any case other than NFS4ERR_EXPIRED in the specific
> case of nfs4_do_open()? I can't see that it does. It looks to me as if
> the open recovery routines already have their own handling of this case.

I only observed the issue with open(). After further
review I think you are right and it only applies to nfs4_do_open().


> 
> If so, why not just add it as a special case in the nfs4_do_open() error
> handling? Otherwise this patch will end up overriding other generic
> cases where we have an inode, but no open state.
> 

Fair point.
So perhaps, few lines further instead of:

			if (inode) {
...
			if (state == NULL) {
					break;
			}

There should be:

			if (inode) {
...
			if (state == NULL) {
				nfs4_schedule_lease_recovery(clp);
				goto wait_on_recovery;
			}



This way we know that inode cannot be null at this point, and it's a case where both inode and state are NULL.
This would be a little bit more general in case we reach this point.

But if you think it is better to move it to nfs4_do_open() then I've just tested the following patch:

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 76d3716..b7c4044 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -3187,6 +3187,11 @@ static struct nfs4_state *nfs4_do_open(struct inode *dir,
                        exception.retry = 1;
                        continue;
                }
+               if (status == -NFS4ERR_EXPIRED) {
+                       nfs4_schedule_lease_recovery(server->nfs_client);
+                       exception.retry = 1;
+                       continue;
+               }
                if (status == -EAGAIN) {
                        /* We must have found a delegation */
                        exception.retry = 1;



Please let me know which way you want to proceed and I will submit an updated patch.



> Note that _nfs4_do_open() already waits for lease recovery, so we only
> need the call to nfs_schedule_lease_recovery().
>

Yep
Trond Myklebust Jan. 23, 2020, 7:33 p.m. UTC | #4
On Wed, 2020-01-22 at 14:20 +0000, Robert Milkowski wrote:
> > -----Original Message-----
> > From: Trond Myklebust <trondmy@hammerspace.com>
> > Sent: 17 January 2020 17:24
> > To: linux-nfs@vger.kernel.org; rmilkowski@gmail.com
> > Cc: anna.schumaker@netapp.com; linux-kernel@vger.kernel.org;
> > chuck.lever@oracle.com
> > Subject: Re: [PATCH v2] NFSv4: try lease recovery on
> > NFS4ERR_EXPIRED
> > 
> > On Fri, 2020-01-17 at 16:12 +0000, Robert Milkowski wrote:
> > > Anyone please?
> > > 
> > > 
> > > -----Original Message-----
> > > From: Robert Milkowski <rmilkowski@gmail.com>
> > > Sent: 08 January 2020 21:48
> > > To: linux-nfs@vger.kernel.org
> > > Cc: 'Trond Myklebust' <trondmy@hammerspace.com>; 'Chuck Lever'
> > > <chuck.lever@oracle.com>; 'Anna Schumaker' <
> > > anna.schumaker@netapp.com
> > > > ;
> > > linux-kernel@vger.kernel.org
> > > Subject: [PATCH v2] NFSv4: try lease recovery on NFS4ERR_EXPIRED
> > > 
> > > From: Robert Milkowski <rmilkowski@gmail.com>
> > > 
> > > Currently, if an nfs server returns NFS4ERR_EXPIRED to open(),
> > > etc.
> > > we return EIO to applications without even trying to recover.
> > > 
> > > Fixes: 272289a3df72 ("NFSv4: nfs4_do_handle_exception() handle
> > > revoke/expiry of a single stateid")
> > > Signed-off-by: Robert Milkowski <rmilkowski@gmail.com>
> > > ---
> > >  fs/nfs/nfs4proc.c | 4 ++++
> > >  1 file changed, 4 insertions(+)
> > > 
> > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index
> > > 76d3716..2478405
> > > 100644
> > > --- a/fs/nfs/nfs4proc.c
> > > +++ b/fs/nfs/nfs4proc.c
> > > @@ -481,6 +481,10 @@ static int nfs4_do_handle_exception(struct
> > > nfs_server *server,
> > >  						stateid);
> > >  				goto wait_on_recovery;
> > >  			}
> > > +			if (state == NULL) {
> > > +				nfs4_schedule_lease_recovery(clp);
> > > +				goto wait_on_recovery;
> > > +			}
> > >  			/* Fall through */
> > >  		case -NFS4ERR_OPENMODE:
> > >  			if (inode) {
> > > --
> > > 1.8.3.1
> > > 
> > > 
> > 
> > Does this apply to any case other than NFS4ERR_EXPIRED in the
> > specific
> > case of nfs4_do_open()? I can't see that it does. It looks to me as
> > if
> > the open recovery routines already have their own handling of this
> > case.
> 
> I only observed the issue with open(). After further
> review I think you are right and it only applies to nfs4_do_open().
> 
> 
> > If so, why not just add it as a special case in the nfs4_do_open()
> > error
> > handling? Otherwise this patch will end up overriding other generic
> > cases where we have an inode, but no open state.
> > 
> 
> Fair point.
> So perhaps, few lines further instead of:
> 
> 			if (inode) {
> ...
> 			if (state == NULL) {
> 					break;
> 			}
> 
> There should be:
> 
> 			if (inode) {
> ...
> 			if (state == NULL) {
> 				nfs4_schedule_lease_recovery(clp);
> 				goto wait_on_recovery;
> 			}
> 
> 
> 
> This way we know that inode cannot be null at this point, and it's a
> case where both inode and state are NULL.
> This would be a little bit more general in case we reach this point.
> 
> But if you think it is better to move it to nfs4_do_open() then I've
> just tested the following patch:
> 
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 76d3716..b7c4044 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -3187,6 +3187,11 @@ static struct nfs4_state *nfs4_do_open(struct
> inode *dir,
>                         exception.retry = 1;
>                         continue;
>                 }
> +               if (status == -NFS4ERR_EXPIRED) {
> +                       nfs4_schedule_lease_recovery(server-
> >nfs_client);
> +                       exception.retry = 1;
> +                       continue;
> +               }
>                 if (status == -EAGAIN) {
>                         /* We must have found a delegation */
>                         exception.retry = 1;
> 

This looks like what I'm asking for, yes. That seems like the minimal
patch that addresses the problem you're describing.
Robert Milkowski Jan. 27, 2020, 2:46 p.m. UTC | #5
On Thu, 23 Jan 2020 at 19:33, Trond Myklebust <trondmy@hammerspace.com> wrote:
>
> On Wed, 2020-01-22 at 14:20 +0000, Robert Milkowski wrote:
> > > -----Original Message-----
> > > From: Trond Myklebust <trondmy@hammerspace.com>
> > > Sent: 17 January 2020 17:24
> > > To: linux-nfs@vger.kernel.org; rmilkowski@gmail.com
> > > Cc: anna.schumaker@netapp.com; linux-kernel@vger.kernel.org;
> > > chuck.lever@oracle.com
> > > Subject: Re: [PATCH v2] NFSv4: try lease recovery on
> > > NFS4ERR_EXPIRED
> > >
> > > On Fri, 2020-01-17 at 16:12 +0000, Robert Milkowski wrote:
> > > > Anyone please?
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: Robert Milkowski <rmilkowski@gmail.com>
> > > > Sent: 08 January 2020 21:48
> > > > To: linux-nfs@vger.kernel.org
> > > > Cc: 'Trond Myklebust' <trondmy@hammerspace.com>; 'Chuck Lever'
> > > > <chuck.lever@oracle.com>; 'Anna Schumaker' <
> > > > anna.schumaker@netapp.com
> > > > > ;
> > > > linux-kernel@vger.kernel.org
> > > > Subject: [PATCH v2] NFSv4: try lease recovery on NFS4ERR_EXPIRED
> > > >
> > > > From: Robert Milkowski <rmilkowski@gmail.com>
> > > >
> > > > Currently, if an nfs server returns NFS4ERR_EXPIRED to open(),
> > > > etc.
> > > > we return EIO to applications without even trying to recover.
> > > >
> > > > Fixes: 272289a3df72 ("NFSv4: nfs4_do_handle_exception() handle
> > > > revoke/expiry of a single stateid")
> > > > Signed-off-by: Robert Milkowski <rmilkowski@gmail.com>
> > > > ---
> > > >  fs/nfs/nfs4proc.c | 4 ++++
> > > >  1 file changed, 4 insertions(+)
> > > >
> > > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index
> > > > 76d3716..2478405
> > > > 100644
> > > > --- a/fs/nfs/nfs4proc.c
> > > > +++ b/fs/nfs/nfs4proc.c
> > > > @@ -481,6 +481,10 @@ static int nfs4_do_handle_exception(struct
> > > > nfs_server *server,
> > > >                                           stateid);
> > > >                           goto wait_on_recovery;
> > > >                   }
> > > > +                 if (state == NULL) {
> > > > +                         nfs4_schedule_lease_recovery(clp);
> > > > +                         goto wait_on_recovery;
> > > > +                 }
> > > >                   /* Fall through */
> > > >           case -NFS4ERR_OPENMODE:
> > > >                   if (inode) {
> > > > --
> > > > 1.8.3.1
> > > >
> > > >
> > >
> > > Does this apply to any case other than NFS4ERR_EXPIRED in the
> > > specific
> > > case of nfs4_do_open()? I can't see that it does. It looks to me as
> > > if
> > > the open recovery routines already have their own handling of this
> > > case.
> >
> > I only observed the issue with open(). After further
> > review I think you are right and it only applies to nfs4_do_open().
> >
> >
> > > If so, why not just add it as a special case in the nfs4_do_open()
> > > error
> > > handling? Otherwise this patch will end up overriding other generic
> > > cases where we have an inode, but no open state.
> > >
> >
> > Fair point.
> > So perhaps, few lines further instead of:
> >
> >                       if (inode) {
> > ...
> >                       if (state == NULL) {
> >                                       break;
> >                       }
> >
> > There should be:
> >
> >                       if (inode) {
> > ...
> >                       if (state == NULL) {
> >                               nfs4_schedule_lease_recovery(clp);
> >                               goto wait_on_recovery;
> >                       }
> >
> >
> >
> > This way we know that inode cannot be null at this point, and it's a
> > case where both inode and state are NULL.
> > This would be a little bit more general in case we reach this point.
> >
> > But if you think it is better to move it to nfs4_do_open() then I've
> > just tested the following patch:
> >
> > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> > index 76d3716..b7c4044 100644
> > --- a/fs/nfs/nfs4proc.c
> > +++ b/fs/nfs/nfs4proc.c
> > @@ -3187,6 +3187,11 @@ static struct nfs4_state *nfs4_do_open(struct
> > inode *dir,
> >                         exception.retry = 1;
> >                         continue;
> >                 }
> > +               if (status == -NFS4ERR_EXPIRED) {
> > +                       nfs4_schedule_lease_recovery(server-
> > >nfs_client);
> > +                       exception.retry = 1;
> > +                       continue;
> > +               }
> >                 if (status == -EAGAIN) {
> >                         /* We must have found a delegation */
> >                         exception.retry = 1;
> >
>
> This looks like what I'm asking for, yes. That seems like the minimal
> patch that addresses the problem you're describing.
>

Ok, will submit later today or tomorrow.
Thanks.

Patch
diff mbox series

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 76d3716..2478405 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -481,6 +481,10 @@  static int nfs4_do_handle_exception(struct nfs_server
*server,
 						stateid);
 				goto wait_on_recovery;
 			}
+			if (state == NULL) {
+				nfs4_schedule_lease_recovery(clp);
+				goto wait_on_recovery;
+			}
 			/* Fall through */
 		case -NFS4ERR_OPENMODE: