diff mbox

[RFC] nfsd: fix error handling for clients that fail to return the layout

Message ID 1471368867-29362-1-git-send-email-jlayton@redhat.com (mailing list archive)
State Deferred, archived
Headers show

Commit Message

Jeff Layton Aug. 16, 2016, 5:34 p.m. UTC
Currently, when the client fails to return the layout we'll eventually
give up trying but leave the layout in place. What we really need to
do here is fence the client in this case. Have it fall through to that
code in that case instead of into the NFS4ERR_NOMATCHING_LAYOUT case.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 fs/nfsd/nfs4layouts.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Note that this patch is untested, other than for compilation as I
don't have a block/scsi pnfs setup on which to do so. Still, I think
it makes more sense to fence clients that don't return the layout
instead of just giving up.

Comments

J. Bruce Fields Aug. 16, 2016, 9:17 p.m. UTC | #1
On Tue, Aug 16, 2016 at 01:34:27PM -0400, Jeff Layton wrote:
> Currently, when the client fails to return the layout we'll eventually
> give up trying but leave the layout in place.

Maybe I'm not reading the code right, but I think the layout is
eventually removed unconditionally in every case, by
nfsd4_cb_layout_release--were you seeing something else?

> What we really need to
> do here is fence the client in this case. Have it fall through to that
> code in that case instead of into the NFS4ERR_NOMATCHING_LAYOUT case.

So the only change here is to fence in the case a client keeps
responding with DELAY, right?

That does seem like an improvement.

I wonder if the result is completely correct.

In the list_empty(&ls->ls_layouts) case, shouldn't we also call
trace_layout_recall_done()?

Does it really make sense to retry the callback in the case the callback
succeeds but the client hasn't returned yet?

If the client returns the layout but returns a status other than 0,
DELAY, or NOMATCHING_LAYOUT, is it really correct to fence it?

If trunking's in effect and we have to change the callback connection
while waiting for the return, do we do the right thing?  (Looking at
it... Actually, I think nfsd4_cb_sequence_done should handle these cases
for us, OK, maybe I'm less worried.)

--b.

> 
> Signed-off-by: Jeff Layton <jlayton@redhat.com>
> ---
>  fs/nfsd/nfs4layouts.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> Note that this patch is untested, other than for compilation as I
> don't have a block/scsi pnfs setup on which to do so. Still, I think
> it makes more sense to fence clients that don't return the layout
> instead of just giving up.
> 
> diff --git a/fs/nfsd/nfs4layouts.c b/fs/nfsd/nfs4layouts.c
> index 42aace4fc4c8..596205d939a1 100644
> --- a/fs/nfsd/nfs4layouts.c
> +++ b/fs/nfsd/nfs4layouts.c
> @@ -686,10 +686,6 @@ nfsd4_cb_layout_done(struct nfsd4_callback *cb, struct rpc_task *task)
>  			return 0;
>  		}
>  		/* Fallthrough */
> -	case -NFS4ERR_NOMATCHING_LAYOUT:
> -		trace_layout_recall_done(&ls->ls_stid.sc_stateid);
> -		task->tk_status = 0;
> -		return 1;
>  	default:
>  		/*
>  		 * Unknown error or non-responding client, we'll need to fence.
> @@ -702,6 +698,10 @@ nfsd4_cb_layout_done(struct nfsd4_callback *cb, struct rpc_task *task)
>  		else
>  			nfsd4_cb_layout_fail(ls);
>  		return -1;
> +	case -NFS4ERR_NOMATCHING_LAYOUT:
> +		trace_layout_recall_done(&ls->ls_stid.sc_stateid);
> +		task->tk_status = 0;
> +		return 1;
>  	}
>  }
>  
> -- 
> 2.7.4
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeff Layton Aug. 17, 2016, 1:28 p.m. UTC | #2
On Tue, 2016-08-16 at 17:17 -0400, J. Bruce Fields wrote:
> On Tue, Aug 16, 2016 at 01:34:27PM -0400, Jeff Layton wrote:
> > 
> > Currently, when the client fails to return the layout we'll
> > eventually
> > give up trying but leave the layout in place.
> 
> Maybe I'm not reading the code right, but I think the layout is
> eventually removed unconditionally in every case, by
> nfsd4_cb_layout_release--were you seeing something else?
> 

Yes, you're correct. Still, we'll be revoking the layout record in the
server, but the client could still think it has it and it could
continue writing to the storage. If we're revoking the layout without
the client explicitly acknowledging that it's no longer using it, then
we really do have to fence as well.

> > 
> > What we really need to
> > do here is fence the client in this case. Have it fall through to
> > that
> > code in that case instead of into the NFS4ERR_NOMATCHING_LAYOUT
> > case.
> 
> So the only change here is to fence in the case a client keeps
> responding with DELAY, right?
> 

That, or if it is returning 0 and not following up with a LAYOUTRETURN.

> That does seem like an improvement.
> 
> I wonder if the result is completely correct.
> 
> In the list_empty(&ls->ls_layouts) case, shouldn't we also call
> trace_layout_recall_done()?
> 
> Does it really make sense to retry the callback in the case the
> callback
> succeeds but the client hasn't returned yet?
> 

No it doesn't, but fixing that is a little more difficult. Right now,
we time out the recall in the ->done operation. If we don't retry the
call then we don't have a mechanism to handle the timeout.

There are several ways to fix that, but they're all pretty ugly,
AFAICT. Any thoughts on how you'd like to handle the timeout?

> If the client returns the layout but returns a status other than 0,
> DELAY, or NOMATCHING_LAYOUT, is it really correct to fence it?
> 

Probably not. Some of the error cases (e.g. NFS4ERR_BADHANDLE) could
probably be considered the same as NFS4ERR_NOMATCHING_LAYOUT. That
said, fencing is generally the safer option when we're in doubt.

> If trunking's in effect and we have to change the callback connection
> while waiting for the return, do we do the right thing?  (Looking at
> it... Actually, I think nfsd4_cb_sequence_done should handle these
> cases
> for us, OK, maybe I'm less worried.)
> 

Yeah, that should all be handled in the CB_SEQUENCE op.

> --b.
> 
> > 
> > 
> > Signed-off-by: Jeff Layton <jlayton@redhat.com>
> > ---
> >  fs/nfsd/nfs4layouts.c | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> > 
> > Note that this patch is untested, other than for compilation as I
> > don't have a block/scsi pnfs setup on which to do so. Still, I
> > think
> > it makes more sense to fence clients that don't return the layout
> > instead of just giving up.
> > 
> > diff --git a/fs/nfsd/nfs4layouts.c b/fs/nfsd/nfs4layouts.c
> > index 42aace4fc4c8..596205d939a1 100644
> > --- a/fs/nfsd/nfs4layouts.c
> > +++ b/fs/nfsd/nfs4layouts.c
> > @@ -686,10 +686,6 @@ nfsd4_cb_layout_done(struct nfsd4_callback
> > *cb, struct rpc_task *task)
> >  			return 0;
> >  		}
> >  		/* Fallthrough */
> > -	case -NFS4ERR_NOMATCHING_LAYOUT:
> > -		trace_layout_recall_done(&ls->ls_stid.sc_stateid);
> > -		task->tk_status = 0;
> > -		return 1;
> >  	default:
> >  		/*
> >  		 * Unknown error or non-responding client, we'll
> > need to fence.
> > @@ -702,6 +698,10 @@ nfsd4_cb_layout_done(struct nfsd4_callback
> > *cb, struct rpc_task *task)
> >  		else
> >  			nfsd4_cb_layout_fail(ls);
> >  		return -1;
> > +	case -NFS4ERR_NOMATCHING_LAYOUT:
> > +		trace_layout_recall_done(&ls->ls_stid.sc_stateid);
> > +		task->tk_status = 0;
> > +		return 1;
> >  	}
> >  }
> >  
> > -- 
> > 2.7.4
J. Bruce Fields Aug. 17, 2016, 2:24 p.m. UTC | #3
On Wed, Aug 17, 2016 at 09:28:14AM -0400, Jeff Layton wrote:
> On Tue, 2016-08-16 at 17:17 -0400, J. Bruce Fields wrote:
> > On Tue, Aug 16, 2016 at 01:34:27PM -0400, Jeff Layton wrote:
> > > 
> > > Currently, when the client fails to return the layout we'll
> > > eventually
> > > give up trying but leave the layout in place.
> > 
> > Maybe I'm not reading the code right, but I think the layout is
> > eventually removed unconditionally in every case, by
> > nfsd4_cb_layout_release--were you seeing something else?
> > 
> 
> Yes, you're correct. Still, we'll be revoking the layout record in the
> server, but the client could still think it has it and it could
> continue writing to the storage. If we're revoking the layout without
> the client explicitly acknowledging that it's no longer using it, then
> we really do have to fence as well.

Got it.  Could just use some changelog clarification.

> > > What we really need to
> > > do here is fence the client in this case. Have it fall through to
> > > that
> > > code in that case instead of into the NFS4ERR_NOMATCHING_LAYOUT
> > > case.
> > 
> > So the only change here is to fence in the case a client keeps
> > responding with DELAY, right?
> > 
> 
> That, or if it is returning 0 and not following up with a LAYOUTRETURN.
> 
> > That does seem like an improvement.
> > 
> > I wonder if the result is completely correct.
> > 
> > In the list_empty(&ls->ls_layouts) case, shouldn't we also call
> > trace_layout_recall_done()?
> > 
> > Does it really make sense to retry the callback in the case the
> > callback
> > succeeds but the client hasn't returned yet?
> > 
> 
> No it doesn't, but fixing that is a little more difficult. Right now,
> we time out the recall in the ->done operation. If we don't retry the
> call then we don't have a mechanism to handle the timeout.
> 
> There are several ways to fix that, but they're all pretty ugly,
> AFAICT. Any thoughts on how you'd like to handle the timeout?

The delegation code keeps the delegation return or revocation mostly
separate from the callback.  Revocation is handled by the laundromat.

I'd prefer delegation and layout timeouts were handled in roughly the
same way.

As you know it's taken a long time to shake races out of the delegation
code.  So if the current layout recall approach is simpler, and if the
only drawback is redundant callbacks, then I guess there's no rush to
rewrite everything.

> > If the client returns the layout but returns a status other than 0,
> > DELAY, or NOMATCHING_LAYOUT, is it really correct to fence it?
> > 
> 
> Probably not. Some of the error cases (e.g. NFS4ERR_BADHANDLE) could
> probably be considered the same as NFS4ERR_NOMATCHING_LAYOUT. That
> said, fencing is generally the safer option when we're in doubt.
> 
> > If trunking's in effect and we have to change the callback connection
> > while waiting for the return, do we do the right thing?  (Looking at
> > it... Actually, I think nfsd4_cb_sequence_done should handle these
> > cases
> > for us, OK, maybe I'm less worried.)
> > 
> 
> Yeah, that should all be handled in the CB_SEQUENCE op.

That said, we need pynfs tests or similar to test those codepaths, there
are probably lurking bugs.  That'd be a prerequisite for any rewrite of
the layout revocation.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/nfsd/nfs4layouts.c b/fs/nfsd/nfs4layouts.c
index 42aace4fc4c8..596205d939a1 100644
--- a/fs/nfsd/nfs4layouts.c
+++ b/fs/nfsd/nfs4layouts.c
@@ -686,10 +686,6 @@  nfsd4_cb_layout_done(struct nfsd4_callback *cb, struct rpc_task *task)
 			return 0;
 		}
 		/* Fallthrough */
-	case -NFS4ERR_NOMATCHING_LAYOUT:
-		trace_layout_recall_done(&ls->ls_stid.sc_stateid);
-		task->tk_status = 0;
-		return 1;
 	default:
 		/*
 		 * Unknown error or non-responding client, we'll need to fence.
@@ -702,6 +698,10 @@  nfsd4_cb_layout_done(struct nfsd4_callback *cb, struct rpc_task *task)
 		else
 			nfsd4_cb_layout_fail(ls);
 		return -1;
+	case -NFS4ERR_NOMATCHING_LAYOUT:
+		trace_layout_recall_done(&ls->ls_stid.sc_stateid);
+		task->tk_status = 0;
+		return 1;
 	}
 }