diff mbox

[v1] nfs: Don't increment lock sequence ID after NFS4ERR_MOVED

Message ID 20170122190429.7337.77928.stgit@manet.1015granger.net (mailing list archive)
State New, archived
Headers show

Commit Message

Chuck Lever Jan. 22, 2017, 7:04 p.m. UTC
Xuan Qi reports that the Linux NFSv4 client failed to lock a file
that was migrated. The steps he observed on the wire:

1. The client sent a LOCK request
2. The server replied NFS4ERR_MOVED
3. The client switched to the destination server
4. The client sent the LOCK request again with a bumped
   lock sequence ID
5. The server rejected the LOCK request with NFS4ERR_BAD_SEQID

RFC 3530 section 8.1.5 provides a list of NFS errors which do not
bump a lock sequence ID.

However, RFC 3530 is now obsoleted by RFC 7530. In RFC 7530 section
9.1.7, this list has been updated by the addition of NFS4ERR_MOVED.

Reported-by: Xuan Qi <xuan.qi@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: stable@vger.kernel.org # v3.7+
---
 include/linux/nfs4.h |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Chuck Lever Jan. 23, 2017, 3:01 p.m. UTC | #1
> On Jan 22, 2017, at 2:04 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
> 
> Xuan Qi reports that the Linux NFSv4 client failed to lock a file
> that was migrated. The steps he observed on the wire:
> 
> 1. The client sent a LOCK request
> 2. The server replied NFS4ERR_MOVED
> 3. The client switched to the destination server
> 4. The client sent the LOCK request again with a bumped
>   lock sequence ID
> 5. The server rejected the LOCK request with NFS4ERR_BAD_SEQID

The list of steps could be more clear:

1. The client sent a LOCK request to the source server
2. The source server replied NFS4ERR_MOVED
3. The client switched to the destination server
4. The client sent the same LOCK request to the destination
   server with a bumped lock sequence ID
5. The destination server rejected the LOCK request with
   NFS4ERR_BAD_SEQID


> RFC 3530 section 8.1.5 provides a list of NFS errors which do not
> bump a lock sequence ID.
> 
> However, RFC 3530 is now obsoleted by RFC 7530. In RFC 7530 section
> 9.1.7, this list has been updated by the addition of NFS4ERR_MOVED.
> 
> Reported-by: Xuan Qi <xuan.qi@oracle.com>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> Cc: stable@vger.kernel.org # v3.7+
> ---
> include/linux/nfs4.h |    3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
> index bca5363..1b1ca04 100644
> --- a/include/linux/nfs4.h
> +++ b/include/linux/nfs4.h
> @@ -282,7 +282,7 @@ enum nfsstat4 {
> 
> static inline bool seqid_mutating_err(u32 err)
> {
> -	/* rfc 3530 section 8.1.5: */
> +	/* See RFC 7530, section 9.1.7 */
> 	switch (err) {
> 	case NFS4ERR_STALE_CLIENTID:
> 	case NFS4ERR_STALE_STATEID:
> @@ -291,6 +291,7 @@ static inline bool seqid_mutating_err(u32 err)
> 	case NFS4ERR_BADXDR:
> 	case NFS4ERR_RESOURCE:
> 	case NFS4ERR_NOFILEHANDLE:
> +	case NFS4ERR_MOVED:
> 		return false;
> 	};
> 	return true;
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Chuck Lever



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
J. Bruce Fields Jan. 23, 2017, 4:49 p.m. UTC | #2
On Mon, Jan 23, 2017 at 10:01:27AM -0500, Chuck Lever wrote:
> 
> > On Jan 22, 2017, at 2:04 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
> > 
> > Xuan Qi reports that the Linux NFSv4 client failed to lock a file
> > that was migrated. The steps he observed on the wire:
> > 
> > 1. The client sent a LOCK request
> > 2. The server replied NFS4ERR_MOVED
> > 3. The client switched to the destination server
> > 4. The client sent the LOCK request again with a bumped
> >   lock sequence ID
> > 5. The server rejected the LOCK request with NFS4ERR_BAD_SEQID
> 
> The list of steps could be more clear:
> 
> 1. The client sent a LOCK request to the source server
> 2. The source server replied NFS4ERR_MOVED
> 3. The client switched to the destination server
> 4. The client sent the same LOCK request to the destination
>    server with a bumped lock sequence ID
> 5. The destination server rejected the LOCK request with
>    NFS4ERR_BAD_SEQID
> 
> 
> > RFC 3530 section 8.1.5 provides a list of NFS errors which do not
> > bump a lock sequence ID.
> > 
> > However, RFC 3530 is now obsoleted by RFC 7530. In RFC 7530 section
> > 9.1.7, this list has been updated by the addition of NFS4ERR_MOVED.

I guess we figured the backwards-incompatible change was OK since
essentially the Solaris server is the first we know of to be making real
use of NFS4ERR_MOVED?

And probably it's required for the their implementation because the old
server no longer has the ability to update the state once it's reached
the point of returning ERR_MOVED.

OK, makes sense to me, I think.

--b.

> > 
> > Reported-by: Xuan Qi <xuan.qi@oracle.com>
> > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > Cc: stable@vger.kernel.org # v3.7+
> > ---
> > include/linux/nfs4.h |    3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
> > index bca5363..1b1ca04 100644
> > --- a/include/linux/nfs4.h
> > +++ b/include/linux/nfs4.h
> > @@ -282,7 +282,7 @@ enum nfsstat4 {
> > 
> > static inline bool seqid_mutating_err(u32 err)
> > {
> > -	/* rfc 3530 section 8.1.5: */
> > +	/* See RFC 7530, section 9.1.7 */
> > 	switch (err) {
> > 	case NFS4ERR_STALE_CLIENTID:
> > 	case NFS4ERR_STALE_STATEID:
> > @@ -291,6 +291,7 @@ static inline bool seqid_mutating_err(u32 err)
> > 	case NFS4ERR_BADXDR:
> > 	case NFS4ERR_RESOURCE:
> > 	case NFS4ERR_NOFILEHANDLE:
> > +	case NFS4ERR_MOVED:
> > 		return false;
> > 	};
> > 	return true;
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> Chuck Lever
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chuck Lever Jan. 24, 2017, 7:06 p.m. UTC | #3
> On Jan 23, 2017, at 11:49 AM, bfields@fieldses.org wrote:
> 
> On Mon, Jan 23, 2017 at 10:01:27AM -0500, Chuck Lever wrote:
>> 
>>> On Jan 22, 2017, at 2:04 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
>>> 
>>> Xuan Qi reports that the Linux NFSv4 client failed to lock a file
>>> that was migrated. The steps he observed on the wire:
>>> 
>>> 1. The client sent a LOCK request
>>> 2. The server replied NFS4ERR_MOVED
>>> 3. The client switched to the destination server
>>> 4. The client sent the LOCK request again with a bumped
>>>  lock sequence ID
>>> 5. The server rejected the LOCK request with NFS4ERR_BAD_SEQID
>> 
>> The list of steps could be more clear:
>> 
>> 1. The client sent a LOCK request to the source server
>> 2. The source server replied NFS4ERR_MOVED
>> 3. The client switched to the destination server
>> 4. The client sent the same LOCK request to the destination
>>   server with a bumped lock sequence ID
>> 5. The destination server rejected the LOCK request with
>>   NFS4ERR_BAD_SEQID
>> 
>> 
>>> RFC 3530 section 8.1.5 provides a list of NFS errors which do not
>>> bump a lock sequence ID.
>>> 
>>> However, RFC 3530 is now obsoleted by RFC 7530. In RFC 7530 section
>>> 9.1.7, this list has been updated by the addition of NFS4ERR_MOVED.
> 
> I guess we figured the backwards-incompatible change was OK since
> essentially the Solaris server is the first we know of to be making real
> use of NFS4ERR_MOVED?
> 
> And probably it's required for the their implementation because the old
> server no longer has the ability to update the state once it's reached
> the point of returning ERR_MOVED.
> 
> OK, makes sense to me, I think.

Hi Bruce-

Does this mean you will take this patch, or should
I just add your Reviewed-by: ?


> --b.
> 
>>> 
>>> Reported-by: Xuan Qi <xuan.qi@oracle.com>
>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>> Cc: stable@vger.kernel.org # v3.7+
>>> ---
>>> include/linux/nfs4.h |    3 ++-
>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>> 
>>> diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
>>> index bca5363..1b1ca04 100644
>>> --- a/include/linux/nfs4.h
>>> +++ b/include/linux/nfs4.h
>>> @@ -282,7 +282,7 @@ enum nfsstat4 {
>>> 
>>> static inline bool seqid_mutating_err(u32 err)
>>> {
>>> -	/* rfc 3530 section 8.1.5: */
>>> +	/* See RFC 7530, section 9.1.7 */
>>> 	switch (err) {
>>> 	case NFS4ERR_STALE_CLIENTID:
>>> 	case NFS4ERR_STALE_STATEID:
>>> @@ -291,6 +291,7 @@ static inline bool seqid_mutating_err(u32 err)
>>> 	case NFS4ERR_BADXDR:
>>> 	case NFS4ERR_RESOURCE:
>>> 	case NFS4ERR_NOFILEHANDLE:
>>> +	case NFS4ERR_MOVED:
>>> 		return false;
>>> 	};
>>> 	return true;
>>> 
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
>> --
>> Chuck Lever
>> 
>> 
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Chuck Lever



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
J. Bruce Fields Jan. 24, 2017, 7:15 p.m. UTC | #4
On Tue, Jan 24, 2017 at 02:06:16PM -0500, Chuck Lever wrote:
> 
> > On Jan 23, 2017, at 11:49 AM, bfields@fieldses.org wrote:
> > 
> > On Mon, Jan 23, 2017 at 10:01:27AM -0500, Chuck Lever wrote:
> >> 
> >>> On Jan 22, 2017, at 2:04 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
> >>> 
> >>> Xuan Qi reports that the Linux NFSv4 client failed to lock a file
> >>> that was migrated. The steps he observed on the wire:
> >>> 
> >>> 1. The client sent a LOCK request
> >>> 2. The server replied NFS4ERR_MOVED
> >>> 3. The client switched to the destination server
> >>> 4. The client sent the LOCK request again with a bumped
> >>>  lock sequence ID
> >>> 5. The server rejected the LOCK request with NFS4ERR_BAD_SEQID
> >> 
> >> The list of steps could be more clear:
> >> 
> >> 1. The client sent a LOCK request to the source server
> >> 2. The source server replied NFS4ERR_MOVED
> >> 3. The client switched to the destination server
> >> 4. The client sent the same LOCK request to the destination
> >>   server with a bumped lock sequence ID
> >> 5. The destination server rejected the LOCK request with
> >>   NFS4ERR_BAD_SEQID
> >> 
> >> 
> >>> RFC 3530 section 8.1.5 provides a list of NFS errors which do not
> >>> bump a lock sequence ID.
> >>> 
> >>> However, RFC 3530 is now obsoleted by RFC 7530. In RFC 7530 section
> >>> 9.1.7, this list has been updated by the addition of NFS4ERR_MOVED.
> > 
> > I guess we figured the backwards-incompatible change was OK since
> > essentially the Solaris server is the first we know of to be making real
> > use of NFS4ERR_MOVED?
> > 
> > And probably it's required for the their implementation because the old
> > server no longer has the ability to update the state once it's reached
> > the point of returning ERR_MOVED.
> > 
> > OK, makes sense to me, I think.
> 
> Hi Bruce-
> 
> Does this mean you will take this patch, or should
> I just add your Reviewed-by: ?

I can take it if nobody objects.  Mind if I append the above to the
changelog?  (Just want to document why we think the apparently
backwards-incompatible change is OK.)

--b.

> 
> 
> > --b.
> > 
> >>> 
> >>> Reported-by: Xuan Qi <xuan.qi@oracle.com>
> >>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> >>> Cc: stable@vger.kernel.org # v3.7+
> >>> ---
> >>> include/linux/nfs4.h |    3 ++-
> >>> 1 file changed, 2 insertions(+), 1 deletion(-)
> >>> 
> >>> diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
> >>> index bca5363..1b1ca04 100644
> >>> --- a/include/linux/nfs4.h
> >>> +++ b/include/linux/nfs4.h
> >>> @@ -282,7 +282,7 @@ enum nfsstat4 {
> >>> 
> >>> static inline bool seqid_mutating_err(u32 err)
> >>> {
> >>> -	/* rfc 3530 section 8.1.5: */
> >>> +	/* See RFC 7530, section 9.1.7 */
> >>> 	switch (err) {
> >>> 	case NFS4ERR_STALE_CLIENTID:
> >>> 	case NFS4ERR_STALE_STATEID:
> >>> @@ -291,6 +291,7 @@ static inline bool seqid_mutating_err(u32 err)
> >>> 	case NFS4ERR_BADXDR:
> >>> 	case NFS4ERR_RESOURCE:
> >>> 	case NFS4ERR_NOFILEHANDLE:
> >>> +	case NFS4ERR_MOVED:
> >>> 		return false;
> >>> 	};
> >>> 	return true;
> >>> 
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> >>> the body of a message to majordomo@vger.kernel.org
> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> 
> >> --
> >> Chuck Lever
> >> 
> >> 
> >> 
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> Chuck Lever
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chuck Lever Jan. 24, 2017, 7:31 p.m. UTC | #5
> On Jan 24, 2017, at 2:15 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> 
> On Tue, Jan 24, 2017 at 02:06:16PM -0500, Chuck Lever wrote:
>> 
>>> On Jan 23, 2017, at 11:49 AM, bfields@fieldses.org wrote:
>>> 
>>> On Mon, Jan 23, 2017 at 10:01:27AM -0500, Chuck Lever wrote:
>>>> 
>>>>> On Jan 22, 2017, at 2:04 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
>>>>> 
>>>>> Xuan Qi reports that the Linux NFSv4 client failed to lock a file
>>>>> that was migrated. The steps he observed on the wire:
>>>>> 
>>>>> 1. The client sent a LOCK request
>>>>> 2. The server replied NFS4ERR_MOVED
>>>>> 3. The client switched to the destination server
>>>>> 4. The client sent the LOCK request again with a bumped
>>>>> lock sequence ID
>>>>> 5. The server rejected the LOCK request with NFS4ERR_BAD_SEQID
>>>> 
>>>> The list of steps could be more clear:
>>>> 
>>>> 1. The client sent a LOCK request to the source server
>>>> 2. The source server replied NFS4ERR_MOVED
>>>> 3. The client switched to the destination server
>>>> 4. The client sent the same LOCK request to the destination
>>>>  server with a bumped lock sequence ID
>>>> 5. The destination server rejected the LOCK request with
>>>>  NFS4ERR_BAD_SEQID
>>>> 
>>>> 
>>>>> RFC 3530 section 8.1.5 provides a list of NFS errors which do not
>>>>> bump a lock sequence ID.
>>>>> 
>>>>> However, RFC 3530 is now obsoleted by RFC 7530. In RFC 7530 section
>>>>> 9.1.7, this list has been updated by the addition of NFS4ERR_MOVED.
>>> 
>>> I guess we figured the backwards-incompatible change was OK since
>>> essentially the Solaris server is the first we know of to be making real
>>> use of NFS4ERR_MOVED?
>>> 
>>> And probably it's required for the their implementation because the old
>>> server no longer has the ability to update the state once it's reached
>>> the point of returning ERR_MOVED.
>>> 
>>> OK, makes sense to me, I think.
>> 
>> Hi Bruce-
>> 
>> Does this mean you will take this patch, or should
>> I just add your Reviewed-by: ?
> 
> I can take it if nobody objects.  Mind if I append the above to the
> changelog?  (Just want to document why we think the apparently
> backwards-incompatible change is OK.)

Adding a justification is OK with me, and please replace the
list of steps with my updated list above.

However, your explanation implies that Solaris is the only server
that might need this fix. Actually _any_ server that supports
transparent state migration needs clients to get this fix. Lock
operations on a file that has moved are not able to update the
sequence ID on the destination server.

This backwards-compatible change is OK because:

- No servers in the wild support migration yet, thus
NFS4ERR_MOVED is never returned by existing servers

- Clients that do not support migration should never receive
NFS4ERR_MOVED on a state-mutating operation

In other words, this change is necessary only for clients that
support TSM.

Salt to taste.


> --b.
> 
>> 
>> 
>>> --b.
>>> 
>>>>> 
>>>>> Reported-by: Xuan Qi <xuan.qi@oracle.com>
>>>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>>>> Cc: stable@vger.kernel.org # v3.7+
>>>>> ---
>>>>> include/linux/nfs4.h |    3 ++-
>>>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>>> 
>>>>> diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
>>>>> index bca5363..1b1ca04 100644
>>>>> --- a/include/linux/nfs4.h
>>>>> +++ b/include/linux/nfs4.h
>>>>> @@ -282,7 +282,7 @@ enum nfsstat4 {
>>>>> 
>>>>> static inline bool seqid_mutating_err(u32 err)
>>>>> {
>>>>> -	/* rfc 3530 section 8.1.5: */
>>>>> +	/* See RFC 7530, section 9.1.7 */
>>>>> 	switch (err) {
>>>>> 	case NFS4ERR_STALE_CLIENTID:
>>>>> 	case NFS4ERR_STALE_STATEID:
>>>>> @@ -291,6 +291,7 @@ static inline bool seqid_mutating_err(u32 err)
>>>>> 	case NFS4ERR_BADXDR:
>>>>> 	case NFS4ERR_RESOURCE:
>>>>> 	case NFS4ERR_NOFILEHANDLE:
>>>>> +	case NFS4ERR_MOVED:
>>>>> 		return false;
>>>>> 	};
>>>>> 	return true;
>>>>> 
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>> 
>>>> --
>>>> Chuck Lever
>>>> 
>>>> 
>>>> 
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
>> --
>> Chuck Lever
>> 
>> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Chuck Lever



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
J. Bruce Fields Jan. 24, 2017, 7:41 p.m. UTC | #6
On Tue, Jan 24, 2017 at 02:31:37PM -0500, Chuck Lever wrote:
> Adding a justification is OK with me, and please replace the
> list of steps with my updated list above.
> 
> However, your explanation implies that Solaris is the only server
> that might need this fix. Actually _any_ server that supports
> transparent state migration needs clients to get this fix. Lock
> operations on a file that has moved are not able to update the
> sequence ID on the destination server.

Are you sure?  Couldn't an implementation include a server-to-server
protocol that allowed the source and destination server to share stateid
information?

But even if that's possible, it may be unnecessarily complicated, so I
agree I shouldn't be claiming it's a Solaris-specific issue (though it
may be worth documenting that's who first hit this).

--b.

> This backwards-compatible change is OK because:
> 
> - No servers in the wild support migration yet, thus
> NFS4ERR_MOVED is never returned by existing servers
> 
> - Clients that do not support migration should never receive
> NFS4ERR_MOVED on a state-mutating operation
> 
> In other words, this change is necessary only for clients that
> support TSM.
> 
> Salt to taste.
> 
> 
> > --b.
> > 
> >> 
> >> 
> >>> --b.
> >>> 
> >>>>> 
> >>>>> Reported-by: Xuan Qi <xuan.qi@oracle.com>
> >>>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> >>>>> Cc: stable@vger.kernel.org # v3.7+
> >>>>> ---
> >>>>> include/linux/nfs4.h |    3 ++-
> >>>>> 1 file changed, 2 insertions(+), 1 deletion(-)
> >>>>> 
> >>>>> diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
> >>>>> index bca5363..1b1ca04 100644
> >>>>> --- a/include/linux/nfs4.h
> >>>>> +++ b/include/linux/nfs4.h
> >>>>> @@ -282,7 +282,7 @@ enum nfsstat4 {
> >>>>> 
> >>>>> static inline bool seqid_mutating_err(u32 err)
> >>>>> {
> >>>>> -	/* rfc 3530 section 8.1.5: */
> >>>>> +	/* See RFC 7530, section 9.1.7 */
> >>>>> 	switch (err) {
> >>>>> 	case NFS4ERR_STALE_CLIENTID:
> >>>>> 	case NFS4ERR_STALE_STATEID:
> >>>>> @@ -291,6 +291,7 @@ static inline bool seqid_mutating_err(u32 err)
> >>>>> 	case NFS4ERR_BADXDR:
> >>>>> 	case NFS4ERR_RESOURCE:
> >>>>> 	case NFS4ERR_NOFILEHANDLE:
> >>>>> +	case NFS4ERR_MOVED:
> >>>>> 		return false;
> >>>>> 	};
> >>>>> 	return true;
> >>>>> 
> >>>>> --
> >>>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> >>>>> the body of a message to majordomo@vger.kernel.org
> >>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>> 
> >>>> --
> >>>> Chuck Lever
> >>>> 
> >>>> 
> >>>> 
> >>>> --
> >>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> >>>> the body of a message to majordomo@vger.kernel.org
> >>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> >>> the body of a message to majordomo@vger.kernel.org
> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> 
> >> --
> >> Chuck Lever
> >> 
> >> 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> Chuck Lever
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
J. Bruce Fields Jan. 24, 2017, 7:53 p.m. UTC | #7
On Tue, Jan 24, 2017 at 02:41:40PM -0500, J. Bruce Fields wrote:
> On Tue, Jan 24, 2017 at 02:31:37PM -0500, Chuck Lever wrote:
> > Adding a justification is OK with me, and please replace the
> > list of steps with my updated list above.
> > 
> > However, your explanation implies that Solaris is the only server
> > that might need this fix. Actually _any_ server that supports
> > transparent state migration needs clients to get this fix. Lock
> > operations on a file that has moved are not able to update the
> > sequence ID on the destination server.
> 
> Are you sure?  Couldn't an implementation include a server-to-server
> protocol that allowed the source and destination server to share stateid
> information?
> 
> But even if that's possible, it may be unnecessarily complicated, so I
> agree I shouldn't be claiming it's a Solaris-specific issue (though it
> may be worth documenting that's who first hit this).
> 
> --b.
> 
> > This backwards-compatible change is OK because:
> > 
> > - No servers in the wild support migration yet, thus
> > NFS4ERR_MOVED is never returned by existing servers

I think you mean "transparent state migration" there?

A server supporting non-transparent state migration could return
NFS4ERR_MOVED on a LOCK operation, but the client won't be able to use
that stateid afterwards in that case anyway.

> > - Clients that do not support migration should never receive
> > NFS4ERR_MOVED on a state-mutating operation

I didn't think there was a way for clients to advertise non-support for
migration?

But such clients could never recover from MOVED anyway, so we're not
making things worse for them.

--b.

> > 
> > In other words, this change is necessary only for clients that
> > support TSM.
> > 
> > Salt to taste.
> > 
> > 
> > > --b.
> > > 
> > >> 
> > >> 
> > >>> --b.
> > >>> 
> > >>>>> 
> > >>>>> Reported-by: Xuan Qi <xuan.qi@oracle.com>
> > >>>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > >>>>> Cc: stable@vger.kernel.org # v3.7+
> > >>>>> ---
> > >>>>> include/linux/nfs4.h |    3 ++-
> > >>>>> 1 file changed, 2 insertions(+), 1 deletion(-)
> > >>>>> 
> > >>>>> diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
> > >>>>> index bca5363..1b1ca04 100644
> > >>>>> --- a/include/linux/nfs4.h
> > >>>>> +++ b/include/linux/nfs4.h
> > >>>>> @@ -282,7 +282,7 @@ enum nfsstat4 {
> > >>>>> 
> > >>>>> static inline bool seqid_mutating_err(u32 err)
> > >>>>> {
> > >>>>> -	/* rfc 3530 section 8.1.5: */
> > >>>>> +	/* See RFC 7530, section 9.1.7 */
> > >>>>> 	switch (err) {
> > >>>>> 	case NFS4ERR_STALE_CLIENTID:
> > >>>>> 	case NFS4ERR_STALE_STATEID:
> > >>>>> @@ -291,6 +291,7 @@ static inline bool seqid_mutating_err(u32 err)
> > >>>>> 	case NFS4ERR_BADXDR:
> > >>>>> 	case NFS4ERR_RESOURCE:
> > >>>>> 	case NFS4ERR_NOFILEHANDLE:
> > >>>>> +	case NFS4ERR_MOVED:
> > >>>>> 		return false;
> > >>>>> 	};
> > >>>>> 	return true;
> > >>>>> 
> > >>>>> --
> > >>>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > >>>>> the body of a message to majordomo@vger.kernel.org
> > >>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > >>>> 
> > >>>> --
> > >>>> Chuck Lever
> > >>>> 
> > >>>> 
> > >>>> 
> > >>>> --
> > >>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > >>>> the body of a message to majordomo@vger.kernel.org
> > >>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > >>> --
> > >>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > >>> the body of a message to majordomo@vger.kernel.org
> > >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > >> 
> > >> --
> > >> Chuck Lever
> > >> 
> > >> 
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> > --
> > Chuck Lever
> > 
> > 
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chuck Lever Jan. 24, 2017, 7:54 p.m. UTC | #8
> On Jan 24, 2017, at 2:41 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> 
> On Tue, Jan 24, 2017 at 02:31:37PM -0500, Chuck Lever wrote:
>> Adding a justification is OK with me, and please replace the
>> list of steps with my updated list above.
>> 
>> However, your explanation implies that Solaris is the only server
>> that might need this fix. Actually _any_ server that supports
>> transparent state migration needs clients to get this fix. Lock
>> operations on a file that has moved are not able to update the
>> sequence ID on the destination server.
> 
> Are you sure?

I'm pretty sure.


> Couldn't an implementation include a server-to-server
> protocol that allowed the source and destination server to share stateid
> information?

"Migration" means that the filesystem's name space and data
content has moved, and is no longer accessible on this server.

"Transparent State Migration" means that the filesystem's
state was moved with its name space and data content.

NFS4ERR_MOVED here means specifically that file and its state
is no longer managed or accessible at the local server. It
kind of implies categorically that the above situation is not
in play.

What you are describing is something that is not Transparent
State Migration. Sounds more like replication. In which case,
the server would report NFS4ERR_MOVED on a LOOKUP at the
root of the filesystem, and not on a LOCK request. A client
can recognize the difference between these two and react
accordingly.


> But even if that's possible, it may be unnecessarily complicated, so I
> agree I shouldn't be claiming it's a Solaris-specific issue (though it
> may be worth documenting that's who first hit this).
> 
> --b.
> 
>> This backwards-compatible change is OK because:
>> 
>> - No servers in the wild support migration yet, thus
>> NFS4ERR_MOVED is never returned by existing servers
>> 
>> - Clients that do not support migration should never receive
>> NFS4ERR_MOVED on a state-mutating operation
>> 
>> In other words, this change is necessary only for clients that
>> support TSM.
>> 
>> Salt to taste.
>> 
>> 
>>> --b.
>>> 
>>>> 
>>>> 
>>>>> --b.
>>>>> 
>>>>>>> 
>>>>>>> Reported-by: Xuan Qi <xuan.qi@oracle.com>
>>>>>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>>>>>> Cc: stable@vger.kernel.org # v3.7+
>>>>>>> ---
>>>>>>> include/linux/nfs4.h |    3 ++-
>>>>>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>> 
>>>>>>> diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
>>>>>>> index bca5363..1b1ca04 100644
>>>>>>> --- a/include/linux/nfs4.h
>>>>>>> +++ b/include/linux/nfs4.h
>>>>>>> @@ -282,7 +282,7 @@ enum nfsstat4 {
>>>>>>> 
>>>>>>> static inline bool seqid_mutating_err(u32 err)
>>>>>>> {
>>>>>>> -	/* rfc 3530 section 8.1.5: */
>>>>>>> +	/* See RFC 7530, section 9.1.7 */
>>>>>>> 	switch (err) {
>>>>>>> 	case NFS4ERR_STALE_CLIENTID:
>>>>>>> 	case NFS4ERR_STALE_STATEID:
>>>>>>> @@ -291,6 +291,7 @@ static inline bool seqid_mutating_err(u32 err)
>>>>>>> 	case NFS4ERR_BADXDR:
>>>>>>> 	case NFS4ERR_RESOURCE:
>>>>>>> 	case NFS4ERR_NOFILEHANDLE:
>>>>>>> +	case NFS4ERR_MOVED:
>>>>>>> 		return false;
>>>>>>> 	};
>>>>>>> 	return true;
>>>>>>> 
>>>>>>> --
>>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>> 
>>>>>> --
>>>>>> Chuck Lever
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>> 
>>>> --
>>>> Chuck Lever
>>>> 
>>>> 
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
>> --
>> Chuck Lever
>> 
>> 

--
Chuck Lever



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chuck Lever Jan. 24, 2017, 7:58 p.m. UTC | #9
> On Jan 24, 2017, at 2:53 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> 
> On Tue, Jan 24, 2017 at 02:41:40PM -0500, J. Bruce Fields wrote:
>> On Tue, Jan 24, 2017 at 02:31:37PM -0500, Chuck Lever wrote:
>>> Adding a justification is OK with me, and please replace the
>>> list of steps with my updated list above.
>>> 
>>> However, your explanation implies that Solaris is the only server
>>> that might need this fix. Actually _any_ server that supports
>>> transparent state migration needs clients to get this fix. Lock
>>> operations on a file that has moved are not able to update the
>>> sequence ID on the destination server.
>> 
>> Are you sure?  Couldn't an implementation include a server-to-server
>> protocol that allowed the source and destination server to share stateid
>> information?
>> 
>> But even if that's possible, it may be unnecessarily complicated, so I
>> agree I shouldn't be claiming it's a Solaris-specific issue (though it
>> may be worth documenting that's who first hit this).
>> 
>> --b.
>> 
>>> This backwards-compatible change is OK because:
>>> 
>>> - No servers in the wild support migration yet, thus
>>> NFS4ERR_MOVED is never returned by existing servers
> 
> I think you mean "transparent state migration" there?

Yes, though I'm not aware of any public implementations
that support plain migration either.

NFS4ERR_MOVED can also be returned in cases where servers want
to advertise replicas, but we don't expect that status on
seqid-mutating operations like LOCK.


> A server supporting non-transparent state migration could return
> NFS4ERR_MOVED on a LOCK operation, but the client won't be able to use
> that stateid afterwards in that case anyway.
> 
>>> - Clients that do not support migration should never receive
>>> NFS4ERR_MOVED on a state-mutating operation
> 
> I didn't think there was a way for clients to advertise non-support for
> migration?

Not in NFSv4.0, but NFSv4.1 has an EXCHANGE_ID flag that signifies
client support for migration.


> But such clients could never recover from MOVED anyway, so we're not
> making things worse for them.

Exactly.


> --b.
> 
>>> 
>>> In other words, this change is necessary only for clients that
>>> support TSM.
>>> 
>>> Salt to taste.
>>> 
>>> 
>>>> --b.
>>>> 
>>>>> 
>>>>> 
>>>>>> --b.
>>>>>> 
>>>>>>>> 
>>>>>>>> Reported-by: Xuan Qi <xuan.qi@oracle.com>
>>>>>>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>>>>>>> Cc: stable@vger.kernel.org # v3.7+
>>>>>>>> ---
>>>>>>>> include/linux/nfs4.h |    3 ++-
>>>>>>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>>> 
>>>>>>>> diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
>>>>>>>> index bca5363..1b1ca04 100644
>>>>>>>> --- a/include/linux/nfs4.h
>>>>>>>> +++ b/include/linux/nfs4.h
>>>>>>>> @@ -282,7 +282,7 @@ enum nfsstat4 {
>>>>>>>> 
>>>>>>>> static inline bool seqid_mutating_err(u32 err)
>>>>>>>> {
>>>>>>>> -	/* rfc 3530 section 8.1.5: */
>>>>>>>> +	/* See RFC 7530, section 9.1.7 */
>>>>>>>> 	switch (err) {
>>>>>>>> 	case NFS4ERR_STALE_CLIENTID:
>>>>>>>> 	case NFS4ERR_STALE_STATEID:
>>>>>>>> @@ -291,6 +291,7 @@ static inline bool seqid_mutating_err(u32 err)
>>>>>>>> 	case NFS4ERR_BADXDR:
>>>>>>>> 	case NFS4ERR_RESOURCE:
>>>>>>>> 	case NFS4ERR_NOFILEHANDLE:
>>>>>>>> +	case NFS4ERR_MOVED:
>>>>>>>> 		return false;
>>>>>>>> 	};
>>>>>>>> 	return true;
>>>>>>>> 
>>>>>>>> --
>>>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>> 
>>>>>>> --
>>>>>>> Chuck Lever
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>> 
>>>>> --
>>>>> Chuck Lever
>>>>> 
>>>>> 
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> 
>>> --
>>> Chuck Lever
>>> 
>>> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Chuck Lever



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Trond Myklebust Jan. 24, 2017, 8:23 p.m. UTC | #10
On Tue, 2017-01-24 at 14:15 -0500, J. Bruce Fields wrote:
> On Tue, Jan 24, 2017 at 02:06:16PM -0500, Chuck Lever wrote:

> > 

> > > On Jan 23, 2017, at 11:49 AM, bfields@fieldses.org wrote:

> > > 

> > > On Mon, Jan 23, 2017 at 10:01:27AM -0500, Chuck Lever wrote:

> > > > 

> > > > > On Jan 22, 2017, at 2:04 PM, Chuck Lever <chuck.lever@oracle.

> > > > > com> wrote:

> > > > > 

> > > > > Xuan Qi reports that the Linux NFSv4 client failed to lock a

> > > > > file

> > > > > that was migrated. The steps he observed on the wire:

> > > > > 

> > > > > 1. The client sent a LOCK request

> > > > > 2. The server replied NFS4ERR_MOVED

> > > > > 3. The client switched to the destination server

> > > > > 4. The client sent the LOCK request again with a bumped

> > > > >  lock sequence ID

> > > > > 5. The server rejected the LOCK request with

> > > > > NFS4ERR_BAD_SEQID

> > > > 

> > > > The list of steps could be more clear:

> > > > 

> > > > 1. The client sent a LOCK request to the source server

> > > > 2. The source server replied NFS4ERR_MOVED

> > > > 3. The client switched to the destination server

> > > > 4. The client sent the same LOCK request to the destination

> > > >   server with a bumped lock sequence ID

> > > > 5. The destination server rejected the LOCK request with

> > > >   NFS4ERR_BAD_SEQID

> > > > 

> > > > 

> > > > > RFC 3530 section 8.1.5 provides a list of NFS errors which do

> > > > > not

> > > > > bump a lock sequence ID.

> > > > > 

> > > > > However, RFC 3530 is now obsoleted by RFC 7530. In RFC 7530

> > > > > section

> > > > > 9.1.7, this list has been updated by the addition of

> > > > > NFS4ERR_MOVED.

> > > 

> > > I guess we figured the backwards-incompatible change was OK since

> > > essentially the Solaris server is the first we know of to be

> > > making real

> > > use of NFS4ERR_MOVED?

> > > 

> > > And probably it's required for the their implementation because

> > > the old

> > > server no longer has the ability to update the state once it's

> > > reached

> > > the point of returning ERR_MOVED.

> > > 

> > > OK, makes sense to me, I think.

> > 

> > Hi Bruce-

> > 

> > Does this mean you will take this patch, or should

> > I just add your Reviewed-by: ?

> 

> I can take it if nobody objects.  Mind if I append the above to the

> changelog?  (Just want to document why we think the apparently

> backwards-incompatible change is OK.)

> 

I've already added it to my linux-next branch as a stable patch.


-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.myklebust@primarydata.com
J. Bruce Fields Jan. 24, 2017, 8:31 p.m. UTC | #11
On Tue, Jan 24, 2017 at 08:23:36PM +0000, Trond Myklebust wrote:
> On Tue, 2017-01-24 at 14:15 -0500, J. Bruce Fields wrote:
> > On Tue, Jan 24, 2017 at 02:06:16PM -0500, Chuck Lever wrote:
> > > 
> > > > On Jan 23, 2017, at 11:49 AM, bfields@fieldses.org wrote:
> > > > 
> > > > On Mon, Jan 23, 2017 at 10:01:27AM -0500, Chuck Lever wrote:
> > > > > 
> > > > > > On Jan 22, 2017, at 2:04 PM, Chuck Lever <chuck.lever@oracle.
> > > > > > com> wrote:
> > > > > > 
> > > > > > Xuan Qi reports that the Linux NFSv4 client failed to lock a
> > > > > > file
> > > > > > that was migrated. The steps he observed on the wire:
> > > > > > 
> > > > > > 1. The client sent a LOCK request
> > > > > > 2. The server replied NFS4ERR_MOVED
> > > > > > 3. The client switched to the destination server
> > > > > > 4. The client sent the LOCK request again with a bumped
> > > > > >  lock sequence ID
> > > > > > 5. The server rejected the LOCK request with
> > > > > > NFS4ERR_BAD_SEQID
> > > > > 
> > > > > The list of steps could be more clear:
> > > > > 
> > > > > 1. The client sent a LOCK request to the source server
> > > > > 2. The source server replied NFS4ERR_MOVED
> > > > > 3. The client switched to the destination server
> > > > > 4. The client sent the same LOCK request to the destination
> > > > >   server with a bumped lock sequence ID
> > > > > 5. The destination server rejected the LOCK request with
> > > > >   NFS4ERR_BAD_SEQID
> > > > > 
> > > > > 
> > > > > > RFC 3530 section 8.1.5 provides a list of NFS errors which do
> > > > > > not
> > > > > > bump a lock sequence ID.
> > > > > > 
> > > > > > However, RFC 3530 is now obsoleted by RFC 7530. In RFC 7530
> > > > > > section
> > > > > > 9.1.7, this list has been updated by the addition of
> > > > > > NFS4ERR_MOVED.
> > > > 
> > > > I guess we figured the backwards-incompatible change was OK since
> > > > essentially the Solaris server is the first we know of to be
> > > > making real
> > > > use of NFS4ERR_MOVED?
> > > > 
> > > > And probably it's required for the their implementation because
> > > > the old
> > > > server no longer has the ability to update the state once it's
> > > > reached
> > > > the point of returning ERR_MOVED.
> > > > 
> > > > OK, makes sense to me, I think.
> > > 
> > > Hi Bruce-
> > > 
> > > Does this mean you will take this patch, or should
> > > I just add your Reviewed-by: ?
> > 
> > I can take it if nobody objects.  Mind if I append the above to the
> > changelog?  (Just want to document why we think the apparently
> > backwards-incompatible change is OK.)
> > 
> I've already added it to my linux-next branch as a stable patch.

OK, fine by me, dropping.--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chuck Lever Jan. 25, 2017, 7:58 p.m. UTC | #12
> On Jan 24, 2017, at 3:23 PM, Trond Myklebust <trondmy@primarydata.com> wrote:
> 
> On Tue, 2017-01-24 at 14:15 -0500, J. Bruce Fields wrote:
>> On Tue, Jan 24, 2017 at 02:06:16PM -0500, Chuck Lever wrote:
>>> 
>>>> On Jan 23, 2017, at 11:49 AM, bfields@fieldses.org wrote:
>>>> 
>>>> On Mon, Jan 23, 2017 at 10:01:27AM -0500, Chuck Lever wrote:
>>>>> 
>>>>>> On Jan 22, 2017, at 2:04 PM, Chuck Lever <chuck.lever@oracle.
>>>>>> com> wrote:
>>>>>> 
>>>>>> Xuan Qi reports that the Linux NFSv4 client failed to lock a
>>>>>> file
>>>>>> that was migrated. The steps he observed on the wire:
>>>>>> 
>>>>>> 1. The client sent a LOCK request
>>>>>> 2. The server replied NFS4ERR_MOVED
>>>>>> 3. The client switched to the destination server
>>>>>> 4. The client sent the LOCK request again with a bumped
>>>>>> ย lock sequence ID
>>>>>> 5. The server rejected the LOCK request with
>>>>>> NFS4ERR_BAD_SEQID
>>>>> 
>>>>> The list of steps could be more clear:
>>>>> 
>>>>> 1. The client sent a LOCK request to the source server
>>>>> 2. The source server replied NFS4ERR_MOVED
>>>>> 3. The client switched to the destination server
>>>>> 4. The client sent the same LOCK request to the destination
>>>>> ย  server with a bumped lock sequence ID
>>>>> 5. The destination server rejected the LOCK request with
>>>>> ย  NFS4ERR_BAD_SEQID
>>>>> 
>>>>> 
>>>>>> RFC 3530 section 8.1.5 provides a list of NFS errors which do
>>>>>> not
>>>>>> bump a lock sequence ID.
>>>>>> 
>>>>>> However, RFC 3530 is now obsoleted by RFC 7530. In RFC 7530
>>>>>> section
>>>>>> 9.1.7, this list has been updated by the addition of
>>>>>> NFS4ERR_MOVED.
>>>> 
>>>> I guess we figured the backwards-incompatible change was OK since
>>>> essentially the Solaris server is the first we know of to be
>>>> making real
>>>> use of NFS4ERR_MOVED?
>>>> 
>>>> And probably it's required for the their implementation because
>>>> the old
>>>> server no longer has the ability to update the state once it's
>>>> reached
>>>> the point of returning ERR_MOVED.
>>>> 
>>>> OK, makes sense to me, I think.
>>> 
>>> Hi Bruce-
>>> 
>>> Does this mean you will take this patch, or should
>>> I just add your Reviewed-by: ?
>> 
>> I can take it if nobody objects.ย ย Mind if I append the above to the
>> changelog?ย ย (Just want to document why we think the apparently
>> backwards-incompatible change is OK.)
>> 
> I've already added it to my linux-next branch as a stable patch.

This patch alone might not be enough.

Our test results show that even with this patch applied, the Linux
client still increments the lock sequence ID after NFS4ERR_MOVED.


--
Chuck Lever



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chuck Lever Jan. 25, 2017, 8:08 p.m. UTC | #13
> On Jan 25, 2017, at 2:58 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
> 
>> 
>> On Jan 24, 2017, at 3:23 PM, Trond Myklebust <trondmy@primarydata.com> wrote:
>> 
>> On Tue, 2017-01-24 at 14:15 -0500, J. Bruce Fields wrote:
>>> On Tue, Jan 24, 2017 at 02:06:16PM -0500, Chuck Lever wrote:
>>>> 
>>>>> On Jan 23, 2017, at 11:49 AM, bfields@fieldses.org wrote:
>>>>> 
>>>>> On Mon, Jan 23, 2017 at 10:01:27AM -0500, Chuck Lever wrote:
>>>>>> 
>>>>>>> On Jan 22, 2017, at 2:04 PM, Chuck Lever <chuck.lever@oracle.
>>>>>>> com> wrote:
>>>>>>> 
>>>>>>> Xuan Qi reports that the Linux NFSv4 client failed to lock a
>>>>>>> file
>>>>>>> that was migrated. The steps he observed on the wire:
>>>>>>> 
>>>>>>> 1. The client sent a LOCK request
>>>>>>> 2. The server replied NFS4ERR_MOVED
>>>>>>> 3. The client switched to the destination server
>>>>>>> 4. The client sent the LOCK request again with a bumped
>>>>>>> ย lock sequence ID
>>>>>>> 5. The server rejected the LOCK request with
>>>>>>> NFS4ERR_BAD_SEQID
>>>>>> 
>>>>>> The list of steps could be more clear:
>>>>>> 
>>>>>> 1. The client sent a LOCK request to the source server
>>>>>> 2. The source server replied NFS4ERR_MOVED
>>>>>> 3. The client switched to the destination server
>>>>>> 4. The client sent the same LOCK request to the destination
>>>>>> ย  server with a bumped lock sequence ID
>>>>>> 5. The destination server rejected the LOCK request with
>>>>>> ย  NFS4ERR_BAD_SEQID
>>>>>> 
>>>>>> 
>>>>>>> RFC 3530 section 8.1.5 provides a list of NFS errors which do
>>>>>>> not
>>>>>>> bump a lock sequence ID.
>>>>>>> 
>>>>>>> However, RFC 3530 is now obsoleted by RFC 7530. In RFC 7530
>>>>>>> section
>>>>>>> 9.1.7, this list has been updated by the addition of
>>>>>>> NFS4ERR_MOVED.
>>>>> 
>>>>> I guess we figured the backwards-incompatible change was OK since
>>>>> essentially the Solaris server is the first we know of to be
>>>>> making real
>>>>> use of NFS4ERR_MOVED?
>>>>> 
>>>>> And probably it's required for the their implementation because
>>>>> the old
>>>>> server no longer has the ability to update the state once it's
>>>>> reached
>>>>> the point of returning ERR_MOVED.
>>>>> 
>>>>> OK, makes sense to me, I think.
>>>> 
>>>> Hi Bruce-
>>>> 
>>>> Does this mean you will take this patch, or should
>>>> I just add your Reviewed-by: ?
>>> 
>>> I can take it if nobody objects.ย ย Mind if I append the above to the
>>> changelog?ย ย (Just want to document why we think the apparently
>>> backwards-incompatible change is OK.)
>>> 
>> I've already added it to my linux-next branch as a stable patch.
> 
> This patch alone might not be enough.
> 
> Our test results show that even with this patch applied, the Linux
> client still increments the lock sequence ID after NFS4ERR_MOVED.

Looks like nfs_increment_seqid() also needs to be updated?

--
Chuck Lever



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
index bca5363..1b1ca04 100644
--- a/include/linux/nfs4.h
+++ b/include/linux/nfs4.h
@@ -282,7 +282,7 @@  enum nfsstat4 {
 
 static inline bool seqid_mutating_err(u32 err)
 {
-	/* rfc 3530 section 8.1.5: */
+	/* See RFC 7530, section 9.1.7 */
 	switch (err) {
 	case NFS4ERR_STALE_CLIENTID:
 	case NFS4ERR_STALE_STATEID:
@@ -291,6 +291,7 @@  static inline bool seqid_mutating_err(u32 err)
 	case NFS4ERR_BADXDR:
 	case NFS4ERR_RESOURCE:
 	case NFS4ERR_NOFILEHANDLE:
+	case NFS4ERR_MOVED:
 		return false;
 	};
 	return true;