diff mbox series

NFSD: Fix 5 seconds delay when doing inter server copy

Message ID 20201124031609.67297-1-dai.ngo@oracle.com (mailing list archive)
State New
Headers show
Series NFSD: Fix 5 seconds delay when doing inter server copy | expand

Commit Message

Dai Ngo Nov. 24, 2020, 3:16 a.m. UTC
Since commit b4868b44c5628 ("NFSv4: Wait for stateid updates after
CLOSE/OPEN_DOWNGRADE"), every inter server copy operation suffers 5
seconds delay regardless of the size of the copy. The delay is from
nfs_set_open_stateid_locked when the check by nfs_stateid_is_sequential
fails because the seqid in both nfs4_state and nfs4_stateid are 0.

Fix by modifying the source server to return the stateid for COPY_NOTIFY
request with seqid 1 instead of 0. This is also to conform with
section 4.8 of RFC 7862.

Here is the relevant paragraph from section 4.8 of RFC 7862:

   A copy offload stateid's seqid MUST NOT be zero.  In the context of a
   copy offload operation, it is inappropriate to indicate "the most
   recent copy offload operation" using a stateid with a seqid of zero
   (see Section 8.2.2 of [RFC5661]).  It is inappropriate because the
   stateid refers to internal state in the server and there may be
   several asynchronous COPY operations being performed in parallel on
   the same file by the server.  Therefore, a copy offload stateid with
   a seqid of zero MUST be considered invalid.

Fixes: ce0887ac96d3 ("NFSD add nfs4 inter ssc to nfsd4_copy")
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
---
 fs/nfsd/nfs4state.c | 1 +
 1 file changed, 1 insertion(+)

Comments

J. Bruce Fields Nov. 24, 2020, 8:49 p.m. UTC | #1
On Mon, Nov 23, 2020 at 10:16:09PM -0500, Dai Ngo wrote:
> Since commit b4868b44c5628 ("NFSv4: Wait for stateid updates after
> CLOSE/OPEN_DOWNGRADE"), every inter server copy operation suffers 5
> seconds delay regardless of the size of the copy. The delay is from
> nfs_set_open_stateid_locked when the check by nfs_stateid_is_sequential
> fails because the seqid in both nfs4_state and nfs4_stateid are 0.
> 
> Fix by modifying the source server to return the stateid for COPY_NOTIFY
> request with seqid 1 instead of 0. This is also to conform with
> section 4.8 of RFC 7862.
> 
> Here is the relevant paragraph from section 4.8 of RFC 7862:
> 
>    A copy offload stateid's seqid MUST NOT be zero.  In the context of a
>    copy offload operation, it is inappropriate to indicate "the most
>    recent copy offload operation" using a stateid with a seqid of zero
>    (see Section 8.2.2 of [RFC5661]).  It is inappropriate because the
>    stateid refers to internal state in the server and there may be
>    several asynchronous COPY operations being performed in parallel on
>    the same file by the server.  Therefore, a copy offload stateid with
>    a seqid of zero MUST be considered invalid.
> 
> Fixes: ce0887ac96d3 ("NFSD add nfs4 inter ssc to nfsd4_copy")
> Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
> ---
>  fs/nfsd/nfs4state.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index d7f27ed6b794..33ee1a6961e3 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -793,6 +793,7 @@ struct nfs4_cpntf_state *nfs4_alloc_init_cpntf_state(struct nfsd_net *nn,
>  	refcount_set(&cps->cp_stateid.sc_count, 1);
>  	if (!nfs4_init_cp_state(nn, &cps->cp_stateid, NFS4_COPYNOTIFY_STID))
>  		goto out_free;
> +	cps->cp_stateid.stid.si_generation = 1;

This affects the stateid returned by COPY_NOTIFY, but not the one
returned by COPY.  I think we wan to add this to nfs4_init_cp_state()
and cover both.

--b.

>  	spin_lock(&nn->s2s_cp_lock);
>  	list_add(&cps->cp_list, &p_stid->sc_cp_list);
>  	spin_unlock(&nn->s2s_cp_lock);
> -- 
> 2.9.5
Chuck Lever Nov. 30, 2020, 5:57 p.m. UTC | #2
Hello Dai -

> On Nov 24, 2020, at 3:49 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> 
> On Mon, Nov 23, 2020 at 10:16:09PM -0500, Dai Ngo wrote:
>> Since commit b4868b44c5628 ("NFSv4: Wait for stateid updates after
>> CLOSE/OPEN_DOWNGRADE"), every inter server copy operation suffers 5
>> seconds delay regardless of the size of the copy. The delay is from
>> nfs_set_open_stateid_locked when the check by nfs_stateid_is_sequential
>> fails because the seqid in both nfs4_state and nfs4_stateid are 0.
>> 
>> Fix by modifying the source server to return the stateid for COPY_NOTIFY
>> request with seqid 1 instead of 0. This is also to conform with
>> section 4.8 of RFC 7862.
>> 
>> Here is the relevant paragraph from section 4.8 of RFC 7862:
>> 
>>   A copy offload stateid's seqid MUST NOT be zero.  In the context of a
>>   copy offload operation, it is inappropriate to indicate "the most
>>   recent copy offload operation" using a stateid with a seqid of zero
>>   (see Section 8.2.2 of [RFC5661]).  It is inappropriate because the
>>   stateid refers to internal state in the server and there may be
>>   several asynchronous COPY operations being performed in parallel on
>>   the same file by the server.  Therefore, a copy offload stateid with
>>   a seqid of zero MUST be considered invalid.
>> 
>> Fixes: ce0887ac96d3 ("NFSD add nfs4 inter ssc to nfsd4_copy")
>> Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
>> ---
>> fs/nfsd/nfs4state.c | 1 +
>> 1 file changed, 1 insertion(+)
>> 
>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>> index d7f27ed6b794..33ee1a6961e3 100644
>> --- a/fs/nfsd/nfs4state.c
>> +++ b/fs/nfsd/nfs4state.c
>> @@ -793,6 +793,7 @@ struct nfs4_cpntf_state *nfs4_alloc_init_cpntf_state(struct nfsd_net *nn,
>> 	refcount_set(&cps->cp_stateid.sc_count, 1);
>> 	if (!nfs4_init_cp_state(nn, &cps->cp_stateid, NFS4_COPYNOTIFY_STID))
>> 		goto out_free;
>> +	cps->cp_stateid.stid.si_generation = 1;
> 
> This affects the stateid returned by COPY_NOTIFY, but not the one
> returned by COPY.  I think we wan to add this to nfs4_init_cp_state()
> and cover both.

Since time is creeping on towards the next merge window, I assume
this particular fix needs to go there, but I don't see the final
version of it (with Bruce's suggested fix) on the list. Did I miss
it?


>> 	spin_lock(&nn->s2s_cp_lock);
>> 	list_add(&cps->cp_list, &p_stid->sc_cp_list);
>> 	spin_unlock(&nn->s2s_cp_lock);
>> -- 
>> 2.9.5

--
Chuck Lever
chucklever@gmail.com
Dai Ngo Nov. 30, 2020, 6:47 p.m. UTC | #3
Hi Chuck,

Sorry for the delay. I will make update the patch, test it, and re-submit
it by end of today.

Thanks,
-Dai

On 11/30/20 9:57 AM, Chuck Lever wrote:
> Hello Dai -
>
>> On Nov 24, 2020, at 3:49 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
>>
>> On Mon, Nov 23, 2020 at 10:16:09PM -0500, Dai Ngo wrote:
>>> Since commit b4868b44c5628 ("NFSv4: Wait for stateid updates after
>>> CLOSE/OPEN_DOWNGRADE"), every inter server copy operation suffers 5
>>> seconds delay regardless of the size of the copy. The delay is from
>>> nfs_set_open_stateid_locked when the check by nfs_stateid_is_sequential
>>> fails because the seqid in both nfs4_state and nfs4_stateid are 0.
>>>
>>> Fix by modifying the source server to return the stateid for COPY_NOTIFY
>>> request with seqid 1 instead of 0. This is also to conform with
>>> section 4.8 of RFC 7862.
>>>
>>> Here is the relevant paragraph from section 4.8 of RFC 7862:
>>>
>>>    A copy offload stateid's seqid MUST NOT be zero.  In the context of a
>>>    copy offload operation, it is inappropriate to indicate "the most
>>>    recent copy offload operation" using a stateid with a seqid of zero
>>>    (see Section 8.2.2 of [RFC5661]).  It is inappropriate because the
>>>    stateid refers to internal state in the server and there may be
>>>    several asynchronous COPY operations being performed in parallel on
>>>    the same file by the server.  Therefore, a copy offload stateid with
>>>    a seqid of zero MUST be considered invalid.
>>>
>>> Fixes: ce0887ac96d3 ("NFSD add nfs4 inter ssc to nfsd4_copy")
>>> Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
>>> ---
>>> fs/nfsd/nfs4state.c | 1 +
>>> 1 file changed, 1 insertion(+)
>>>
>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>>> index d7f27ed6b794..33ee1a6961e3 100644
>>> --- a/fs/nfsd/nfs4state.c
>>> +++ b/fs/nfsd/nfs4state.c
>>> @@ -793,6 +793,7 @@ struct nfs4_cpntf_state *nfs4_alloc_init_cpntf_state(struct nfsd_net *nn,
>>> 	refcount_set(&cps->cp_stateid.sc_count, 1);
>>> 	if (!nfs4_init_cp_state(nn, &cps->cp_stateid, NFS4_COPYNOTIFY_STID))
>>> 		goto out_free;
>>> +	cps->cp_stateid.stid.si_generation = 1;
>> This affects the stateid returned by COPY_NOTIFY, but not the one
>> returned by COPY.  I think we wan to add this to nfs4_init_cp_state()
>> and cover both.
> Since time is creeping on towards the next merge window, I assume
> this particular fix needs to go there, but I don't see the final
> version of it (with Bruce's suggested fix) on the list. Did I miss
> it?
>
>
>>> 	spin_lock(&nn->s2s_cp_lock);
>>> 	list_add(&cps->cp_list, &p_stid->sc_cp_list);
>>> 	spin_unlock(&nn->s2s_cp_lock);
>>> -- 
>>> 2.9.5
> --
> Chuck Lever
> chucklever@gmail.com
>
>
>
Dai Ngo Nov. 30, 2020, 9:28 p.m. UTC | #4
On 11/24/20 12:49 PM, J. Bruce Fields wrote:
> On Mon, Nov 23, 2020 at 10:16:09PM -0500, Dai Ngo wrote:
>> Since commit b4868b44c5628 ("NFSv4: Wait for stateid updates after
>> CLOSE/OPEN_DOWNGRADE"), every inter server copy operation suffers 5
>> seconds delay regardless of the size of the copy. The delay is from
>> nfs_set_open_stateid_locked when the check by nfs_stateid_is_sequential
>> fails because the seqid in both nfs4_state and nfs4_stateid are 0.
>>
>> Fix by modifying the source server to return the stateid for COPY_NOTIFY
>> request with seqid 1 instead of 0. This is also to conform with
>> section 4.8 of RFC 7862.
>>
>> Here is the relevant paragraph from section 4.8 of RFC 7862:
>>
>>     A copy offload stateid's seqid MUST NOT be zero.  In the context of a
>>     copy offload operation, it is inappropriate to indicate "the most
>>     recent copy offload operation" using a stateid with a seqid of zero
>>     (see Section 8.2.2 of [RFC5661]).  It is inappropriate because the
>>     stateid refers to internal state in the server and there may be
>>     several asynchronous COPY operations being performed in parallel on
>>     the same file by the server.  Therefore, a copy offload stateid with
>>     a seqid of zero MUST be considered invalid.
>>
>> Fixes: ce0887ac96d3 ("NFSD add nfs4 inter ssc to nfsd4_copy")
>> Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
>> ---
>>   fs/nfsd/nfs4state.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>> index d7f27ed6b794..33ee1a6961e3 100644
>> --- a/fs/nfsd/nfs4state.c
>> +++ b/fs/nfsd/nfs4state.c
>> @@ -793,6 +793,7 @@ struct nfs4_cpntf_state *nfs4_alloc_init_cpntf_state(struct nfsd_net *nn,
>>   	refcount_set(&cps->cp_stateid.sc_count, 1);
>>   	if (!nfs4_init_cp_state(nn, &cps->cp_stateid, NFS4_COPYNOTIFY_STID))
>>   		goto out_free;
>> +	cps->cp_stateid.stid.si_generation = 1;
> This affects the stateid returned by COPY_NOTIFY, but not the one
> returned by COPY.  I think we wan to add this to nfs4_init_cp_state()
> and cover both.

Hi Bruce, thank you for your suggestion. Updated patch tested and submitted.

-Dai

P.S sorry for the delay, I was on leave last few days.

>
> --b.
>
>>   	spin_lock(&nn->s2s_cp_lock);
>>   	list_add(&cps->cp_list, &p_stid->sc_cp_list);
>>   	spin_unlock(&nn->s2s_cp_lock);
>> -- 
>> 2.9.5
diff mbox series

Patch

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index d7f27ed6b794..33ee1a6961e3 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -793,6 +793,7 @@  struct nfs4_cpntf_state *nfs4_alloc_init_cpntf_state(struct nfsd_net *nn,
 	refcount_set(&cps->cp_stateid.sc_count, 1);
 	if (!nfs4_init_cp_state(nn, &cps->cp_stateid, NFS4_COPYNOTIFY_STID))
 		goto out_free;
+	cps->cp_stateid.stid.si_generation = 1;
 	spin_lock(&nn->s2s_cp_lock);
 	list_add(&cps->cp_list, &p_stid->sc_cp_list);
 	spin_unlock(&nn->s2s_cp_lock);