[2/6] ceph: hold extra reference to r_parent over life of request
diff mbox series

Message ID 20200106153520.307523-3-jlayton@kernel.org
State New
Headers show
Series
  • ceph: asynchronous unlink support
Related show

Commit Message

Jeff Layton Jan. 6, 2020, 3:35 p.m. UTC
Currently, we just assume that it will stick around by virtue of the
submitter's reference, but later patches will allow the syscall to
return early and we can't rely on that reference at that point.

Take an extra reference to the inode when setting r_parent and release
it when releasing the request.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ceph/mds_client.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Comments

Xiubo Li Jan. 9, 2020, 2:05 a.m. UTC | #1
On 2020/1/6 23:35, Jeff Layton wrote:
> Currently, we just assume that it will stick around by virtue of the
> submitter's reference, but later patches will allow the syscall to
> return early and we can't rely on that reference at that point.
>
> Take an extra reference to the inode when setting r_parent and release
> it when releasing the request.
>
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>   fs/ceph/mds_client.c | 8 ++++++--
>   1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> index 94cce2ab92c4..b7122f682678 100644
> --- a/fs/ceph/mds_client.c
> +++ b/fs/ceph/mds_client.c
> @@ -708,8 +708,10 @@ void ceph_mdsc_release_request(struct kref *kref)
>   		/* avoid calling iput_final() in mds dispatch threads */
>   		ceph_async_iput(req->r_inode);
>   	}
> -	if (req->r_parent)
> +	if (req->r_parent) {
>   		ceph_put_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN);
> +		ceph_async_iput(req->r_parent);
> +	}
>   	ceph_async_iput(req->r_target_inode);
>   	if (req->r_dentry)
>   		dput(req->r_dentry);
> @@ -2706,8 +2708,10 @@ int ceph_mdsc_submit_request(struct ceph_mds_client *mdsc, struct inode *dir,
>   	/* take CAP_PIN refs for r_inode, r_parent, r_old_dentry */
>   	if (req->r_inode)
>   		ceph_get_cap_refs(ceph_inode(req->r_inode), CEPH_CAP_PIN);
> -	if (req->r_parent)
> +	if (req->r_parent) {
>   		ceph_get_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN);
> +		ihold(req->r_parent);
> +	}

This might also fix another issue when the mdsc request is timedout and 
returns to the vfs, then the r_parent maybe released in vfs. And then if 
we reference it again in mdsc handle_reply() --> 
ceph_mdsc_release_request(),  some unknown issues may happen later ??

>   	if (req->r_old_dentry_dir)
>   		ceph_get_cap_refs(ceph_inode(req->r_old_dentry_dir),
>   				  CEPH_CAP_PIN);
Jeff Layton Jan. 9, 2020, 11:20 a.m. UTC | #2
On Thu, 2020-01-09 at 10:05 +0800, Xiubo Li wrote:
> On 2020/1/6 23:35, Jeff Layton wrote:
> > Currently, we just assume that it will stick around by virtue of the
> > submitter's reference, but later patches will allow the syscall to
> > return early and we can't rely on that reference at that point.
> > 
> > Take an extra reference to the inode when setting r_parent and release
> > it when releasing the request.
> > 
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> >   fs/ceph/mds_client.c | 8 ++++++--
> >   1 file changed, 6 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> > index 94cce2ab92c4..b7122f682678 100644
> > --- a/fs/ceph/mds_client.c
> > +++ b/fs/ceph/mds_client.c
> > @@ -708,8 +708,10 @@ void ceph_mdsc_release_request(struct kref *kref)
> >   		/* avoid calling iput_final() in mds dispatch threads */
> >   		ceph_async_iput(req->r_inode);
> >   	}
> > -	if (req->r_parent)
> > +	if (req->r_parent) {
> >   		ceph_put_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN);
> > +		ceph_async_iput(req->r_parent);
> > +	}
> >   	ceph_async_iput(req->r_target_inode);
> >   	if (req->r_dentry)
> >   		dput(req->r_dentry);
> > @@ -2706,8 +2708,10 @@ int ceph_mdsc_submit_request(struct ceph_mds_client *mdsc, struct inode *dir,
> >   	/* take CAP_PIN refs for r_inode, r_parent, r_old_dentry */
> >   	if (req->r_inode)
> >   		ceph_get_cap_refs(ceph_inode(req->r_inode), CEPH_CAP_PIN);
> > -	if (req->r_parent)
> > +	if (req->r_parent) {
> >   		ceph_get_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN);
> > +		ihold(req->r_parent);
> > +	}
> 
> This might also fix another issue when the mdsc request is timedout and 
> returns to the vfs, then the r_parent maybe released in vfs. And then if 
> we reference it again in mdsc handle_reply() --> 
> ceph_mdsc_release_request(),  some unknown issues may happen later ??
> 

AIUI, when a timeout occurs, the req is unhashed such that handle_reply
can't find it. So, I doubt this affects that one way or another.

> >   	if (req->r_old_dentry_dir)
> >   		ceph_get_cap_refs(ceph_inode(req->r_old_dentry_dir),
> >   				  CEPH_CAP_PIN);
> 
>
Xiubo Li Jan. 9, 2020, 1:16 p.m. UTC | #3
On 2020/1/9 19:20, Jeff Layton wrote:
> On Thu, 2020-01-09 at 10:05 +0800, Xiubo Li wrote:
>> On 2020/1/6 23:35, Jeff Layton wrote:
>>> Currently, we just assume that it will stick around by virtue of the
>>> submitter's reference, but later patches will allow the syscall to
>>> return early and we can't rely on that reference at that point.
>>>
>>> Take an extra reference to the inode when setting r_parent and release
>>> it when releasing the request.
>>>
>>> Signed-off-by: Jeff Layton <jlayton@kernel.org>
>>> ---
>>>    fs/ceph/mds_client.c | 8 ++++++--
>>>    1 file changed, 6 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
>>> index 94cce2ab92c4..b7122f682678 100644
>>> --- a/fs/ceph/mds_client.c
>>> +++ b/fs/ceph/mds_client.c
>>> @@ -708,8 +708,10 @@ void ceph_mdsc_release_request(struct kref *kref)
>>>    		/* avoid calling iput_final() in mds dispatch threads */
>>>    		ceph_async_iput(req->r_inode);
>>>    	}
>>> -	if (req->r_parent)
>>> +	if (req->r_parent) {
>>>    		ceph_put_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN);
>>> +		ceph_async_iput(req->r_parent);
>>> +	}
>>>    	ceph_async_iput(req->r_target_inode);
>>>    	if (req->r_dentry)
>>>    		dput(req->r_dentry);
>>> @@ -2706,8 +2708,10 @@ int ceph_mdsc_submit_request(struct ceph_mds_client *mdsc, struct inode *dir,
>>>    	/* take CAP_PIN refs for r_inode, r_parent, r_old_dentry */
>>>    	if (req->r_inode)
>>>    		ceph_get_cap_refs(ceph_inode(req->r_inode), CEPH_CAP_PIN);
>>> -	if (req->r_parent)
>>> +	if (req->r_parent) {
>>>    		ceph_get_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN);
>>> +		ihold(req->r_parent);
>>> +	}
>> This might also fix another issue when the mdsc request is timedout and
>> returns to the vfs, then the r_parent maybe released in vfs. And then if
>> we reference it again in mdsc handle_reply() -->
>> ceph_mdsc_release_request(),  some unknown issues may happen later ??
>>
> AIUI, when a timeout occurs, the req is unhashed such that handle_reply
> can't find it. So, I doubt this affects that one way or another.

If my understanding is correct, such as for rmdir(), the logic will be :

req = ceph_mdsc_create_request()      //  ref == 1

ceph_mdsc_do_request(req) -->

         ceph_mdsc_submit_request(req) -->

                 __register_request(req) // ref == 2

         ceph_mdsc_wait_request(req)  // If timedout

ceph_mdsc_put_request(req)  // ref == 1

Then in handled_reply(), only when we get a safe reply it will call 
__unregister_request(req), then the ref could be 0.

Though it will ihold()/ceph_async_iput() the req->r_unsafe_dir(= 
req->r_parent) , but the _iput() will be called just before we reference 
the req->r_parent in the _relase_request(). And the _iput() here may 
call the iput_final().

BRs



>>>    	if (req->r_old_dentry_dir)
>>>    		ceph_get_cap_refs(ceph_inode(req->r_old_dentry_dir),
>>>    				  CEPH_CAP_PIN);
>>
Jeff Layton Jan. 9, 2020, 1:33 p.m. UTC | #4
On Thu, 2020-01-09 at 21:16 +0800, Xiubo Li wrote:
> On 2020/1/9 19:20, Jeff Layton wrote:
> > On Thu, 2020-01-09 at 10:05 +0800, Xiubo Li wrote:
> > > On 2020/1/6 23:35, Jeff Layton wrote:
> > > > Currently, we just assume that it will stick around by virtue of the
> > > > submitter's reference, but later patches will allow the syscall to
> > > > return early and we can't rely on that reference at that point.
> > > > 
> > > > Take an extra reference to the inode when setting r_parent and release
> > > > it when releasing the request.
> > > > 
> > > > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > > > ---
> > > >    fs/ceph/mds_client.c | 8 ++++++--
> > > >    1 file changed, 6 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> > > > index 94cce2ab92c4..b7122f682678 100644
> > > > --- a/fs/ceph/mds_client.c
> > > > +++ b/fs/ceph/mds_client.c
> > > > @@ -708,8 +708,10 @@ void ceph_mdsc_release_request(struct kref *kref)
> > > >    		/* avoid calling iput_final() in mds dispatch threads */
> > > >    		ceph_async_iput(req->r_inode);
> > > >    	}
> > > > -	if (req->r_parent)
> > > > +	if (req->r_parent) {
> > > >    		ceph_put_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN);
> > > > +		ceph_async_iput(req->r_parent);
> > > > +	}
> > > >    	ceph_async_iput(req->r_target_inode);
> > > >    	if (req->r_dentry)
> > > >    		dput(req->r_dentry);
> > > > @@ -2706,8 +2708,10 @@ int ceph_mdsc_submit_request(struct ceph_mds_client *mdsc, struct inode *dir,
> > > >    	/* take CAP_PIN refs for r_inode, r_parent, r_old_dentry */
> > > >    	if (req->r_inode)
> > > >    		ceph_get_cap_refs(ceph_inode(req->r_inode), CEPH_CAP_PIN);
> > > > -	if (req->r_parent)
> > > > +	if (req->r_parent) {
> > > >    		ceph_get_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN);
> > > > +		ihold(req->r_parent);
> > > > +	}
> > > This might also fix another issue when the mdsc request is timedout and
> > > returns to the vfs, then the r_parent maybe released in vfs. And then if
> > > we reference it again in mdsc handle_reply() -->
> > > ceph_mdsc_release_request(),  some unknown issues may happen later ??
> > > 
> > AIUI, when a timeout occurs, the req is unhashed such that handle_reply
> > can't find it. So, I doubt this affects that one way or another.
> 
> If my understanding is correct, such as for rmdir(), the logic will be :
> 
> req = ceph_mdsc_create_request()      //  ref == 1
> 
> ceph_mdsc_do_request(req) -->
> 
>          ceph_mdsc_submit_request(req) -->
> 
>                  __register_request(req) // ref == 2
> 
>          ceph_mdsc_wait_request(req)  // If timedout
> 
> ceph_mdsc_put_request(req)  // ref == 1
> 
> Then in handled_reply(), only when we get a safe reply it will call 
> __unregister_request(req), then the ref could be 0.
> 
> Though it will ihold()/ceph_async_iput() the req->r_unsafe_dir(= 
> req->r_parent) , but the _iput() will be called just before we reference 
> the req->r_parent in the _relase_request(). And the _iput() here may 
> call the iput_final().
> 

I take it back, I think you're right. This likely would fix that issue
up. I'll plan to add a note about that to the changelog before I merge
it. Should we mark this for stable in light of that?
Xiubo Li Jan. 10, 2020, 1:41 a.m. UTC | #5
On 2020/1/9 21:33, Jeff Layton wrote:
> On Thu, 2020-01-09 at 21:16 +0800, Xiubo Li wrote:
>> On 2020/1/9 19:20, Jeff Layton wrote:
>>> On Thu, 2020-01-09 at 10:05 +0800, Xiubo Li wrote:
>>>> On 2020/1/6 23:35, Jeff Layton wrote:
>>>>> Currently, we just assume that it will stick around by virtue of the
>>>>> submitter's reference, but later patches will allow the syscall to
>>>>> return early and we can't rely on that reference at that point.
>>>>>
>>>>> Take an extra reference to the inode when setting r_parent and release
>>>>> it when releasing the request.
>>>>>
>>>>> Signed-off-by: Jeff Layton <jlayton@kernel.org>
>>>>> ---
>>>>>     fs/ceph/mds_client.c | 8 ++++++--
>>>>>     1 file changed, 6 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
>>>>> index 94cce2ab92c4..b7122f682678 100644
>>>>> --- a/fs/ceph/mds_client.c
>>>>> +++ b/fs/ceph/mds_client.c
>>>>> @@ -708,8 +708,10 @@ void ceph_mdsc_release_request(struct kref *kref)
>>>>>     		/* avoid calling iput_final() in mds dispatch threads */
>>>>>     		ceph_async_iput(req->r_inode);
>>>>>     	}
>>>>> -	if (req->r_parent)
>>>>> +	if (req->r_parent) {
>>>>>     		ceph_put_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN);
>>>>> +		ceph_async_iput(req->r_parent);
>>>>> +	}
>>>>>     	ceph_async_iput(req->r_target_inode);
>>>>>     	if (req->r_dentry)
>>>>>     		dput(req->r_dentry);
>>>>> @@ -2706,8 +2708,10 @@ int ceph_mdsc_submit_request(struct ceph_mds_client *mdsc, struct inode *dir,
>>>>>     	/* take CAP_PIN refs for r_inode, r_parent, r_old_dentry */
>>>>>     	if (req->r_inode)
>>>>>     		ceph_get_cap_refs(ceph_inode(req->r_inode), CEPH_CAP_PIN);
>>>>> -	if (req->r_parent)
>>>>> +	if (req->r_parent) {
>>>>>     		ceph_get_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN);
>>>>> +		ihold(req->r_parent);
>>>>> +	}
>>>> This might also fix another issue when the mdsc request is timedout and
>>>> returns to the vfs, then the r_parent maybe released in vfs. And then if
>>>> we reference it again in mdsc handle_reply() -->
>>>> ceph_mdsc_release_request(),  some unknown issues may happen later ??
>>>>
>>> AIUI, when a timeout occurs, the req is unhashed such that handle_reply
>>> can't find it. So, I doubt this affects that one way or another.
>> If my understanding is correct, such as for rmdir(), the logic will be :
>>
>> req = ceph_mdsc_create_request()      //  ref == 1
>>
>> ceph_mdsc_do_request(req) -->
>>
>>           ceph_mdsc_submit_request(req) -->
>>
>>                   __register_request(req) // ref == 2
>>
>>           ceph_mdsc_wait_request(req)  // If timedout
>>
>> ceph_mdsc_put_request(req)  // ref == 1
>>
>> Then in handled_reply(), only when we get a safe reply it will call
>> __unregister_request(req), then the ref could be 0.
>>
>> Though it will ihold()/ceph_async_iput() the req->r_unsafe_dir(=
>> req->r_parent) , but the _iput() will be called just before we reference
>> the req->r_parent in the _relase_request(). And the _iput() here may
>> call the iput_final().
>>
> I take it back, I think you're right. This likely would fix that issue
> up. I'll plan to add a note about that to the changelog before I merge
> it. Should we mark this for stable in light of that?

Yeah, right :-)

Patch
diff mbox series

diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 94cce2ab92c4..b7122f682678 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -708,8 +708,10 @@  void ceph_mdsc_release_request(struct kref *kref)
 		/* avoid calling iput_final() in mds dispatch threads */
 		ceph_async_iput(req->r_inode);
 	}
-	if (req->r_parent)
+	if (req->r_parent) {
 		ceph_put_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN);
+		ceph_async_iput(req->r_parent);
+	}
 	ceph_async_iput(req->r_target_inode);
 	if (req->r_dentry)
 		dput(req->r_dentry);
@@ -2706,8 +2708,10 @@  int ceph_mdsc_submit_request(struct ceph_mds_client *mdsc, struct inode *dir,
 	/* take CAP_PIN refs for r_inode, r_parent, r_old_dentry */
 	if (req->r_inode)
 		ceph_get_cap_refs(ceph_inode(req->r_inode), CEPH_CAP_PIN);
-	if (req->r_parent)
+	if (req->r_parent) {
 		ceph_get_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN);
+		ihold(req->r_parent);
+	}
 	if (req->r_old_dentry_dir)
 		ceph_get_cap_refs(ceph_inode(req->r_old_dentry_dir),
 				  CEPH_CAP_PIN);