Message ID | 20200106153520.307523-3-jlayton@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ceph: asynchronous unlink support | expand |
On 2020/1/6 23:35, Jeff Layton wrote: > Currently, we just assume that it will stick around by virtue of the > submitter's reference, but later patches will allow the syscall to > return early and we can't rely on that reference at that point. > > Take an extra reference to the inode when setting r_parent and release > it when releasing the request. > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > --- > fs/ceph/mds_client.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c > index 94cce2ab92c4..b7122f682678 100644 > --- a/fs/ceph/mds_client.c > +++ b/fs/ceph/mds_client.c > @@ -708,8 +708,10 @@ void ceph_mdsc_release_request(struct kref *kref) > /* avoid calling iput_final() in mds dispatch threads */ > ceph_async_iput(req->r_inode); > } > - if (req->r_parent) > + if (req->r_parent) { > ceph_put_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN); > + ceph_async_iput(req->r_parent); > + } > ceph_async_iput(req->r_target_inode); > if (req->r_dentry) > dput(req->r_dentry); > @@ -2706,8 +2708,10 @@ int ceph_mdsc_submit_request(struct ceph_mds_client *mdsc, struct inode *dir, > /* take CAP_PIN refs for r_inode, r_parent, r_old_dentry */ > if (req->r_inode) > ceph_get_cap_refs(ceph_inode(req->r_inode), CEPH_CAP_PIN); > - if (req->r_parent) > + if (req->r_parent) { > ceph_get_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN); > + ihold(req->r_parent); > + } This might also fix another issue when the mdsc request is timedout and returns to the vfs, then the r_parent maybe released in vfs. And then if we reference it again in mdsc handle_reply() --> ceph_mdsc_release_request(), some unknown issues may happen later ?? > if (req->r_old_dentry_dir) > ceph_get_cap_refs(ceph_inode(req->r_old_dentry_dir), > CEPH_CAP_PIN);
On Thu, 2020-01-09 at 10:05 +0800, Xiubo Li wrote: > On 2020/1/6 23:35, Jeff Layton wrote: > > Currently, we just assume that it will stick around by virtue of the > > submitter's reference, but later patches will allow the syscall to > > return early and we can't rely on that reference at that point. > > > > Take an extra reference to the inode when setting r_parent and release > > it when releasing the request. > > > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > --- > > fs/ceph/mds_client.c | 8 ++++++-- > > 1 file changed, 6 insertions(+), 2 deletions(-) > > > > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c > > index 94cce2ab92c4..b7122f682678 100644 > > --- a/fs/ceph/mds_client.c > > +++ b/fs/ceph/mds_client.c > > @@ -708,8 +708,10 @@ void ceph_mdsc_release_request(struct kref *kref) > > /* avoid calling iput_final() in mds dispatch threads */ > > ceph_async_iput(req->r_inode); > > } > > - if (req->r_parent) > > + if (req->r_parent) { > > ceph_put_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN); > > + ceph_async_iput(req->r_parent); > > + } > > ceph_async_iput(req->r_target_inode); > > if (req->r_dentry) > > dput(req->r_dentry); > > @@ -2706,8 +2708,10 @@ int ceph_mdsc_submit_request(struct ceph_mds_client *mdsc, struct inode *dir, > > /* take CAP_PIN refs for r_inode, r_parent, r_old_dentry */ > > if (req->r_inode) > > ceph_get_cap_refs(ceph_inode(req->r_inode), CEPH_CAP_PIN); > > - if (req->r_parent) > > + if (req->r_parent) { > > ceph_get_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN); > > + ihold(req->r_parent); > > + } > > This might also fix another issue when the mdsc request is timedout and > returns to the vfs, then the r_parent maybe released in vfs. And then if > we reference it again in mdsc handle_reply() --> > ceph_mdsc_release_request(), some unknown issues may happen later ?? > AIUI, when a timeout occurs, the req is unhashed such that handle_reply can't find it. So, I doubt this affects that one way or another. > > if (req->r_old_dentry_dir) > > ceph_get_cap_refs(ceph_inode(req->r_old_dentry_dir), > > CEPH_CAP_PIN); > >
On 2020/1/9 19:20, Jeff Layton wrote: > On Thu, 2020-01-09 at 10:05 +0800, Xiubo Li wrote: >> On 2020/1/6 23:35, Jeff Layton wrote: >>> Currently, we just assume that it will stick around by virtue of the >>> submitter's reference, but later patches will allow the syscall to >>> return early and we can't rely on that reference at that point. >>> >>> Take an extra reference to the inode when setting r_parent and release >>> it when releasing the request. >>> >>> Signed-off-by: Jeff Layton <jlayton@kernel.org> >>> --- >>> fs/ceph/mds_client.c | 8 ++++++-- >>> 1 file changed, 6 insertions(+), 2 deletions(-) >>> >>> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c >>> index 94cce2ab92c4..b7122f682678 100644 >>> --- a/fs/ceph/mds_client.c >>> +++ b/fs/ceph/mds_client.c >>> @@ -708,8 +708,10 @@ void ceph_mdsc_release_request(struct kref *kref) >>> /* avoid calling iput_final() in mds dispatch threads */ >>> ceph_async_iput(req->r_inode); >>> } >>> - if (req->r_parent) >>> + if (req->r_parent) { >>> ceph_put_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN); >>> + ceph_async_iput(req->r_parent); >>> + } >>> ceph_async_iput(req->r_target_inode); >>> if (req->r_dentry) >>> dput(req->r_dentry); >>> @@ -2706,8 +2708,10 @@ int ceph_mdsc_submit_request(struct ceph_mds_client *mdsc, struct inode *dir, >>> /* take CAP_PIN refs for r_inode, r_parent, r_old_dentry */ >>> if (req->r_inode) >>> ceph_get_cap_refs(ceph_inode(req->r_inode), CEPH_CAP_PIN); >>> - if (req->r_parent) >>> + if (req->r_parent) { >>> ceph_get_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN); >>> + ihold(req->r_parent); >>> + } >> This might also fix another issue when the mdsc request is timedout and >> returns to the vfs, then the r_parent maybe released in vfs. And then if >> we reference it again in mdsc handle_reply() --> >> ceph_mdsc_release_request(), some unknown issues may happen later ?? >> > AIUI, when a timeout occurs, the req is unhashed such that handle_reply > can't find it. So, I doubt this affects that one way or another. If my understanding is correct, such as for rmdir(), the logic will be : req = ceph_mdsc_create_request() // ref == 1 ceph_mdsc_do_request(req) --> ceph_mdsc_submit_request(req) --> __register_request(req) // ref == 2 ceph_mdsc_wait_request(req) // If timedout ceph_mdsc_put_request(req) // ref == 1 Then in handled_reply(), only when we get a safe reply it will call __unregister_request(req), then the ref could be 0. Though it will ihold()/ceph_async_iput() the req->r_unsafe_dir(= req->r_parent) , but the _iput() will be called just before we reference the req->r_parent in the _relase_request(). And the _iput() here may call the iput_final(). BRs >>> if (req->r_old_dentry_dir) >>> ceph_get_cap_refs(ceph_inode(req->r_old_dentry_dir), >>> CEPH_CAP_PIN); >>
On Thu, 2020-01-09 at 21:16 +0800, Xiubo Li wrote: > On 2020/1/9 19:20, Jeff Layton wrote: > > On Thu, 2020-01-09 at 10:05 +0800, Xiubo Li wrote: > > > On 2020/1/6 23:35, Jeff Layton wrote: > > > > Currently, we just assume that it will stick around by virtue of the > > > > submitter's reference, but later patches will allow the syscall to > > > > return early and we can't rely on that reference at that point. > > > > > > > > Take an extra reference to the inode when setting r_parent and release > > > > it when releasing the request. > > > > > > > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > > > --- > > > > fs/ceph/mds_client.c | 8 ++++++-- > > > > 1 file changed, 6 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c > > > > index 94cce2ab92c4..b7122f682678 100644 > > > > --- a/fs/ceph/mds_client.c > > > > +++ b/fs/ceph/mds_client.c > > > > @@ -708,8 +708,10 @@ void ceph_mdsc_release_request(struct kref *kref) > > > > /* avoid calling iput_final() in mds dispatch threads */ > > > > ceph_async_iput(req->r_inode); > > > > } > > > > - if (req->r_parent) > > > > + if (req->r_parent) { > > > > ceph_put_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN); > > > > + ceph_async_iput(req->r_parent); > > > > + } > > > > ceph_async_iput(req->r_target_inode); > > > > if (req->r_dentry) > > > > dput(req->r_dentry); > > > > @@ -2706,8 +2708,10 @@ int ceph_mdsc_submit_request(struct ceph_mds_client *mdsc, struct inode *dir, > > > > /* take CAP_PIN refs for r_inode, r_parent, r_old_dentry */ > > > > if (req->r_inode) > > > > ceph_get_cap_refs(ceph_inode(req->r_inode), CEPH_CAP_PIN); > > > > - if (req->r_parent) > > > > + if (req->r_parent) { > > > > ceph_get_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN); > > > > + ihold(req->r_parent); > > > > + } > > > This might also fix another issue when the mdsc request is timedout and > > > returns to the vfs, then the r_parent maybe released in vfs. And then if > > > we reference it again in mdsc handle_reply() --> > > > ceph_mdsc_release_request(), some unknown issues may happen later ?? > > > > > AIUI, when a timeout occurs, the req is unhashed such that handle_reply > > can't find it. So, I doubt this affects that one way or another. > > If my understanding is correct, such as for rmdir(), the logic will be : > > req = ceph_mdsc_create_request() // ref == 1 > > ceph_mdsc_do_request(req) --> > > ceph_mdsc_submit_request(req) --> > > __register_request(req) // ref == 2 > > ceph_mdsc_wait_request(req) // If timedout > > ceph_mdsc_put_request(req) // ref == 1 > > Then in handled_reply(), only when we get a safe reply it will call > __unregister_request(req), then the ref could be 0. > > Though it will ihold()/ceph_async_iput() the req->r_unsafe_dir(= > req->r_parent) , but the _iput() will be called just before we reference > the req->r_parent in the _relase_request(). And the _iput() here may > call the iput_final(). > I take it back, I think you're right. This likely would fix that issue up. I'll plan to add a note about that to the changelog before I merge it. Should we mark this for stable in light of that?
On 2020/1/9 21:33, Jeff Layton wrote: > On Thu, 2020-01-09 at 21:16 +0800, Xiubo Li wrote: >> On 2020/1/9 19:20, Jeff Layton wrote: >>> On Thu, 2020-01-09 at 10:05 +0800, Xiubo Li wrote: >>>> On 2020/1/6 23:35, Jeff Layton wrote: >>>>> Currently, we just assume that it will stick around by virtue of the >>>>> submitter's reference, but later patches will allow the syscall to >>>>> return early and we can't rely on that reference at that point. >>>>> >>>>> Take an extra reference to the inode when setting r_parent and release >>>>> it when releasing the request. >>>>> >>>>> Signed-off-by: Jeff Layton <jlayton@kernel.org> >>>>> --- >>>>> fs/ceph/mds_client.c | 8 ++++++-- >>>>> 1 file changed, 6 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c >>>>> index 94cce2ab92c4..b7122f682678 100644 >>>>> --- a/fs/ceph/mds_client.c >>>>> +++ b/fs/ceph/mds_client.c >>>>> @@ -708,8 +708,10 @@ void ceph_mdsc_release_request(struct kref *kref) >>>>> /* avoid calling iput_final() in mds dispatch threads */ >>>>> ceph_async_iput(req->r_inode); >>>>> } >>>>> - if (req->r_parent) >>>>> + if (req->r_parent) { >>>>> ceph_put_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN); >>>>> + ceph_async_iput(req->r_parent); >>>>> + } >>>>> ceph_async_iput(req->r_target_inode); >>>>> if (req->r_dentry) >>>>> dput(req->r_dentry); >>>>> @@ -2706,8 +2708,10 @@ int ceph_mdsc_submit_request(struct ceph_mds_client *mdsc, struct inode *dir, >>>>> /* take CAP_PIN refs for r_inode, r_parent, r_old_dentry */ >>>>> if (req->r_inode) >>>>> ceph_get_cap_refs(ceph_inode(req->r_inode), CEPH_CAP_PIN); >>>>> - if (req->r_parent) >>>>> + if (req->r_parent) { >>>>> ceph_get_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN); >>>>> + ihold(req->r_parent); >>>>> + } >>>> This might also fix another issue when the mdsc request is timedout and >>>> returns to the vfs, then the r_parent maybe released in vfs. And then if >>>> we reference it again in mdsc handle_reply() --> >>>> ceph_mdsc_release_request(), some unknown issues may happen later ?? >>>> >>> AIUI, when a timeout occurs, the req is unhashed such that handle_reply >>> can't find it. So, I doubt this affects that one way or another. >> If my understanding is correct, such as for rmdir(), the logic will be : >> >> req = ceph_mdsc_create_request() // ref == 1 >> >> ceph_mdsc_do_request(req) --> >> >> ceph_mdsc_submit_request(req) --> >> >> __register_request(req) // ref == 2 >> >> ceph_mdsc_wait_request(req) // If timedout >> >> ceph_mdsc_put_request(req) // ref == 1 >> >> Then in handled_reply(), only when we get a safe reply it will call >> __unregister_request(req), then the ref could be 0. >> >> Though it will ihold()/ceph_async_iput() the req->r_unsafe_dir(= >> req->r_parent) , but the _iput() will be called just before we reference >> the req->r_parent in the _relase_request(). And the _iput() here may >> call the iput_final(). >> > I take it back, I think you're right. This likely would fix that issue > up. I'll plan to add a note about that to the changelog before I merge > it. Should we mark this for stable in light of that? Yeah, right :-)
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 94cce2ab92c4..b7122f682678 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -708,8 +708,10 @@ void ceph_mdsc_release_request(struct kref *kref) /* avoid calling iput_final() in mds dispatch threads */ ceph_async_iput(req->r_inode); } - if (req->r_parent) + if (req->r_parent) { ceph_put_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN); + ceph_async_iput(req->r_parent); + } ceph_async_iput(req->r_target_inode); if (req->r_dentry) dput(req->r_dentry); @@ -2706,8 +2708,10 @@ int ceph_mdsc_submit_request(struct ceph_mds_client *mdsc, struct inode *dir, /* take CAP_PIN refs for r_inode, r_parent, r_old_dentry */ if (req->r_inode) ceph_get_cap_refs(ceph_inode(req->r_inode), CEPH_CAP_PIN); - if (req->r_parent) + if (req->r_parent) { ceph_get_cap_refs(ceph_inode(req->r_parent), CEPH_CAP_PIN); + ihold(req->r_parent); + } if (req->r_old_dentry_dir) ceph_get_cap_refs(ceph_inode(req->r_old_dentry_dir), CEPH_CAP_PIN);
Currently, we just assume that it will stick around by virtue of the submitter's reference, but later patches will allow the syscall to return early and we can't rely on that reference at that point. Take an extra reference to the inode when setting r_parent and release it when releasing the request. Signed-off-by: Jeff Layton <jlayton@kernel.org> --- fs/ceph/mds_client.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)