Message ID | 20210302194659.98563-1-dai.ngo@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | NFSv4.2: destination file needs to be released after inter-server copy is done. | expand |
On Tue, Mar 2, 2021 at 2:47 PM Dai Ngo <dai.ngo@oracle.com> wrote: > > This patch is to fix the resource leak problem of the source file > when doing inter-server copy. The fix is to close and release the > file in __nfs42_ssc_close after the copy is done. > > Signed-off-by: Dai Ngo <dai.ngo@oracle.com> > --- > fs/nfs/nfs4file.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c > index 57b3821d975a..20163fe702a7 100644 > --- a/fs/nfs/nfs4file.c > +++ b/fs/nfs/nfs4file.c > @@ -405,6 +405,12 @@ static void __nfs42_ssc_close(struct file *filep) > struct nfs_open_context *ctx = nfs_file_open_context(filep); > > ctx->state->flags = 0; > + > + if (!filep) > + return; > + get_file(filep); > + filp_close(filep, NULL); > + fput(filep); > } I don't understand this logic. There is no reason to call filp_close()? All this would be done by doing a fput(). Also fput() would drop a reference on the mount point. So we are doing this then we can't call that extra disconnect that was added by another patch. Anyway I don't see why there is any reason to call anything but the fput(), I'm not sure why __nfs42_ssc_close() is a better function and doesn't lead to the "use_after_free". > > static const struct nfs4_ssc_client_ops nfs4_ssc_clnt_ops_tbl = { > -- > 2.9.5 >
Thanks Olga for reviewing the patch, reply inline below: On 3/8/21 10:35 AM, Olga Kornievskaia wrote: > On Tue, Mar 2, 2021 at 2:47 PM Dai Ngo <dai.ngo@oracle.com> wrote: >> This patch is to fix the resource leak problem of the source file >> when doing inter-server copy. The fix is to close and release the >> file in __nfs42_ssc_close after the copy is done. >> >> Signed-off-by: Dai Ngo <dai.ngo@oracle.com> >> --- >> fs/nfs/nfs4file.c | 6 ++++++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c >> index 57b3821d975a..20163fe702a7 100644 >> --- a/fs/nfs/nfs4file.c >> +++ b/fs/nfs/nfs4file.c >> @@ -405,6 +405,12 @@ static void __nfs42_ssc_close(struct file *filep) >> struct nfs_open_context *ctx = nfs_file_open_context(filep); >> >> ctx->state->flags = 0; >> + >> + if (!filep) >> + return; >> + get_file(filep); >> + filp_close(filep, NULL); >> + fput(filep); >> } > I don't understand this logic. There is no reason to call > filp_close()? This is to follow the steps done in nfsd_file_put/.../nfsd_file_free. However since this is the source file the flush is probably not needed, just there to be safe. I will remove it. > All this would be done by doing a fput(). Also fput() > would drop a reference on the mount point. So we are doing this then > we can't call that extra disconnect that was added by another patch. nfsd4_interssc_disconnect does not need to access the source file. I tested both patches together and did not see any problem. If there is use-after_free condition the code detects it and there would be warning messages in /var/log/messages. > Anyway I don't see why there is any reason to call anything but the > fput(), I'm not sure why __nfs42_ssc_close() is a better function and > doesn't lead to the "use_after_free". Since __nfs42_ssc_open was called open the file, I think __nfs42_ssc_close is an appropriate place to close the file. -Dai > >> static const struct nfs4_ssc_client_ops nfs4_ssc_clnt_ops_tbl = { >> -- >> 2.9.5 >>
On Mon, Mar 8, 2021 at 3:10 PM <dai.ngo@oracle.com> wrote: > > Thanks Olga for reviewing the patch, reply inline below: > > On 3/8/21 10:35 AM, Olga Kornievskaia wrote: > > On Tue, Mar 2, 2021 at 2:47 PM Dai Ngo <dai.ngo@oracle.com> wrote: > >> This patch is to fix the resource leak problem of the source file > >> when doing inter-server copy. The fix is to close and release the > >> file in __nfs42_ssc_close after the copy is done. > >> > >> Signed-off-by: Dai Ngo <dai.ngo@oracle.com> > >> --- > >> fs/nfs/nfs4file.c | 6 ++++++ > >> 1 file changed, 6 insertions(+) > >> > >> diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c > >> index 57b3821d975a..20163fe702a7 100644 > >> --- a/fs/nfs/nfs4file.c > >> +++ b/fs/nfs/nfs4file.c > >> @@ -405,6 +405,12 @@ static void __nfs42_ssc_close(struct file *filep) > >> struct nfs_open_context *ctx = nfs_file_open_context(filep); > >> > >> ctx->state->flags = 0; > >> + > >> + if (!filep) > >> + return; > >> + get_file(filep); > >> + filp_close(filep, NULL); > >> + fput(filep); > >> } > > I don't understand this logic. There is no reason to call > > filp_close()? > > This is to follow the steps done in nfsd_file_put/.../nfsd_file_free. > However since this is the source file the flush is probably not needed, > just there to be safe. I will remove it. As I said before the only thing that's needed is the fput() which was originally in the code (but got incorrectly changed to nfsd_put()). I prefer to keep it there because this deals with cleaning up the SSC state. I don't see any significant reason to move it out. > > All this would be done by doing a fput(). Also fput() > > would drop a reference on the mount point. So we are doing this then > > we can't call that extra disconnect that was added by another patch. > > nfsd4_interssc_disconnect does not need to access the source file. > I tested both patches together and did not see any problem. If there > is use-after_free condition the code detects it and there would be > warning messages in /var/log/messages. We don't need and don't want to do the nfsd4_interssc_disconnect in the non-error path. All the ref-counting on the superblock is accomplished already. The other patch is not needed and neither is correct, it makes the incorrect refcounts in failure cases. I nack both patches. The only patch which I will send that's needed is this: diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c index 8d6d2678abad..3581ce737e85 100644 --- a/fs/nfsd/nfs4proc.c +++ b/fs/nfsd/nfs4proc.c @@ -1304,7 +1304,7 @@ nfsd4_cleanup_inter_ssc(struct vfsmount *ss_mnt, struct nfsd_file *src, struct nfsd_file *dst) { nfs42_ssc_close(src->nf_file); - /* 'src' is freed by nfsd4_do_async_copy */ + fput(src->nf_file); nfsd_file_put(dst); mntput(ss_mnt); } > > Anyway I don't see why there is any reason to call anything but the > > fput(), I'm not sure why __nfs42_ssc_close() is a better function and > > doesn't lead to the "use_after_free". > > Since __nfs42_ssc_open was called open the file, I think __nfs42_ssc_close > is an appropriate place to close the file. Again let's keep the cleaning of the server's SSC state in one place. > -Dai > > > > >> static const struct nfs4_ssc_client_ops nfs4_ssc_clnt_ops_tbl = { > >> -- > >> 2.9.5 > >>
diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c index 57b3821d975a..20163fe702a7 100644 --- a/fs/nfs/nfs4file.c +++ b/fs/nfs/nfs4file.c @@ -405,6 +405,12 @@ static void __nfs42_ssc_close(struct file *filep) struct nfs_open_context *ctx = nfs_file_open_context(filep); ctx->state->flags = 0; + + if (!filep) + return; + get_file(filep); + filp_close(filep, NULL); + fput(filep); } static const struct nfs4_ssc_client_ops nfs4_ssc_clnt_ops_tbl = {
This patch is to fix the resource leak problem of the source file when doing inter-server copy. The fix is to close and release the file in __nfs42_ssc_close after the copy is done. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> --- fs/nfs/nfs4file.c | 6 ++++++ 1 file changed, 6 insertions(+)