diff mbox series

NFSv4.2: destination file needs to be released after inter-server copy is done.

Message ID 20210302194659.98563-1-dai.ngo@oracle.com (mailing list archive)
State New, archived
Headers show
Series NFSv4.2: destination file needs to be released after inter-server copy is done. | expand

Commit Message

Dai Ngo March 2, 2021, 7:46 p.m. UTC
This patch is to fix the resource leak problem of the source file
when doing inter-server copy. The fix is to close and release the
file in __nfs42_ssc_close after the copy is done.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
---
 fs/nfs/nfs4file.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Olga Kornievskaia March 8, 2021, 6:35 p.m. UTC | #1
On Tue, Mar 2, 2021 at 2:47 PM Dai Ngo <dai.ngo@oracle.com> wrote:
>
> This patch is to fix the resource leak problem of the source file
> when doing inter-server copy. The fix is to close and release the
> file in __nfs42_ssc_close after the copy is done.
>
> Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
> ---
>  fs/nfs/nfs4file.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
> index 57b3821d975a..20163fe702a7 100644
> --- a/fs/nfs/nfs4file.c
> +++ b/fs/nfs/nfs4file.c
> @@ -405,6 +405,12 @@ static void __nfs42_ssc_close(struct file *filep)
>         struct nfs_open_context *ctx = nfs_file_open_context(filep);
>
>         ctx->state->flags = 0;
> +
> +       if (!filep)
> +               return;
> +       get_file(filep);
> +       filp_close(filep, NULL);
> +       fput(filep);
>  }

I don't understand this logic. There is no reason to call
filp_close()? All this would be done by doing a fput(). Also fput()
would drop a reference on the mount point. So we are doing this then
we can't call that extra disconnect that was added by another patch.
Anyway I don't see why there is any reason to call anything but the
fput(), I'm not sure why __nfs42_ssc_close() is a better function and
doesn't lead to the "use_after_free".

>
>  static const struct nfs4_ssc_client_ops nfs4_ssc_clnt_ops_tbl = {
> --
> 2.9.5
>
Dai Ngo March 8, 2021, 8:10 p.m. UTC | #2
Thanks Olga for reviewing the patch, reply inline below:

On 3/8/21 10:35 AM, Olga Kornievskaia wrote:
> On Tue, Mar 2, 2021 at 2:47 PM Dai Ngo <dai.ngo@oracle.com> wrote:
>> This patch is to fix the resource leak problem of the source file
>> when doing inter-server copy. The fix is to close and release the
>> file in __nfs42_ssc_close after the copy is done.
>>
>> Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
>> ---
>>   fs/nfs/nfs4file.c | 6 ++++++
>>   1 file changed, 6 insertions(+)
>>
>> diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
>> index 57b3821d975a..20163fe702a7 100644
>> --- a/fs/nfs/nfs4file.c
>> +++ b/fs/nfs/nfs4file.c
>> @@ -405,6 +405,12 @@ static void __nfs42_ssc_close(struct file *filep)
>>          struct nfs_open_context *ctx = nfs_file_open_context(filep);
>>
>>          ctx->state->flags = 0;
>> +
>> +       if (!filep)
>> +               return;
>> +       get_file(filep);
>> +       filp_close(filep, NULL);
>> +       fput(filep);
>>   }
> I don't understand this logic. There is no reason to call
> filp_close()?

This is to follow the steps done in nfsd_file_put/.../nfsd_file_free.
However since this is the source file the flush is probably not needed,
just there to be safe. I will remove it.

> All this would be done by doing a fput(). Also fput()
> would drop a reference on the mount point. So we are doing this then
> we can't call that extra disconnect that was added by another patch.

nfsd4_interssc_disconnect does not need to access the source file.
I tested both patches together and did not see any problem. If there
is use-after_free condition the code detects it and there would be
warning messages in /var/log/messages.

> Anyway I don't see why there is any reason to call anything but the
> fput(), I'm not sure why __nfs42_ssc_close() is a better function and
> doesn't lead to the "use_after_free".

Since __nfs42_ssc_open was called open the file, I think __nfs42_ssc_close
is an appropriate place to close the file.

-Dai

>
>>   static const struct nfs4_ssc_client_ops nfs4_ssc_clnt_ops_tbl = {
>> --
>> 2.9.5
>>
Olga Kornievskaia March 9, 2021, 1:57 p.m. UTC | #3
On Mon, Mar 8, 2021 at 3:10 PM <dai.ngo@oracle.com> wrote:
>
> Thanks Olga for reviewing the patch, reply inline below:
>
> On 3/8/21 10:35 AM, Olga Kornievskaia wrote:
> > On Tue, Mar 2, 2021 at 2:47 PM Dai Ngo <dai.ngo@oracle.com> wrote:
> >> This patch is to fix the resource leak problem of the source file
> >> when doing inter-server copy. The fix is to close and release the
> >> file in __nfs42_ssc_close after the copy is done.
> >>
> >> Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
> >> ---
> >>   fs/nfs/nfs4file.c | 6 ++++++
> >>   1 file changed, 6 insertions(+)
> >>
> >> diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
> >> index 57b3821d975a..20163fe702a7 100644
> >> --- a/fs/nfs/nfs4file.c
> >> +++ b/fs/nfs/nfs4file.c
> >> @@ -405,6 +405,12 @@ static void __nfs42_ssc_close(struct file *filep)
> >>          struct nfs_open_context *ctx = nfs_file_open_context(filep);
> >>
> >>          ctx->state->flags = 0;
> >> +
> >> +       if (!filep)
> >> +               return;
> >> +       get_file(filep);
> >> +       filp_close(filep, NULL);
> >> +       fput(filep);
> >>   }
> > I don't understand this logic. There is no reason to call
> > filp_close()?
>
> This is to follow the steps done in nfsd_file_put/.../nfsd_file_free.
> However since this is the source file the flush is probably not needed,
> just there to be safe. I will remove it.

As I said before the only thing that's needed is the fput() which was
originally in the code (but got incorrectly changed to nfsd_put()). I
prefer to keep it there because this deals with cleaning up the SSC
state. I don't see any significant reason to move it out.

> > All this would be done by doing a fput(). Also fput()
> > would drop a reference on the mount point. So we are doing this then
> > we can't call that extra disconnect that was added by another patch.
>
> nfsd4_interssc_disconnect does not need to access the source file.
> I tested both patches together and did not see any problem. If there
> is use-after_free condition the code detects it and there would be
> warning messages in /var/log/messages.

We don't need and don't want to do the nfsd4_interssc_disconnect in
the non-error path. All the ref-counting on the superblock is
accomplished already. The other patch is not needed and neither is
correct, it makes the incorrect refcounts in failure cases.

I nack both patches. The only patch which I will send that's needed is this:

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 8d6d2678abad..3581ce737e85 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1304,7 +1304,7 @@ nfsd4_cleanup_inter_ssc(struct vfsmount *ss_mnt,
struct nfsd_file *src,
                        struct nfsd_file *dst)
 {
        nfs42_ssc_close(src->nf_file);
-       /* 'src' is freed by nfsd4_do_async_copy */
+       fput(src->nf_file);
        nfsd_file_put(dst);
        mntput(ss_mnt);
 }

> > Anyway I don't see why there is any reason to call anything but the
> > fput(), I'm not sure why __nfs42_ssc_close() is a better function and
> > doesn't lead to the "use_after_free".
>
> Since __nfs42_ssc_open was called open the file, I think __nfs42_ssc_close
> is an appropriate place to close the file.

Again let's keep the cleaning of the server's SSC state in one place.



> -Dai
>
> >
> >>   static const struct nfs4_ssc_client_ops nfs4_ssc_clnt_ops_tbl = {
> >> --
> >> 2.9.5
> >>
diff mbox series

Patch

diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
index 57b3821d975a..20163fe702a7 100644
--- a/fs/nfs/nfs4file.c
+++ b/fs/nfs/nfs4file.c
@@ -405,6 +405,12 @@  static void __nfs42_ssc_close(struct file *filep)
 	struct nfs_open_context *ctx = nfs_file_open_context(filep);
 
 	ctx->state->flags = 0;
+
+	if (!filep)
+		return;
+	get_file(filep);
+	filp_close(filep, NULL);
+	fput(filep);
 }
 
 static const struct nfs4_ssc_client_ops nfs4_ssc_clnt_ops_tbl = {