diff mbox

[1/5] NFSv4: Ensure we reference the inode for return-on-close in delegreturn

Message ID 1423175831-54558-1-git-send-email-trond.myklebust@primarydata.com (mailing list archive)
State New, archived
Headers show

Commit Message

Trond Myklebust Feb. 5, 2015, 10:37 p.m. UTC
If we have to do a return-on-close in the delegreturn code, then
we must ensure that the inode and super block remain referenced.

Cc: Peng Tao <tao.peng@primarydata.com>
Cc: stable@vger.kernel.org # 3.17.x
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfs/nfs4proc.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

Comments

Peng Tao Feb. 6, 2015, 1:45 a.m. UTC | #1
On Fri, Feb 6, 2015 at 6:37 AM, Trond Myklebust
<trond.myklebust@primarydata.com> wrote:
> If we have to do a return-on-close in the delegreturn code, then
> we must ensure that the inode and super block remain referenced.
>
looks good. One nit is that maybe it's better to reuse the two helpers
in your 2ed patch.

Reviewed-by: Peng Tao <tao.peng@primarydata.com>
> Cc: Peng Tao <tao.peng@primarydata.com>
> Cc: stable@vger.kernel.org # 3.17.x
> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
> ---
>  fs/nfs/nfs4proc.c | 19 ++++++++++++++-----
>  1 file changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index cd4295d84d54..b803c1d363e7 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -5175,9 +5175,16 @@ static void nfs4_delegreturn_done(struct rpc_task *task, void *calldata)
>  static void nfs4_delegreturn_release(void *calldata)
>  {
>         struct nfs4_delegreturndata *data = calldata;
> +       struct inode *inode = data->inode;
> +
> +       if (inode) {
> +               struct super_block *sb = inode->i_sb;
>
> -       if (data->roc)
> -               pnfs_roc_release(data->inode);
> +               if (data->roc)
> +                       pnfs_roc_release(inode);
> +               iput(inode);
> +               nfs_sb_deactive(sb);
> +       }
>         kfree(calldata);
>  }
>
> @@ -5234,9 +5241,11 @@ static int _nfs4_proc_delegreturn(struct inode *inode, struct rpc_cred *cred, co
>         nfs_fattr_init(data->res.fattr);
>         data->timestamp = jiffies;
>         data->rpc_status = 0;
> -       data->inode = inode;
> -       data->roc = list_empty(&NFS_I(inode)->open_files) ?
> -                   pnfs_roc(inode) : false;
> +       data->inode = igrab(inode);
> +       if (data->inode) {
> +               nfs_sb_active(inode->i_sb);
> +               data->roc = nfs4_roc(inode);
> +       }
>
>         task_setup_data.callback_data = data;
>         msg.rpc_argp = &data->args;
> --
> 2.1.0
>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peng Tao Feb. 6, 2015, 1:57 a.m. UTC | #2
On Fri, Feb 6, 2015 at 9:45 AM, Peng Tao <tao.peng@primarydata.com> wrote:
> On Fri, Feb 6, 2015 at 6:37 AM, Trond Myklebust
> <trond.myklebust@primarydata.com> wrote:
>> If we have to do a return-on-close in the delegreturn code, then
>> we must ensure that the inode and super block remain referenced.
>>
ah, a second thought. I looked for call sites of nfs_sb_active() and
it gets called at five places in current tree:
alloc_nfs_open_context, nfs4_opendata_alloc, nfs4_do_close,
nfs_do_call_unlink, nfs_do_call_unlink

So it appears that sb is activated while any file keeps opened and
between unlink calls. Then it looks that we are allowed to keep
delegations after sb is released? Maybe the best way to fix the sb
reference part is to pin sb when getting the first delegation.

Cheers,
Tao

> looks good. One nit is that maybe it's better to reuse the two helpers
> in your 2ed patch.
>
> Reviewed-by: Peng Tao <tao.peng@primarydata.com>
>> Cc: Peng Tao <tao.peng@primarydata.com>
>> Cc: stable@vger.kernel.org # 3.17.x
>> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
>> ---
>>  fs/nfs/nfs4proc.c | 19 ++++++++++++++-----
>>  1 file changed, 14 insertions(+), 5 deletions(-)
>>
>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>> index cd4295d84d54..b803c1d363e7 100644
>> --- a/fs/nfs/nfs4proc.c
>> +++ b/fs/nfs/nfs4proc.c
>> @@ -5175,9 +5175,16 @@ static void nfs4_delegreturn_done(struct rpc_task *task, void *calldata)
>>  static void nfs4_delegreturn_release(void *calldata)
>>  {
>>         struct nfs4_delegreturndata *data = calldata;
>> +       struct inode *inode = data->inode;
>> +
>> +       if (inode) {
>> +               struct super_block *sb = inode->i_sb;
>>
>> -       if (data->roc)
>> -               pnfs_roc_release(data->inode);
>> +               if (data->roc)
>> +                       pnfs_roc_release(inode);
>> +               iput(inode);
>> +               nfs_sb_deactive(sb);
>> +       }
>>         kfree(calldata);
>>  }
>>
>> @@ -5234,9 +5241,11 @@ static int _nfs4_proc_delegreturn(struct inode *inode, struct rpc_cred *cred, co
>>         nfs_fattr_init(data->res.fattr);
>>         data->timestamp = jiffies;
>>         data->rpc_status = 0;
>> -       data->inode = inode;
>> -       data->roc = list_empty(&NFS_I(inode)->open_files) ?
>> -                   pnfs_roc(inode) : false;
>> +       data->inode = igrab(inode);
>> +       if (data->inode) {
>> +               nfs_sb_active(inode->i_sb);
>> +               data->roc = nfs4_roc(inode);
>> +       }
>>
>>         task_setup_data.callback_data = data;
>>         msg.rpc_argp = &data->args;
>> --
>> 2.1.0
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Trond Myklebust Feb. 6, 2015, 2:53 a.m. UTC | #3
On Thu, Feb 5, 2015 at 8:57 PM, Peng Tao <tao.peng@primarydata.com> wrote:
>
> On Fri, Feb 6, 2015 at 9:45 AM, Peng Tao <tao.peng@primarydata.com> wrote:
> > On Fri, Feb 6, 2015 at 6:37 AM, Trond Myklebust
> > <trond.myklebust@primarydata.com> wrote:
> >> If we have to do a return-on-close in the delegreturn code, then
> >> we must ensure that the inode and super block remain referenced.
> >>
> ah, a second thought. I looked for call sites of nfs_sb_active() and
> it gets called at five places in current tree:
> alloc_nfs_open_context, nfs4_opendata_alloc, nfs4_do_close,
> nfs_do_call_unlink, nfs_do_call_unlink
>
> So it appears that sb is activated while any file keeps opened and
> between unlink calls. Then it looks that we are allowed to keep
> delegations after sb is released? Maybe the best way to fix the sb
> reference part is to pin sb when getting the first delegation.

The superblock reference is only there in order to allow us to perform
asynchronous delegreturns without any danger. The problem here is that
we'd end up pinning the superblock even after umount if there are
still unreturned delegations.
That said, I do see that there is a problem with calling
nfs_sb_active() when sb->s_active is zero, so I think I'd like to fix
that up.
Peng Tao Feb. 6, 2015, 2:54 a.m. UTC | #4
On Fri, Feb 6, 2015 at 9:57 AM, Peng Tao <tao.peng@primarydata.com> wrote:
> On Fri, Feb 6, 2015 at 9:45 AM, Peng Tao <tao.peng@primarydata.com> wrote:
>> On Fri, Feb 6, 2015 at 6:37 AM, Trond Myklebust
>> <trond.myklebust@primarydata.com> wrote:
>>> If we have to do a return-on-close in the delegreturn code, then
>>> we must ensure that the inode and super block remain referenced.
>>>
> ah, a second thought. I looked for call sites of nfs_sb_active() and
> it gets called at five places in current tree:
> alloc_nfs_open_context, nfs4_opendata_alloc, nfs4_do_close,
> nfs_do_call_unlink, nfs_do_call_unlink
>
> So it appears that sb is activated while any file keeps opened and
> between unlink calls. Then it looks that we are allowed to keep
> delegations after sb is released? Maybe the best way to fix the sb
> reference part is to pin sb when getting the first delegation.
>
err, that cannot be working at all... by pinning super block, we
prevent umount from happening. and we are returning delegations while
shutting down nfs_server and evicting inode. I see your point that the
patch actually intends to deal with race between async delegation
return vs. nfs_server shutting down and inode eviction.

oops! sorry for the noise... you patch is definitely the way we want to go.

Cheers,
Tao

> Cheers,
> Tao
>
>> looks good. One nit is that maybe it's better to reuse the two helpers
>> in your 2ed patch.
>>
>> Reviewed-by: Peng Tao <tao.peng@primarydata.com>
>>> Cc: Peng Tao <tao.peng@primarydata.com>
>>> Cc: stable@vger.kernel.org # 3.17.x
>>> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
>>> ---
>>>  fs/nfs/nfs4proc.c | 19 ++++++++++++++-----
>>>  1 file changed, 14 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>>> index cd4295d84d54..b803c1d363e7 100644
>>> --- a/fs/nfs/nfs4proc.c
>>> +++ b/fs/nfs/nfs4proc.c
>>> @@ -5175,9 +5175,16 @@ static void nfs4_delegreturn_done(struct rpc_task *task, void *calldata)
>>>  static void nfs4_delegreturn_release(void *calldata)
>>>  {
>>>         struct nfs4_delegreturndata *data = calldata;
>>> +       struct inode *inode = data->inode;
>>> +
>>> +       if (inode) {
>>> +               struct super_block *sb = inode->i_sb;
>>>
>>> -       if (data->roc)
>>> -               pnfs_roc_release(data->inode);
>>> +               if (data->roc)
>>> +                       pnfs_roc_release(inode);
>>> +               iput(inode);
>>> +               nfs_sb_deactive(sb);
>>> +       }
>>>         kfree(calldata);
>>>  }
>>>
>>> @@ -5234,9 +5241,11 @@ static int _nfs4_proc_delegreturn(struct inode *inode, struct rpc_cred *cred, co
>>>         nfs_fattr_init(data->res.fattr);
>>>         data->timestamp = jiffies;
>>>         data->rpc_status = 0;
>>> -       data->inode = inode;
>>> -       data->roc = list_empty(&NFS_I(inode)->open_files) ?
>>> -                   pnfs_roc(inode) : false;
>>> +       data->inode = igrab(inode);
>>> +       if (data->inode) {
>>> +               nfs_sb_active(inode->i_sb);
>>> +               data->roc = nfs4_roc(inode);
>>> +       }
>>>
>>>         task_setup_data.callback_data = data;
>>>         msg.rpc_argp = &data->args;
>>> --
>>> 2.1.0
>>>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peng Tao Feb. 6, 2015, 3:05 a.m. UTC | #5
On Fri, Feb 6, 2015 at 10:53 AM, Trond Myklebust
<trond.myklebust@primarydata.com> wrote:
> On Thu, Feb 5, 2015 at 8:57 PM, Peng Tao <tao.peng@primarydata.com> wrote:
>>
>> On Fri, Feb 6, 2015 at 9:45 AM, Peng Tao <tao.peng@primarydata.com> wrote:
>> > On Fri, Feb 6, 2015 at 6:37 AM, Trond Myklebust
>> > <trond.myklebust@primarydata.com> wrote:
>> >> If we have to do a return-on-close in the delegreturn code, then
>> >> we must ensure that the inode and super block remain referenced.
>> >>
>> ah, a second thought. I looked for call sites of nfs_sb_active() and
>> it gets called at five places in current tree:
>> alloc_nfs_open_context, nfs4_opendata_alloc, nfs4_do_close,
>> nfs_do_call_unlink, nfs_do_call_unlink
>>
>> So it appears that sb is activated while any file keeps opened and
>> between unlink calls. Then it looks that we are allowed to keep
>> delegations after sb is released? Maybe the best way to fix the sb
>> reference part is to pin sb when getting the first delegation.
>
> The superblock reference is only there in order to allow us to perform
> asynchronous delegreturns without any danger. The problem here is that
> we'd end up pinning the superblock even after umount if there are
> still unreturned delegations.
> That said, I do see that there is a problem with calling
> nfs_sb_active() when sb->s_active is zero, so I think I'd like to fix
> that up.
yeah, I see your point. Thanks for the explanation.

Cheers,
Tao

>
> --
> Trond Myklebust
> Linux NFS client maintainer, PrimaryData
> trond.myklebust@primarydata.com
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index cd4295d84d54..b803c1d363e7 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5175,9 +5175,16 @@  static void nfs4_delegreturn_done(struct rpc_task *task, void *calldata)
 static void nfs4_delegreturn_release(void *calldata)
 {
 	struct nfs4_delegreturndata *data = calldata;
+	struct inode *inode = data->inode;
+
+	if (inode) {
+		struct super_block *sb = inode->i_sb;
 
-	if (data->roc)
-		pnfs_roc_release(data->inode);
+		if (data->roc)
+			pnfs_roc_release(inode);
+		iput(inode);
+		nfs_sb_deactive(sb);
+	}
 	kfree(calldata);
 }
 
@@ -5234,9 +5241,11 @@  static int _nfs4_proc_delegreturn(struct inode *inode, struct rpc_cred *cred, co
 	nfs_fattr_init(data->res.fattr);
 	data->timestamp = jiffies;
 	data->rpc_status = 0;
-	data->inode = inode;
-	data->roc = list_empty(&NFS_I(inode)->open_files) ?
-		    pnfs_roc(inode) : false;
+	data->inode = igrab(inode);
+	if (data->inode) {
+		nfs_sb_active(inode->i_sb);
+		data->roc = nfs4_roc(inode);
+	}
 
 	task_setup_data.callback_data = data;
 	msg.rpc_argp = &data->args;