diff mbox

lockd: fix "list_add double add" caused by legacy signal interface

Message ID 75f4472c-b9e2-6353-3af0-c4939ecfca41@virtuozzo.com (mailing list archive)
State New, archived
Headers show

Commit Message

Vasily Averin Nov. 13, 2017, 4:25 a.m. UTC
restart_grace() uses hardcoded init_net.
It can cause to "list_add double add" in following scenario:

1) nfsd and lockd was started in several net namespaces
2) nfsd in init_net was stopped (lockd was not stopped because
 it have users from another net namespaces)
3) lockd got signal, called restart_grace() -> set_grace_period()
 and enabled lock_manager in hardcoded init_net.
4) nfsd in init_net is started again,
 its lockd_up() calls set_grace_period() and tries to add
 lock_manager into init_net 2nd time.

Jeff Layton suggest:
"Make it safe to call locks_start_grace multiple times on the same
lock_manager. If it's already on the global grace_list, then don't try
to add it again.

With this change, we also need to ensure that the nfsd4 lock manager
initializes the list before we call locks_start_grace. While we're at
it, move the rest of the nfsd_net initialization into
nfs4_state_create_net. I see no reason to have it spread over two
functions like it is today."

Suggested patch was updated to generate warning in described situation.

Suggested-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
---
 fs/nfs_common/grace.c | 6 +++++-
 fs/nfsd/nfs4state.c   | 7 ++++---
 2 files changed, 9 insertions(+), 4 deletions(-)

Comments

Jeffrey Layton Nov. 13, 2017, 11:49 a.m. UTC | #1
On Mon, 2017-11-13 at 07:25 +0300, Vasily Averin wrote:
> restart_grace() uses hardcoded init_net.
> It can cause to "list_add double add" in following scenario:
> 
> 1) nfsd and lockd was started in several net namespaces
> 2) nfsd in init_net was stopped (lockd was not stopped because
>  it have users from another net namespaces)
> 3) lockd got signal, called restart_grace() -> set_grace_period()
>  and enabled lock_manager in hardcoded init_net.
> 4) nfsd in init_net is started again,
>  its lockd_up() calls set_grace_period() and tries to add
>  lock_manager into init_net 2nd time.
> 
> Jeff Layton suggest:
> "Make it safe to call locks_start_grace multiple times on the same
> lock_manager. If it's already on the global grace_list, then don't try
> to add it again.
> 
> With this change, we also need to ensure that the nfsd4 lock manager
> initializes the list before we call locks_start_grace. While we're at
> it, move the rest of the nfsd_net initialization into
> nfs4_state_create_net. I see no reason to have it spread over two
> functions like it is today."
> 
> Suggested patch was updated to generate warning in described situation.
> 
> Suggested-by: Jeff Layton <jlayton@redhat.com>
> Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
> ---
>  fs/nfs_common/grace.c | 6 +++++-
>  fs/nfsd/nfs4state.c   | 7 ++++---
>  2 files changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/nfs_common/grace.c b/fs/nfs_common/grace.c
> index bd3e2d3..5be08f0 100644
> --- a/fs/nfs_common/grace.c
> +++ b/fs/nfs_common/grace.c
> @@ -30,7 +30,11 @@ locks_start_grace(struct net *net, struct lock_manager *lm)
>  	struct list_head *grace_list = net_generic(net, grace_net_id);
>  
>  	spin_lock(&grace_lock);
> -	list_add(&lm->list, grace_list);
> +	if (list_empty(&lm->list))
> +		list_add(&lm->list, grace_list);
> +	else
> +		WARN(1, "double list_add attempt detected in net %x %s\n",
> +		     net->ns.inum, (net == &init_net) ? "(init_net)" : "");
>  	spin_unlock(&grace_lock);
>  }

I'm not sure that warning really means much.

It's not _really_ a bug to request that a new grace period start while
it's already in one. In general, it's ok to request a new grace period
while it's currently enforcing one. That should just have the effect of
extending the existing grace period.

>  EXPORT_SYMBOL_GPL(locks_start_grace);
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 7345143..b29b5a1 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -7103,6 +7103,10 @@ static int nfs4_state_create_net(struct net *net)
>  		INIT_LIST_HEAD(&nn->sessionid_hashtbl[i]);
>  	nn->conf_name_tree = RB_ROOT;
>  	nn->unconf_name_tree = RB_ROOT;
> +	nn->boot_time = get_seconds();
> +	nn->grace_ended = false;
> +	nn->nfsd4_manager.block_opens = true;
> +	INIT_LIST_HEAD(&nn->nfsd4_manager.list);
>  	INIT_LIST_HEAD(&nn->client_lru);
>  	INIT_LIST_HEAD(&nn->close_lru);
>  	INIT_LIST_HEAD(&nn->del_recall_lru);
> @@ -7160,9 +7164,6 @@ nfs4_state_start_net(struct net *net)
>  	ret = nfs4_state_create_net(net);
>  	if (ret)
>  		return ret;
> -	nn->boot_time = get_seconds();
> -	nn->grace_ended = false;
> -	nn->nfsd4_manager.block_opens = true;
>  	locks_start_grace(net, &nn->nfsd4_manager);
>  	nfsd4_client_tracking_init(net);
>  	printk(KERN_INFO "NFSD: starting %ld-second grace period (net %x)\n",
Vasily Averin Nov. 13, 2017, 2:57 p.m. UTC | #2
On 2017-11-13 14:49, Jeff Layton wrote:
> On Mon, 2017-11-13 at 07:25 +0300, Vasily Averin wrote:
>> --- a/fs/nfs_common/grace.c
>> +++ b/fs/nfs_common/grace.c
>> @@ -30,7 +30,11 @@ locks_start_grace(struct net *net, struct lock_manager *lm)
>>  	struct list_head *grace_list = net_generic(net, grace_net_id);
>>  
>>  	spin_lock(&grace_lock);
>> -	list_add(&lm->list, grace_list);
>> +	if (list_empty(&lm->list))
>> +		list_add(&lm->list, grace_list);
>> +	else
>> +		WARN(1, "double list_add attempt detected in net %x %s\n",
>> +		     net->ns.inum, (net == &init_net) ? "(init_net)" : "");
>>  	spin_unlock(&grace_lock);
>>  }
> 
> I'm not sure that warning really means much.
> 
> It's not _really_ a bug to request that a new grace period start while
> it's already in one. In general, it's ok to request a new grace period
> while it's currently enforcing one. That should just have the effect of
> extending the existing grace period.

"double list_add" can happen in init_net when legacy signal in lockd was used.
It should not happen during usual extending of existing grace period,
because restart_grace() calls locks_end_grace() before set_grace_period()
but it can race with start of lockd_up_net() in init_net.
I'm agree: we do not have any bugs in this scenario, all should work correctly.

However I would like to keep WARN to properly detect lost locks_end_grace()/
cancel_delayed_work().

If you worry about real false positive and do not worry about abstract
future troubles in init_net, I can move WARN under (net != &init_net) check.

However I would like to keep this warning here.

On the other hand if you disagree and still believe that WARN is not required here
I'm ready to agree with your original patch version.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeffrey Layton Nov. 13, 2017, 8:06 p.m. UTC | #3
On Mon, 2017-11-13 at 17:57 +0300, Vasily Averin wrote:
> On 2017-11-13 14:49, Jeff Layton wrote:
> > On Mon, 2017-11-13 at 07:25 +0300, Vasily Averin wrote:
> > > --- a/fs/nfs_common/grace.c
> > > +++ b/fs/nfs_common/grace.c
> > > @@ -30,7 +30,11 @@ locks_start_grace(struct net *net, struct lock_manager *lm)
> > >  	struct list_head *grace_list = net_generic(net, grace_net_id);
> > >  
> > >  	spin_lock(&grace_lock);
> > > -	list_add(&lm->list, grace_list);
> > > +	if (list_empty(&lm->list))
> > > +		list_add(&lm->list, grace_list);
> > > +	else
> > > +		WARN(1, "double list_add attempt detected in net %x %s\n",
> > > +		     net->ns.inum, (net == &init_net) ? "(init_net)" : "");
> > >  	spin_unlock(&grace_lock);
> > >  }
> > 
> > I'm not sure that warning really means much.
> > 
> > It's not _really_ a bug to request that a new grace period start while
> > it's already in one. In general, it's ok to request a new grace period
> > while it's currently enforcing one. That should just have the effect of
> > extending the existing grace period.
> 
> "double list_add" can happen in init_net when legacy signal in lockd was used.
> It should not happen during usual extending of existing grace period,
> because restart_grace() calls locks_end_grace() before set_grace_period()
> but it can race with start of lockd_up_net() in init_net.
> I'm agree: we do not have any bugs in this scenario, all should work correctly.
> 
> However I would like to keep WARN to properly detect lost locks_end_grace()/
> cancel_delayed_work().
> 
> If you worry about real false positive and do not worry about abstract
> future troubles in init_net, I can move WARN under (net != &init_net) check.
> 
> However I would like to keep this warning here.
> 
> On the other hand if you disagree and still believe that WARN is not required here
> I'm ready to agree with your original patch version.

Fair enough. I don't feel strongly about it. I just have been doing some
investigation lately into clustered grace period management, so it's a
little on my mind. [1]

For now though, you're certainly correct that we'll never attempt to set
the grace period while we're already in it. If we ever want to do more
complex grace period handling in the kernel, we may need to drop that
WARN, however.

[1]: https://jtlayton.wordpress.com/2017/11/07/active-active-nfs-over-cephfs/
J. Bruce Fields Nov. 14, 2017, 12:46 a.m. UTC | #4
On Mon, Nov 13, 2017 at 03:06:17PM -0500, Jeff Layton wrote:
> On Mon, 2017-11-13 at 17:57 +0300, Vasily Averin wrote:
> > On 2017-11-13 14:49, Jeff Layton wrote:
> > > On Mon, 2017-11-13 at 07:25 +0300, Vasily Averin wrote:
> > > > --- a/fs/nfs_common/grace.c
> > > > +++ b/fs/nfs_common/grace.c
> > > > @@ -30,7 +30,11 @@ locks_start_grace(struct net *net, struct lock_manager *lm)
> > > >  	struct list_head *grace_list = net_generic(net, grace_net_id);
> > > >  
> > > >  	spin_lock(&grace_lock);
> > > > -	list_add(&lm->list, grace_list);
> > > > +	if (list_empty(&lm->list))
> > > > +		list_add(&lm->list, grace_list);
> > > > +	else
> > > > +		WARN(1, "double list_add attempt detected in net %x %s\n",
> > > > +		     net->ns.inum, (net == &init_net) ? "(init_net)" : "");
> > > >  	spin_unlock(&grace_lock);
> > > >  }
> > > 
> > > I'm not sure that warning really means much.
> > > 
> > > It's not _really_ a bug to request that a new grace period start while
> > > it's already in one. In general, it's ok to request a new grace period
> > > while it's currently enforcing one. That should just have the effect of
> > > extending the existing grace period.
> > 
> > "double list_add" can happen in init_net when legacy signal in lockd was used.
> > It should not happen during usual extending of existing grace period,
> > because restart_grace() calls locks_end_grace() before set_grace_period()
> > but it can race with start of lockd_up_net() in init_net.
> > I'm agree: we do not have any bugs in this scenario, all should work correctly.
> > 
> > However I would like to keep WARN to properly detect lost locks_end_grace()/
> > cancel_delayed_work().
> > 
> > If you worry about real false positive and do not worry about abstract
> > future troubles in init_net, I can move WARN under (net != &init_net) check.
> > 
> > However I would like to keep this warning here.
> > 
> > On the other hand if you disagree and still believe that WARN is not required here
> > I'm ready to agree with your original patch version.
> 
> Fair enough. I don't feel strongly about it. I just have been doing some
> investigation lately into clustered grace period management, so it's a
> little on my mind. [1]
> 
> For now though, you're certainly correct that we'll never attempt to set
> the grace period while we're already in it. If we ever want to do more
> complex grace period handling in the kernel, we may need to drop that
> WARN, however.

OK, applied with a minor changelog update.

Vasily, if you see anything missing from nfsd-next at this point, let me
know:

	git://linux-nfs.org/~bfields/linux.git nfsd-next

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/nfs_common/grace.c b/fs/nfs_common/grace.c
index bd3e2d3..5be08f0 100644
--- a/fs/nfs_common/grace.c
+++ b/fs/nfs_common/grace.c
@@ -30,7 +30,11 @@  locks_start_grace(struct net *net, struct lock_manager *lm)
 	struct list_head *grace_list = net_generic(net, grace_net_id);
 
 	spin_lock(&grace_lock);
-	list_add(&lm->list, grace_list);
+	if (list_empty(&lm->list))
+		list_add(&lm->list, grace_list);
+	else
+		WARN(1, "double list_add attempt detected in net %x %s\n",
+		     net->ns.inum, (net == &init_net) ? "(init_net)" : "");
 	spin_unlock(&grace_lock);
 }
 EXPORT_SYMBOL_GPL(locks_start_grace);
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 7345143..b29b5a1 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -7103,6 +7103,10 @@  static int nfs4_state_create_net(struct net *net)
 		INIT_LIST_HEAD(&nn->sessionid_hashtbl[i]);
 	nn->conf_name_tree = RB_ROOT;
 	nn->unconf_name_tree = RB_ROOT;
+	nn->boot_time = get_seconds();
+	nn->grace_ended = false;
+	nn->nfsd4_manager.block_opens = true;
+	INIT_LIST_HEAD(&nn->nfsd4_manager.list);
 	INIT_LIST_HEAD(&nn->client_lru);
 	INIT_LIST_HEAD(&nn->close_lru);
 	INIT_LIST_HEAD(&nn->del_recall_lru);
@@ -7160,9 +7164,6 @@  nfs4_state_start_net(struct net *net)
 	ret = nfs4_state_create_net(net);
 	if (ret)
 		return ret;
-	nn->boot_time = get_seconds();
-	nn->grace_ended = false;
-	nn->nfsd4_manager.block_opens = true;
 	locks_start_grace(net, &nn->nfsd4_manager);
 	nfsd4_client_tracking_init(net);
 	printk(KERN_INFO "NFSD: starting %ld-second grace period (net %x)\n",