diff mbox

[v2] fs/dcache.c: avoid soft-lockup in dput()

Message ID 1466564475-30417-1-git-send-email-fangwei1@huawei.com (mailing list archive)
State New, archived
Headers show

Commit Message

fangwei June 22, 2016, 3:01 a.m. UTC
We triggered soft-lockup under stress test which
open/access/write/close one file concurrently on more than
five different CPUs:

WARN: soft lockup - CPU#0 stuck for 11s! [who:30631]
...
[<ffffffc0003986f8>] dput+0x100/0x298
[<ffffffc00038c2dc>] terminate_walk+0x4c/0x60
[<ffffffc00038f56c>] path_lookupat+0x5cc/0x7a8
[<ffffffc00038f780>] filename_lookup+0x38/0xf0
[<ffffffc000391180>] user_path_at_empty+0x78/0xd0
[<ffffffc0003911f4>] user_path_at+0x1c/0x28
[<ffffffc00037d4fc>] SyS_faccessat+0xb4/0x230

->d_lock trylock may failed many times because of concurrently
operations, and dput() may execute a long time.

Fix this by replacing cpu_relax() with cond_resched().
dput() used to be sleepable, so make it sleepable again
should be safe.

Cc: <stable@vger.kernel.org>
Signed-off-by: Wei Fang <fangwei1@huawei.com>
---
Changes v1->v2:
- add might_sleep() to annotate that dput() can sleep

 fs/dcache.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

Comments

Boqun Feng June 22, 2016, 6:51 a.m. UTC | #1
Hi Wei Fang,

On Wed, Jun 22, 2016 at 11:01:15AM +0800, Wei Fang wrote:
> We triggered soft-lockup under stress test which
> open/access/write/close one file concurrently on more than
> five different CPUs:
> 
> WARN: soft lockup - CPU#0 stuck for 11s! [who:30631]
> ...
> [<ffffffc0003986f8>] dput+0x100/0x298
> [<ffffffc00038c2dc>] terminate_walk+0x4c/0x60
> [<ffffffc00038f56c>] path_lookupat+0x5cc/0x7a8
> [<ffffffc00038f780>] filename_lookup+0x38/0xf0
> [<ffffffc000391180>] user_path_at_empty+0x78/0xd0
> [<ffffffc0003911f4>] user_path_at+0x1c/0x28
> [<ffffffc00037d4fc>] SyS_faccessat+0xb4/0x230
> 
> ->d_lock trylock may failed many times because of concurrently
> operations, and dput() may execute a long time.
> 
> Fix this by replacing cpu_relax() with cond_resched().
> dput() used to be sleepable, so make it sleepable again
> should be safe.
> 
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Wei Fang <fangwei1@huawei.com>
> ---
> Changes v1->v2:
> - add might_sleep() to annotate that dput() can sleep
> 
>  fs/dcache.c |    4 +++-
>  1 files changed, 3 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/dcache.c b/fs/dcache.c
> index d5ecc6e..074fc1c 100644
> --- a/fs/dcache.c
> +++ b/fs/dcache.c
> @@ -578,7 +578,7 @@ static struct dentry *dentry_kill(struct dentry *dentry)
>  
>  failed:
>  	spin_unlock(&dentry->d_lock);
> -	cpu_relax();
> +	cond_resched();

Is it better to put the cond_resched() in the caller(i.e. dput()), right
before "goto repeat"? Because it's obviously a loop there, which makes
the purpose of cond_resched() more straightforward.

Regards,
Boqun

>  	return dentry; /* try again with same dentry */
>  }
>  
> @@ -752,6 +752,8 @@ void dput(struct dentry *dentry)
>  		return;
>  
>  repeat:
> +	might_sleep();
> +
>  	rcu_read_lock();
>  	if (likely(fast_dput(dentry))) {
>  		rcu_read_unlock();
> -- 
> 1.7.1
>
fangwei July 6, 2016, 2:36 a.m. UTC | #2
Hi, Boqun,

>> diff --git a/fs/dcache.c b/fs/dcache.c
>> index d5ecc6e..074fc1c 100644
>> --- a/fs/dcache.c
>> +++ b/fs/dcache.c
>> @@ -578,7 +578,7 @@ static struct dentry *dentry_kill(struct dentry *dentry)
>>  
>>  failed:
>>  	spin_unlock(&dentry->d_lock);
>> -	cpu_relax();
>> +	cond_resched();
> 
> Is it better to put the cond_resched() in the caller(i.e. dput()), right
> before "goto repeat"? Because it's obviously a loop there, which makes
> the purpose of cond_resched() more straightforward.

Agreed, that's more reasonable. I'll send v3 soon.

Thanks,
Wei

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Vaishali Thakkar Sept. 16, 2016, 7:49 a.m. UTC | #3
On Wednesday 22 June 2016 08:31 AM, Wei Fang wrote:
> We triggered soft-lockup under stress test which
> open/access/write/close one file concurrently on more than
> five different CPUs:
> 
> WARN: soft lockup - CPU#0 stuck for 11s! [who:30631]
> ...
> [<ffffffc0003986f8>] dput+0x100/0x298
> [<ffffffc00038c2dc>] terminate_walk+0x4c/0x60
> [<ffffffc00038f56c>] path_lookupat+0x5cc/0x7a8
> [<ffffffc00038f780>] filename_lookup+0x38/0xf0
> [<ffffffc000391180>] user_path_at_empty+0x78/0xd0
> [<ffffffc0003911f4>] user_path_at+0x1c/0x28
> [<ffffffc00037d4fc>] SyS_faccessat+0xb4/0x230
> 
> ->d_lock trylock may failed many times because of concurrently
> operations, and dput() may execute a long time.
> 
> Fix this by replacing cpu_relax() with cond_resched().
> dput() used to be sleepable, so make it sleepable again
> should be safe.

Hi,

Just a question regarding this change. As after this change
dput() is sleepable, is it still safe to use if under the
spinlock in the function d_prune_aliases?

Thanks

> Cc: <stable@vger.kernel.org>
> Signed-off-by: Wei Fang <fangwei1@huawei.com>
> ---
> Changes v1->v2:
> - add might_sleep() to annotate that dput() can sleep
> 
>  fs/dcache.c |    4 +++-
>  1 files changed, 3 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/dcache.c b/fs/dcache.c
> index d5ecc6e..074fc1c 100644
> --- a/fs/dcache.c
> +++ b/fs/dcache.c
> @@ -578,7 +578,7 @@ static struct dentry *dentry_kill(struct dentry *dentry)
>  
>  failed:
>  	spin_unlock(&dentry->d_lock);
> -	cpu_relax();
> +	cond_resched();
>  	return dentry; /* try again with same dentry */
>  }
>  
> @@ -752,6 +752,8 @@ void dput(struct dentry *dentry)
>  		return;
>  
>  repeat:
> +	might_sleep();
> +
>  	rcu_read_lock();
>  	if (likely(fast_dput(dentry))) {
>  		rcu_read_unlock();
>
Al Viro Sept. 16, 2016, 12:10 p.m. UTC | #4
On Fri, Sep 16, 2016 at 01:19:19PM +0530, Vaishali Thakkar wrote:

> Hi,
> 
> Just a question regarding this change. As after this change
> dput() is sleepable, is it still safe to use if under the
> spinlock in the function d_prune_aliases?

It has always been sleepable and it wouldn't have been safe to use
under spinlocks.  Which d_prune_aliases() does not do - __dentry_kill()
is called with dentry, its parent and its inode (if present) all locked and
it drops all those locks before returning.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Vaishali Thakkar Sept. 16, 2016, 12:50 p.m. UTC | #5
On Friday 16 September 2016 05:40 PM, Al Viro wrote:
> On Fri, Sep 16, 2016 at 01:19:19PM +0530, Vaishali Thakkar wrote:
> 
>> Hi,
>>
>> Just a question regarding this change. As after this change
>> dput() is sleepable, is it still safe to use if under the
>> spinlock in the function d_prune_aliases?
> 
> It has always been sleepable and it wouldn't have been safe to use
> under spinlocks.  Which d_prune_aliases() does not do - __dentry_kill()
> is called with dentry, its parent and its inode (if present) all locked and
> it drops all those locks before returning.

Ah, I see. Alright. Thanks for the clarification.

>
diff mbox

Patch

diff --git a/fs/dcache.c b/fs/dcache.c
index d5ecc6e..074fc1c 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -578,7 +578,7 @@  static struct dentry *dentry_kill(struct dentry *dentry)
 
 failed:
 	spin_unlock(&dentry->d_lock);
-	cpu_relax();
+	cond_resched();
 	return dentry; /* try again with same dentry */
 }
 
@@ -752,6 +752,8 @@  void dput(struct dentry *dentry)
 		return;
 
 repeat:
+	might_sleep();
+
 	rcu_read_lock();
 	if (likely(fast_dput(dentry))) {
 		rcu_read_unlock();