From patchwork Wed Jul 6 03:32:20 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: fangwei X-Patchwork-Id: 9215353 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id D13A76048F for ; Wed, 6 Jul 2016 03:26:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C5A802860B for ; Wed, 6 Jul 2016 03:26:11 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B947C2860D; Wed, 6 Jul 2016 03:26:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4855D2860B for ; Wed, 6 Jul 2016 03:26:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751574AbcGFDZx (ORCPT ); Tue, 5 Jul 2016 23:25:53 -0400 Received: from szxga02-in.huawei.com ([119.145.14.65]:40349 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751271AbcGFDZx (ORCPT ); Tue, 5 Jul 2016 23:25:53 -0400 Received: from 172.24.1.137 (EHLO szxeml427-hub.china.huawei.com) ([172.24.1.137]) by szxrg02-dlp.huawei.com (MOS 4.3.7-GA FastPath queued) with ESMTP id DJV76176; Wed, 06 Jul 2016 11:25:33 +0800 (CST) Received: from localhost.localdomain (10.175.100.166) by szxeml427-hub.china.huawei.com (10.82.67.182) with Microsoft SMTP Server id 14.3.235.1; Wed, 6 Jul 2016 11:25:21 +0800 From: Wei Fang To: CC: , , , , , , Wei Fang , Subject: [PATCH v3] fs/dcache.c: avoid soft-lockup in dput() Date: Wed, 6 Jul 2016 11:32:20 +0800 Message-ID: <1467775940-16711-1-git-send-email-fangwei1@huawei.com> X-Mailer: git-send-email 1.7.1 MIME-Version: 1.0 X-Originating-IP: [10.175.100.166] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A090206.577C7A2F.000D, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2013-06-18 04:22:30, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: eaf7b281455160a713c186f6a0696b20 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We triggered soft-lockup under stress test which open/access/write/close one file concurrently on more than five different CPUs: WARN: soft lockup - CPU#0 stuck for 11s! [who:30631] ... [] dput+0x100/0x298 [] terminate_walk+0x4c/0x60 [] path_lookupat+0x5cc/0x7a8 [] filename_lookup+0x38/0xf0 [] user_path_at_empty+0x78/0xd0 [] user_path_at+0x1c/0x28 [] SyS_faccessat+0xb4/0x230 ->d_lock trylock may failed many times because of concurrently operations, and dput() may execute a long time. Fix this by replacing cpu_relax() with cond_resched(). dput() used to be sleepable, so make it sleepable again should be safe. Cc: Signed-off-by: Wei Fang --- Changes v1->v2: - add might_sleep() to annotate that dput() can sleep Changes v2->v3: - put cond_resched() in dput() fs/dcache.c | 7 +++++-- 1 files changed, 5 insertions(+), 2 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index ad4a542..77f6687 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -589,7 +589,6 @@ static struct dentry *dentry_kill(struct dentry *dentry) failed: spin_unlock(&dentry->d_lock); - cpu_relax(); return dentry; /* try again with same dentry */ } @@ -763,6 +762,8 @@ void dput(struct dentry *dentry) return; repeat: + might_sleep(); + rcu_read_lock(); if (likely(fast_dput(dentry))) { rcu_read_unlock(); @@ -796,8 +797,10 @@ repeat: kill_it: dentry = dentry_kill(dentry); - if (dentry) + if (dentry) { + cond_resched(); goto repeat; + } } EXPORT_SYMBOL(dput);