From patchwork Mon Jul 17 13:39:33 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Waiman Long X-Patchwork-Id: 9845093 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 23E8060386 for ; Mon, 17 Jul 2017 13:41:07 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 16EF128509 for ; Mon, 17 Jul 2017 13:41:07 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0BC2728512; Mon, 17 Jul 2017 13:41:07 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AE52128509 for ; Mon, 17 Jul 2017 13:41:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751482AbdGQNkE (ORCPT ); Mon, 17 Jul 2017 09:40:04 -0400 Received: from mx1.redhat.com ([209.132.183.28]:41762 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751278AbdGQNkA (ORCPT ); Mon, 17 Jul 2017 09:40:00 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 16A25356E1; Mon, 17 Jul 2017 13:40:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 16A25356E1 Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=longman@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 16A25356E1 Received: from llong.com (dhcp-17-6.bos.redhat.com [10.18.17.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id E29157E67B; Mon, 17 Jul 2017 13:39:55 +0000 (UTC) From: Waiman Long To: Alexander Viro , Jonathan Corbet Cc: linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, "Paul E. McKenney" , Andrew Morton , Ingo Molnar , Miklos Szeredi , Waiman Long Subject: [PATCH 4/4] fs/dcache: Protect negative dentry pruning from racing with umount Date: Mon, 17 Jul 2017 09:39:33 -0400 Message-Id: <1500298773-7510-5-git-send-email-longman@redhat.com> In-Reply-To: <1500298773-7510-1-git-send-email-longman@redhat.com> References: <1500298773-7510-1-git-send-email-longman@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Mon, 17 Jul 2017 13:40:00 +0000 (UTC) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The negative dentry pruning is done on a specific super_block set in the ndblk.prune_sb variable. If the super_block is also being un-mounted concurrently, the content of the super_block may no longer be valid. To protect against such racing condition, a new lock is added to the ndblk structure to synchronize the negative dentry pruning and umount operation. Signed-off-by: Waiman Long --- fs/dcache.c | 41 ++++++++++++++++++++++++++++++++++++++--- 1 file changed, 38 insertions(+), 3 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index 6c7d86f..babfa05 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -139,11 +139,13 @@ struct dentry_stat_t dentry_stat = { static long neg_dentry_nfree_init __read_mostly; /* Free pool initial value */ static struct { raw_spinlock_t nfree_lock; + raw_spinlock_t prune_lock; /* Lock for protecting pruning */ long nfree; /* Negative dentry free pool */ struct super_block *prune_sb; /* Super_block for pruning */ int neg_count, prune_count; /* Pruning counts */ } ndblk ____cacheline_aligned_in_smp; +static void clear_prune_sb_for_umount(struct super_block *sb); static void prune_negative_dentry(struct work_struct *work); static DECLARE_DELAYED_WORK(prune_neg_dentry_work, prune_negative_dentry); @@ -1294,6 +1296,7 @@ void shrink_dcache_sb(struct super_block *sb) { long freed; + clear_prune_sb_for_umount(sb); do { LIST_HEAD(dispose); @@ -1324,7 +1327,8 @@ static enum lru_status dentry_negative_lru_isolate(struct list_head *item, * list. */ if ((ndblk.neg_count >= NEG_DENTRY_BATCH) || - (ndblk.prune_count >= NEG_DENTRY_BATCH)) { + (ndblk.prune_count >= NEG_DENTRY_BATCH) || + !READ_ONCE(ndblk.prune_sb)) { ndblk.prune_count = 0; return LRU_STOP; } @@ -1375,15 +1379,24 @@ static enum lru_status dentry_negative_lru_isolate(struct list_head *item, static void prune_negative_dentry(struct work_struct *work) { int freed; - struct super_block *sb = READ_ONCE(ndblk.prune_sb); + struct super_block *sb; LIST_HEAD(dispose); - if (!sb) + /* + * The prune_lock is used to protect negative dentry pruning from + * racing with concurrent umount operation. + */ + raw_spin_lock(&ndblk.prune_lock); + sb = READ_ONCE(ndblk.prune_sb); + if (!sb) { + raw_spin_unlock(&ndblk.prune_lock); return; + } ndblk.neg_count = ndblk.prune_count = 0; freed = list_lru_walk(&sb->s_dentry_lru, dentry_negative_lru_isolate, &dispose, NEG_DENTRY_BATCH); + raw_spin_unlock(&ndblk.prune_lock); if (freed) shrink_dentry_list(&dispose); @@ -1400,6 +1413,27 @@ static void prune_negative_dentry(struct work_struct *work) WRITE_ONCE(ndblk.prune_sb, NULL); } +/* + * This is called before an umount to clear ndblk.prune_sb if it + * matches the given super_block. + */ +static void clear_prune_sb_for_umount(struct super_block *sb) +{ + if (likely(READ_ONCE(ndblk.prune_sb) != sb)) + return; + WRITE_ONCE(ndblk.prune_sb, NULL); + /* + * Need to wait until an ongoing pruning operation, if present, + * is completed. + * + * Clearing ndblk.prune_sb will hasten the completion of pruning. + * In the unlikely event that ndblk.prune_sb is set to another + * super_block, the waiting will last the complete pruning operation + * which shouldn't be that long either. + */ + raw_spin_unlock_wait(&ndblk.prune_lock); +} + /** * enum d_walk_ret - action to talke during tree walk * @D_WALK_CONTINUE: contrinue walk @@ -1722,6 +1756,7 @@ void shrink_dcache_for_umount(struct super_block *sb) WARN(down_read_trylock(&sb->s_umount), "s_umount should've been locked"); + clear_prune_sb_for_umount(sb); dentry = sb->s_root; sb->s_root = NULL; do_one_tree(dentry);