From patchwork Mon Aug 1 05:28:08 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiaoguang Wang X-Patchwork-Id: 9253747 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 066236077C for ; Mon, 1 Aug 2016 05:31:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E851C2848B for ; Mon, 1 Aug 2016 05:31:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DBEA828473; Mon, 1 Aug 2016 05:31:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 576B328473 for ; Mon, 1 Aug 2016 05:31:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751005AbcHAFbc (ORCPT ); Mon, 1 Aug 2016 01:31:32 -0400 Received: from cn.fujitsu.com ([222.73.24.84]:4196 "EHLO song.cn.fujitsu.com" rhost-flags-OK-FAIL-OK-OK) by vger.kernel.org with ESMTP id S1750897AbcHAFba (ORCPT ); Mon, 1 Aug 2016 01:31:30 -0400 X-IronPort-AV: E=Sophos;i="5.20,367,1444665600"; d="scan'208";a="758434" Received: from unknown (HELO cn.fujitsu.com) ([10.167.250.3]) by song.cn.fujitsu.com with ESMTP; 01 Aug 2016 13:31:15 +0800 Received: from localhost.localdomain (unknown [10.167.226.107]) by cn.fujitsu.com (Postfix) with ESMTP id 74E00405640A; Mon, 1 Aug 2016 13:31:13 +0800 (CST) From: Wang Xiaoguang To: linux-btrfs@vger.kernel.org Cc: dsterba@suse.cz Subject: [PATCH v4] btrfs: fix fsfreeze hang caused by delayed iputs deal Date: Mon, 1 Aug 2016 13:28:08 +0800 Message-Id: <20160801052808.7331-1-wangxg.fnst@cn.fujitsu.com> X-Mailer: git-send-email 2.9.0 MIME-Version: 1.0 X-yoursite-MailScanner-ID: 74E00405640A.A19A0 X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-From: wangxg.fnst@cn.fujitsu.com Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When running fstests generic/068, sometimes we got below deadlock: xfs_io D ffff8800331dbb20 0 6697 6693 0x00000080 ffff8800331dbb20 ffff88007acfc140 ffff880034d895c0 ffff8800331dc000 ffff880032d243e8 fffffffeffffffff ffff880032d24400 0000000000000001 ffff8800331dbb38 ffffffff816a9045 ffff880034d895c0 ffff8800331dbba8 Call Trace: [] schedule+0x35/0x80 [] rwsem_down_read_failed+0xf2/0x140 [] ? __filemap_fdatawrite_range+0xd1/0x100 [] call_rwsem_down_read_failed+0x18/0x30 [] ? btrfs_alloc_block_rsv+0x2c/0xb0 [btrfs] [] percpu_down_read+0x35/0x50 [] __sb_start_write+0x2c/0x40 [] start_transaction+0x2a5/0x4d0 [btrfs] [] btrfs_join_transaction+0x17/0x20 [btrfs] [] btrfs_evict_inode+0x3c4/0x5d0 [btrfs] [] evict+0xba/0x1a0 [] iput+0x196/0x200 [] btrfs_run_delayed_iputs+0x70/0xc0 [btrfs] [] btrfs_commit_transaction+0x928/0xa80 [btrfs] [] btrfs_freeze+0x30/0x40 [btrfs] [] freeze_super+0xf0/0x190 [] do_vfs_ioctl+0x4a5/0x5c0 [] ? do_audit_syscall_entry+0x66/0x70 [] ? syscall_trace_enter_phase1+0x11f/0x140 [] SyS_ioctl+0x79/0x90 [] do_syscall_64+0x62/0x110 [] entry_SYSCALL64_slow_path+0x25/0x25 From this warning, freeze_super() already holds SB_FREEZE_FS, but btrfs_freeze() will call btrfs_commit_transaction() again, if btrfs_commit_transaction() finds that it has delayed iputs to handle, it'll start_transaction(), which will try to get SB_FREEZE_FS lock again, then deadlock occurs. The root cause is that in btrfs, sync_filesystem(sb) does not make sure all metadata is updated. There still maybe some codes adding delayed iputs, see below sample race window: CPU1 | CPU2 |-> freeze_super() | |-> sync_filesystem(sb); | | |-> cleaner_kthread() | | |-> btrfs_delete_unused_bgs() | | |-> btrfs_remove_chunk() | | |-> btrfs_remove_block_group() | | |-> btrfs_add_delayed_iput() | | |-> sb->s_writers.frozen = SB_FREEZE_FS; | |-> sb_wait_write(sb, SB_FREEZE_FS); | | acquire SB_FREEZE_FS lock. | | | |-> btrfs_freeze() | |-> btrfs_commit_transaction() | |-> btrfs_run_delayed_iputs() | | will handle delayed iputs, | | that means start_transaction() | | will be called, which will try | | to get SB_FREEZE_FS lock. | To fix this issue, introduce a "int fs_frozen" to record internally whether fs has been frozen. If fs has been frozen, we can not handle delayed iputs. Signed-off-by: Wang Xiaoguang Reviewed-by: David Sterba --- v3: we introduce a atomic_t fs_frozen, and if fs_frozen is 1, we can not handle delayed iputs. v4: change atomic_t fs_frozen to be int. --- fs/btrfs/ctree.h | 2 ++ fs/btrfs/disk-io.c | 1 + fs/btrfs/super.c | 10 ++++++++++ fs/btrfs/transaction.c | 7 ++++++- 4 files changed, 19 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 90041a2..3f241d5 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1091,6 +1091,8 @@ struct btrfs_fs_info { struct list_head pinned_chunks; int creating_free_space_tree; + /* Used to record internally whether fs has been frozen */ + int fs_frozen; }; struct btrfs_subvolume_writers { diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 86cad9a..1d26a51 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2621,6 +2621,7 @@ int open_ctree(struct super_block *sb, atomic_set(&fs_info->qgroup_op_seq, 0); atomic_set(&fs_info->reada_works_cnt, 0); atomic64_set(&fs_info->tree_mod_seq, 0); + fs_info->fs_frozen = 0; fs_info->sb = sb; fs_info->max_inline = BTRFS_DEFAULT_MAX_INLINE; fs_info->metadata_ratio = 0; diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 60e7179..6b73cad 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -2216,6 +2216,7 @@ static int btrfs_freeze(struct super_block *sb) struct btrfs_trans_handle *trans; struct btrfs_root *root = btrfs_sb(sb)->tree_root; + root->fs_info->fs_frozen = 1; trans = btrfs_attach_transaction_barrier(root); if (IS_ERR(trans)) { /* no transaction, don't bother */ @@ -2226,6 +2227,14 @@ static int btrfs_freeze(struct super_block *sb) return btrfs_commit_transaction(trans, root); } +static int btrfs_unfreeze(struct super_block *sb) +{ + struct btrfs_root *root = btrfs_sb(sb)->tree_root; + + root->fs_info->fs_frozen = 0; + return 0; +} + static int btrfs_show_devname(struct seq_file *m, struct dentry *root) { struct btrfs_fs_info *fs_info = btrfs_sb(root->d_sb); @@ -2274,6 +2283,7 @@ static const struct super_operations btrfs_super_ops = { .statfs = btrfs_statfs, .remount_fs = btrfs_remount, .freeze_fs = btrfs_freeze, + .unfreeze_fs = btrfs_unfreeze, }; static const struct file_operations btrfs_ctl_fops = { diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index 948aa18..477325d 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -2277,8 +2277,13 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans, kmem_cache_free(btrfs_trans_handle_cachep, trans); + /* + * If fs has been frozen, we can not handle delayed iputs, otherwise + * it'll result in deadlock about SB_FREEZE_FS. + */ if (current != root->fs_info->transaction_kthread && - current != root->fs_info->cleaner_kthread) + current != root->fs_info->cleaner_kthread && + !root->fs_info->fs_frozen) btrfs_run_delayed_iputs(root); return ret;