From patchwork Fri Oct 2 17:43:28 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 7318011 Return-Path: X-Original-To: patchwork-linux-btrfs@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 5A6F49F302 for ; Fri, 2 Oct 2015 17:43:51 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 546762081B for ; Fri, 2 Oct 2015 17:43:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 264C620804 for ; Fri, 2 Oct 2015 17:43:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753368AbbJBRnl (ORCPT ); Fri, 2 Oct 2015 13:43:41 -0400 Received: from mail.kernel.org ([198.145.29.136]:39823 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751679AbbJBRni (ORCPT ); Fri, 2 Oct 2015 13:43:38 -0400 Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id A281E20804; Fri, 2 Oct 2015 17:43:37 +0000 (UTC) Received: from debian3.lan (bl8-199-62.dsl.telepac.pt [85.241.199.62]) (using TLSv1.2 with cipher AES128-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 205042081C; Fri, 2 Oct 2015 17:43:35 +0000 (UTC) From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Cc: jbacik@db.com, Filipe Manana Subject: [PATCH] Btrfs: fix deadlock when finalizing block group creation Date: Fri, 2 Oct 2015 18:43:28 +0100 Message-Id: <1443807808-26424-1-git-send-email-fdmanana@kernel.org> X-Mailer: git-send-email 2.1.3 X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Filipe Manana Josef ran into a deadlock while a transaction handle was finalizing the creation of its block groups, which produced the following trace: [260445.593112] fio D ffff88022a9df468 0 8924 4518 0x00000084 [260445.593119] ffff88022a9df468 ffffffff81c134c0 ffff880429693c00 ffff88022a9df488 [260445.593126] ffff88022a9e0000 ffff8803490d7b00 ffff8803490d7b18 ffff88022a9df4b0 [260445.593132] ffff8803490d7af8 ffff88022a9df488 ffffffff8175a437 ffff8803490d7b00 [260445.593137] Call Trace: [260445.593145] [] schedule+0x37/0x80 [260445.593189] [] btrfs_tree_lock+0xa7/0x1f0 [btrfs] [260445.593197] [] ? prepare_to_wait_event+0xf0/0xf0 [260445.593225] [] btrfs_lock_root_node+0x34/0x50 [btrfs] [260445.593253] [] btrfs_search_slot+0x88b/0xa00 [btrfs] [260445.593295] [] ? free_extent_buffer+0x4f/0x90 [btrfs] [260445.593324] [] btrfs_insert_empty_items+0x66/0xc0 [btrfs] [260445.593351] [] ? btrfs_alloc_path+0x1a/0x20 [btrfs] [260445.593394] [] btrfs_finish_chunk_alloc+0x1c9/0x570 [btrfs] [260445.593427] [] btrfs_create_pending_block_groups+0x11b/0x200 [btrfs] [260445.593459] [] do_chunk_alloc+0x2a4/0x2e0 [btrfs] [260445.593491] [] find_free_extent+0xa55/0xd90 [btrfs] [260445.593524] [] btrfs_reserve_extent+0xd2/0x220 [btrfs] [260445.593532] [] ? account_page_dirtied+0xdd/0x170 [260445.593564] [] btrfs_alloc_tree_block+0x108/0x4a0 [btrfs] [260445.593597] [] ? btree_set_page_dirty+0xe/0x10 [btrfs] [260445.593626] [] __btrfs_cow_block+0x12d/0x5b0 [btrfs] [260445.593654] [] btrfs_cow_block+0x11f/0x1c0 [btrfs] [260445.593682] [] btrfs_search_slot+0x1e7/0xa00 [btrfs] [260445.593724] [] ? free_extent_buffer+0x4f/0x90 [btrfs] [260445.593752] [] btrfs_insert_empty_items+0x66/0xc0 [btrfs] [260445.593830] [] ? btrfs_alloc_path+0x1a/0x20 [btrfs] [260445.593905] [] btrfs_finish_chunk_alloc+0x1c9/0x570 [btrfs] [260445.593946] [] btrfs_create_pending_block_groups+0x11b/0x200 [btrfs] [260445.593990] [] btrfs_commit_transaction+0xa8/0xb40 [btrfs] [260445.594042] [] ? btrfs_log_dentry_safe+0x6d/0x80 [btrfs] [260445.594089] [] btrfs_sync_file+0x294/0x350 [btrfs] [260445.594115] [] vfs_fsync_range+0x3b/0xa0 [260445.594133] [] ? syscall_trace_enter_phase1+0x131/0x180 [260445.594149] [] do_fsync+0x3d/0x70 [260445.594169] [] ? syscall_trace_leave+0xb8/0x110 [260445.594187] [] SyS_fsync+0x10/0x20 [260445.594204] [] entry_SYSCALL_64_fastpath+0x12/0x71 This happened because the same transaction handle created a large number of block groups and while finalizing their creation (inserting new items and updating existing items in the chunk and device trees) a new metadata extent had to be allocated and no free space was found in the current metadata block groups, which made find_free_extent() attempt to allocate a new block group via do_chunk_alloc(). However at do_chunk_alloc() we ended up allocating a new system chunk too and exceeded the threshold of 2Mb of reserved chunk bytes, which makes do_chunk_alloc() enter the final part of block group creation again (at btrfs_create_pending_block_groups()) and attempt to lock again the root of the chunk tree when it's already write locked by the same task. Fix this by never recursing into the finalization phase of block group creation. Reported-by: Josef Bacik Fixes: 00d80e342c0f ("Btrfs: fix quick exhaustion of the system array in the superblock") Signed-off-by: Filipe Manana --- fs/btrfs/extent-tree.c | 5 ++++- fs/btrfs/transaction.c | 1 + fs/btrfs/transaction.h | 1 + 3 files changed, 6 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 9f96042..358453d 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -4306,7 +4306,8 @@ out: * the block groups that were made dirty during the lifetime of the * transaction. */ - if (trans->chunk_bytes_reserved >= (2 * 1024 * 1024ull)) { + if (trans->chunk_bytes_reserved >= (2 * 1024 * 1024ull) && + !trans->creating_pending_bgs) { btrfs_create_pending_block_groups(trans, trans->root); btrfs_trans_release_chunk_metadata(trans); } @@ -9561,6 +9562,7 @@ void btrfs_create_pending_block_groups(struct btrfs_trans_handle *trans, struct btrfs_key key; int ret = 0; + trans->creating_pending_bgs = true; list_for_each_entry_safe(block_group, tmp, &trans->new_bgs, bg_list) { if (ret) goto next; @@ -9581,6 +9583,7 @@ void btrfs_create_pending_block_groups(struct btrfs_trans_handle *trans, next: list_del_init(&block_group->bg_list); } + trans->creating_pending_bgs = false; } int btrfs_make_block_group(struct btrfs_trans_handle *trans, diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index a2d6f7b..60544d9 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -557,6 +557,7 @@ again: h->delayed_ref_elem.seq = 0; h->type = type; h->allocating_chunk = false; + h->creating_pending_bgs = false; h->reloc_reserved = false; h->sync = false; INIT_LIST_HEAD(&h->qgroup_ref_list); diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h index 87964bf..ce86bb0 100644 --- a/fs/btrfs/transaction.h +++ b/fs/btrfs/transaction.h @@ -118,6 +118,7 @@ struct btrfs_trans_handle { short aborted; short adding_csums; bool allocating_chunk; + bool creating_pending_bgs; bool reloc_reserved; bool sync; unsigned int type;