From patchwork Wed May 20 13:01:55 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 6445681 Return-Path: X-Original-To: patchwork-linux-btrfs@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id AD7C59F1C1 for ; Wed, 20 May 2015 13:04:53 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 8DD5F20373 for ; Wed, 20 May 2015 13:04:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5563920295 for ; Wed, 20 May 2015 13:04:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753362AbbETNEq (ORCPT ); Wed, 20 May 2015 09:04:46 -0400 Received: from victor.provo.novell.com ([137.65.250.26]:56690 "EHLO prv3-mh.provo.novell.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751976AbbETNEn (ORCPT ); Wed, 20 May 2015 09:04:43 -0400 Received: from debian3.lan (prv-ext-foundry1int.gns.novell.com [137.65.251.240]) by prv3-mh.provo.novell.com with ESMTP (NOT encrypted); Wed, 20 May 2015 07:04:35 -0600 From: Filipe Manana To: linux-btrfs@vger.kernel.org Cc: Filipe Manana Subject: [PATCH 2/2] Btrfs: fix -ENOSPC on block group removal Date: Wed, 20 May 2015 14:01:55 +0100 Message-Id: <1432126915-30733-2-git-send-email-fdmanana@suse.com> X-Mailer: git-send-email 2.1.3 In-Reply-To: <1432126915-30733-1-git-send-email-fdmanana@suse.com> References: <1432126915-30733-1-git-send-email-fdmanana@suse.com> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Unlike when attempting to allocate a new block group, where we check that we have enough space in the system space_info to update the device items and insert a new chunk item in the chunk tree, we were not checking if the system space_info had enough space for updating the device items and deleting the chunk item in the chunk tree. This often lead to -ENOSPC error when attempting to allocate blocks for the chunk tree (during btree node/leaf COW operations) while updating the device items or deleting the chunk item, which resulted in the current transaction being aborted and turning the filesystem into read-only mode. While running fstests generic/038, which stresses allocation of block groups and removal of unused block groups, with a large scratch device (750Gb) this happened often, despite more than enough unallocated space, and resulted in the following trace: [68663.586604] WARNING: CPU: 3 PID: 1521 at fs/btrfs/super.c:260 __btrfs_abort_transaction+0x52/0x114 [btrfs]() [68663.600407] BTRFS: Transaction aborted (error -28) (...) [68663.730829] Call Trace: [68663.732585] [] dump_stack+0x4f/0x7b [68663.734334] [] ? console_unlock+0x361/0x3ad [68663.739980] [] warn_slowpath_common+0xa1/0xbb [68663.757153] [] ? __btrfs_abort_transaction+0x52/0x114 [btrfs] [68663.760925] [] warn_slowpath_fmt+0x46/0x48 [68663.762854] [] ? btrfs_update_device+0x15a/0x16c [btrfs] [68663.764073] [] __btrfs_abort_transaction+0x52/0x114 [btrfs] [68663.765130] [] btrfs_remove_chunk+0x597/0x5ee [btrfs] [68663.765998] [] ? btrfs_delete_unused_bgs+0x245/0x296 [btrfs] [68663.767068] [] btrfs_delete_unused_bgs+0x258/0x296 [btrfs] [68663.768227] [] ? _raw_spin_unlock_irq+0x2d/0x4c [68663.769081] [] cleaner_kthread+0x13d/0x16c [btrfs] [68663.799485] [] ? btrfs_alloc_root+0x28/0x28 [btrfs] [68663.809208] [] kthread+0xef/0xf7 [68663.828795] [] ? time_hardirqs_on+0x15/0x28 [68663.844942] [] ? __kthread_parkme+0xad/0xad [68663.846486] [] ret_from_fork+0x58/0x90 [68663.847760] [] ? __kthread_parkme+0xad/0xad [68663.849503] ---[ end trace 798477c6d6dbaad6 ]--- [68663.850525] BTRFS: error (device sdc) in btrfs_remove_chunk:2652: errno=-28 No space left So fix this by verifying that enough space exists in system space_info, and reserving the space in the chunk block reserve, before attempting to delete the block group and allocate a new system chunk if we don't have enough space to perform the necessary updates and delete in the chunk tree. Like for the block group creation case, we don't error our if we fail to allocate a new system chunk, since we might end up not needing it (no node/leaf splits happen during the COW operations and/or we end up not needing to COW any btree nodes or leafs because they were already COWed in the current transaction and their writeback didn't start yet). Signed-off-by: Filipe Manana --- Applies on top of the recent patch titled: "Btrfs: fix racy system chunk allocation when setting block group ro" fs/btrfs/ctree.h | 4 ++++ fs/btrfs/extent-tree.c | 31 +++++++++++++++++++++++-------- fs/btrfs/volumes.c | 3 +++ 3 files changed, 30 insertions(+), 8 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 5ecafcd..70b79d0 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3516,6 +3516,10 @@ int btrfs_delayed_refs_qgroup_accounting(struct btrfs_trans_handle *trans, int __get_raid_index(u64 flags); int btrfs_start_write_no_snapshoting(struct btrfs_root *root); void btrfs_end_write_no_snapshoting(struct btrfs_root *root); +void check_system_chunk(struct btrfs_trans_handle *trans, + struct btrfs_root *root, + const u64 type, + const bool is_allocation); /* ctree.c */ int btrfs_bin_search(struct extent_buffer *eb, struct btrfs_key *key, int level, int *slot); diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 3d07b4d..b76fd95 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -4092,7 +4092,7 @@ static int should_alloc_chunk(struct btrfs_root *root, return 1; } -static u64 get_system_chunk_thresh(struct btrfs_root *root, u64 type) +static u64 get_profile_num_devs(struct btrfs_root *root, u64 type) { u64 num_dev; @@ -4106,17 +4106,24 @@ static u64 get_system_chunk_thresh(struct btrfs_root *root, u64 type) else num_dev = 1; /* DUP or single */ - /* metadata for updaing devices and chunk tree */ - return btrfs_calc_trans_metadata_size(root, num_dev + 1); + return num_dev; } -static void check_system_chunk(struct btrfs_trans_handle *trans, - struct btrfs_root *root, u64 type) +/* + * If @is_allocation is true, reserve space in the system space info necessary + * for allocating a chunk, otherwise if it's false, reserve space necessary for + * removing a chunk. + */ +void check_system_chunk(struct btrfs_trans_handle *trans, + struct btrfs_root *root, + u64 type, + const bool is_allocation) { struct btrfs_space_info *info; u64 left; u64 thresh; int ret = 0; + u64 num_devs; /* * Needed because we can end up allocating a system chunk and for an @@ -4131,7 +4138,15 @@ static void check_system_chunk(struct btrfs_trans_handle *trans, info->bytes_may_use; spin_unlock(&info->lock); - thresh = get_system_chunk_thresh(root, type); + num_devs = get_profile_num_devs(root, type); + + /* num_devs device items to update and 1 chunk item to add or remove */ + if (is_allocation) + thresh = btrfs_calc_trans_metadata_size(root, num_devs + 1); + else + thresh = btrfs_calc_trans_metadata_size(root, num_devs) + + btrfs_calc_trunc_metadata_size(root, 1); + if (left < thresh && btrfs_test_opt(root, ENOSPC_DEBUG)) { btrfs_info(root->fs_info, "left=%llu, need=%llu, flags=%llu", left, thresh, type); @@ -4243,7 +4258,7 @@ again: * Check if we have enough space in SYSTEM chunk because we may need * to update devices. */ - check_system_chunk(trans, extent_root, flags); + check_system_chunk(trans, extent_root, flags, true); ret = btrfs_alloc_chunk(trans, extent_root, flags); trans->allocating_chunk = false; @@ -8887,7 +8902,7 @@ out: if (cache->flags & BTRFS_BLOCK_GROUP_SYSTEM) { alloc_flags = update_block_group_flags(root, cache->flags); lock_chunks(root->fs_info->chunk_root); - check_system_chunk(trans, root, alloc_flags); + check_system_chunk(trans, root, alloc_flags, true); unlock_chunks(root->fs_info->chunk_root); } mutex_unlock(&root->fs_info->ro_block_group_mutex); diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index d2b276c..dbea12e 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2625,6 +2625,9 @@ int btrfs_remove_chunk(struct btrfs_trans_handle *trans, return -EINVAL; } map = (struct map_lookup *)em->bdev; + lock_chunks(root->fs_info->chunk_root); + check_system_chunk(trans, extent_root, map->type, false); + unlock_chunks(root->fs_info->chunk_root); for (i = 0; i < map->num_stripes; i++) { struct btrfs_device *device = map->stripes[i].dev;