From patchwork Wed Aug 1 02:37:16 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 10551613 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EE0F314E0 for ; Wed, 1 Aug 2018 02:37:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DE6F12B40F for ; Wed, 1 Aug 2018 02:37:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D2F132B57B; Wed, 1 Aug 2018 02:37:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 55F192B40F for ; Wed, 1 Aug 2018 02:37:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726149AbeHAEUt (ORCPT ); Wed, 1 Aug 2018 00:20:49 -0400 Received: from mx2.suse.de ([195.135.220.15]:51250 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725933AbeHAEUs (ORCPT ); Wed, 1 Aug 2018 00:20:48 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id A5FE1ACD7 for ; Wed, 1 Aug 2018 02:37:32 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 1/6] btrfs: Check each block group has corresponding chunk at mount time Date: Wed, 1 Aug 2018 10:37:16 +0800 Message-Id: <20180801023721.32143-2-wqu@suse.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180801023721.32143-1-wqu@suse.com> References: <20180801023721.32143-1-wqu@suse.com> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP A crafted btrfs with incorrect chunk<->block group mapping, it could leads to a lot of unexpected behavior. Although the crafted image can be catched by block group item checker added in "[PATCH] btrfs: tree-checker: Verify block_group_item", if one crafted a valid enough block group item which can pass above check but still mismatch with existing chunk, it could cause a lot of undefined behavior. This patch will add extra block group -> chunk mapping check, to ensure we have a completely matching (start, len, flags) chunk for each block group at mount time. Here we reuse the original find_first_block_group(), which is already doing basic bg -> chunk check, adding more check on start/len and type flags. Link: https://bugzilla.kernel.org/show_bug.cgi?id=199837 Reported-by: Xu Wen Signed-off-by: Qu Wenruo Reviewed-by: Su Yue --- fs/btrfs/extent-tree.c | 29 ++++++++++++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 3d9fe58c0080..63a6b5d36ac1 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -9717,6 +9717,8 @@ static int find_first_block_group(struct btrfs_fs_info *fs_info, int ret = 0; struct btrfs_key found_key; struct extent_buffer *leaf; + struct btrfs_block_group_item bg; + u64 flags; int slot; ret = btrfs_search_slot(NULL, root, key, path, 0, 0); @@ -9751,8 +9753,33 @@ static int find_first_block_group(struct btrfs_fs_info *fs_info, "logical %llu len %llu found bg but no related chunk", found_key.objectid, found_key.offset); ret = -ENOENT; + } else if (em->start != found_key.objectid || + em->len != found_key.offset) { + btrfs_err(fs_info, + "block group %llu len %llu mismatch with chunk %llu len %llu", + found_key.objectid, found_key.offset, + em->start, em->len); + ret = -EUCLEAN; } else { - ret = 0; + read_extent_buffer(leaf, &bg, + btrfs_item_ptr_offset(leaf, slot), + sizeof(bg)); + flags = btrfs_block_group_flags(&bg) & + BTRFS_BLOCK_GROUP_TYPE_MASK; + + if (flags != (em->map_lookup->type & + BTRFS_BLOCK_GROUP_TYPE_MASK)) { + btrfs_err(fs_info, +"block group %llu len %llu type flags 0x%llx mismatch with chunk type flags 0x%llx", + found_key.objectid, + found_key.offset, + flags, + (BTRFS_BLOCK_GROUP_TYPE_MASK & + em->map_lookup->type)); + ret = -EUCLEAN; + } else { + ret = 0; + } } free_extent_map(em); goto out; From patchwork Wed Aug 1 02:37:17 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 10551615 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 21B3A14E0 for ; Wed, 1 Aug 2018 02:37:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1121B2B40F for ; Wed, 1 Aug 2018 02:37:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 022282B57B; Wed, 1 Aug 2018 02:37:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 848722B40F for ; Wed, 1 Aug 2018 02:37:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726204AbeHAEUz (ORCPT ); Wed, 1 Aug 2018 00:20:55 -0400 Received: from mx2.suse.de ([195.135.220.15]:51258 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725933AbeHAEUz (ORCPT ); Wed, 1 Aug 2018 00:20:55 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id E4847ACD7 for ; Wed, 1 Aug 2018 02:37:38 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 2/6] btrfs: Verify every chunk has corresponding block group at mount time Date: Wed, 1 Aug 2018 10:37:17 +0800 Message-Id: <20180801023721.32143-3-wqu@suse.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180801023721.32143-1-wqu@suse.com> References: <20180801023721.32143-1-wqu@suse.com> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP If a crafted btrfs has missing block group items, it could cause unexpected behavior and breaks our expectation on 1:1 chunk<->block group mapping. Although we added block group -> chunk mapping check, we still need chunk -> block group mapping check. This patch will do extra check to ensure each chunk has its corresponding block group. Link: https://bugzilla.kernel.org/show_bug.cgi?id=199847 Reported-by: Xu Wen Signed-off-by: Qu Wenruo Reviewed-by: Gu Jinxiang --- fs/btrfs/extent-tree.c | 57 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 56 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 63a6b5d36ac1..3578fa5b30ef 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -10030,6 +10030,61 @@ btrfs_create_block_group_cache(struct btrfs_fs_info *fs_info, return cache; } + +/* + * Iterate all chunks and verify each of them has corresponding block group + */ +static int check_chunk_block_group_mappings(struct btrfs_fs_info *fs_info) +{ + struct btrfs_mapping_tree *map_tree = &fs_info->mapping_tree; + struct extent_map *em; + struct btrfs_block_group_cache *bg; + u64 start = 0; + int ret = 0; + + while (1) { + read_lock(&map_tree->map_tree.lock); + /* + * lookup_extent_mapping will return the first extent map + * intersects the range, so set @len to 1 is enough to get + * the first chunk. + */ + em = lookup_extent_mapping(&map_tree->map_tree, start, 1); + read_unlock(&map_tree->map_tree.lock); + if (!em) + break; + + bg = btrfs_lookup_block_group(fs_info, em->start); + if (!bg) { + btrfs_err(fs_info, + "chunk start=%llu len=%llu doesn't have corresponding block group", + em->start, em->len); + ret = -EUCLEAN; + free_extent_map(em); + break; + } + if (bg->key.objectid != em->start || + bg->key.offset != em->len || + (bg->flags & BTRFS_BLOCK_GROUP_TYPE_MASK) != + (em->map_lookup->type & BTRFS_BLOCK_GROUP_TYPE_MASK)) { + btrfs_err(fs_info, +"chunk start=%llu len=%llu flags=0x%llx doesn't match with block group start=%llu len=%llu flags=0x%llx", + em->start, em->len, + em->map_lookup->type & BTRFS_BLOCK_GROUP_TYPE_MASK, + bg->key.objectid, bg->key.offset, + bg->flags & BTRFS_BLOCK_GROUP_TYPE_MASK); + ret = -EUCLEAN; + free_extent_map(em); + btrfs_put_block_group(bg); + break; + } + start = em->start + em->len; + free_extent_map(em); + btrfs_put_block_group(bg); + } + return ret; +} + int btrfs_read_block_groups(struct btrfs_fs_info *info) { struct btrfs_path *path; @@ -10203,7 +10258,7 @@ int btrfs_read_block_groups(struct btrfs_fs_info *info) btrfs_add_raid_kobjects(info); init_global_block_rsv(info); - ret = 0; + ret = check_chunk_block_group_mappings(info); error: btrfs_free_path(path); return ret; From patchwork Wed Aug 1 02:37:18 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 10551617 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 662541708 for ; Wed, 1 Aug 2018 02:37:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 53173286F3 for ; Wed, 1 Aug 2018 02:37:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4535228DB1; Wed, 1 Aug 2018 02:37:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CC73428595 for ; Wed, 1 Aug 2018 02:37:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726275AbeHAEU7 (ORCPT ); Wed, 1 Aug 2018 00:20:59 -0400 Received: from mx2.suse.de ([195.135.220.15]:51268 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725933AbeHAEU7 (ORCPT ); Wed, 1 Aug 2018 00:20:59 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id E6DE7AE2E for ; Wed, 1 Aug 2018 02:37:42 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 3/6] btrfs: Remove unused function btrfs_account_dev_extents_size() Date: Wed, 1 Aug 2018 10:37:18 +0800 Message-Id: <20180801023721.32143-4-wqu@suse.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180801023721.32143-1-wqu@suse.com> References: <20180801023721.32143-1-wqu@suse.com> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This function is never used by any one in kernel, just remove it. Signed-off-by: Qu Wenruo --- fs/btrfs/volumes.c | 85 ---------------------------------------------- fs/btrfs/volumes.h | 2 -- 2 files changed, 87 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index b33bf29130b6..e6a8e4aabc66 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1259,91 +1259,6 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder, return ret; } -/* helper to account the used device space in the range */ -int btrfs_account_dev_extents_size(struct btrfs_device *device, u64 start, - u64 end, u64 *length) -{ - struct btrfs_key key; - struct btrfs_root *root = device->fs_info->dev_root; - struct btrfs_dev_extent *dev_extent; - struct btrfs_path *path; - u64 extent_end; - int ret; - int slot; - struct extent_buffer *l; - - *length = 0; - - if (start >= device->total_bytes || - test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state)) - return 0; - - path = btrfs_alloc_path(); - if (!path) - return -ENOMEM; - path->reada = READA_FORWARD; - - key.objectid = device->devid; - key.offset = start; - key.type = BTRFS_DEV_EXTENT_KEY; - - ret = btrfs_search_slot(NULL, root, &key, path, 0, 0); - if (ret < 0) - goto out; - if (ret > 0) { - ret = btrfs_previous_item(root, path, key.objectid, key.type); - if (ret < 0) - goto out; - } - - while (1) { - l = path->nodes[0]; - slot = path->slots[0]; - if (slot >= btrfs_header_nritems(l)) { - ret = btrfs_next_leaf(root, path); - if (ret == 0) - continue; - if (ret < 0) - goto out; - - break; - } - btrfs_item_key_to_cpu(l, &key, slot); - - if (key.objectid < device->devid) - goto next; - - if (key.objectid > device->devid) - break; - - if (key.type != BTRFS_DEV_EXTENT_KEY) - goto next; - - dev_extent = btrfs_item_ptr(l, slot, struct btrfs_dev_extent); - extent_end = key.offset + btrfs_dev_extent_length(l, - dev_extent); - if (key.offset <= start && extent_end > end) { - *length = end - start + 1; - break; - } else if (key.offset <= start && extent_end > start) - *length += extent_end - start; - else if (key.offset > start && extent_end <= end) - *length += extent_end - key.offset; - else if (key.offset > start && key.offset <= end) { - *length += end - key.offset + 1; - break; - } else if (key.offset > end) - break; - -next: - path->slots[0]++; - } - ret = 0; -out: - btrfs_free_path(path); - return ret; -} - static int contains_pending_extent(struct btrfs_transaction *transaction, struct btrfs_device *device, u64 *start, u64 len) diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index 77e6004b6cb9..6d4f38ad9f5c 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -384,8 +384,6 @@ static inline enum btrfs_map_op btrfs_op(struct bio *bio) } } -int btrfs_account_dev_extents_size(struct btrfs_device *device, u64 start, - u64 end, u64 *length); void btrfs_get_bbio(struct btrfs_bio *bbio); void btrfs_put_bbio(struct btrfs_bio *bbio); int btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, From patchwork Wed Aug 1 02:37:19 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 10551619 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B319F1708 for ; Wed, 1 Aug 2018 02:37:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DA7F52A4A7 for ; Wed, 1 Aug 2018 02:37:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CE9372A49E; Wed, 1 Aug 2018 02:37:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3E8F32A49E for ; Wed, 1 Aug 2018 02:37:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726447AbeHAEVF (ORCPT ); Wed, 1 Aug 2018 00:21:05 -0400 Received: from mx2.suse.de ([195.135.220.15]:51282 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725933AbeHAEVE (ORCPT ); Wed, 1 Aug 2018 00:21:04 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 7F24BACD7 for ; Wed, 1 Aug 2018 02:37:47 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 4/6] btrfs: Introduce mount time chunk <-> dev extent mapping check Date: Wed, 1 Aug 2018 10:37:19 +0800 Message-Id: <20180801023721.32143-5-wqu@suse.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180801023721.32143-1-wqu@suse.com> References: <20180801023721.32143-1-wqu@suse.com> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This patch will introduce chunk <-> dev extent mapping check, to protect us against invalid dev extents or chunks. Since chunk mapping is the fundamental infrastructure of btrfs, extra check at mount time could prevent a lot of unexpected behavior (BUG_ON). Reported-by: Xu Wen Link: https://bugzilla.kernel.org/show_bug.cgi?id=200403 Link: https://bugzilla.kernel.org/show_bug.cgi?id=200407 Signed-off-by: Qu Wenruo Reviewed-by: Su Yue --- fs/btrfs/disk-io.c | 7 ++ fs/btrfs/volumes.c | 183 +++++++++++++++++++++++++++++++++++++++++++++ fs/btrfs/volumes.h | 2 + 3 files changed, 192 insertions(+) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 205092dc9390..068ca7498e94 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3075,6 +3075,13 @@ int open_ctree(struct super_block *sb, fs_info->generation = generation; fs_info->last_trans_committed = generation; + ret = btrfs_verify_dev_extents(fs_info); + if (ret) { + btrfs_err(fs_info, + "failed to verify dev extents against chunks: %d", + ret); + goto fail_block_groups; + } ret = btrfs_recover_balance(fs_info); if (ret) { btrfs_err(fs_info, "failed to recover balance: %d", ret); diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index e6a8e4aabc66..467a589854fa 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -6440,6 +6440,7 @@ static int read_one_chunk(struct btrfs_fs_info *fs_info, struct btrfs_key *key, map->stripe_len = btrfs_chunk_stripe_len(leaf, chunk); map->type = btrfs_chunk_type(leaf, chunk); map->sub_stripes = btrfs_chunk_sub_stripes(leaf, chunk); + map->verified_stripes = 0; for (i = 0; i < num_stripes; i++) { map->stripes[i].physical = btrfs_stripe_offset_nr(leaf, chunk, i); @@ -7295,3 +7296,185 @@ void btrfs_reset_fs_info_ptr(struct btrfs_fs_info *fs_info) fs_devices = fs_devices->seed; } } + +static u64 calc_stripe_length(u64 type, u64 chunk_len, int num_stripes) +{ + int index = btrfs_bg_flags_to_raid_index(type); + int ncopies = btrfs_raid_array[index].ncopies; + int data_stripes; + + switch (type & BTRFS_BLOCK_GROUP_PROFILE_MASK) { + case BTRFS_BLOCK_GROUP_RAID5: + data_stripes = num_stripes - 1; + break; + case BTRFS_BLOCK_GROUP_RAID6: + data_stripes = num_stripes - 2; + break; + default: + data_stripes = num_stripes / ncopies; + break; + } + return div_u64(chunk_len, data_stripes); +} +static int verify_one_dev_extent(struct btrfs_fs_info *fs_info, + u64 chunk_offset, u64 devid, + u64 physical_offset, u64 physical_len) +{ + struct extent_map_tree *em_tree = &fs_info->mapping_tree.map_tree; + struct extent_map *em; + struct map_lookup *map; + u64 stripe_len; + bool found = false; + int ret = 0; + int i; + + read_lock(&em_tree->lock); + em = lookup_extent_mapping(em_tree, chunk_offset, 1); + read_unlock(&em_tree->lock); + + if (!em) { + ret = -EUCLEAN; + btrfs_err(fs_info, + "dev extent (%llu, %llu) doesn't have corresponding chunk", + devid, physical_offset); + goto out; + } + + map = em->map_lookup; + stripe_len = calc_stripe_length(map->type, em->len, map->num_stripes); + if (physical_len != stripe_len) { + btrfs_err(fs_info, +"dev extent (%llu, %llu) length doesn't match with chunk %llu, have %llu expect %llu", + devid, physical_offset, em->start, physical_len, + stripe_len); + ret = -EUCLEAN; + goto out; + } + + for (i = 0; i < map->num_stripes; i++) { + if (map->stripes[i].dev->devid == devid && + map->stripes[i].physical == physical_offset) { + found = true; + if (map->verified_stripes >= map->num_stripes) { + btrfs_err(fs_info, + "too many dev extent for chunk %llu is detected", + em->start); + ret = -EUCLEAN; + goto out; + } + map->verified_stripes++; + break; + } + } + if (!found) { + ret = -EUCLEAN; + btrfs_err(fs_info, + "dev extent (%llu, %llu) has no corresponding chunk", + devid, physical_offset); + } +out: + free_extent_map(em); + return ret; +} + +static int verify_chunk_dev_extent_mapping(struct btrfs_fs_info *fs_info) +{ + struct extent_map_tree *em_tree = &fs_info->mapping_tree.map_tree; + struct extent_map *em; + struct rb_node *node; + int ret = 0; + + read_lock(&em_tree->lock); + for (node = rb_first(&em_tree->map); node; node = rb_next(node)) { + em = rb_entry(node, struct extent_map, rb_node); + if (em->map_lookup->num_stripes != + em->map_lookup->verified_stripes) { + btrfs_err(fs_info, + "chunk %llu has missing dev extent, have %d expect %d", + em->start, em->map_lookup->verified_stripes, + em->map_lookup->num_stripes); + ret = -EUCLEAN; + goto out; + } + } +out: + read_unlock(&em_tree->lock); + return ret; +} + +/* + * Ensure all dev extents are mapped to correct chunk. + * Or later chunk allocation/free would cause unexpected behavior. + * + * NOTE: This will iterate through the whole device tree, which should be + * at the same size level of chunk tree. + * This would increase mount time by a tiny fraction. + */ +int btrfs_verify_dev_extents(struct btrfs_fs_info *fs_info) +{ + struct btrfs_path *path; + struct btrfs_root *root = fs_info->dev_root; + struct btrfs_key key; + int ret = 0; + + key.objectid = 1; + key.type = BTRFS_DEV_EXTENT_KEY; + key.offset = 0; + + path = btrfs_alloc_path(); + if (!path) + return -ENOMEM; + + path->reada = READA_FORWARD; + ret = btrfs_search_slot(NULL, root, &key, path, 0, 0); + if (ret < 0) + goto out; + + if (path->slots[0] >= btrfs_header_nritems(path->nodes[0])) { + ret = btrfs_next_item(root, path); + if (ret < 0) + goto out; + /* No dev extents at all? Not good */ + if (ret > 0) { + ret = -EUCLEAN; + goto out; + } + } + while (1) { + struct extent_buffer *leaf = path->nodes[0]; + struct btrfs_dev_extent *dext; + int slot = path->slots[0]; + u64 chunk_offset; + u64 physical_offset; + u64 physical_len; + u64 devid; + + btrfs_item_key_to_cpu(leaf, &key, slot); + if (key.type != BTRFS_DEV_EXTENT_KEY) + break; + devid = key.objectid; + physical_offset = key.offset; + + dext = btrfs_item_ptr(leaf, slot, struct btrfs_dev_extent); + chunk_offset = btrfs_dev_extent_chunk_offset(leaf, dext); + physical_len = btrfs_dev_extent_length(leaf, dext); + + ret = verify_one_dev_extent(fs_info, chunk_offset, devid, + physical_offset, physical_len); + if (ret < 0) + goto out; + ret = btrfs_next_item(root, path); + if (ret < 0) + goto out; + if (ret > 0) { + ret = 0; + break; + } + } + + /* Ensure all chunks have corresponding dev extents */ + ret = verify_chunk_dev_extent_mapping(fs_info); +out: + btrfs_free_path(path); + return ret; +} diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index 6d4f38ad9f5c..4301bf2d0534 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -345,6 +345,7 @@ struct map_lookup { u64 stripe_len; int num_stripes; int sub_stripes; + int verified_stripes; /* For mount time dev extent verification */ struct btrfs_bio_stripe stripes[]; }; @@ -559,5 +560,6 @@ void btrfs_set_fs_info_ptr(struct btrfs_fs_info *fs_info); void btrfs_reset_fs_info_ptr(struct btrfs_fs_info *fs_info); bool btrfs_check_rw_degradable(struct btrfs_fs_info *fs_info, struct btrfs_device *failing_dev); +int btrfs_verify_dev_extents(struct btrfs_fs_info *fs_info); #endif From patchwork Wed Aug 1 02:37:20 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 10551621 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4CCC414E0 for ; Wed, 1 Aug 2018 02:37:54 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A44E52A49E for ; Wed, 1 Aug 2018 02:37:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 98E5D2A529; Wed, 1 Aug 2018 02:37:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4C0DD2A49E for ; Wed, 1 Aug 2018 02:37:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726567AbeHAEVH (ORCPT ); Wed, 1 Aug 2018 00:21:07 -0400 Received: from mx2.suse.de ([195.135.220.15]:51292 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725933AbeHAEVH (ORCPT ); Wed, 1 Aug 2018 00:21:07 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id BA07FACD7 for ; Wed, 1 Aug 2018 02:37:50 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 5/6] btrfs: Exit gracefully when failed to add chunk map Date: Wed, 1 Aug 2018 10:37:20 +0800 Message-Id: <20180801023721.32143-6-wqu@suse.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180801023721.32143-1-wqu@suse.com> References: <20180801023721.32143-1-wqu@suse.com> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP It's completely possible that a crafted btrfs image contains overlapping chunks. Although we can't detect such problem by tree-checker, but it's not a catastrophic problem, current extent map can already detect such problem and return -EEXIST. We just only need to exit gracefully so btrfs can refuse to mount the fs. Reported-by: Xu Wen Link: https://bugzilla.kernel.org/show_bug.cgi?id=200409 Signed-off-by: Qu Wenruo --- fs/btrfs/volumes.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 467a589854fa..8c281c1e7f36 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -6477,10 +6477,14 @@ static int read_one_chunk(struct btrfs_fs_info *fs_info, struct btrfs_key *key, write_lock(&map_tree->map_tree.lock); ret = add_extent_mapping(&map_tree->map_tree, em, 0); write_unlock(&map_tree->map_tree.lock); - BUG_ON(ret); /* Tree corruption */ + if (ret < 0) { + btrfs_err(fs_info, + "failed to add chunk map, start=%llu len=%llu: %d", + em->start, em->len, ret); + } free_extent_map(em); - return 0; + return ret; } static void fill_device_from_item(struct extent_buffer *leaf, From patchwork Wed Aug 1 02:37:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 10551623 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CC8591708 for ; Wed, 1 Aug 2018 02:37:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BB0B12A6D8 for ; Wed, 1 Aug 2018 02:37:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AE2492A716; Wed, 1 Aug 2018 02:37:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B1BAB2A6D8 for ; Wed, 1 Aug 2018 02:37:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726628AbeHAEVM (ORCPT ); Wed, 1 Aug 2018 00:21:12 -0400 Received: from mx2.suse.de ([195.135.220.15]:51304 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725933AbeHAEVM (ORCPT ); Wed, 1 Aug 2018 00:21:12 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 346ACACD7 for ; Wed, 1 Aug 2018 02:37:55 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 6/6] btrfs: locking: Allow btrfs_tree_lock() to return error to avoid deadlock Date: Wed, 1 Aug 2018 10:37:21 +0800 Message-Id: <20180801023721.32143-7-wqu@suse.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180801023721.32143-1-wqu@suse.com> References: <20180801023721.32143-1-wqu@suse.com> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP [BUG] For certains crafted image, whose csum root leaf has missing backref, if we try to trigger write with data csum, it could cause deadlock with the following kernel WARN_ON(): ------ WARNING: CPU: 1 PID: 41 at fs/btrfs/locking.c:230 btrfs_tree_lock+0x3e2/0x400 CPU: 1 PID: 41 Comm: kworker/u4:1 Not tainted 4.18.0-rc1+ #8 Workqueue: btrfs-endio-write btrfs_endio_write_helper RIP: 0010:btrfs_tree_lock+0x3e2/0x400 Call Trace: btrfs_alloc_tree_block+0x39f/0x770 __btrfs_cow_block+0x285/0x9e0 btrfs_cow_block+0x191/0x2e0 btrfs_search_slot+0x492/0x1160 btrfs_lookup_csum+0xec/0x280 btrfs_csum_file_blocks+0x2be/0xa60 add_pending_csums+0xaf/0xf0 btrfs_finish_ordered_io+0x74b/0xc90 finish_ordered_fn+0x15/0x20 normal_work_helper+0xf6/0x500 btrfs_endio_write_helper+0x12/0x20 process_one_work+0x302/0x770 worker_thread+0x81/0x6d0 kthread+0x180/0x1d0 ret_from_fork+0x35/0x40 ---[ end trace 2e85051acb5f6dc1 ]--- ------ [CAUSE] That crafted image has missing backref for csum tree root leaf. And when we try to allocate new tree block, since there is no EXTENT/METADATA_ITEM for csum tree root, btrfs consider it's free slot and use it. However we have already locked csum tree root leaf by ourselves, thus we will ourselves to release read/write lock, and causing deadlock. [FIX] This patch will allow btrfs_tree_lock() to return error number, so most callers can exit gracefully. (Some caller doesn't really expect btrfs_tree_lock() to return error, and in that case, we use BUG_ON()) Since such problem can only happen in crafted image, we will still trigger kernel warning, but with a little more meaningful warning message. Link: https://bugzilla.kernel.org/show_bug.cgi?id=200405 Reported-by: Xu Wen Signed-off-by: Qu Wenruo Reviewed-by: Su Yue --- fs/btrfs/ctree.c | 57 +++++++++++++++++++++++++++++++------- fs/btrfs/extent-tree.c | 28 +++++++++++++++---- fs/btrfs/extent_io.c | 8 ++++-- fs/btrfs/free-space-tree.c | 4 ++- fs/btrfs/locking.c | 13 +++++++-- fs/btrfs/locking.h | 2 +- fs/btrfs/qgroup.c | 4 ++- fs/btrfs/relocation.c | 13 +++++++-- fs/btrfs/tree-log.c | 14 ++++++++-- 9 files changed, 115 insertions(+), 28 deletions(-) diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 4bc326df472e..6e695f4cd931 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -161,10 +161,12 @@ struct extent_buffer *btrfs_root_node(struct btrfs_root *root) struct extent_buffer *btrfs_lock_root_node(struct btrfs_root *root) { struct extent_buffer *eb; + int ret; while (1) { eb = btrfs_root_node(root); - btrfs_tree_lock(eb); + ret = btrfs_tree_lock(eb); + BUG_ON(ret < 0); if (eb == root->node) break; btrfs_tree_unlock(eb); @@ -1584,6 +1586,7 @@ int btrfs_realloc_node(struct btrfs_trans_handle *trans, for (i = start_slot; i <= end_slot; i++) { struct btrfs_key first_key; int close = 1; + int ret; btrfs_node_key(parent, &disk_key, i); if (!progress_passed && comp_keys(&disk_key, progress) < 0) @@ -1637,7 +1640,11 @@ int btrfs_realloc_node(struct btrfs_trans_handle *trans, if (search_start == 0) search_start = last_block; - btrfs_tree_lock(cur); + ret = btrfs_tree_lock(cur); + if (ret < 0) { + free_extent_buffer(cur); + return ret; + } btrfs_set_lock_blocking(cur); err = __btrfs_cow_block(trans, root, cur, parent, i, &cur, search_start, @@ -1853,7 +1860,11 @@ static noinline int balance_level(struct btrfs_trans_handle *trans, goto enospc; } - btrfs_tree_lock(child); + ret = btrfs_tree_lock(child); + if (ret < 0) { + free_extent_buffer(child); + goto enospc; + } btrfs_set_lock_blocking(child); ret = btrfs_cow_block(trans, root, child, mid, 0, &child); if (ret) { @@ -1891,7 +1902,12 @@ static noinline int balance_level(struct btrfs_trans_handle *trans, left = NULL; if (left) { - btrfs_tree_lock(left); + ret = btrfs_tree_lock(left); + if (ret < 0) { + free_extent_buffer(left); + left = NULL; + goto enospc; + } btrfs_set_lock_blocking(left); wret = btrfs_cow_block(trans, root, left, parent, pslot - 1, &left); @@ -1906,7 +1922,12 @@ static noinline int balance_level(struct btrfs_trans_handle *trans, right = NULL; if (right) { - btrfs_tree_lock(right); + ret = btrfs_tree_lock(right); + if (ret < 0) { + free_extent_buffer(right); + right = NULL; + goto enospc; + } btrfs_set_lock_blocking(right); wret = btrfs_cow_block(trans, root, right, parent, pslot + 1, &right); @@ -2069,7 +2090,11 @@ static noinline int push_nodes_for_insert(struct btrfs_trans_handle *trans, if (left) { u32 left_nr; - btrfs_tree_lock(left); + ret = btrfs_tree_lock(left); + if (ret < 0) { + free_extent_buffer(left); + return ret; + } btrfs_set_lock_blocking(left); left_nr = btrfs_header_nritems(left); @@ -2124,7 +2149,11 @@ static noinline int push_nodes_for_insert(struct btrfs_trans_handle *trans, if (right) { u32 right_nr; - btrfs_tree_lock(right); + ret = btrfs_tree_lock(right); + if (ret < 0) { + free_extent_buffer(right); + return ret; + } btrfs_set_lock_blocking(right); right_nr = btrfs_header_nritems(right); @@ -2874,7 +2903,9 @@ int btrfs_search_slot(struct btrfs_trans_handle *trans, struct btrfs_root *root, err = btrfs_try_tree_write_lock(b); if (!err) { btrfs_set_path_blocking(p); - btrfs_tree_lock(b); + ret = btrfs_tree_lock(b); + if (ret < 0) + goto done; btrfs_clear_path_blocking(p, b, BTRFS_WRITE_LOCK); } @@ -3773,7 +3804,9 @@ static int push_leaf_right(struct btrfs_trans_handle *trans, struct btrfs_root if (IS_ERR(right)) return 1; - btrfs_tree_lock(right); + ret = btrfs_tree_lock(right); + if (ret < 0) + goto out_free; btrfs_set_lock_blocking(right); free_space = btrfs_leaf_free_space(fs_info, right); @@ -3811,6 +3844,7 @@ static int push_leaf_right(struct btrfs_trans_handle *trans, struct btrfs_root right, free_space, left_nritems, min_slot); out_unlock: btrfs_tree_unlock(right); +out_free: free_extent_buffer(right); return 1; } @@ -4007,7 +4041,9 @@ static int push_leaf_left(struct btrfs_trans_handle *trans, struct btrfs_root if (IS_ERR(left)) return 1; - btrfs_tree_lock(left); + ret = btrfs_tree_lock(left); + if (ret < 0) + goto out_free; btrfs_set_lock_blocking(left); free_space = btrfs_leaf_free_space(fs_info, left); @@ -4037,6 +4073,7 @@ static int push_leaf_left(struct btrfs_trans_handle *trans, struct btrfs_root max_slot); out: btrfs_tree_unlock(left); +out_free: free_extent_buffer(left); return ret; } diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 3578fa5b30ef..da615ebc072e 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -8297,14 +8297,19 @@ btrfs_init_new_buffer(struct btrfs_trans_handle *trans, struct btrfs_root *root, { struct btrfs_fs_info *fs_info = root->fs_info; struct extent_buffer *buf; + int ret; buf = btrfs_find_create_tree_block(fs_info, bytenr); if (IS_ERR(buf)) return buf; + ret = btrfs_tree_lock(buf); + if (ret < 0) { + free_extent_buffer(buf); + return ERR_PTR(ret); + } btrfs_set_header_generation(buf, trans->transid); btrfs_set_buffer_lockdep_class(root->root_key.objectid, buf, level); - btrfs_tree_lock(buf); clean_tree_block(fs_info, buf); clear_bit(EXTENT_BUFFER_STALE, &buf->bflags); @@ -8721,7 +8726,9 @@ static noinline int do_walk_down(struct btrfs_trans_handle *trans, level - 1); reada = 1; } - btrfs_tree_lock(next); + ret = btrfs_tree_lock(next); + if (ret < 0) + goto out_free; btrfs_set_lock_blocking(next); ret = btrfs_lookup_extent_info(trans, fs_info, bytenr, level - 1, 1, @@ -8781,7 +8788,9 @@ static noinline int do_walk_down(struct btrfs_trans_handle *trans, free_extent_buffer(next); return -EIO; } - btrfs_tree_lock(next); + ret = btrfs_tree_lock(next); + if (ret < 0) + goto out_free; btrfs_set_lock_blocking(next); } @@ -8839,6 +8848,7 @@ static noinline int do_walk_down(struct btrfs_trans_handle *trans, out_unlock: btrfs_tree_unlock(next); +out_free: free_extent_buffer(next); return ret; @@ -8887,7 +8897,8 @@ static noinline int walk_up_proc(struct btrfs_trans_handle *trans, */ if (!path->locks[level]) { BUG_ON(level == 0); - btrfs_tree_lock(eb); + ret = btrfs_tree_lock(eb); + BUG_ON(ret < 0); btrfs_set_lock_blocking(eb); path->locks[level] = BTRFS_WRITE_LOCK_BLOCKING; @@ -8929,7 +8940,8 @@ static noinline int walk_up_proc(struct btrfs_trans_handle *trans, /* make block locked assertion in clean_tree_block happy */ if (!path->locks[level] && btrfs_header_generation(eb) == trans->transid) { - btrfs_tree_lock(eb); + ret = btrfs_tree_lock(eb); + BUG_ON(ret < 0); btrfs_set_lock_blocking(eb); path->locks[level] = BTRFS_WRITE_LOCK_BLOCKING; } @@ -9107,7 +9119,11 @@ int btrfs_drop_snapshot(struct btrfs_root *root, level = btrfs_header_level(root->node); while (1) { - btrfs_tree_lock(path->nodes[level]); + ret = btrfs_tree_lock(path->nodes[level]); + if (ret < 0) { + err = ret; + goto out_end_trans; + } btrfs_set_lock_blocking(path->nodes[level]); path->locks[level] = BTRFS_WRITE_LOCK_BLOCKING; diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index cce6087d6880..3ba02aac6dd5 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3545,7 +3545,9 @@ lock_extent_buffer_for_io(struct extent_buffer *eb, if (!btrfs_try_tree_write_lock(eb)) { flush = 1; flush_write_bio(epd); - btrfs_tree_lock(eb); + ret = btrfs_tree_lock(eb); + if (ret < 0) + return ret; } if (test_bit(EXTENT_BUFFER_WRITEBACK, &eb->bflags)) { @@ -3558,7 +3560,9 @@ lock_extent_buffer_for_io(struct extent_buffer *eb, } while (1) { wait_on_extent_buffer_writeback(eb); - btrfs_tree_lock(eb); + ret = btrfs_tree_lock(eb); + if (ret < 0) + return ret; if (!test_bit(EXTENT_BUFFER_WRITEBACK, &eb->bflags)) break; btrfs_tree_unlock(eb); diff --git a/fs/btrfs/free-space-tree.c b/fs/btrfs/free-space-tree.c index b5950aacd697..948bbc45638a 100644 --- a/fs/btrfs/free-space-tree.c +++ b/fs/btrfs/free-space-tree.c @@ -1242,7 +1242,9 @@ int btrfs_clear_free_space_tree(struct btrfs_fs_info *fs_info) list_del(&free_space_root->dirty_list); - btrfs_tree_lock(free_space_root->node); + ret = btrfs_tree_lock(free_space_root->node); + if (ret < 0) + goto abort; clean_tree_block(fs_info, free_space_root->node); btrfs_tree_unlock(free_space_root->node); btrfs_free_tree_block(trans, free_space_root, free_space_root->node, diff --git a/fs/btrfs/locking.c b/fs/btrfs/locking.c index 1da768e5ef75..ca151be026bf 100644 --- a/fs/btrfs/locking.c +++ b/fs/btrfs/locking.c @@ -224,10 +224,18 @@ void btrfs_tree_read_unlock_blocking(struct extent_buffer *eb) /* * take a spinning write lock. This will wait for both * blocking readers or writers + * + * Return 0 if we successfully locked the tree block + * Return <0 if some selftest failed (mostly due to extent tree corruption) */ -void btrfs_tree_lock(struct extent_buffer *eb) +int btrfs_tree_lock(struct extent_buffer *eb) { - WARN_ON(eb->lock_owner == current->pid); + if (unlikely(eb->lock_owner == current->pid)) { + WARN(1, +"tree block %llu already locked by current (pid=%d), refuse to lock", + eb->start, current->pid); + return -EUCLEAN; + } again: wait_event(eb->read_lock_wq, atomic_read(&eb->blocking_readers) == 0); wait_event(eb->write_lock_wq, atomic_read(&eb->blocking_writers) == 0); @@ -248,6 +256,7 @@ void btrfs_tree_lock(struct extent_buffer *eb) atomic_inc(&eb->spinning_writers); atomic_inc(&eb->write_locks); eb->lock_owner = current->pid; + return 0; } /* diff --git a/fs/btrfs/locking.h b/fs/btrfs/locking.h index 29135def468e..7394d7ba2ff9 100644 --- a/fs/btrfs/locking.h +++ b/fs/btrfs/locking.h @@ -11,7 +11,7 @@ #define BTRFS_WRITE_LOCK_BLOCKING 3 #define BTRFS_READ_LOCK_BLOCKING 4 -void btrfs_tree_lock(struct extent_buffer *eb); +int btrfs_tree_lock(struct extent_buffer *eb); void btrfs_tree_unlock(struct extent_buffer *eb); void btrfs_tree_read_lock(struct extent_buffer *eb); diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c index 1874a6d2e6f5..ec4351fd7537 100644 --- a/fs/btrfs/qgroup.c +++ b/fs/btrfs/qgroup.c @@ -1040,7 +1040,9 @@ int btrfs_quota_disable(struct btrfs_trans_handle *trans, list_del("a_root->dirty_list); - btrfs_tree_lock(quota_root->node); + ret = btrfs_tree_lock(quota_root->node); + if (ret < 0) + goto out; clean_tree_block(fs_info, quota_root->node); btrfs_tree_unlock(quota_root->node); btrfs_free_tree_block(trans, quota_root, quota_root->node, 0, 1); diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index be94c65bb4d2..a2fc0bd83a40 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -1877,7 +1877,11 @@ int replace_path(struct btrfs_trans_handle *trans, free_extent_buffer(eb); break; } - btrfs_tree_lock(eb); + ret = btrfs_tree_lock(eb); + if (ret < 0) { + free_extent_buffer(eb); + break; + } if (cow) { ret = btrfs_cow_block(trans, dest, eb, parent, slot, &eb); @@ -2788,7 +2792,12 @@ static int do_relocation(struct btrfs_trans_handle *trans, err = -EIO; goto next; } - btrfs_tree_lock(eb); + ret = btrfs_tree_lock(eb); + if (ret < 0) { + free_extent_buffer(eb); + err = ret; + goto next; + } btrfs_set_lock_blocking(eb); if (!node->eb) { diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index f8220ec02036..173bc15a7cd1 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -2588,7 +2588,11 @@ static noinline int walk_down_log_tree(struct btrfs_trans_handle *trans, } if (trans) { - btrfs_tree_lock(next); + ret = btrfs_tree_lock(next); + if (ret < 0) { + free_extent_buffer(next); + return ret; + } btrfs_set_lock_blocking(next); clean_tree_block(fs_info, next); btrfs_wait_tree_block_writeback(next); @@ -2672,7 +2676,9 @@ static noinline int walk_up_log_tree(struct btrfs_trans_handle *trans, next = path->nodes[*level]; if (trans) { - btrfs_tree_lock(next); + ret = btrfs_tree_lock(next); + if (ret < 0) + return ret; btrfs_set_lock_blocking(next); clean_tree_block(fs_info, next); btrfs_wait_tree_block_writeback(next); @@ -2754,7 +2760,9 @@ static int walk_log_tree(struct btrfs_trans_handle *trans, next = path->nodes[orig_level]; if (trans) { - btrfs_tree_lock(next); + ret = btrfs_tree_lock(next); + if (ret < 0) + goto out; btrfs_set_lock_blocking(next); clean_tree_block(fs_info, next); btrfs_wait_tree_block_writeback(next);