From patchwork Mon Oct 17 11:55:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13008692 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C916C433FE for ; Mon, 17 Oct 2022 11:55:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230335AbiJQLzm (ORCPT ); Mon, 17 Oct 2022 07:55:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43960 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230272AbiJQLzj (ORCPT ); Mon, 17 Oct 2022 07:55:39 -0400 Received: from esa4.hgst.iphmx.com (esa4.hgst.iphmx.com [216.71.154.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D7A694B982 for ; Mon, 17 Oct 2022 04:55:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1666007736; x=1697543736; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Hf9Z+o4YR38WziKkB9Uph07xCV0Nto76Q3RiaGWWnM4=; b=ArKDli7u0dQmQvCKmpyBvvEUYyIUrHR/DzRM1+omycefm4ZG41KFDUqU C4Y2nRqpmpIZ1gBtmh9NJ2nsK22KCGXGI21nlZ13kJZp7B88DDwm3eUm+ q3gPVjPSGMLEmYSzp1ayxRpmNdS92YKljJviB8Uju0gPzQILHA/Nb93TS 3XCROquCHRh7G4GLwM/YrlYXpNQveG4ml2xcXJK8U2Hpfg1zpLxKlgRHK /9zZ3Mb6HF5KI92hTsSZ9Y6md/ruf2KJSwTQ5ALFfiyRUAu06835M6uTL vYuCb1s5I/x6dMaaDxU5jy6AZz/ZC2cYGWAKdQ2rNykRdbL+vkxQPVQso A==; X-IronPort-AV: E=Sophos;i="5.95,191,1661788800"; d="scan'208";a="212337154" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 17 Oct 2022 19:55:35 +0800 IronPort-SDR: LoBlhstHrBppIRf7//BY07BxVmTeFr5XcoVBlRk3sQhE/rSkpOhv9s9j5pKNJJrpo/paJlJEHB UBwAs0PGIT7m2XH77lCF57Hcuh5yn0VrlN/mGvdmDgF02b6F6jMBN2Yz0jQVpIgXSMZxsE8OB9 hpoItu+ljBEEAibidtqHSFAATHCW+GZWti4ToEWJrX31V3xQBVjcafS/eb27fAo0NBEV7JKRjm 2shYZlf9TZ9r68UMMJBNgI57LptVeNoYI9gEm1Qs7WXjxQ33NKzTdHxwsd7RU3+qLfOeDiyE4J FSgdJf5V1GYDp17brjWkGr81 Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 17 Oct 2022 04:15:08 -0700 IronPort-SDR: WlRG0oubXsD1OszfGhTjTzW7vIT6TDohWto6mzugZckwr+5U1aZzfyBgVU++xrCFTg5MEzZ0VC vDOlnQeTB65M9AJM5Wu8P4Ra7WZl+I0Vhf/G8BtkvwCl+ZG5trqZb2xRZuVYiWrLG+SmUejrOD H739UTqWzkj6851eA6590tBcZS2e76n6JqKWX3Wj60MpeHK+eEfTBKK5F8Oi9XSYiqPi/+Wu+h lHCArTUM8chqee0iwvuLPwfjZ+5KWYxYoqjWqURoYvq/fD6OOuwySPIqQDew5W9MCGwd+Ya3dV kpg= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip02.wdc.com with ESMTP; 17 Oct 2022 04:55:35 -0700 From: Johannes Thumshirn To: linux-btrfs@vger.kernel.org Cc: Johannes Thumshirn Subject: [RFC v3 01/11] btrfs: add raid stripe tree definitions Date: Mon, 17 Oct 2022 04:55:19 -0700 Message-Id: <6ca9b49af62a15f7a3482aca3f2566cc10ce8330.1666007330.git.johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Add definitions for the raid stripe tree. This tree will hold information about the on-disk layout of the stripes in a RAID set. Each stripe extent has a 1:1 relationship with an on-disk extent item and is doing the logical to per-drive physical address translation for the extent item in question. Signed-off-by: Johannes Thumshirn --- fs/btrfs/ctree.h | 29 +++++++++++++++++++++++++++++ include/uapi/linux/btrfs_tree.h | 20 ++++++++++++++++++-- 2 files changed, 47 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 87bb4218276b..80ead27299dc 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1992,6 +1992,35 @@ BTRFS_SETGET_FUNCS(timespec_nsec, struct btrfs_timespec, nsec, 32); BTRFS_SETGET_STACK_FUNCS(stack_timespec_sec, struct btrfs_timespec, sec, 64); BTRFS_SETGET_STACK_FUNCS(stack_timespec_nsec, struct btrfs_timespec, nsec, 32); +BTRFS_SETGET_FUNCS(stripe_extent_devid, struct btrfs_stripe_extent, devid, 64); +BTRFS_SETGET_FUNCS(stripe_extent_physical, struct btrfs_stripe_extent, physical, 64); +BTRFS_SETGET_STACK_FUNCS(stack_stripe_extent_devid, struct btrfs_stripe_extent, devid, 64); +BTRFS_SETGET_STACK_FUNCS(stack_stripe_extent_physical, struct btrfs_stripe_extent, physical, 64); + +static inline struct btrfs_stripe_extent *btrfs_stripe_extent_nr( + struct btrfs_dp_stripe *dps, int nr) +{ + unsigned long offset = (unsigned long)dps; + + offset += offsetof(struct btrfs_dp_stripe, extents); + offset += nr * sizeof(struct btrfs_stripe_extent); + return (struct btrfs_stripe_extent *)offset; +} + +static inline u64 btrfs_stripe_extent_devid_nr(const struct extent_buffer *eb, + struct btrfs_dp_stripe *dps, + int nr) +{ + return btrfs_stripe_extent_devid(eb, btrfs_stripe_extent_nr(dps, nr)); +} + +static inline u64 btrfs_stripe_extent_physical_nr(const struct extent_buffer *eb, + struct btrfs_dp_stripe *dps, + int nr) +{ + return btrfs_stripe_extent_physical(eb, btrfs_stripe_extent_nr(dps, nr)); +} + /* struct btrfs_dev_extent */ BTRFS_SETGET_FUNCS(dev_extent_chunk_tree, struct btrfs_dev_extent, chunk_tree, 64); diff --git a/include/uapi/linux/btrfs_tree.h b/include/uapi/linux/btrfs_tree.h index 5f32a2a495dc..047e1d0b2ff6 100644 --- a/include/uapi/linux/btrfs_tree.h +++ b/include/uapi/linux/btrfs_tree.h @@ -4,9 +4,8 @@ #include #include -#ifdef __KERNEL__ #include -#else +#ifndef __KERNEL__ #include #endif @@ -56,6 +55,9 @@ /* Holds the block group items for extent tree v2. */ #define BTRFS_BLOCK_GROUP_TREE_OBJECTID 11ULL +/* tracks RAID stripes in block groups. */ +#define BTRFS_RAID_STRIPE_TREE_OBJECTID 12ULL + /* device stats in the device tree */ #define BTRFS_DEV_STATS_OBJECTID 0ULL @@ -264,6 +266,8 @@ */ #define BTRFS_QGROUP_RELATION_KEY 246 +#define BTRFS_RAID_STRIPE_KEY 247 + /* * Obsolete name, see BTRFS_TEMPORARY_ITEM_KEY. */ @@ -488,6 +492,18 @@ struct btrfs_free_space_header { __le64 num_bitmaps; } __attribute__ ((__packed__)); +struct btrfs_stripe_extent { + /* btrfs device-id this raid extent lives on */ + __le64 devid; + /* physical location on disk */ + __le64 physical; +} __attribute__ ((__packed__)); + +struct btrfs_dp_stripe { + /* array of stripe extents this stripe is composed of */ + __DECLARE_FLEX_ARRAY(struct btrfs_stripe_extent, extents); +} __attribute__ ((__packed__)); + #define BTRFS_HEADER_FLAG_WRITTEN (1ULL << 0) #define BTRFS_HEADER_FLAG_RELOC (1ULL << 1) From patchwork Mon Oct 17 11:55:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13008693 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C15F6C43217 for ; Mon, 17 Oct 2022 11:55:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230305AbiJQLzn (ORCPT ); Mon, 17 Oct 2022 07:55:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43982 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230311AbiJQLzk (ORCPT ); Mon, 17 Oct 2022 07:55:40 -0400 Received: from esa4.hgst.iphmx.com (esa4.hgst.iphmx.com [216.71.154.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8957C4D254 for ; Mon, 17 Oct 2022 04:55:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1666007738; x=1697543738; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=k823U7GWnY2UFKJOyrEMfCA0sDZ3rRiSWmqcNjBhcC4=; b=rsNE4KMpZwRc8xVieweVEKLIM8LzmLhqfjNDJd4m/OfS+7GWxkDoUtpW KQj2Y9qx2li9sQmvz9Ubzpa6dFgss+snj19YgPR/9OBh98WvFAeCu4Hk5 WvGoYMIMeyQvYVmz4Aulx2e3Au3d4Gn0NFqaHtQ+gmsZFyFBQYtwQCarw VliKdfzHl1DQR0GAg93D7GfH+INU7mhQqhUs8T6tOaP9CWrxmdiQn7abK hRaGEtfKgBwochdRm+mEsflnLU9ipBBb/wMbYXil8mwL90Q/T9s3zV5rk QTgc6FjUrjYv/JClDjiXo/jk85zwNgJ563TGnIMfGEJIuHuel7aZm0Ly+ g==; X-IronPort-AV: E=Sophos;i="5.95,191,1661788800"; d="scan'208";a="212337155" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 17 Oct 2022 19:55:36 +0800 IronPort-SDR: zjjb86uNBivFq0ZMjjAyRfJKPglSBvNVMS4oTjyyg18+rT/S0A0/2AFjHyTImBYlzLVhwiAyj2 qAfZYwHNknNPUhl9fKp4H0fvsDvdW++RL9W2uylNDXb93/br9XAkJ3kgYC9dktbCZWy6VLWAIF 13jZQq9cLiSGPJ4QzOEv2pSU38JJxfErvvoSf2hAneQC61/iMR4Csc/HklX8hiz8f8LD63XdbD FeNX3YyV2zwZJsG9ibbdsyCKteoHxBIcfHmuNwjDlWBbKQ9KsvjyBlpKWBgOc4qrN5mAykerK+ c6VTADc9k0MEGdTx8Wycs4Mz Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 17 Oct 2022 04:15:09 -0700 IronPort-SDR: m3JS3Wx08vn0tJyZMJrTVscrwH8rYPIU0MhkwIna/P8yaNJGp891Q9Z4TrCI+DJIOy3QJUM3pE AYfttwgbOb21b5MctwwKWwrky3k1I+29gzrLvo5kWDhyNGno4uXha9O2N4t8eB1B+Eip1RDRSi MuRjfHOnJGVgdie1LFV9TsD+DsxZ3Rd2Bo1nsrWx8WC4qaEMBi5h0V3hWoeiUnQoLZEEo0gDZE YQHVxNSSm/XJGxNw5uYXtAXiavoMGMY+RU+W0kisxjUYOKoaxEHE63k8kS7awGCZnKXDKFrcB4 SMo= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip02.wdc.com with ESMTP; 17 Oct 2022 04:55:36 -0700 From: Johannes Thumshirn To: linux-btrfs@vger.kernel.org Cc: Johannes Thumshirn Subject: [RFC v3 02/11] btrfs: read raid-stripe-tree from disk Date: Mon, 17 Oct 2022 04:55:20 -0700 Message-Id: <578b525f70185756de6cccf4443c95d8dc262e0e.1666007330.git.johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org If we find a raid-stripe-tree on mount, read it from disk. Signed-off-by: Johannes Thumshirn --- fs/btrfs/block-rsv.c | 1 + fs/btrfs/ctree.h | 1 + fs/btrfs/disk-io.c | 12 ++++++++++++ include/uapi/linux/btrfs.h | 1 + 4 files changed, 15 insertions(+) diff --git a/fs/btrfs/block-rsv.c b/fs/btrfs/block-rsv.c index 6ce704d3bdd2..6794443cb0e8 100644 --- a/fs/btrfs/block-rsv.c +++ b/fs/btrfs/block-rsv.c @@ -425,6 +425,7 @@ void btrfs_init_root_block_rsv(struct btrfs_root *root) case BTRFS_EXTENT_TREE_OBJECTID: case BTRFS_FREE_SPACE_TREE_OBJECTID: case BTRFS_BLOCK_GROUP_TREE_OBJECTID: + case BTRFS_RAID_STRIPE_TREE_OBJECTID: root->block_rsv = &fs_info->delayed_refs_rsv; break; case BTRFS_ROOT_TREE_OBJECTID: diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 80ead27299dc..430f224743a9 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -680,6 +680,7 @@ struct btrfs_fs_info { struct btrfs_root *uuid_root; struct btrfs_root *data_reloc_root; struct btrfs_root *block_group_root; + struct btrfs_root *stripe_root; /* the log root tree is a directory of all the other log roots */ struct btrfs_root *log_root_tree; diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 7bb00a010c74..a166b2602c40 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1396,6 +1396,9 @@ static struct btrfs_root *btrfs_get_global_root(struct btrfs_fs_info *fs_info, return btrfs_grab_root(root) ? root : ERR_PTR(-ENOENT); } + if (objectid == BTRFS_RAID_STRIPE_TREE_OBJECTID) + return btrfs_grab_root(fs_info->stripe_root) ? + fs_info->stripe_root : ERR_PTR(-ENOENT); return NULL; } @@ -1474,6 +1477,7 @@ void btrfs_free_fs_info(struct btrfs_fs_info *fs_info) btrfs_put_root(fs_info->fs_root); btrfs_put_root(fs_info->data_reloc_root); btrfs_put_root(fs_info->block_group_root); + btrfs_put_root(fs_info->stripe_root); btrfs_check_leaked_roots(fs_info); btrfs_extent_buffer_leak_debug_check(fs_info); kfree(fs_info->super_copy); @@ -2008,6 +2012,7 @@ static void free_root_pointers(struct btrfs_fs_info *info, bool free_chunk_root) free_root_extent_buffers(info->fs_root); free_root_extent_buffers(info->data_reloc_root); free_root_extent_buffers(info->block_group_root); + free_root_extent_buffers(info->stripe_root); if (free_chunk_root) free_root_extent_buffers(info->chunk_root); } @@ -2457,6 +2462,13 @@ static int btrfs_read_roots(struct btrfs_fs_info *fs_info) fs_info->uuid_root = root; } + location.objectid = BTRFS_RAID_STRIPE_TREE_OBJECTID; + root = btrfs_read_tree_root(tree_root, &location); + if (!IS_ERR(root)) { + set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state); + fs_info->stripe_root = root; + } + return 0; out: btrfs_warn(fs_info, "failed to read root (objectid=%llu): %d", diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h index c413fca6f581..74508dee2878 100644 --- a/include/uapi/linux/btrfs.h +++ b/include/uapi/linux/btrfs.h @@ -316,6 +316,7 @@ struct btrfs_ioctl_fs_info_args { #define BTRFS_FEATURE_INCOMPAT_RAID1C34 (1ULL << 11) #define BTRFS_FEATURE_INCOMPAT_ZONED (1ULL << 12) #define BTRFS_FEATURE_INCOMPAT_EXTENT_TREE_V2 (1ULL << 13) +#define BTRFS_FEATURE_INCOMPAT_STRIPE_TREE (1ULL << 14) struct btrfs_ioctl_feature_flags { __u64 compat_flags; From patchwork Mon Oct 17 11:55:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13008695 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C99E3C43219 for ; Mon, 17 Oct 2022 11:55:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230050AbiJQLzp (ORCPT ); Mon, 17 Oct 2022 07:55:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44040 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230118AbiJQLzl (ORCPT ); Mon, 17 Oct 2022 07:55:41 -0400 Received: from esa4.hgst.iphmx.com (esa4.hgst.iphmx.com [216.71.154.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ADB0B57E0F for ; Mon, 17 Oct 2022 04:55:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1666007739; x=1697543739; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=rV9FAufAXgsESK/Ahap77qr/WaEZ3622VgRZQkGTzp4=; b=lPrJOG9g7va33qWsPGkDNH6Yx8/sDDgR4yy3iVvzwwLnrBjkbuRIGIMZ LZJRZ57fKrjrYBX9lEO26odGM9ngsdXkWi044ltFWpqOaKxq/dduVPooD /jSytfUIV+rMHGHIczBjUewpzxnKeOAuDCgeIgwpavU8nFm2Ry6D40PqH 9HroHVmWU1oOwswXbodBFPbsZUZCIpQL3TlKn/7+4YvUhA3oxV5aVH6KW W9ybNmS/BtNO8ztTpwTGFTNBowUbezSGjVgxpmVfeh+CB/YymJkAf3C92 JlNKwwx1RN3cpxbE4T8NYMw9Uss0nSfttzXGTYgVB6EH9Ztww7LQLc46p Q==; X-IronPort-AV: E=Sophos;i="5.95,191,1661788800"; d="scan'208";a="212337157" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 17 Oct 2022 19:55:36 +0800 IronPort-SDR: XEE2EyZJfzf0bWDH1eALkf4h+Sf15+zTdlUNlSr89gkB2G26APb0E9+wtXTKE1pHY/TPZV+vdp /YhOHLFBJ7F9/I75r3W3Dt+SwNjA2Tlv7sFVIr/GvrMx60s6e8HpzIQV5LMgZzyOFi3YrIrYvr yC4ZghQe0bcx1Wfr37pwW1OH4wUrN905QQv9TmzLZVjjpvvaD90/fOpQ8mB+L3NfI6hMXUVO81 2aqbbOyWAXeE9kfQR3K+TjLdLRw4NwwmmC0h5G6x54BGFVHI9oEXXj6MFUOH/+Z1c74kcYz/Q0 6XnIj7IE5O9GuriT0oiko1sU Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 17 Oct 2022 04:15:10 -0700 IronPort-SDR: PiND+OpDBZsP4Og/sF25K1r7aqs8arKRUtlHoA3CoIzKknVEM09WRDHNE6UsznHSNCy1/lVULF IX/IN1JnjHgXUyaZY9hM+nkOsNEHflH+JRkBTFlm+4OOKNtX53q/fu57CIPWrAk3DPiDos9fVB 4Nu1x1ZlhB11X478vgvox8mtmMH1xAVvxKuBguBzyfpU/PzjSMVc+DPlZhgsbLge2VvORdr/B4 Ne8X+xVOYJKYN6ecpdsuvFaldjKjlE9V0OTCTkNihpxk0lYXVt+2mD8e4C3osIthUqWGfAGmd+ 1j4= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip02.wdc.com with ESMTP; 17 Oct 2022 04:55:37 -0700 From: Johannes Thumshirn To: linux-btrfs@vger.kernel.org Cc: Johannes Thumshirn Subject: [RFC v3 03/11] btrfs: add support for inserting raid stripe extents Date: Mon, 17 Oct 2022 04:55:21 -0700 Message-Id: <5c8b4c3005d7c02ca4ab76a1802f14137ae47bda.1666007330.git.johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Add support for inserting stripe extents into the raid stripe tree on completion of every write that needs an extra logical-to-physical translation when using RAID. Inserting the stripe extents happens after the data I/O has completed, this is done to a) support zone-append and b) rule out the possibility of a RAID-write-hole. This is done by creating in-memory ordered stripe extents, just like the in memory ordered extents, on I/O completion and the on-disk raid stripe extents get created once we're running the delayed_refs for the extent item this stripe extent is tied to. Signed-off-by: Johannes Thumshirn --- fs/btrfs/Makefile | 2 +- fs/btrfs/ctree.h | 3 + fs/btrfs/disk-io.c | 3 + fs/btrfs/extent-tree.c | 48 +++++++++ fs/btrfs/inode.c | 6 ++ fs/btrfs/raid-stripe-tree.c | 189 ++++++++++++++++++++++++++++++++++++ fs/btrfs/raid-stripe-tree.h | 49 ++++++++++ fs/btrfs/volumes.c | 35 ++++++- fs/btrfs/volumes.h | 14 +-- fs/btrfs/zoned.c | 4 + 10 files changed, 345 insertions(+), 8 deletions(-) create mode 100644 fs/btrfs/raid-stripe-tree.c create mode 100644 fs/btrfs/raid-stripe-tree.h diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile index 99f9995670ea..4484831ac624 100644 --- a/fs/btrfs/Makefile +++ b/fs/btrfs/Makefile @@ -31,7 +31,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \ backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o \ uuid-tree.o props.o free-space-tree.o tree-checker.o space-info.o \ block-rsv.o delalloc-space.o block-group.o discard.o reflink.o \ - subpage.o tree-mod-log.o + subpage.o tree-mod-log.o raid-stripe-tree.o btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 430f224743a9..1f75ab8702bb 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1102,6 +1102,9 @@ struct btrfs_fs_info { struct lockdep_map btrfs_trans_pending_ordered_map; struct lockdep_map btrfs_ordered_extent_map; + struct mutex stripe_update_lock; + struct rb_root stripe_update_tree; + #ifdef CONFIG_BTRFS_FS_REF_VERIFY spinlock_t ref_verify_lock; struct rb_root block_tree; diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index a166b2602c40..190caabf5fb7 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2973,6 +2973,9 @@ void btrfs_init_fs_info(struct btrfs_fs_info *fs_info) fs_info->bg_reclaim_threshold = BTRFS_DEFAULT_RECLAIM_THRESH; INIT_WORK(&fs_info->reclaim_bgs_work, btrfs_reclaim_bgs_work); + + mutex_init(&fs_info->stripe_update_lock); + fs_info->stripe_update_tree = RB_ROOT; } static int init_mount_fs_info(struct btrfs_fs_info *fs_info, struct super_block *sb) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index bcd0e72cded3..061296bcdfb4 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -36,6 +36,7 @@ #include "rcu-string.h" #include "zoned.h" #include "dev-replace.h" +#include "raid-stripe-tree.h" #undef SCRAMBLE_DELAYED_REFS @@ -1491,6 +1492,50 @@ static int __btrfs_inc_extent_ref(struct btrfs_trans_handle *trans, return ret; } +static int add_stripe_entry_for_delayed_ref(struct btrfs_trans_handle *trans, + struct btrfs_delayed_ref_node *node) +{ + struct btrfs_fs_info *fs_info = trans->fs_info; + struct extent_map *em; + struct map_lookup *map; + int ret; + + if (!fs_info->stripe_root) + return 0; + + em = btrfs_get_chunk_map(fs_info, node->bytenr, node->num_bytes); + if (!em) { + btrfs_err(fs_info, + "cannot get chunk map for address %llu", + node->bytenr); + return -EINVAL; + } + + map = em->map_lookup; + + if (btrfs_need_stripe_tree_update(fs_info, map->type)) { + struct btrfs_ordered_stripe *stripe; + + stripe = btrfs_lookup_ordered_stripe(fs_info, node->bytenr); + if (!stripe) { + btrfs_err(fs_info, + "cannot get stripe extent for address %llu (%llu)", + node->bytenr, node->num_bytes); + free_extent_map(em); + return -EINVAL; + } + ASSERT(stripe->logical == node->bytenr); + ret = btrfs_insert_raid_extent(trans, stripe); + /* once for us */ + btrfs_put_ordered_stripe(fs_info, stripe); + /* once for the tree */ + btrfs_put_ordered_stripe(fs_info, stripe); + } + free_extent_map(em); + + return ret; +} + static int run_delayed_data_ref(struct btrfs_trans_handle *trans, struct btrfs_delayed_ref_node *node, struct btrfs_delayed_extent_op *extent_op, @@ -1521,6 +1566,9 @@ static int run_delayed_data_ref(struct btrfs_trans_handle *trans, flags, ref->objectid, ref->offset, &ins, node->ref_mod); + if (ret) + return ret; + ret = add_stripe_entry_for_delayed_ref(trans, node); } else if (node->action == BTRFS_ADD_DELAYED_REF) { ret = __btrfs_inc_extent_ref(trans, node, parent, ref_root, ref->objectid, ref->offset, diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 5f2d9d4f6d43..5414ba573022 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -55,6 +55,7 @@ #include "zoned.h" #include "subpage.h" #include "inode-item.h" +#include "raid-stripe-tree.h" struct btrfs_iget_args { u64 ino; @@ -9550,6 +9551,11 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent( if (qgroup_released < 0) return ERR_PTR(qgroup_released); + ret = btrfs_insert_preallocated_raid_stripe(inode->root->fs_info, + start, len); + if (ret) + goto free_qgroup; + if (trans) { ret = insert_reserved_file_extent(trans, inode, file_offset, &stack_fi, diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c new file mode 100644 index 000000000000..d8a69060b54b --- /dev/null +++ b/fs/btrfs/raid-stripe-tree.c @@ -0,0 +1,189 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include + +#include "ctree.h" +#include "transaction.h" +#include "disk-io.h" +#include "raid-stripe-tree.h" +#include "volumes.h" +#include "misc.h" +#include "print-tree.h" + +static int ordered_stripe_cmp(const void *key, const struct rb_node *node) +{ + struct btrfs_ordered_stripe *stripe = + rb_entry(node, struct btrfs_ordered_stripe, rb_node); + const u64 *logical = key; + + if (*logical < stripe->logical) + return -1; + if (*logical >= stripe->logical + stripe->num_bytes) + return 1; + return 0; +} + +static int ordered_stripe_less(struct rb_node *rba, const struct rb_node *rbb) +{ + struct btrfs_ordered_stripe *stripe = + rb_entry(rba, struct btrfs_ordered_stripe, rb_node); + return ordered_stripe_cmp(&stripe->logical, rbb); +} + +int btrfs_add_ordered_stripe(struct btrfs_io_context *bioc) +{ + struct btrfs_fs_info *fs_info = bioc->fs_info; + struct btrfs_ordered_stripe *stripe; + struct btrfs_io_stripe *tmp; + u64 logical = bioc->logical; + u64 length = bioc->size; + struct rb_node *node; + size_t size; + + size = bioc->num_stripes * sizeof(struct btrfs_io_stripe); + stripe = kzalloc(sizeof(struct btrfs_ordered_stripe), GFP_NOFS); + if (!stripe) + return -ENOMEM; + + spin_lock_init(&stripe->lock); + tmp = kmemdup(bioc->stripes, size, GFP_NOFS); + if (!tmp) { + kfree(stripe); + return -ENOMEM; + } + + stripe->logical = logical; + stripe->num_bytes = length; + stripe->num_stripes = bioc->num_stripes; + spin_lock(&stripe->lock); + stripe->stripes = tmp; + spin_unlock(&stripe->lock); + refcount_set(&stripe->ref, 1); + + mutex_lock(&fs_info->stripe_update_lock); + node = rb_find_add(&stripe->rb_node, &fs_info->stripe_update_tree, + ordered_stripe_less); + mutex_unlock(&fs_info->stripe_update_lock); + if (node) { + btrfs_err(fs_info, "logical: %llu, length: %llu already exists", + logical, length); + return -EINVAL; + } + + return 0; +} + +struct btrfs_ordered_stripe *btrfs_lookup_ordered_stripe(struct btrfs_fs_info *fs_info, + u64 logical) +{ + struct rb_root *root = &fs_info->stripe_update_tree; + struct btrfs_ordered_stripe *stripe = NULL; + struct rb_node *node; + + mutex_lock(&fs_info->stripe_update_lock); + node = rb_find(&logical, root, ordered_stripe_cmp); + if (node) { + stripe = rb_entry(node, struct btrfs_ordered_stripe, rb_node); + refcount_inc(&stripe->ref); + } + mutex_unlock(&fs_info->stripe_update_lock); + + return stripe; +} + +void btrfs_put_ordered_stripe(struct btrfs_fs_info *fs_info, + struct btrfs_ordered_stripe *stripe) +{ + mutex_lock(&fs_info->stripe_update_lock); + if (refcount_dec_and_test(&stripe->ref)) { + struct rb_node *node = &stripe->rb_node; + + rb_erase(node, &fs_info->stripe_update_tree); + RB_CLEAR_NODE(node); + + spin_lock(&stripe->lock); + kfree(stripe->stripes); + spin_unlock(&stripe->lock); + kfree(stripe); + } + mutex_unlock(&fs_info->stripe_update_lock); +} + +int btrfs_insert_preallocated_raid_stripe(struct btrfs_fs_info *fs_info, + u64 start, u64 len) +{ + struct btrfs_io_context *bioc = NULL; + struct btrfs_ordered_stripe *stripe; + u64 map_length = len; + int ret; + + if (!fs_info->stripe_root) + return 0; + + ret = btrfs_map_block(fs_info, BTRFS_MAP_WRITE, start, &map_length, + &bioc, 0); + if (ret) + return ret; + + bioc->size = len; + + stripe = btrfs_lookup_ordered_stripe(fs_info, start); + if (!stripe) { + ret = btrfs_add_ordered_stripe(bioc); + if (ret) + return ret; + } else { + spin_lock(&stripe->lock); + memcpy(stripe->stripes, bioc->stripes, + bioc->num_stripes * sizeof(struct btrfs_io_stripe)); + spin_unlock(&stripe->lock); + btrfs_put_ordered_stripe(fs_info, stripe); + } + + return 0; +} + +int btrfs_insert_raid_extent(struct btrfs_trans_handle *trans, + struct btrfs_ordered_stripe *stripe) +{ + struct btrfs_fs_info *fs_info = trans->fs_info; + struct btrfs_key stripe_key; + struct btrfs_root *stripe_root = fs_info->stripe_root; + struct btrfs_dp_stripe *raid_stripe; + size_t item_size; + int ret; + + item_size = stripe->num_stripes * sizeof(struct btrfs_stripe_extent); + + raid_stripe = kzalloc(item_size, GFP_NOFS); + if (!raid_stripe) { + btrfs_abort_transaction(trans, -ENOMEM); + btrfs_end_transaction(trans); + return -ENOMEM; + } + + spin_lock(&stripe->lock); + for (int i = 0; i < stripe->num_stripes; i++) { + u64 devid = stripe->stripes[i].dev->devid; + u64 physical = stripe->stripes[i].physical; + struct btrfs_stripe_extent *stripe_extent = + &raid_stripe->extents[i]; + + btrfs_set_stack_stripe_extent_devid(stripe_extent, devid); + btrfs_set_stack_stripe_extent_physical(stripe_extent, physical); + } + spin_unlock(&stripe->lock); + + stripe_key.objectid = stripe->logical; + stripe_key.type = BTRFS_RAID_STRIPE_KEY; + stripe_key.offset = stripe->num_bytes; + + ret = btrfs_insert_item(trans, stripe_root, &stripe_key, raid_stripe, + item_size); + if (ret) + btrfs_abort_transaction(trans, ret); + + kfree(raid_stripe); + + return ret; +} diff --git a/fs/btrfs/raid-stripe-tree.h b/fs/btrfs/raid-stripe-tree.h new file mode 100644 index 000000000000..fdcaad405742 --- /dev/null +++ b/fs/btrfs/raid-stripe-tree.h @@ -0,0 +1,49 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef BTRFS_RAID_STRIPE_TREE_H +#define BTRFS_RAID_STRIPE_TREE_H + +struct btrfs_io_context; + +struct btrfs_ordered_stripe { + struct rb_node rb_node; + + u64 logical; + u64 num_bytes; + int num_stripes; + struct btrfs_io_stripe *stripes; + spinlock_t lock; + refcount_t ref; +}; + +int btrfs_insert_raid_extent(struct btrfs_trans_handle *trans, + struct btrfs_ordered_stripe *stripe); +int btrfs_insert_preallocated_raid_stripe(struct btrfs_fs_info *fs_info, + u64 start, u64 len); +struct btrfs_ordered_stripe *btrfs_lookup_ordered_stripe( + struct btrfs_fs_info *fs_info, + u64 logical); +int btrfs_add_ordered_stripe(struct btrfs_io_context *bioc); +void btrfs_put_ordered_stripe(struct btrfs_fs_info *fs_info, + struct btrfs_ordered_stripe *stripe); + +static inline bool btrfs_need_stripe_tree_update(struct btrfs_fs_info *fs_info, + u64 map_type) +{ + u64 type = map_type & BTRFS_BLOCK_GROUP_TYPE_MASK; + u64 profile = map_type & BTRFS_BLOCK_GROUP_PROFILE_MASK; + + if (!fs_info->stripe_root) + return false; + + // for now + if (type != BTRFS_BLOCK_GROUP_DATA) + return false; + + if (profile & BTRFS_BLOCK_GROUP_RAID1_MASK) + return true; + + return false; +} + +#endif diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index f5be296b863e..261bf6dd17bc 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -33,6 +33,7 @@ #include "block-group.h" #include "discard.h" #include "zoned.h" +#include "raid-stripe-tree.h" static struct bio_set btrfs_bioset; static struct bio_set btrfs_clone_bioset; @@ -45,6 +46,8 @@ struct btrfs_failed_bio { atomic_t repair_count; }; +static void btrfs_raid_stripe_update(struct work_struct *work); + #define BTRFS_BLOCK_GROUP_STRIPE_MASK (BTRFS_BLOCK_GROUP_RAID0 | \ BTRFS_BLOCK_GROUP_RAID10 | \ BTRFS_BLOCK_GROUP_RAID56_MASK) @@ -5887,6 +5890,7 @@ static void sort_parity_stripes(struct btrfs_io_context *bioc, int num_stripes) } static struct btrfs_io_context *alloc_btrfs_io_context(struct btrfs_fs_info *fs_info, + u64 logical, int total_stripes, int real_stripes) { @@ -5907,6 +5911,7 @@ static struct btrfs_io_context *alloc_btrfs_io_context(struct btrfs_fs_info *fs_ refcount_set(&bioc->refs, 1); bioc->fs_info = fs_info; + bioc->logical = logical; bioc->tgtdev_map = (int *)(bioc->stripes + total_stripes); bioc->raid_map = (u64 *)(bioc->tgtdev_map + real_stripes); @@ -6512,7 +6517,8 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, goto out; } - bioc = alloc_btrfs_io_context(fs_info, num_alloc_stripes, tgtdev_indexes); + bioc = alloc_btrfs_io_context(fs_info, logical, num_alloc_stripes, + tgtdev_indexes); if (!bioc) { ret = -ENOMEM; goto out; @@ -6921,6 +6927,21 @@ static void btrfs_raid56_end_io(struct bio *bio) btrfs_put_bioc(bioc); } +static void btrfs_raid_stripe_update(struct work_struct *work) +{ + struct btrfs_bio *bbio = + container_of(work, struct btrfs_bio, raid_stripe_work); + struct btrfs_io_stripe *stripe = bbio->bio.bi_private; + struct btrfs_io_context *bioc = stripe->bioc; + int ret; + + ret = btrfs_add_ordered_stripe(bioc); + if (ret) + bbio->bio.bi_status = errno_to_blk_status(ret); + btrfs_orig_bbio_end_io(bbio); + btrfs_put_bioc(bioc); +} + static void btrfs_orig_write_end_io(struct bio *bio) { struct btrfs_io_stripe *stripe = bio->bi_private; @@ -6943,6 +6964,15 @@ static void btrfs_orig_write_end_io(struct bio *bio) else bio->bi_status = BLK_STS_OK; + if (bio_op(bio) == REQ_OP_ZONE_APPEND) + stripe->physical = bio->bi_iter.bi_sector << SECTOR_SHIFT; + + if (btrfs_need_stripe_tree_update(bioc->fs_info, bioc->map_type)) { + INIT_WORK(&bbio->raid_stripe_work, btrfs_raid_stripe_update); + schedule_work(&bbio->raid_stripe_work); + return; + } + btrfs_orig_bbio_end_io(bbio); btrfs_put_bioc(bioc); } @@ -6954,6 +6984,8 @@ static void btrfs_clone_write_end_io(struct bio *bio) if (bio->bi_status) { atomic_inc(&stripe->bioc->error); btrfs_log_dev_io_error(bio, stripe->dev); + } else if (bio_op(bio) == REQ_OP_ZONE_APPEND) { + stripe->physical = bio->bi_iter.bi_sector << SECTOR_SHIFT; } /* Pass on control to the original bio this one was cloned from */ @@ -7013,6 +7045,7 @@ static void btrfs_submit_mirrored_bio(struct btrfs_io_context *bioc, int dev_nr) bio->bi_private = &bioc->stripes[dev_nr]; bio->bi_iter.bi_sector = bioc->stripes[dev_nr].physical >> SECTOR_SHIFT; bioc->stripes[dev_nr].bioc = bioc; + bioc->size = bio->bi_iter.bi_size; btrfs_submit_dev_bio(bioc->stripes[dev_nr].dev, bio); } diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index 3b1fe04ff078..1b4a6e46eec3 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -374,6 +374,8 @@ struct btrfs_bio { atomic_t pending_ios; struct work_struct end_io_work; + struct work_struct raid_stripe_work; + /* * This member must come last, bio_alloc_bioset will allocate enough * bytes for entire btrfs_bio but relies on bio being last. @@ -403,12 +405,10 @@ static inline void btrfs_bio_end_io(struct btrfs_bio *bbio, blk_status_t status) struct btrfs_io_stripe { struct btrfs_device *dev; - union { - /* Block mapping */ - u64 physical; - /* For the endio handler */ - struct btrfs_io_context *bioc; - }; + /* Block mapping */ + u64 physical; + /* For the endio handler */ + struct btrfs_io_context *bioc; }; struct btrfs_discard_stripe { @@ -444,6 +444,8 @@ struct btrfs_io_context { int mirror_num; int num_tgtdevs; int *tgtdev_map; + u64 logical; + u64 size; /* * logical block numbers for the start of each stripe * The last one or two are p/q. These are sorted, diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index 5d28479b7da4..286b99f04ae2 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -1659,6 +1659,10 @@ void btrfs_rewrite_logical_zoned(struct btrfs_ordered_extent *ordered) u64 *logical = NULL; int nr, stripe_len; + /* Filesystems with a stripe tree have their own l2p mapping */ + if (fs_info->stripe_root) + return; + /* Zoned devices should not have partitions. So, we can assume it is 0 */ ASSERT(!bdev_is_partition(ordered->bdev)); if (WARN_ON(!ordered->bdev)) From patchwork Mon Oct 17 11:55:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13008694 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0ACDC4332F for ; Mon, 17 Oct 2022 11:55:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230348AbiJQLzo (ORCPT ); Mon, 17 Oct 2022 07:55:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44020 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230326AbiJQLzk (ORCPT ); Mon, 17 Oct 2022 07:55:40 -0400 Received: from esa4.hgst.iphmx.com (esa4.hgst.iphmx.com [216.71.154.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ADABE4DF23 for ; Mon, 17 Oct 2022 04:55:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1666007739; x=1697543739; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Ka1JTubPu+2SbPMq+18cp2USPIzHB7XUKzYqYyESwdo=; b=Ii4QQ/AnciCXtHX3Lm6qqX+fubYOZ8z0tRW1c2J5A3A0Od2FaqsjFc7N gEQ2fTN7J3/70qrjq8lg06EtB7DoEp1NTPoa10KmDwZ1LsnguVBAennbP dBfwtR99dYVTV8NogOEMSg/QfPXSoIL2yPm4oNyVq5+AqP+Lx3NqaBbsi otMGlAso1mWkktKdnCer3cLWGfs1SkWU8N/XWKzxKxVbmc70WamLBAvTH smM6Mrcqd0nskno/bxFwknkfEUI+XiOA+wy/RzDpofdLE8ycsmaJ58YZd b9vsvQWsZscZeCCNFtTJ2GKjXfhhCZGXIIyUYldOmIu+PMstoX7j/vULI Q==; X-IronPort-AV: E=Sophos;i="5.95,191,1661788800"; d="scan'208";a="212337159" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 17 Oct 2022 19:55:37 +0800 IronPort-SDR: gckuMW1Ul9nZ67F7qEd08nl1/Y/loHkqXpgozhcZSKVPGsS+FjtIl7oxfVQK792M8znPTCz8Po /XX0Kdj1xLtLNVoT9bnpVpCyKTcqoAnUDk43C0FEz7pZCbeGPUr3v2QbrctuW4H0XTTdf+YXK6 1Ci3NEEtb7P1JMOKe7WTIJy5IvHZS0uacVcIaavXMZxW9+5537tTTEXEjgUmK0de4kkusQQPDB S6jFR/76PcDrp3xqOUdzhtVYkzVVLil0THA38v2MbRMgXfmKTCHY5Ye8SJH7rBZxmnLYMhvWAe 9HVvUwSxBOFFezhjoCzAzCm9 Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 17 Oct 2022 04:15:10 -0700 IronPort-SDR: SCtOkIYPYQzIhROkLwXY4zJSmryBRuacLml7ozToHAmPGl2UA4cfCUsJ+yQaJ5DxPKpyK/qOSU SpdKFRRlTg/Nw24bm2qBxJw8CrWs+VUa5CAf689KqnzZmFeG0+jkZf+aQEiU5w5ERZZp5/+2qV i2sK5v6182+qTDECia3ggXuYTxrniMyhnujJs6IQCqKTg+JbJdvwuSa1yVWkxX58qHi3XXRrPu 1/zLe3MK7hwqJDqsiFeWeNewApYAc/SjTrMHpyNoEBdAR0PMJ8TwqbuP5kevhmWwKUsEbjBKy+ O3w= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip02.wdc.com with ESMTP; 17 Oct 2022 04:55:37 -0700 From: Johannes Thumshirn To: linux-btrfs@vger.kernel.org Cc: Johannes Thumshirn Subject: [RFC v3 04/11] btrfs: delete stripe extent on extent deletion Date: Mon, 17 Oct 2022 04:55:22 -0700 Message-Id: X-Mailer: git-send-email 2.37.3 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org As each stripe extent is tied to an extent item, delete the stripe extent once the corresponding extent item is deleted. Signed-off-by: Johannes Thumshirn --- fs/btrfs/extent-tree.c | 8 ++++++++ fs/btrfs/raid-stripe-tree.c | 31 +++++++++++++++++++++++++++++++ fs/btrfs/raid-stripe-tree.h | 2 ++ 3 files changed, 41 insertions(+) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 061296bcdfb4..d6f52e101d5a 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -3211,6 +3211,14 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans, } } + if (is_data) { + ret = btrfs_delete_raid_extent(trans, bytenr, num_bytes); + if (ret) { + btrfs_abort_transaction(trans, ret); + return ret; + } + } + ret = btrfs_del_items(trans, extent_root, path, path->slots[0], num_to_del); if (ret) { diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c index d8a69060b54b..5750857c2a75 100644 --- a/fs/btrfs/raid-stripe-tree.c +++ b/fs/btrfs/raid-stripe-tree.c @@ -109,6 +109,37 @@ void btrfs_put_ordered_stripe(struct btrfs_fs_info *fs_info, mutex_unlock(&fs_info->stripe_update_lock); } +int btrfs_delete_raid_extent(struct btrfs_trans_handle *trans, u64 start, + u64 length) +{ + struct btrfs_fs_info *fs_info = trans->fs_info; + struct btrfs_root *stripe_root = fs_info->stripe_root; + struct btrfs_path *path; + struct btrfs_key stripe_key; + int ret; + + if (!stripe_root) + return 0; + + stripe_key.objectid = start; + stripe_key.type = BTRFS_RAID_STRIPE_KEY; + stripe_key.offset = length; + + path = btrfs_alloc_path(); + if (!path) + return -ENOMEM; + + ret = btrfs_search_slot(trans, stripe_root, &stripe_key, path, -1, 1); + if (ret < 0) + goto out; + + ret = btrfs_del_item(trans, stripe_root, path); +out: + btrfs_free_path(path); + return ret; + +} + int btrfs_insert_preallocated_raid_stripe(struct btrfs_fs_info *fs_info, u64 start, u64 len) { diff --git a/fs/btrfs/raid-stripe-tree.h b/fs/btrfs/raid-stripe-tree.h index fdcaad405742..3456251d0739 100644 --- a/fs/btrfs/raid-stripe-tree.h +++ b/fs/btrfs/raid-stripe-tree.h @@ -16,6 +16,8 @@ struct btrfs_ordered_stripe { refcount_t ref; }; +int btrfs_delete_raid_extent(struct btrfs_trans_handle *trans, u64 start, + u64 length); int btrfs_insert_raid_extent(struct btrfs_trans_handle *trans, struct btrfs_ordered_stripe *stripe); int btrfs_insert_preallocated_raid_stripe(struct btrfs_fs_info *fs_info, From patchwork Mon Oct 17 11:55:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13008697 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F2D5C4332F for ; Mon, 17 Oct 2022 11:55:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230349AbiJQLzr (ORCPT ); Mon, 17 Oct 2022 07:55:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44084 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230307AbiJQLzm (ORCPT ); Mon, 17 Oct 2022 07:55:42 -0400 Received: from esa4.hgst.iphmx.com (esa4.hgst.iphmx.com [216.71.154.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 62CAF4F683 for ; Mon, 17 Oct 2022 04:55:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1666007740; x=1697543740; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ZTS60uHAkYTfgc3cMN+8MQhVDavETbhol8zkM/sUVYk=; b=XzOAN/WplcBWU8KxELmK7h/yIM7ZVJBPF/eXdubQ7BWNxhOxfsMuN2ky WK04jJZSa77NvMlRQD49qJRmclmFa59WNSVwxZUI0X+YnIs+jWhBW4F12 GAT7IgVPQQ/6RLI3oe7Lj/h7NoGYstoj1zzYuAdApsHu53WyTlQE3HfnT KVAyM01aXuaWRV34iqD3LlPe5HAITHc1ZUflpnE97gCfCXTQrUZRWLRR2 gO4DraTb4ophXb9391lsssefjncnPDeRw6GuW5SxpcMnlVPT/wkLu+UUL HejfvALi9qZiE6V/IM45qlOSNtGxqjIrrSTPVNytTLizDWEoK9Ft+DqzA A==; X-IronPort-AV: E=Sophos;i="5.95,191,1661788800"; d="scan'208";a="212337161" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 17 Oct 2022 19:55:38 +0800 IronPort-SDR: z/FcemTqXPnHMiAEajlqMS6Z4pKIuRm590hPDxAttlkz4JzVZAU4j+haHhStL70alBJ6rYHghV kJUXOPDHqP7Va+bbhVZTkBmg6/LuduvZbDnFpFZE3uG420mq8AUFTfiznJvbVaPTEeZ/mGUgs/ 1CnK0l67tAXo6nUQA+SULFBBG/ZzlPt6rg376+cz4wHOnFqTX2QxVUs2qJvpBbIugAvVbXaQb7 z8vhHpmm3cY8bpvm4p/QVMA+yKhuLgLXVLSFKTot+aXgLhe9Pvyn3CUMyKJCua52/u9AHyNFcd iTLy7nKVwLn0CPE8C4YjFp6h Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 17 Oct 2022 04:15:11 -0700 IronPort-SDR: MOCoqbrpbTbO3SoDw7wEr/+OWJWKxPiYpOGJRDZdp01k84Sf1H5mn0vLJSzQ4LQMVoh2Psen49 K/jqGpQbl9Wp4orvJeBB8jRYxTyqJnRgYvpMKrdd/SYjgF3rkoJkWPHVoopWPqjMmOoaV4/szi tF/aycoR4V8NbbGTP6QxCqyEigFIlmeK12FJU1iMA0zM9DMZ++2sDEhHRjAMRaSFWxz3UViIcV 3hF4ziKv7M/mrnvkhxnx3dFGsjrmhkKLXZxadQQUlDjrXDMdGNF3ZbGbZ5Vp0R1CPAvThaJ+6J kaY= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip02.wdc.com with ESMTP; 17 Oct 2022 04:55:38 -0700 From: Johannes Thumshirn To: linux-btrfs@vger.kernel.org Cc: Johannes Thumshirn Subject: [RFC v3 05/11] btrfs: lookup physical address from stripe extent Date: Mon, 17 Oct 2022 04:55:23 -0700 Message-Id: <85853887c5f50188e32f879be823c690c33af9d3.1666007330.git.johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Lookup the physical address from the raid stripe tree when a read on an RAID volume formatted with the raid stripe tree was attempted. If the requested logical address was not found in the stripe tree, it may still be in the in-memory ordered stripe tree, so fallback to searching the ordered stripe tree in this case. Signed-off-by: Johannes Thumshirn --- fs/btrfs/raid-stripe-tree.c | 142 ++++++++++++++++++++++++++++++++++++ fs/btrfs/raid-stripe-tree.h | 3 + fs/btrfs/volumes.c | 30 ++++++-- 3 files changed, 168 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c index 5750857c2a75..91e67600e01a 100644 --- a/fs/btrfs/raid-stripe-tree.c +++ b/fs/btrfs/raid-stripe-tree.c @@ -218,3 +218,145 @@ int btrfs_insert_raid_extent(struct btrfs_trans_handle *trans, return ret; } + +static bool btrfs_physical_from_ordered_stripe(struct btrfs_fs_info *fs_info, + u64 logical, u64 *length, + int num_stripes, + struct btrfs_io_stripe *stripe) +{ + struct btrfs_ordered_stripe *os; + u64 offset; + u64 found_end; + u64 end; + int i; + + os = btrfs_lookup_ordered_stripe(fs_info, logical); + if (!os) + return false; + + end = logical + *length; + found_end = os->logical + os->num_bytes; + if (end > found_end) + *length -= end - found_end; + + for (i = 0; i < num_stripes; i++) { + if (os->stripes[i].dev != stripe->dev) + continue; + + offset = logical - os->logical; + ASSERT(offset >= 0); + stripe->physical = os->stripes[i].physical + offset; + btrfs_put_ordered_stripe(fs_info, os); + break; + } + + return true; +} + +int btrfs_get_raid_extent_offset(struct btrfs_fs_info *fs_info, + u64 logical, u64 *length, u64 map_type, + struct btrfs_io_stripe *stripe) +{ + struct btrfs_root *stripe_root = fs_info->stripe_root; + int num_stripes = btrfs_bg_type_to_factor(map_type); + struct btrfs_dp_stripe *raid_stripe; + struct btrfs_key stripe_key; + struct btrfs_key found_key; + struct btrfs_path *path; + struct extent_buffer *leaf; + u64 offset; + u64 found_logical; + u64 found_length; + u64 end; + u64 found_end; + int slot; + int ret; + int i; + + /* + * If we still have the stripe in the ordered stripe tree get it from + * there + */ + if (btrfs_physical_from_ordered_stripe(fs_info, logical, length, + num_stripes, stripe)) + return 0; + + stripe_key.objectid = logical; + stripe_key.type = BTRFS_RAID_STRIPE_KEY; + stripe_key.offset = 0; + + path = btrfs_alloc_path(); + if (!path) + return -ENOMEM; + + ret = btrfs_search_slot(NULL, stripe_root, &stripe_key, path, 0, 0); + if (ret < 0) + goto out; + if (ret) { + if (path->slots[0] != 0) + path->slots[0]--; + } + + end = logical + *length; + + while (1) { + leaf = path->nodes[0]; + slot = path->slots[0]; + + btrfs_item_key_to_cpu(leaf, &found_key, slot); + found_logical = found_key.objectid; + found_length = found_key.offset; + + if (found_logical > end) + break; + + if (!in_range(logical, found_logical, found_length)) + goto next; + + offset = logical - found_logical; + found_end = found_logical + found_length; + + /* + * If we have a logically contiguous, but physically + * noncontinuous range, we need to split the bio. Record the + * length after which we must split the bio. + */ + if (end > found_end) + *length -= end - found_end; + + raid_stripe = btrfs_item_ptr(leaf, slot, struct btrfs_dp_stripe); + for (i = 0; i < num_stripes; i++) { + if (btrfs_stripe_extent_devid_nr(leaf, raid_stripe, i) != + stripe->dev->devid) + continue; + stripe->physical = btrfs_stripe_extent_physical_nr(leaf, + raid_stripe, i) + offset; + ret = 0; + goto out; + } + + /* + * If we're here, we haven't found the requested devid in the + * stripe. + */ + ret = -ENOENT; + goto out; +next: + ret = btrfs_next_item(stripe_root, path); + if (ret) + break; + } + +out: + if (ret > 0) + ret = -ENOENT; + if (ret) { + btrfs_err(fs_info, + "cannot find raid-stripe for logical [%llu, %llu]", + logical, logical + *length); + btrfs_print_tree(leaf, 1); + } + btrfs_free_path(path); + + return ret; +} diff --git a/fs/btrfs/raid-stripe-tree.h b/fs/btrfs/raid-stripe-tree.h index 3456251d0739..083e754f5239 100644 --- a/fs/btrfs/raid-stripe-tree.h +++ b/fs/btrfs/raid-stripe-tree.h @@ -16,6 +16,9 @@ struct btrfs_ordered_stripe { refcount_t ref; }; +int btrfs_get_raid_extent_offset(struct btrfs_fs_info *fs_info, + u64 logical, u64 *length, u64 map_type, + struct btrfs_io_stripe *stripe); int btrfs_delete_raid_extent(struct btrfs_trans_handle *trans, u64 start, u64 length); int btrfs_insert_raid_extent(struct btrfs_trans_handle *trans, diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 261bf6dd17bc..c67d76d93982 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -6313,12 +6313,21 @@ static u64 btrfs_max_io_len(struct map_lookup *map, enum btrfs_map_op op, return U64_MAX; } -static void set_io_stripe(struct btrfs_io_stripe *dst, const struct map_lookup *map, - u32 stripe_index, u64 stripe_offset, u64 stripe_nr) +static int set_io_stripe(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, + u64 logical, u64 *length, struct btrfs_io_stripe *dst, + struct map_lookup *map, u32 stripe_index, + u64 stripe_offset, u64 stripe_nr) { dst->dev = map->stripes[stripe_index].dev; + + if (fs_info->stripe_root && op == BTRFS_MAP_READ && + btrfs_need_stripe_tree_update(fs_info, map->type)) + return btrfs_get_raid_extent_offset(fs_info, logical, length, + map->type, dst); + dst->physical = map->stripes[stripe_index].physical + stripe_offset + stripe_nr * map->stripe_len; + return 0; } static int __btrfs_map_block(struct btrfs_fs_info *fs_info, @@ -6507,13 +6516,14 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, smap->dev = dev_replace->tgtdev; smap->physical = physical_to_patch_in_first_stripe; *mirror_num_ret = map->num_stripes + 1; + ret = 0; } else { - set_io_stripe(smap, map, stripe_index, stripe_offset, - stripe_nr); *mirror_num_ret = mirror_num; + ret = set_io_stripe(fs_info, op, logical, length, smap, + map, stripe_index, stripe_offset, + stripe_nr); } *bioc_ret = NULL; - ret = 0; goto out; } @@ -6525,8 +6535,14 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, } for (i = 0; i < num_stripes; i++) { - set_io_stripe(&bioc->stripes[i], map, stripe_index, stripe_offset, - stripe_nr); + ret = set_io_stripe(fs_info, op, logical, length, + &bioc->stripes[i], map, stripe_index, + stripe_offset, stripe_nr); + if (ret) { + btrfs_put_bioc(bioc); + goto out; + } + stripe_index++; } From patchwork Mon Oct 17 11:55:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13008696 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE1B8C433FE for ; Mon, 17 Oct 2022 11:55:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230355AbiJQLzq (ORCPT ); Mon, 17 Oct 2022 07:55:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44064 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230206AbiJQLzl (ORCPT ); Mon, 17 Oct 2022 07:55:41 -0400 Received: from esa4.hgst.iphmx.com (esa4.hgst.iphmx.com [216.71.154.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DEAA64E40E for ; Mon, 17 Oct 2022 04:55:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1666007740; x=1697543740; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=yJk2w/eXaKUKwsaTLp1KsGm2AQ02EBWBtnYP5LkgHqY=; b=O37zJjafEweNC3/E+LuHnQNUjpCMFrCCdYBqDPwktOH6X3lg3zlDhrAc tuCHJJ1AFrS7dozrhgX5wLhezXbW88b3f1O2o7H5tSh25pfct8Lqy3IMl hgNjzJv+TLp/SKQl6p2bpZ6TApZIXYZbH04G42JmL03MaJ3+aP7vdV9Mj dVGpPJIXbM3UELHjxqe4jPk6T/MR210tUeHZDZz7vbZB86pI81jwjnE5X o5OIjTVKikOSHFNTWdkQK2uR0fhTfWqUN0q3zTqp8AhmZ5HvlKJiCZDBg 4vr3agrZ3IJSrCzHFjbHqGP5m/pUM2oGj3ICJXxBp3Jnxi+IfK1EXo8No w==; X-IronPort-AV: E=Sophos;i="5.95,191,1661788800"; d="scan'208";a="212337162" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 17 Oct 2022 19:55:39 +0800 IronPort-SDR: VlgzzNA5uHvNXUGLp/zWYXk/G670gSJMhmpkh/o3Pi8+h+VuJIosQYdfjfOiCjRhcQWkQgiXA/ HL1cdMMvctRMp37vzJIMxuMg8AHTtiDfC5jv2Sz7RFPKXuR+o/eye7fVm1hjeMZgaRnyrXT3Qp ZRq14taa9e26OSDVJg8F28eztQ8FtYE4iqQMEzaMDTBIdiklA7mbr4Gp9k16EDNVw/pxF1cYKU 6YoaU9W6C4WZTYsmFvl9Eadvrey3FlfhlkjiswLpcBEorwwnm9I0tt5ZcOsx6C+XmlN6fbsN36 hkXEeoB+FPWT8366z+jWyUUp Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 17 Oct 2022 04:15:12 -0700 IronPort-SDR: csKLicCDmhC03tLyxsX1zw3tS44Bz5fOR73rMCK2fp4O6zga8G9AeYauBHfoVAFZDNcoFN4S9+ 6m61nlmHiXwLvSgxsFyUs3vUNFLnzj2JN0AduF/UY6/fgCMadS68rZxJX7EKfOHHaArfCXuPRP qAYGiSP39eKPMIbgJZun4IPsJnmjP6AbpGukXKbW6ilMGeMwWm3Cv0DotY1h2I94wlk0akkesp 0YT324w1zevScxJudO26uTRC98NPrUfSfr3ov1ftzv+P8M8Nnx8/GeuU+enRlGei9FXujNnJAi bmE= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip02.wdc.com with ESMTP; 17 Oct 2022 04:55:39 -0700 From: Johannes Thumshirn To: linux-btrfs@vger.kernel.org Cc: Johannes Thumshirn Subject: [RFC v3 06/11] btrfs: add raid stripe tree pretty printer Date: Mon, 17 Oct 2022 04:55:24 -0700 Message-Id: <5574796b1656045504f2d5c52bab4e85fb9d1b8d.1666007330.git.johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Decode raid-stripe-tree entries on btrfs_print_tree(). Signed-off-by: Johannes Thumshirn Reviewed-by: Josef Bacik --- fs/btrfs/print-tree.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/fs/btrfs/print-tree.c b/fs/btrfs/print-tree.c index dd8777872143..1c781a30fd60 100644 --- a/fs/btrfs/print-tree.c +++ b/fs/btrfs/print-tree.c @@ -6,6 +6,7 @@ #include "ctree.h" #include "disk-io.h" #include "print-tree.h" +#include "raid-stripe-tree.h" struct root_name_map { u64 id; @@ -25,6 +26,7 @@ static const struct root_name_map root_map[] = { { BTRFS_FREE_SPACE_TREE_OBJECTID, "FREE_SPACE_TREE" }, { BTRFS_BLOCK_GROUP_TREE_OBJECTID, "BLOCK_GROUP_TREE" }, { BTRFS_DATA_RELOC_TREE_OBJECTID, "DATA_RELOC_TREE" }, + { BTRFS_RAID_STRIPE_TREE_OBJECTID, "RAID_STRIPE_TREE" }, }; const char *btrfs_root_name(const struct btrfs_key *key, char *buf) @@ -184,6 +186,20 @@ static void print_uuid_item(struct extent_buffer *l, unsigned long offset, } } +static void print_raid_stripe_key(struct extent_buffer *eb, u32 item_size, + struct btrfs_dp_stripe *stripe) +{ + int num_stripes; + int i; + + num_stripes = item_size / sizeof(struct btrfs_stripe_extent); + + for (i = 0; i < num_stripes; i++) + pr_info("\t\t\tstripe %d devid %llu physical %llu\n", i, + btrfs_stripe_extent_devid_nr(eb, stripe, i), + btrfs_stripe_extent_physical_nr(eb, stripe, i)); +} + /* * Helper to output refs and locking status of extent buffer. Useful to debug * race condition related problems. @@ -348,6 +364,11 @@ void btrfs_print_leaf(struct extent_buffer *l) print_uuid_item(l, btrfs_item_ptr_offset(l, i), btrfs_item_size(l, i)); break; + case BTRFS_RAID_STRIPE_KEY: + print_raid_stripe_key(l, btrfs_item_size(l, i), + btrfs_item_ptr(l, i, + struct btrfs_dp_stripe)); + break; } } } From patchwork Mon Oct 17 11:55:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13008698 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D640C43217 for ; Mon, 17 Oct 2022 11:55:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230356AbiJQLzs (ORCPT ); Mon, 17 Oct 2022 07:55:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44116 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230338AbiJQLzm (ORCPT ); Mon, 17 Oct 2022 07:55:42 -0400 Received: from esa4.hgst.iphmx.com (esa4.hgst.iphmx.com [216.71.154.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6988A57E12 for ; Mon, 17 Oct 2022 04:55:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1666007741; x=1697543741; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/KOh8Qx+nBF4bAugvGLl8O8BjRxQzjcCHYVCq7TdJ/0=; b=fUuLAoM551dN2ZwRsXiB49vQIYe7K5vvGXxPCf51st5j7qBSnLZunZZX GeqUow/FtlKn+XLwdSz9DrCdmGDE5k88pLSODuAWQR+L/3cDKv4mjsmrQ hq7sm9mLW1L2UPkumEvnLW/zWhJvowE27zweczkUIdWfBH0Nuu8EPfEVa GCKzVLG7+KHU27kZTolGOnDXc4cuMYD3BaUZ4/1edNHtSrSJ9nq26HSqr gkNJYJneb08CBlLiND6cXT6c9KHttzcAHvDtzgoO2yP5BN0CSG6f4x8gW i1ky48Pa1hMmDP9S4Z4L1bTp0GTnGKoZFlvDuQ+54LA4buWx/k/O0vCiV g==; X-IronPort-AV: E=Sophos;i="5.95,191,1661788800"; d="scan'208";a="212337164" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 17 Oct 2022 19:55:39 +0800 IronPort-SDR: Dq6VxkwMrD4VueIH7z3qJRvFjVhlffq7EhCJqXiPphGkrj3efqckyl3CngUMMVTYbreFzTj6wE QhWWSlX5JcTm7PdZOHS27/uyRUDAiV6H0O8SgYhK4U4VBJIdq6iyfiq+WrSuDYk6XCUh32uqrO j5rvz7WPkW+BXuUcf3+MYCtjz+8YCtS+SoQ/ocmtsGQk5s5n1V+CfooR4yY0CfCjN2M+l4Ssde w/4Wp5vGJmnZhK62hc4NKbj+wUT/7MtxMY8A69P4Tc19Y3obYkWsF03McfATJcS1/WYfYJf3zK 2Ce4hBuTVcaTwnxvfsQVzwBZ Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 17 Oct 2022 04:15:12 -0700 IronPort-SDR: aXBmOl684flRDbV6AKVNi6Z9ZpC7iQhwvGZXWQcGiuGkqBs61JTD4n3rtAQLcQEAJVNpis8/ld Z0CjYTRWLesdvH1lBfePQRNc4BpCuYFAu/LDmHpXAR2MnB8zd3e2z1RbZOs2JT7EiSAwYo+wII CkXg8N10OviuojsIV63iGuZh3tMOJWU7TQ50SNjkMII87yu0/8iln/qmZQz7AmkyxZo5W5ndP8 2YsJZgJajaEs35dOph0vYSmi55liIWi51ipgpVKrNSfqvomi3K6PBBsk/IgFC8RVILbZ9Foh+r IM4= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip02.wdc.com with ESMTP; 17 Oct 2022 04:55:40 -0700 From: Johannes Thumshirn To: linux-btrfs@vger.kernel.org Cc: Johannes Thumshirn Subject: [RFC v3 07/11] btrfs: zoned: allow zoned RAID1 Date: Mon, 17 Oct 2022 04:55:25 -0700 Message-Id: X-Mailer: git-send-email 2.37.3 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org When we have a raid-stripe-tree, we can do RAID1 on zoned devices for data block-groups. For meta-data block-groups, we don't actually need anything special, as all meta-data I/O is protected by the btrfs_zoned_meta_io_lock() already. Signed-off-by: Johannes Thumshirn Reviewed-by: Josef Bacik --- fs/btrfs/zoned.c | 39 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index 286b99f04ae2..f4ce39169468 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -1486,6 +1486,45 @@ int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new) cache->zone_capacity = min(caps[0], caps[1]); break; case BTRFS_BLOCK_GROUP_RAID1: + case BTRFS_BLOCK_GROUP_RAID1C3: + case BTRFS_BLOCK_GROUP_RAID1C4: + if (map->type & BTRFS_BLOCK_GROUP_DATA && + !fs_info->stripe_root) { + btrfs_err(fs_info, + "zoned: data RAID1 needs stripe_root"); + ret = -EIO; + goto out; + + } + + for (i = 0; i < map->num_stripes; i++) { + if (alloc_offsets[i] == WP_MISSING_DEV || + alloc_offsets[i] == WP_CONVENTIONAL) + continue; + + if (i == 0) + continue; + + if (alloc_offsets[0] != alloc_offsets[i]) { + btrfs_err(fs_info, + "zoned: write pointer offset mismatch of zones in RAID profile"); + ret = -EIO; + goto out; + } + if (test_bit(0, active) != test_bit(i, active)) { + if (!btrfs_zone_activate(cache)) { + ret = -EIO; + goto out; + } + } else { + if (test_bit(0, active)) + set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, + &cache->runtime_flags); + } + cache->zone_capacity = min(caps[0], caps[i]); + } + cache->alloc_offset = alloc_offsets[0]; + break; case BTRFS_BLOCK_GROUP_RAID0: case BTRFS_BLOCK_GROUP_RAID10: case BTRFS_BLOCK_GROUP_RAID5: From patchwork Mon Oct 17 11:55:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13008699 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5BDCC433FE for ; Mon, 17 Oct 2022 11:55:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230357AbiJQLzt (ORCPT ); Mon, 17 Oct 2022 07:55:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44118 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230339AbiJQLzm (ORCPT ); Mon, 17 Oct 2022 07:55:42 -0400 Received: from esa4.hgst.iphmx.com (esa4.hgst.iphmx.com [216.71.154.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BFEAA580A4 for ; Mon, 17 Oct 2022 04:55:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1666007741; x=1697543741; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Du6tCc6fBTLlhGjBKl1Mr6W7UOAlDi7W0ad9SYMPw+4=; b=PUKpbEpJB3toF0VFOjQjM7JQ1RvSGXKwOF6f63+gl8YWgQbJXLgS1sfd L80YkKO5Ek6iHBBzrgree07W1DOGTABkT/TUjVJ4wr0k1NtDL1rAJpKNq A1Y/3JSn4tbuYCc5D9c4LpvpfbUtRUzLxD1mZQPOVyd5GrUX0bFpaVbli Jn9mGI4USZB2KECIm8bTrAVLwd6ipmGrFj+Xs5SSB4oza32JzjFBXRFd9 Q0nIZ1j8q4GFhu5leUD7IE9Yobl0LQWintq+8plseNQp+4PusN0LdcLGY q0TJFqMbTwHDgZB2srcsJD2j1evAXRt3tY+kpOSeeLYWXbN+vPA+NnyZ9 A==; X-IronPort-AV: E=Sophos;i="5.95,191,1661788800"; d="scan'208";a="212337166" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 17 Oct 2022 19:55:40 +0800 IronPort-SDR: O9SRt+saAqLgDHShaOas6x+pMikj3ItfX+censO9tUP8h8zgESeB2OjxIwsnMxzKTc2RUAaxlm ivQAJl9WoUIPJ4DzXlhRHAdPWe0gPe6l4P/ksnDUnSgUHRRK+DMMsKD1zSEjRv6AO0UOg4cLsS dJdgLviSYwwsl/iJblkgCVhMKFHLMSMvT67hHzG9pP9avI04rteb8mWNkUrMOwST5u+MpzHWCu VEXTNZejUfw8oSpYkltZesQk92JeiVuiBL4/9tI9q7mu0xbP+kZrWlVGIrAfbomeNeYnrAlLoV qLwiEhCoko3twnc0NLvrZPeO Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 17 Oct 2022 04:15:13 -0700 IronPort-SDR: asY75MFp5SRWiOJSL2Bm5MPfAXqwLntg82JigtqibesEhQJb90DnoWNM6FHF/7cyF33npn9CjW DzQs86slCGUHKSaU6IqlvLjJpLMr4AZrHrBEXhdjV5MrhgWZDxg3CYDUdZZe2tVjgl8EYqHVlH RiR+IXQUu3dTrpExlx1FUIkR/FoXiFmrOyroIrjX5Ur4Kw9PFC7XbIvQlkp5Mx22NRbMHOHOR0 Xm7VbNxYlDcq7++vDAfVznsJax6Cki46g5LjaRodoqjEhdPjGvHOGMbQbAHom2Pebke464zFfD iKA= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip02.wdc.com with ESMTP; 17 Oct 2022 04:55:40 -0700 From: Johannes Thumshirn To: linux-btrfs@vger.kernel.org Cc: Johannes Thumshirn Subject: [RFC v3 08/11] btrfs: allow zoned RAID0 and 10 Date: Mon, 17 Oct 2022 04:55:26 -0700 Message-Id: <0c8a339cfd8d364b0d5c817637ba85ee0302503a.1666007330.git.johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Signed-off-by: Johannes Thumshirn --- fs/btrfs/raid-stripe-tree.h | 7 ++++++- fs/btrfs/zoned.c | 4 ++-- 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/raid-stripe-tree.h b/fs/btrfs/raid-stripe-tree.h index 083e754f5239..d1885b428cd4 100644 --- a/fs/btrfs/raid-stripe-tree.h +++ b/fs/btrfs/raid-stripe-tree.h @@ -41,13 +41,18 @@ static inline bool btrfs_need_stripe_tree_update(struct btrfs_fs_info *fs_info, if (!fs_info->stripe_root) return false; - // for now if (type != BTRFS_BLOCK_GROUP_DATA) return false; if (profile & BTRFS_BLOCK_GROUP_RAID1_MASK) return true; + if (profile & BTRFS_BLOCK_GROUP_RAID0) + return true; + + if (profile & BTRFS_BLOCK_GROUP_RAID10) + return true; + return false; } diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index f4ce39169468..3325e7761ef7 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -1488,6 +1488,8 @@ int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new) case BTRFS_BLOCK_GROUP_RAID1: case BTRFS_BLOCK_GROUP_RAID1C3: case BTRFS_BLOCK_GROUP_RAID1C4: + case BTRFS_BLOCK_GROUP_RAID0: + case BTRFS_BLOCK_GROUP_RAID10: if (map->type & BTRFS_BLOCK_GROUP_DATA && !fs_info->stripe_root) { btrfs_err(fs_info, @@ -1525,8 +1527,6 @@ int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new) } cache->alloc_offset = alloc_offsets[0]; break; - case BTRFS_BLOCK_GROUP_RAID0: - case BTRFS_BLOCK_GROUP_RAID10: case BTRFS_BLOCK_GROUP_RAID5: case BTRFS_BLOCK_GROUP_RAID6: /* non-single profiles are not supported yet */ From patchwork Mon Oct 17 11:55:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13008700 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DD72EC43219 for ; Mon, 17 Oct 2022 11:55:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230361AbiJQLzt (ORCPT ); Mon, 17 Oct 2022 07:55:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44148 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230343AbiJQLzn (ORCPT ); Mon, 17 Oct 2022 07:55:43 -0400 Received: from esa4.hgst.iphmx.com (esa4.hgst.iphmx.com [216.71.154.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 33DF6BF59 for ; Mon, 17 Oct 2022 04:55:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1666007741; x=1697543741; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=RgEc14j4CKBgEtcQOf5bFGrH7RRtc3htMhZpJWlQ6OU=; b=HFfV4DAbbx+X2h8g6S/wROJ0c2CPuZZ2uswlj2/E7kU9R0OK58zkO0ra GY44ihlS68shiWWq2hX4PwRtaV6bo183VmLyRuySigVaemR5zlMH4V7yx x+55BI8lBzx1JSD739jkR5Z7LT2TdC9Ezi0kOETPVGIMtriWSetmJY8PR iug75DlExPv8ouDziZFgOAGf95C5Udx/U2+FrrZDuQwx0UopQGVbXp1+1 xCtMigXpi+9NL7rXlGOAcI3/x3MipSicSF6wnHM6zkRWrz75vzX6Y+eBD efIrxpIqJbxI05qfSBUX0CfJUbFetp2xIsUMhfcx1yzQ8iljiC7FCTG5v w==; X-IronPort-AV: E=Sophos;i="5.95,191,1661788800"; d="scan'208";a="212337169" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 17 Oct 2022 19:55:41 +0800 IronPort-SDR: UZxO54scKAF8pz0fkxLfy6Cvv3f63tasvoFaMHTFCncMUrg5T3sQuk/fIp391+Xl6e+9+q/6QP ZVRFseteU8UZ4fGrx8RUGHSFcHjrgmfErCjxdZjJh0PR2f95oXOCgn9+4wEDkmzBxHm2rV3LZK wbnvvs8D0JPDw20ZThDjLbyHFOiqeCm3lFDIObHQmjUGcbbD5pBToN2ZY7wWx0sMRRmROIgf4H 6D/nmohYQtqnIR8dKoaqD7hqFvVLKqr3SLewdmf7ZM4AgNXTlx6Nitza4JMMvQFBMc0OF2Nefz mysVdpmY2SJSboEEOkfy0512 Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 17 Oct 2022 04:15:14 -0700 IronPort-SDR: x32K2ZfjpgXdeyFPhjVkhpKR/4R8QuR8jTS6a61vFrTzqPsUfRNhT0uY1pd/MWT5VROA/IWr5u nd/KYENml0oGQ8AKxuPgOrYRxMbJlEAKCzcl1W1YWVjlXUCgNledYAqWM3scki4lxnd7GQ3csr i+mdNaP/4H9t43nORqSvOzSfPRBYIHkkTrdWuKaq9WsHFU0peqK07mq/yQkfTogf0CbPihd8ar ceCADyTpgAO6g34Y8fAksnAISUntjP8k1R8v8Gkeak2gCsu//T/xpbNgnagihrD2cf1RrWlUeB HVg= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip02.wdc.com with ESMTP; 17 Oct 2022 04:55:41 -0700 From: Johannes Thumshirn To: linux-btrfs@vger.kernel.org Cc: Johannes Thumshirn Subject: [RFC v3 09/11] btrfs: fix striping with RST Date: Mon, 17 Oct 2022 04:55:27 -0700 Message-Id: X-Mailer: git-send-email 2.37.3 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Signed-off-by: Johannes Thumshirn --- fs/btrfs/volumes.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index c67d76d93982..5169001b5bba 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -6509,6 +6509,7 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, * I/O context structure. */ if (smap && num_alloc_stripes == 1 && + !((map->type & BTRFS_BLOCK_GROUP_STRIPE_MASK) && op != BTRFS_MAP_READ) && !((map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) && mirror_num > 1) && (!need_full_stripe(op) || !dev_replace_is_ongoing || !dev_replace->tgtdev)) { From patchwork Mon Oct 17 11:55:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13008701 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 82C82C4332F for ; Mon, 17 Oct 2022 11:55:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230359AbiJQLzu (ORCPT ); Mon, 17 Oct 2022 07:55:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44206 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230311AbiJQLzo (ORCPT ); Mon, 17 Oct 2022 07:55:44 -0400 Received: from esa4.hgst.iphmx.com (esa4.hgst.iphmx.com [216.71.154.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1BDF227DFA for ; Mon, 17 Oct 2022 04:55:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1666007742; x=1697543742; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=QwTr2YDafr/vlLux8dvXGGh9kFgdfwy+DSoCtaVbGuw=; b=Qc9ZpCvax6OFYu33FRP4np4HsJDGikPNNT00uvXC7SLbPbCU72BPzVRz PFrpynGQNClmjbVTDz9IVsPMoyGVJkqv5LC1tCINLOFo0OH688v19OMfI VqTEohJA60oXF6qSRLm0yi1/YmcDbvNDHGnA/+KkGBkkD7R/6T2vgHJlc ++Ll69AGEDacQJUo43y+kytZNQnJ2g2SKZhYGl+aWfHuLC/Fqs8L3Zu2s 3nXSyITphUhBOJX6YL0YBVgm5hOKQ2oEIAVRm1cki8nBLgZtHiVtIz3ec dxlsuOdwR0s3PlX3con0nKf6Od5FSOJlqtT7kCNCGr912N0MNiJVadMSB Q==; X-IronPort-AV: E=Sophos;i="5.95,191,1661788800"; d="scan'208";a="212337171" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 17 Oct 2022 19:55:41 +0800 IronPort-SDR: 9sTjJgvAUL4yXtQosdPrEn12scpf2+SxL0k8dpHkUv3ngKYp/LzItnnSCIvBcPPR4cIer8JSrd +MDxy1QfONfh5mp90DZQt5nfCReCHSiKKeKYRLI1Gc8mP8aQREBYMPL5LPMy/394U9EwvSYmQq TVT5SJSWxuefQXWqeYleLa2GM/kYZyHG7C21D6Pj92mwlY1zfQgtjJKRFYga6Sj2J6PozGdxGt nu60uHwAaKhlsHwSI3xqdYKjBiezBxLBIES6lWKcc/RYdG4iKsBf6OCCldNJlp1hNPCyuGL/UJ KxUsUh+Jd+YvIyK0zVNqzAwV Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 17 Oct 2022 04:15:14 -0700 IronPort-SDR: al41+Wb5hXUiajWCNNN0f5WM1iiLGsyVb4l9IzZZ8XmNn4lwIWez7Hz12u/oyXgVWm4Eu0P6nc i8s/4vAmp5jjfTLNSDAHZpAjpz0auB2yiM6X+BQ1NJC6m3XxjiugEolEluImr7CXeMPUEVGmy6 /bsrFNGp1tpA+79PUtG5JNJJX6HOKrsCTXrnkfrv7fZ9+WA7q8Seaf2gnJdrbn6gSu/xxZS7tw UOkbDek7s/dk9+o97OqMotGh9bW5Wtd5OmcfOgAHm4U2h0ajyYSjbmIB8UtCFvaDrGNjsD7cff ecY= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip02.wdc.com with ESMTP; 17 Oct 2022 04:55:42 -0700 From: Johannes Thumshirn To: linux-btrfs@vger.kernel.org Cc: Johannes Thumshirn Subject: [RFC v3 10/11] btrfs: check for leaks of ordered stripes on umount Date: Mon, 17 Oct 2022 04:55:28 -0700 Message-Id: X-Mailer: git-send-email 2.37.3 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Check if we're leaking any ordered stripes when unmounting a filesystem with an stripe tree. This check is gated behind CONFIG_BTRFS_DEBUG to not affect any production type systems. Signed-off-by: Johannes Thumshirn --- fs/btrfs/disk-io.c | 2 ++ fs/btrfs/raid-stripe-tree.c | 29 +++++++++++++++++++++++++++++ fs/btrfs/raid-stripe-tree.h | 1 + 3 files changed, 32 insertions(+) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 190caabf5fb7..e479e9829c3e 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -43,6 +43,7 @@ #include "space-info.h" #include "zoned.h" #include "subpage.h" +#include "raid-stripe-tree.h" #define BTRFS_SUPER_FLAG_SUPP (BTRFS_HEADER_FLAG_WRITTEN |\ BTRFS_HEADER_FLAG_RELOC |\ @@ -1480,6 +1481,7 @@ void btrfs_free_fs_info(struct btrfs_fs_info *fs_info) btrfs_put_root(fs_info->stripe_root); btrfs_check_leaked_roots(fs_info); btrfs_extent_buffer_leak_debug_check(fs_info); + btrfs_check_ordered_stripe_leak(fs_info); kfree(fs_info->super_copy); kfree(fs_info->super_for_commit); kfree(fs_info->subpage_info); diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c index 91e67600e01a..9a913c4cd44e 100644 --- a/fs/btrfs/raid-stripe-tree.c +++ b/fs/btrfs/raid-stripe-tree.c @@ -30,6 +30,35 @@ static int ordered_stripe_less(struct rb_node *rba, const struct rb_node *rbb) return ordered_stripe_cmp(&stripe->logical, rbb); } +void btrfs_check_ordered_stripe_leak(struct btrfs_fs_info *fs_info) +{ +#ifdef CONFIG_BTRFS_DEBUG + struct rb_node *node; + + if (!fs_info->stripe_root || + RB_EMPTY_ROOT(&fs_info->stripe_update_tree)) + return; + + mutex_lock(&fs_info->stripe_update_lock); + while ((node = rb_first_postorder(&fs_info->stripe_update_tree)) + != NULL) { + struct btrfs_ordered_stripe *stripe = + rb_entry(node, struct btrfs_ordered_stripe, rb_node); + + mutex_unlock(&fs_info->stripe_update_lock); + btrfs_err(fs_info, + "ordered_stripe [%llu, %llu] leaked, refcount=%d", + stripe->logical, stripe->logical + stripe->num_bytes, + refcount_read(&stripe->ref)); + while (refcount_read(&stripe->ref) > 1) + btrfs_put_ordered_stripe(fs_info, stripe); + btrfs_put_ordered_stripe(fs_info, stripe); + mutex_lock(&fs_info->stripe_update_lock); + } + mutex_unlock(&fs_info->stripe_update_lock); +#endif +} + int btrfs_add_ordered_stripe(struct btrfs_io_context *bioc) { struct btrfs_fs_info *fs_info = bioc->fs_info; diff --git a/fs/btrfs/raid-stripe-tree.h b/fs/btrfs/raid-stripe-tree.h index d1885b428cd4..5ffb10bf219e 100644 --- a/fs/btrfs/raid-stripe-tree.h +++ b/fs/btrfs/raid-stripe-tree.h @@ -31,6 +31,7 @@ struct btrfs_ordered_stripe *btrfs_lookup_ordered_stripe( int btrfs_add_ordered_stripe(struct btrfs_io_context *bioc); void btrfs_put_ordered_stripe(struct btrfs_fs_info *fs_info, struct btrfs_ordered_stripe *stripe); +void btrfs_check_ordered_stripe_leak(struct btrfs_fs_info *fs_info); static inline bool btrfs_need_stripe_tree_update(struct btrfs_fs_info *fs_info, u64 map_type) From patchwork Mon Oct 17 11:55:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13008702 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54204C433FE for ; Mon, 17 Oct 2022 11:55:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230366AbiJQLzv (ORCPT ); Mon, 17 Oct 2022 07:55:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44208 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230312AbiJQLzo (ORCPT ); Mon, 17 Oct 2022 07:55:44 -0400 Received: from esa4.hgst.iphmx.com (esa4.hgst.iphmx.com [216.71.154.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1BF1957E1D for ; Mon, 17 Oct 2022 04:55:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1666007742; x=1697543742; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=pDnieQdzQiAlgtdJ8yro6VF2VEXcovI8CAs8SLqFkig=; b=nJm+zivID0KGLu3j3jJIaLHgE+xKV+t//pbboYTFSgGpTC1z7baYOUfo FfFCGMR5SFfQOE4QSNi0vuoetjiqBGD2VcFGp+7tn1JF6jOBjGWHPMbR7 +I+1VphCX1HvJS/bXMzWypiyVFrphcn1D5p5C2x1CBSzjcKsYFY/l5QmJ S8lZIFrKNuWOdvKJ678i4OZmQx3j2Lkk8+c9UgB7wfCYZq+kFH7iGyGzn IJGtz/e60808kDghsWbLypwH8wcSVa+3XPllQMCoZfo/Sr6jYu+IVVcHE HeO0Zgn8vNkpLrt4mKQzBZ0SwvLaMPc2jFLkSGKrF57aNRyOlAGBZTSOo A==; X-IronPort-AV: E=Sophos;i="5.95,191,1661788800"; d="scan'208";a="212337173" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 17 Oct 2022 19:55:42 +0800 IronPort-SDR: NCRJ+86HFbjd/31TzId74TNsCxYU7dGJJkUHnEtNNJ4vlNrHMZqtQmJGo/Jn+uiDqRPxmJ3Az/ HXg+MwHs3hIHjI9Xy/pwKhMwZcnWcQVg/VEi4bniKH5QzFuwp+jF8+BLJhM2jAxZxqwwdsQALF IdXYBNwkj/fr7D9Pii2hpw6eQWLLqL/wCz6uabXm+81wPpOTq01AKKFxdSdrVWfVQuCmEEw3FB prRFyXbYQkIDs7rLPXUSz+Ecbq5V76H9bMRPdXdfrxhtxE2wFmtGQiWT56urMZ2zhMM/YOofw3 hrTbRuMg2QK8kVTq42mWz/sX Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 17 Oct 2022 04:15:15 -0700 IronPort-SDR: FFzCy0PQ95x3x54P3ZxRvZ3Hx2rq1zLe5n14kE2/+z1rCOgM2l7zVweEbgroweV+ehDUhTQlKj NoXnb1iqjpr1yK8PuRGy2zTBnM/cFXA2VJE4v71AxR5FjcLQtlfE8D41T2oWvV3RqRY9UCutNG iRah0DhreRrqJoVaRJAZu0jdnlX822mKidg994QMwsvv0QtR1NZ/zPqqEM5FR4D5sBTMFo0eYx ggTMnBZHsrVLiZCHBsTaTkmEnN573+PuX+2t6yMU9T/pv432xl/dCIUCcFSpVDdxlyeZR05eGI zTg= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip02.wdc.com with ESMTP; 17 Oct 2022 04:55:42 -0700 From: Johannes Thumshirn To: linux-btrfs@vger.kernel.org Cc: Johannes Thumshirn Subject: [RFC v3 11/11] btrfs: add tracepoints for ordered stripes Date: Mon, 17 Oct 2022 04:55:29 -0700 Message-Id: X-Mailer: git-send-email 2.37.3 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Add tracepoints to check the lifetime of btrfs_ordered_stripe entries. Signed-off-by: Johannes Thumshirn Reviewed-by: Josef Bacik --- fs/btrfs/raid-stripe-tree.c | 3 +++ fs/btrfs/super.c | 1 + include/trace/events/btrfs.h | 50 ++++++++++++++++++++++++++++++++++++ 3 files changed, 54 insertions(+) diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c index 9a913c4cd44e..0d27b236445d 100644 --- a/fs/btrfs/raid-stripe-tree.c +++ b/fs/btrfs/raid-stripe-tree.c @@ -99,6 +99,7 @@ int btrfs_add_ordered_stripe(struct btrfs_io_context *bioc) return -EINVAL; } + trace_btrfs_ordered_stripe_add(fs_info, stripe); return 0; } @@ -114,6 +115,7 @@ struct btrfs_ordered_stripe *btrfs_lookup_ordered_stripe(struct btrfs_fs_info *f if (node) { stripe = rb_entry(node, struct btrfs_ordered_stripe, rb_node); refcount_inc(&stripe->ref); + trace_btrfs_ordered_stripe_lookup(fs_info, stripe); } mutex_unlock(&fs_info->stripe_update_lock); @@ -124,6 +126,7 @@ void btrfs_put_ordered_stripe(struct btrfs_fs_info *fs_info, struct btrfs_ordered_stripe *stripe) { mutex_lock(&fs_info->stripe_update_lock); + trace_btrfs_ordered_stripe_put(fs_info, stripe); if (refcount_dec_and_test(&stripe->ref)) { struct rb_node *node = &stripe->rb_node; diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index eb0ae7e396ef..e071245ef0b4 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -49,6 +49,7 @@ #include "discard.h" #include "qgroup.h" #include "raid56.h" +#include "raid-stripe-tree.h" #define CREATE_TRACE_POINTS #include diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h index ed50e81174bf..49510c687977 100644 --- a/include/trace/events/btrfs.h +++ b/include/trace/events/btrfs.h @@ -32,6 +32,7 @@ struct prelim_ref; struct btrfs_space_info; struct btrfs_raid_bio; struct raid56_bio_trace_info; +struct btrfs_ordered_stripe; #define show_ref_type(type) \ __print_symbolic(type, \ @@ -2414,6 +2415,55 @@ DEFINE_EVENT(btrfs_raid56_bio, raid56_scrub_read_recover, TP_ARGS(rbio, bio, trace_info) ); +DECLARE_EVENT_CLASS(btrfs__ordered_stripe, + + TP_PROTO(const struct btrfs_fs_info *fs_info, + const struct btrfs_ordered_stripe *stripe), + + TP_ARGS(fs_info, stripe), + + TP_STRUCT__entry_btrfs( + __field( u64, logical ) + __field( u64, num_bytes ) + __field( int, num_stripes ) + __field( int, ref ) + ), + + TP_fast_assign_btrfs(fs_info, + __entry->logical = stripe->logical; + __entry->num_bytes = stripe->num_bytes; + __entry->num_stripes = stripe->num_stripes; + __entry->ref = refcount_read(&stripe->ref); + ), + + TP_printk_btrfs("logical=%llu, num_bytes=%llu, num_stripes=%d, ref=%d", + __entry->logical, __entry->num_bytes, + __entry->num_stripes, __entry->ref) +); + +DEFINE_EVENT(btrfs__ordered_stripe, btrfs_ordered_stripe_add, + + TP_PROTO(const struct btrfs_fs_info *fs_info, + const struct btrfs_ordered_stripe *stripe), + + TP_ARGS(fs_info, stripe) +); + +DEFINE_EVENT(btrfs__ordered_stripe, btrfs_ordered_stripe_lookup, + + TP_PROTO(const struct btrfs_fs_info *fs_info, + const struct btrfs_ordered_stripe *stripe), + + TP_ARGS(fs_info, stripe) +); + +DEFINE_EVENT(btrfs__ordered_stripe, btrfs_ordered_stripe_put, + + TP_PROTO(const struct btrfs_fs_info *fs_info, + const struct btrfs_ordered_stripe *stripe), + + TP_ARGS(fs_info, stripe) +); #endif /* _TRACE_BTRFS_H */ /* This part must be outside protection */