From patchwork Mon Sep 11 12:52:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13380228 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80976CA0EC8 for ; Mon, 11 Sep 2023 22:07:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1356017AbjIKWCk (ORCPT ); Mon, 11 Sep 2023 18:02:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56874 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237462AbjIKMwa (ORCPT ); Mon, 11 Sep 2023 08:52:30 -0400 Received: from esa5.hgst.iphmx.com (esa5.hgst.iphmx.com [216.71.153.144]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 06433CEB; Mon, 11 Sep 2023 05:52:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1694436746; x=1725972746; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=rko1kr0Bp04UELCy7Ta/M1dYbKfLv8H02oeXzXTMR18=; b=G6Jp+5993tk4PQ0GWVI8W1odlT1YtGvSXmi+OhRqNBgJc9piQ85LIJvP wCTFkIxs9uv+CgsQCEzNJkTfnaM84xo5aOrJh1gP7zb4BoWJd5U3lVLDx ix+ezGj24m38jg2wq54/zHg6g/Ibj7r5eYqsSE6MMjJvZ2yYt4ldqpYwm n0DxlnERyQBOFi8oiSUAKEOATUqyOUnopCI8wZFb0PRotN7r+e4StXTPe z+B3Ucc3+WtDokzCM68mBeaZ/9RPbRtelUrRNV1Q+W3gXAjrFaRHN0fWw rVW4KMDikl6BWYrF7tAKVBBl/a+B9XC1aj6ZTvmfy5vGiFLkGPq9eh2eI g==; X-CSE-ConnectionGUID: DaKU9agaQimiLQGY0wFCWA== X-CSE-MsgGUID: B7j8IrEWRYS+KzKOcm47Qg== X-IronPort-AV: E=Sophos;i="6.02,244,1688400000"; d="scan'208";a="243594378" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 11 Sep 2023 20:52:26 +0800 IronPort-SDR: 3nSB9lgfDD9Taii1mtOdJXrK6ItcBjfhbZSUFaKh771D+44P4yHr3SjTRcH8we2ljcSxxq4Me7 xImvLfDlJyYRABZ1JX5Kz+ze57uzR7Wr9Y4foifrM5jI7Rr13ai0p3dy+D5jgBX1Vk1EO7uYJN ZUksRdB4vFvnHnMqkh/Ruw/jXVSqfYr7ghgI45P3fL7Ew7nwUC/e8pWdqvDJYA8R117tBMWbkE ug5uEP8faztQr6l+kBRj2MRAeCWcvuTJa55kcR0/eYoRhWRj2wEPlFa1zFEDG4Xhthb+3uFdnp jq8= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 11 Sep 2023 04:59:31 -0700 IronPort-SDR: MdE6VJtl9GGWzPSXdCj+oWCkzGdCIzqV2ByodE/P4+lR1HhxafqpInKJtFRBTqiezjeLifUw8M dQN80BstGoE6JwHkd+qmsnEDW+GQITblt8quvjhCsfycQJo0qfSXJQpMTKgk8rRXpo4Xgh+xsG +sLga87MRjGtBP5lAEzdEnTM0rF2esUv79o/WZtQqUPmF/s1VpYb9qLCNcGpiV81rDb530ZTsr IBRnVIxpfA/5xDsNtALRBBCuy6qMshWK5JufGx/r3YIlI3EXZE1Wsi0eWNgkjhfKMYutzjCRWN dcA= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.6]) by uls-op-cesaip02.wdc.com with ESMTP; 11 Sep 2023 05:52:24 -0700 From: Johannes Thumshirn To: Chris Mason , Josef Bacik , David Sterba Cc: Johannes Thumshirn , Christoph Hellwig , Naohiro Aota , Qu Wenruo , Damien Le Moal , linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v8 01/11] btrfs: add raid stripe tree definitions Date: Mon, 11 Sep 2023 05:52:02 -0700 Message-ID: <20230911-raid-stripe-tree-v8-1-647676fa852c@wdc.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230911-raid-stripe-tree-v8-0-647676fa852c@wdc.com> References: <20230911-raid-stripe-tree-v8-0-647676fa852c@wdc.com> MIME-Version: 1.0 X-Mailer: b4 0.12.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1694436627; l=4617; i=johannes.thumshirn@wdc.com; s=20230613; h=from:subject:message-id; bh=rko1kr0Bp04UELCy7Ta/M1dYbKfLv8H02oeXzXTMR18=; b=wYRBOpcnTXFzs4PjGT4KN36cNiyrKoj9lnML5aDEdtNjnRs7VyDLia0V6VIrOcATdCjWn2tme PV/j1Q51hrrAJI5LkpTFysMGVFHlwyGIvGfXBnxeV8UVL+vUiBhGu9O X-Developer-Key: i=johannes.thumshirn@wdc.com; a=ed25519; pk=TGmHKs78FdPi+QhrViEvjKIGwReUGCfa+3LEnGoR2KM= Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Add definitions for the raid stripe tree. This tree will hold information about the on-disk layout of the stripes in a RAID set. Each stripe extent has a 1:1 relationship with an on-disk extent item and is doing the logical to per-drive physical address translation for the extent item in question. Signed-off-by: Johannes Thumshirn --- fs/btrfs/accessors.h | 10 ++++++++++ fs/btrfs/locking.c | 5 +++-- include/uapi/linux/btrfs_tree.h | 33 +++++++++++++++++++++++++++++++-- 3 files changed, 44 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/accessors.h b/fs/btrfs/accessors.h index f958eccff477..977ff160a024 100644 --- a/fs/btrfs/accessors.h +++ b/fs/btrfs/accessors.h @@ -306,6 +306,16 @@ BTRFS_SETGET_FUNCS(timespec_nsec, struct btrfs_timespec, nsec, 32); BTRFS_SETGET_STACK_FUNCS(stack_timespec_sec, struct btrfs_timespec, sec, 64); BTRFS_SETGET_STACK_FUNCS(stack_timespec_nsec, struct btrfs_timespec, nsec, 32); +BTRFS_SETGET_FUNCS(stripe_extent_encoding, struct btrfs_stripe_extent, encoding, 8); +BTRFS_SETGET_FUNCS(raid_stride_devid, struct btrfs_raid_stride, devid, 64); +BTRFS_SETGET_FUNCS(raid_stride_physical, struct btrfs_raid_stride, physical, 64); +BTRFS_SETGET_FUNCS(raid_stride_length, struct btrfs_raid_stride, length, 64); +BTRFS_SETGET_STACK_FUNCS(stack_stripe_extent_encoding, + struct btrfs_stripe_extent, encoding, 8); +BTRFS_SETGET_STACK_FUNCS(stack_raid_stride_devid, struct btrfs_raid_stride, devid, 64); +BTRFS_SETGET_STACK_FUNCS(stack_raid_stride_physical, struct btrfs_raid_stride, physical, 64); +BTRFS_SETGET_STACK_FUNCS(stack_raid_stride_length, struct btrfs_raid_stride, length, 64); + /* struct btrfs_dev_extent */ BTRFS_SETGET_FUNCS(dev_extent_chunk_tree, struct btrfs_dev_extent, chunk_tree, 64); BTRFS_SETGET_FUNCS(dev_extent_chunk_objectid, struct btrfs_dev_extent, diff --git a/fs/btrfs/locking.c b/fs/btrfs/locking.c index 6ac4fd8cc8dc..e7760d40feab 100644 --- a/fs/btrfs/locking.c +++ b/fs/btrfs/locking.c @@ -58,8 +58,8 @@ static struct btrfs_lockdep_keyset { u64 id; /* root objectid */ - /* Longest entry: btrfs-block-group-00 */ - char names[BTRFS_MAX_LEVEL][24]; + /* Longest entry: btrfs-raid-stripe-tree-00 */ + char names[BTRFS_MAX_LEVEL][25]; struct lock_class_key keys[BTRFS_MAX_LEVEL]; } btrfs_lockdep_keysets[] = { { .id = BTRFS_ROOT_TREE_OBJECTID, DEFINE_NAME("root") }, @@ -74,6 +74,7 @@ static struct btrfs_lockdep_keyset { { .id = BTRFS_UUID_TREE_OBJECTID, DEFINE_NAME("uuid") }, { .id = BTRFS_FREE_SPACE_TREE_OBJECTID, DEFINE_NAME("free-space") }, { .id = BTRFS_BLOCK_GROUP_TREE_OBJECTID, DEFINE_NAME("block-group") }, + { .id = BTRFS_RAID_STRIPE_TREE_OBJECTID,DEFINE_NAME("raid-stripe-tree") }, { .id = 0, DEFINE_NAME("tree") }, }; diff --git a/include/uapi/linux/btrfs_tree.h b/include/uapi/linux/btrfs_tree.h index fc3c32186d7e..3fb758ce3ac0 100644 --- a/include/uapi/linux/btrfs_tree.h +++ b/include/uapi/linux/btrfs_tree.h @@ -4,9 +4,8 @@ #include #include -#ifdef __KERNEL__ #include -#else +#ifndef __KERNEL__ #include #endif @@ -73,6 +72,9 @@ /* Holds the block group items for extent tree v2. */ #define BTRFS_BLOCK_GROUP_TREE_OBJECTID 11ULL +/* tracks RAID stripes in block groups. */ +#define BTRFS_RAID_STRIPE_TREE_OBJECTID 12ULL + /* device stats in the device tree */ #define BTRFS_DEV_STATS_OBJECTID 0ULL @@ -285,6 +287,8 @@ */ #define BTRFS_QGROUP_RELATION_KEY 246 +#define BTRFS_RAID_STRIPE_KEY 247 + /* * Obsolete name, see BTRFS_TEMPORARY_ITEM_KEY. */ @@ -719,6 +723,31 @@ struct btrfs_free_space_header { __le64 num_bitmaps; } __attribute__ ((__packed__)); +struct btrfs_raid_stride { + /* btrfs device-id this raid extent lives on */ + __le64 devid; + /* physical location on disk */ + __le64 physical; + /* length of stride on this disk */ + __le64 length; +}; + +#define BTRFS_STRIPE_DUP 0 +#define BTRFS_STRIPE_RAID0 1 +#define BTRFS_STRIPE_RAID1 2 +#define BTRFS_STRIPE_RAID1C3 3 +#define BTRFS_STRIPE_RAID1C4 4 +#define BTRFS_STRIPE_RAID5 5 +#define BTRFS_STRIPE_RAID6 6 +#define BTRFS_STRIPE_RAID10 7 + +struct btrfs_stripe_extent { + __u8 encoding; + __u8 reserved[7]; + /* array of raid strides this stripe is composed of */ + __DECLARE_FLEX_ARRAY(struct btrfs_raid_stride, strides); +}; + #define BTRFS_HEADER_FLAG_WRITTEN (1ULL << 0) #define BTRFS_HEADER_FLAG_RELOC (1ULL << 1) From patchwork Mon Sep 11 12:52:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13380220 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 66E41CA0EC9 for ; Mon, 11 Sep 2023 22:06:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355651AbjIKWBi (ORCPT ); Mon, 11 Sep 2023 18:01:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59604 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237463AbjIKMwc (ORCPT ); Mon, 11 Sep 2023 08:52:32 -0400 Received: from esa5.hgst.iphmx.com (esa5.hgst.iphmx.com [216.71.153.144]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0B122CEB; Mon, 11 Sep 2023 05:52:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1694436748; x=1725972748; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=bkhueVP2Qax3H8bEjdzjP/3E8mphYatEpGf7D9a4/uc=; b=bEZ9aFLzYwdPJ2xsLeDT0Lk845KyWJ4EpBmWUXQKrv9TzSElSeF/JvS/ uklMHwYj990stUKCyVHB7BJ7/JK6BzLujpmH+uWlnXKT66eGLHkLelMZW bTdfn31sb3LBaCOtaIb6EDY+8pMVQuQmR88QBXT/eeXJhMdnb5siusc6m 4+/j9zT5kwW+cZLDj1EP9Nq29i34U47UemOffj29TfIHrcdQdMF3whReL O533lTovxd5CdOAkofXOZg9+NrDtDbIGEI1bL3wvC4kVvbXRgmHp7qZa6 hCWijGAC8FJW8Z8Bt5XvT8D0LnyEkDeFBjS0WJiwCuFvYqZrRwBdDVG4B A==; X-CSE-ConnectionGUID: ucvXE4tiSTWol53WdYcnGw== X-CSE-MsgGUID: 9NoX7HcrTLi1Z0G50Jzbig== X-IronPort-AV: E=Sophos;i="6.02,244,1688400000"; d="scan'208";a="243594382" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 11 Sep 2023 20:52:28 +0800 IronPort-SDR: MYSeBuz2shmTAhjq6LNQahDDhuT65r+3j6mDlBvz6SzG8eW3fIHKA9oirItEBkKwRU+xn4Z+2u 1Ok0Kg5jKzyuxrKGPtpB1Wu01wwgRcwZOeABWMzI1x5N1dfdmP4ZRc1t9i6cpGH4Vbtm18jZmy 2oqhnfpGbfTSaJpUI+hnOedJQsGPUoGXFbHJKazuCOcHXJtGS9lL45mnNdQkL2DcZjPySnXcyG 7KzQ+/rENFXnstLYMpmG7VNroeJk8vHHWX0ytIXXJQXXNx6YkNwkK+wjZ5FXpcPJs5zeE/IR9M cL4= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 11 Sep 2023 04:59:33 -0700 IronPort-SDR: dr4X3oAQHt8s1t3OmhKXv+ITJWlxRQVb5afF6PESNHGAXn/HBNgcTjdv7adyfJ13pYu6dXnSqk ESDRvxS2YfgtblvoDIvXnIt0vPdo+7mID55RTSzaoq/RubeSaMygtGLUwOciyqXzVdao/c2IC6 FUx36UMZ2mGsv+7SIFl/34tmm7MhFlOE9A3AgEcaPoiyWWyVJroabUSlvAQoNwyR3NSE0SFaoG vcb/tQKq8d3fgF+2FLg9zyi9P3EYq1/2XmxAqyO5HTyh0S0BFp2j2kU7sMsOJnKlzUJzySHfgB TDA= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.6]) by uls-op-cesaip02.wdc.com with ESMTP; 11 Sep 2023 05:52:26 -0700 From: Johannes Thumshirn To: Chris Mason , Josef Bacik , David Sterba Cc: Johannes Thumshirn , Christoph Hellwig , Naohiro Aota , Qu Wenruo , Damien Le Moal , linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org, Anand Jain Subject: [PATCH v8 02/11] btrfs: read raid-stripe-tree from disk Date: Mon, 11 Sep 2023 05:52:03 -0700 Message-ID: <20230911-raid-stripe-tree-v8-2-647676fa852c@wdc.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230911-raid-stripe-tree-v8-0-647676fa852c@wdc.com> References: <20230911-raid-stripe-tree-v8-0-647676fa852c@wdc.com> MIME-Version: 1.0 X-Mailer: b4 0.12.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1694436627; l=4798; i=johannes.thumshirn@wdc.com; s=20230613; h=from:subject:message-id; bh=bkhueVP2Qax3H8bEjdzjP/3E8mphYatEpGf7D9a4/uc=; b=YbY7mFGA8DSC9Gu8QnDSxQHmb+4Xj2wDP+ZNRy46iGamnA4OL8/pjdaoQRv5LNny+VNJP3S/4 Z2bWaiPGY9xCwFncbfRyZwqDv/TL8AIAzBF0l3ddMQn368sZR6v5kDr X-Developer-Key: i=johannes.thumshirn@wdc.com; a=ed25519; pk=TGmHKs78FdPi+QhrViEvjKIGwReUGCfa+3LEnGoR2KM= Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org If we find a raid-stripe-tree on mount, read it from disk. Reviewed-by: Josef Bacik Reviewed-by: Anand Jain Signed-off-by: Johannes Thumshirn --- fs/btrfs/block-rsv.c | 6 ++++++ fs/btrfs/disk-io.c | 18 ++++++++++++++++++ fs/btrfs/disk-io.h | 5 +++++ fs/btrfs/fs.h | 1 + include/uapi/linux/btrfs.h | 1 + 5 files changed, 31 insertions(+) diff --git a/fs/btrfs/block-rsv.c b/fs/btrfs/block-rsv.c index 77684c5e0c8b..4e55e5f30f7f 100644 --- a/fs/btrfs/block-rsv.c +++ b/fs/btrfs/block-rsv.c @@ -354,6 +354,11 @@ void btrfs_update_global_block_rsv(struct btrfs_fs_info *fs_info) min_items++; } + if (btrfs_fs_incompat(fs_info, RAID_STRIPE_TREE)) { + num_bytes += btrfs_root_used(&fs_info->stripe_root->root_item); + min_items++; + } + /* * But we also want to reserve enough space so we can do the fallback * global reserve for an unlink, which is an additional @@ -405,6 +410,7 @@ void btrfs_init_root_block_rsv(struct btrfs_root *root) case BTRFS_EXTENT_TREE_OBJECTID: case BTRFS_FREE_SPACE_TREE_OBJECTID: case BTRFS_BLOCK_GROUP_TREE_OBJECTID: + case BTRFS_RAID_STRIPE_TREE_OBJECTID: root->block_rsv = &fs_info->delayed_refs_rsv; break; case BTRFS_ROOT_TREE_OBJECTID: diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 4c5d71065ea8..1ecebcfc1c17 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1179,6 +1179,8 @@ static struct btrfs_root *btrfs_get_global_root(struct btrfs_fs_info *fs_info, return btrfs_grab_root(fs_info->block_group_root); case BTRFS_FREE_SPACE_TREE_OBJECTID: return btrfs_grab_root(btrfs_global_root(fs_info, &key)); + case BTRFS_RAID_STRIPE_TREE_OBJECTID: + return btrfs_grab_root(fs_info->stripe_root); default: return NULL; } @@ -1259,6 +1261,7 @@ void btrfs_free_fs_info(struct btrfs_fs_info *fs_info) btrfs_put_root(fs_info->fs_root); btrfs_put_root(fs_info->data_reloc_root); btrfs_put_root(fs_info->block_group_root); + btrfs_put_root(fs_info->stripe_root); btrfs_check_leaked_roots(fs_info); btrfs_extent_buffer_leak_debug_check(fs_info); kfree(fs_info->super_copy); @@ -1804,6 +1807,7 @@ static void free_root_pointers(struct btrfs_fs_info *info, bool free_chunk_root) free_root_extent_buffers(info->fs_root); free_root_extent_buffers(info->data_reloc_root); free_root_extent_buffers(info->block_group_root); + free_root_extent_buffers(info->stripe_root); if (free_chunk_root) free_root_extent_buffers(info->chunk_root); } @@ -2280,6 +2284,20 @@ static int btrfs_read_roots(struct btrfs_fs_info *fs_info) fs_info->uuid_root = root; } + if (btrfs_fs_incompat(fs_info, RAID_STRIPE_TREE)) { + location.objectid = BTRFS_RAID_STRIPE_TREE_OBJECTID; + root = btrfs_read_tree_root(tree_root, &location); + if (IS_ERR(root)) { + if (!btrfs_test_opt(fs_info, IGNOREBADROOTS)) { + ret = PTR_ERR(root); + goto out; + } + } else { + set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state); + fs_info->stripe_root = root; + } + } + return 0; out: btrfs_warn(fs_info, "failed to read root (objectid=%llu): %d", diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h index 02b645744a82..8b7f01a01c44 100644 --- a/fs/btrfs/disk-io.h +++ b/fs/btrfs/disk-io.h @@ -103,6 +103,11 @@ static inline struct btrfs_root *btrfs_grab_root(struct btrfs_root *root) return NULL; } +static inline struct btrfs_root *btrfs_stripe_tree_root(struct btrfs_fs_info *fs_info) +{ + return fs_info->stripe_root; +} + void btrfs_put_root(struct btrfs_root *root); void btrfs_mark_buffer_dirty(struct extent_buffer *buf); int btrfs_buffer_uptodate(struct extent_buffer *buf, u64 parent_transid, diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h index d84a390336fc..5c7778e8b5ed 100644 --- a/fs/btrfs/fs.h +++ b/fs/btrfs/fs.h @@ -367,6 +367,7 @@ struct btrfs_fs_info { struct btrfs_root *uuid_root; struct btrfs_root *data_reloc_root; struct btrfs_root *block_group_root; + struct btrfs_root *stripe_root; /* The log root tree is a directory of all the other log roots */ struct btrfs_root *log_root_tree; diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h index dbb8b96da50d..b9a1d9af8ae8 100644 --- a/include/uapi/linux/btrfs.h +++ b/include/uapi/linux/btrfs.h @@ -333,6 +333,7 @@ struct btrfs_ioctl_fs_info_args { #define BTRFS_FEATURE_INCOMPAT_RAID1C34 (1ULL << 11) #define BTRFS_FEATURE_INCOMPAT_ZONED (1ULL << 12) #define BTRFS_FEATURE_INCOMPAT_EXTENT_TREE_V2 (1ULL << 13) +#define BTRFS_FEATURE_INCOMPAT_RAID_STRIPE_TREE (1ULL << 14) struct btrfs_ioctl_feature_flags { __u64 compat_flags; From patchwork Mon Sep 11 12:52:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13380229 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 695BCCA0EC3 for ; Mon, 11 Sep 2023 22:07:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1356023AbjIKWCl (ORCPT ); Mon, 11 Sep 2023 18:02:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59618 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237465AbjIKMwe (ORCPT ); Mon, 11 Sep 2023 08:52:34 -0400 Received: from esa5.hgst.iphmx.com (esa5.hgst.iphmx.com [216.71.153.144]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B34A1CEB; Mon, 11 Sep 2023 05:52:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1694436750; x=1725972750; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=nUgxgGRyHtLD7v2F+tQEZvxKac7ZdTy4WQRaUi31oTI=; b=YR2IqBssApWi4WMF/rhEUuFEhuTVMcdadaNMkaI+iA28MMkxinJBl679 k7t0y/nd4hT5lHJUsL6JId4V3iACwa/E9oLpTk8sd+O9RhSoQCZv7aAUv MH1RO8+Cu905PCo3kcMvb0PvkA2cOLUD91/A3YbMjk/354azB4j3Lc647 1cn0hxN2p5OAPz3p9gLV9At+nxlVLVvF7vy8EYrmBvoJJu0p3BPtFziV6 IDKXreYJ+PMAf2FFOV+YzPqUYoUMPKe5ONXjMaptdiQIrjUygx1hePfdR nytTQCHUu6yfBuK3eL804bGbAQfc02IoemyAOlkUi1jRpRrQ7AYxi060b A==; X-CSE-ConnectionGUID: Fy4YrIOSRYueOsUWA5Z3Kg== X-CSE-MsgGUID: VrJ4IHhhQUCTsVySNHIFJw== X-IronPort-AV: E=Sophos;i="6.02,244,1688400000"; d="scan'208";a="243594383" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 11 Sep 2023 20:52:30 +0800 IronPort-SDR: 3VItKIvdd7j/7SLCiwjI2uFweAlkRUGJ3vypPzrMs/2uBw6tlHi6OrgBxRQhngLhIdjJGOFWJ0 waxCkn7icMgEk6W3h2Qy+jqDfWcRhmul9uNLNbhRWnV0rTG1Cbfl1qmDFMKax70YZ3rpdT3xNA WT0J4MeN53ZTpZ5YvJfa+8Yso1Em9Ucxl9ILNM7UXFhoKDq9zbtPAKKWWElUlqXLvXOvl0U3SS neIdafb9cLj1nblmMNXQ3BOhwm4o4/ZYds0gBTNNT8xXgw0e+Y3PusyPA3mIPAKGXJsf3Q4ztO v28= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 11 Sep 2023 04:59:35 -0700 IronPort-SDR: Hlreot0zQ3cyxTdeIICQie1O3w2zak1sobGHL0x/1HLSifI7dbkVJxyuhXFgd7x7sKcwefsk+D qtIv8oqde/WkBJeC1xnX1SE8yxMDctQrahRkTz03UgjvziLZagOKqNoA+Ds3cjcAEZIhip+N4X kc7knOdegY00wir1t8D1r2th/bVgBfAi0loSm9oCErWWX8NS+bFSw0Y2Wm1qeZnQRbwaFMkYX0 WzZLz9+gHHr6+ZhiPJxAHQ9e/ptGbi4qeU5RZb59g0kjIXDbkTCkf0XAgsdvbfGbl68VWDCwP6 0Q8= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.6]) by uls-op-cesaip02.wdc.com with ESMTP; 11 Sep 2023 05:52:28 -0700 From: Johannes Thumshirn To: Chris Mason , Josef Bacik , David Sterba Cc: Johannes Thumshirn , Christoph Hellwig , Naohiro Aota , Qu Wenruo , Damien Le Moal , linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v8 03/11] btrfs: add support for inserting raid stripe extents Date: Mon, 11 Sep 2023 05:52:04 -0700 Message-ID: <20230911-raid-stripe-tree-v8-3-647676fa852c@wdc.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230911-raid-stripe-tree-v8-0-647676fa852c@wdc.com> References: <20230911-raid-stripe-tree-v8-0-647676fa852c@wdc.com> MIME-Version: 1.0 X-Mailer: b4 0.12.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1694436627; l=16978; i=johannes.thumshirn@wdc.com; s=20230613; h=from:subject:message-id; bh=nUgxgGRyHtLD7v2F+tQEZvxKac7ZdTy4WQRaUi31oTI=; b=5etT14Y4azYf+0+JcMH4epr3hgjMJCWWUgP6pTTKKMWD/K7ZVY9DZYSF79w4f1XOj5xGEB+OI UfCO+/nLH/LBfo93Qp8VQuyTQprZC/ulNNXQsZMZ1oylbdDNZPenbBH X-Developer-Key: i=johannes.thumshirn@wdc.com; a=ed25519; pk=TGmHKs78FdPi+QhrViEvjKIGwReUGCfa+3LEnGoR2KM= Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Add support for inserting stripe extents into the raid stripe tree on completion of every write that needs an extra logical-to-physical translation when using RAID. Inserting the stripe extents happens after the data I/O has completed, this is done to a) support zone-append and b) rule out the possibility of a RAID-write-hole. Signed-off-by: Johannes Thumshirn --- fs/btrfs/Makefile | 2 +- fs/btrfs/bio.c | 23 ++++ fs/btrfs/extent-tree.c | 1 + fs/btrfs/inode.c | 8 +- fs/btrfs/ordered-data.c | 1 + fs/btrfs/ordered-data.h | 2 + fs/btrfs/raid-stripe-tree.c | 266 ++++++++++++++++++++++++++++++++++++++++++++ fs/btrfs/raid-stripe-tree.h | 34 ++++++ fs/btrfs/volumes.c | 4 +- fs/btrfs/volumes.h | 15 ++- 10 files changed, 347 insertions(+), 9 deletions(-) diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile index c57d80729d4f..525af975f61c 100644 --- a/fs/btrfs/Makefile +++ b/fs/btrfs/Makefile @@ -33,7 +33,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \ uuid-tree.o props.o free-space-tree.o tree-checker.o space-info.o \ block-rsv.o delalloc-space.o block-group.o discard.o reflink.o \ subpage.o tree-mod-log.o extent-io-tree.o fs.o messages.o bio.o \ - lru_cache.o + lru_cache.o raid-stripe-tree.o btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o btrfs-$(CONFIG_BTRFS_FS_REF_VERIFY) += ref-verify.o diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c index 31ff36990404..ddbe6f8d4ea2 100644 --- a/fs/btrfs/bio.c +++ b/fs/btrfs/bio.c @@ -14,6 +14,7 @@ #include "rcu-string.h" #include "zoned.h" #include "file-item.h" +#include "raid-stripe-tree.h" static struct bio_set btrfs_bioset; static struct bio_set btrfs_clone_bioset; @@ -415,6 +416,9 @@ static void btrfs_orig_write_end_io(struct bio *bio) else bio->bi_status = BLK_STS_OK; + if (bio_op(bio) == REQ_OP_ZONE_APPEND && !bio->bi_status) + stripe->physical = bio->bi_iter.bi_sector << SECTOR_SHIFT; + btrfs_orig_bbio_end_io(bbio); btrfs_put_bioc(bioc); } @@ -426,6 +430,8 @@ static void btrfs_clone_write_end_io(struct bio *bio) if (bio->bi_status) { atomic_inc(&stripe->bioc->error); btrfs_log_dev_io_error(bio, stripe->dev); + } else if (bio_op(bio) == REQ_OP_ZONE_APPEND) { + stripe->physical = bio->bi_iter.bi_sector << SECTOR_SHIFT; } /* Pass on control to the original bio this one was cloned from */ @@ -487,6 +493,7 @@ static void btrfs_submit_mirrored_bio(struct btrfs_io_context *bioc, int dev_nr) bio->bi_private = &bioc->stripes[dev_nr]; bio->bi_iter.bi_sector = bioc->stripes[dev_nr].physical >> SECTOR_SHIFT; bioc->stripes[dev_nr].bioc = bioc; + bioc->size = bio->bi_iter.bi_size; btrfs_submit_dev_bio(bioc->stripes[dev_nr].dev, bio); } @@ -496,6 +503,8 @@ static void __btrfs_submit_bio(struct bio *bio, struct btrfs_io_context *bioc, if (!bioc) { /* Single mirror read/write fast path. */ btrfs_bio(bio)->mirror_num = mirror_num; + if (bio_op(bio) != REQ_OP_READ) + btrfs_bio(bio)->orig_physical = smap->physical; bio->bi_iter.bi_sector = smap->physical >> SECTOR_SHIFT; if (bio_op(bio) != REQ_OP_READ) btrfs_bio(bio)->orig_physical = smap->physical; @@ -688,6 +697,20 @@ static bool btrfs_submit_chunk(struct btrfs_bio *bbio, int mirror_num) bio->bi_opf |= REQ_OP_ZONE_APPEND; } + if (is_data_bbio(bbio) && bioc && + btrfs_need_stripe_tree_update(bioc->fs_info, + bioc->map_type)) { + /* + * No locking for the list update, as we only add to + * the list in the I/O submission path, and list + * iteration only happens in the completion path, + * which can't happen until after the last submission. + */ + btrfs_get_bioc(bioc); + list_add_tail(&bioc->ordered_entry, + &bbio->ordered->bioc_list); + } + /* * Csum items for reloc roots have already been cloned at this * point, so they are handled as part of the no-checksum case. diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 6f6838226fe7..2e11a699ab77 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -42,6 +42,7 @@ #include "file-item.h" #include "orphan.h" #include "tree-checker.h" +#include "raid-stripe-tree.h" #undef SCRAMBLE_DELAYED_REFS diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index bafca05940d7..6f71630248da 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -71,6 +71,7 @@ #include "super.h" #include "orphan.h" #include "backref.h" +#include "raid-stripe-tree.h" struct btrfs_iget_args { u64 ino; @@ -3091,6 +3092,10 @@ int btrfs_finish_one_ordered(struct btrfs_ordered_extent *ordered_extent) trans->block_rsv = &inode->block_rsv; + ret = btrfs_insert_raid_extent(trans, ordered_extent); + if (ret) + goto out; + if (test_bit(BTRFS_ORDERED_COMPRESSED, &ordered_extent->flags)) compress_type = ordered_extent->compress_type; if (test_bit(BTRFS_ORDERED_PREALLOC, &ordered_extent->flags)) { @@ -3224,7 +3229,8 @@ int btrfs_finish_one_ordered(struct btrfs_ordered_extent *ordered_extent) int btrfs_finish_ordered_io(struct btrfs_ordered_extent *ordered) { if (btrfs_is_zoned(btrfs_sb(ordered->inode->i_sb)) && - !test_bit(BTRFS_ORDERED_IOERR, &ordered->flags)) + !test_bit(BTRFS_ORDERED_IOERR, &ordered->flags) && + list_empty(&ordered->bioc_list)) btrfs_finish_ordered_zoned(ordered); return btrfs_finish_one_ordered(ordered); } diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c index 345c449d588c..55c7d5543265 100644 --- a/fs/btrfs/ordered-data.c +++ b/fs/btrfs/ordered-data.c @@ -191,6 +191,7 @@ static struct btrfs_ordered_extent *alloc_ordered_extent( INIT_LIST_HEAD(&entry->log_list); INIT_LIST_HEAD(&entry->root_extent_list); INIT_LIST_HEAD(&entry->work_list); + INIT_LIST_HEAD(&entry->bioc_list); init_completion(&entry->completion); /* diff --git a/fs/btrfs/ordered-data.h b/fs/btrfs/ordered-data.h index 173bd5c5df26..1c51ac57e5df 100644 --- a/fs/btrfs/ordered-data.h +++ b/fs/btrfs/ordered-data.h @@ -151,6 +151,8 @@ struct btrfs_ordered_extent { struct completion completion; struct btrfs_work flush_work; struct list_head work_list; + + struct list_head bioc_list; }; static inline void diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c new file mode 100644 index 000000000000..2415698a8fef --- /dev/null +++ b/fs/btrfs/raid-stripe-tree.c @@ -0,0 +1,266 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2023 Western Digital Corporation or its affiliates. + */ + +#include + +#include "ctree.h" +#include "fs.h" +#include "accessors.h" +#include "transaction.h" +#include "disk-io.h" +#include "raid-stripe-tree.h" +#include "volumes.h" +#include "misc.h" +#include "print-tree.h" + +static u8 btrfs_bg_type_to_raid_encoding(u64 map_type) +{ + switch (map_type & BTRFS_BLOCK_GROUP_PROFILE_MASK) { + case BTRFS_BLOCK_GROUP_DUP: + return BTRFS_STRIPE_DUP; + case BTRFS_BLOCK_GROUP_RAID0: + return BTRFS_STRIPE_RAID0; + case BTRFS_BLOCK_GROUP_RAID1: + return BTRFS_STRIPE_RAID1; + case BTRFS_BLOCK_GROUP_RAID1C3: + return BTRFS_STRIPE_RAID1C3; + case BTRFS_BLOCK_GROUP_RAID1C4: + return BTRFS_STRIPE_RAID1C4; + case BTRFS_BLOCK_GROUP_RAID5: + return BTRFS_STRIPE_RAID5; + case BTRFS_BLOCK_GROUP_RAID6: + return BTRFS_STRIPE_RAID6; + case BTRFS_BLOCK_GROUP_RAID10: + return BTRFS_STRIPE_RAID10; + default: + ASSERT(0); + } +} + +static int btrfs_insert_one_raid_extent(struct btrfs_trans_handle *trans, + int num_stripes, + struct btrfs_io_context *bioc) +{ + struct btrfs_fs_info *fs_info = trans->fs_info; + struct btrfs_key stripe_key; + struct btrfs_root *stripe_root = btrfs_stripe_tree_root(fs_info); + u8 encoding = btrfs_bg_type_to_raid_encoding(bioc->map_type); + struct btrfs_stripe_extent *stripe_extent; + size_t item_size; + int ret; + + item_size = struct_size(stripe_extent, strides, num_stripes); + + stripe_extent = kzalloc(item_size, GFP_NOFS); + if (!stripe_extent) { + btrfs_abort_transaction(trans, -ENOMEM); + btrfs_end_transaction(trans); + return -ENOMEM; + } + + btrfs_set_stack_stripe_extent_encoding(stripe_extent, encoding); + for (int i = 0; i < num_stripes; i++) { + u64 devid = bioc->stripes[i].dev->devid; + u64 physical = bioc->stripes[i].physical; + u64 length = bioc->stripes[i].length; + struct btrfs_raid_stride *raid_stride = + &stripe_extent->strides[i]; + + if (length == 0) + length = bioc->size; + + btrfs_set_stack_raid_stride_devid(raid_stride, devid); + btrfs_set_stack_raid_stride_physical(raid_stride, physical); + btrfs_set_stack_raid_stride_length(raid_stride, length); + } + + stripe_key.objectid = bioc->logical; + stripe_key.type = BTRFS_RAID_STRIPE_KEY; + stripe_key.offset = bioc->size; + + ret = btrfs_insert_item(trans, stripe_root, &stripe_key, stripe_extent, + item_size); + if (ret) + btrfs_abort_transaction(trans, ret); + + kfree(stripe_extent); + + return ret; +} + +static int btrfs_insert_mirrored_raid_extents(struct btrfs_trans_handle *trans, + struct btrfs_ordered_extent *ordered, + u64 map_type) +{ + int num_stripes = btrfs_bg_type_to_factor(map_type); + struct btrfs_io_context *bioc; + int ret; + + list_for_each_entry(bioc, &ordered->bioc_list, ordered_entry) { + ret = btrfs_insert_one_raid_extent(trans, num_stripes, bioc); + if (ret) + return ret; + } + + return 0; +} + +static int btrfs_insert_striped_mirrored_raid_extents( + struct btrfs_trans_handle *trans, + struct btrfs_ordered_extent *ordered, + u64 map_type) +{ + struct btrfs_io_context *bioc; + struct btrfs_io_context *rbioc; + const int nstripes = list_count_nodes(&ordered->bioc_list); + const int index = btrfs_bg_flags_to_raid_index(map_type); + const int substripes = btrfs_raid_array[index].sub_stripes; + const int max_stripes = trans->fs_info->fs_devices->rw_devices / 2; + int left = nstripes; + int stripe = 0, j = 0; + int i = 0; + int ret = 0; + u64 stripe_end; + u64 prev_end; + + if (nstripes == 1) + return btrfs_insert_mirrored_raid_extents(trans, ordered, map_type); + + rbioc = kzalloc(struct_size(rbioc, stripes, nstripes * substripes), + GFP_KERNEL); + if (!rbioc) + return -ENOMEM; + + rbioc->map_type = map_type; + rbioc->logical = list_first_entry(&ordered->bioc_list, typeof(*rbioc), + ordered_entry)->logical; + + stripe_end = rbioc->logical; + prev_end = stripe_end; + list_for_each_entry(bioc, &ordered->bioc_list, ordered_entry) { + + rbioc->size += bioc->size; + for (j = 0; j < substripes; j++) { + stripe = i + j; + rbioc->stripes[stripe].dev = bioc->stripes[j].dev; + rbioc->stripes[stripe].physical = bioc->stripes[j].physical; + rbioc->stripes[stripe].length = bioc->size; + } + + stripe_end += rbioc->size; + if (i >= nstripes || + (stripe_end - prev_end >= max_stripes * BTRFS_STRIPE_LEN)) { + ret = btrfs_insert_one_raid_extent(trans, + nstripes * substripes, + rbioc); + if (ret) + goto out; + + left -= nstripes; + i = 0; + rbioc->logical += rbioc->size; + rbioc->size = 0; + } else { + i += substripes; + prev_end = stripe_end; + } + } + + if (left) { + bioc = list_prev_entry(bioc, ordered_entry); + ret = btrfs_insert_one_raid_extent(trans, substripes, bioc); + } + +out: + kfree(rbioc); + return ret; +} + +static int btrfs_insert_striped_raid_extents(struct btrfs_trans_handle *trans, + struct btrfs_ordered_extent *ordered, + u64 map_type) +{ + struct btrfs_io_context *bioc; + struct btrfs_io_context *rbioc; + const int nstripes = list_count_nodes(&ordered->bioc_list); + int i = 0; + int ret = 0; + + rbioc = kzalloc(struct_size(rbioc, stripes, nstripes), GFP_KERNEL); + if (!rbioc) + return -ENOMEM; + rbioc->map_type = map_type; + rbioc->logical = list_first_entry(&ordered->bioc_list, typeof(*rbioc), + ordered_entry)->logical; + + list_for_each_entry(bioc, &ordered->bioc_list, ordered_entry) { + rbioc->size += bioc->size; + rbioc->stripes[i].dev = bioc->stripes[0].dev; + rbioc->stripes[i].physical = bioc->stripes[0].physical; + rbioc->stripes[i].length = bioc->size; + + if (i == nstripes - 1) { + ret = btrfs_insert_one_raid_extent(trans, nstripes, rbioc); + if (ret) + goto out; + + i = 0; + rbioc->logical += rbioc->size; + rbioc->size = 0; + } else { + i++; + } + } + + if (i && i < nstripes - 1) + ret = btrfs_insert_one_raid_extent(trans, i, rbioc); + +out: + kfree(rbioc); + return ret; +} + +int btrfs_insert_raid_extent(struct btrfs_trans_handle *trans, + struct btrfs_ordered_extent *ordered_extent) +{ + struct btrfs_io_context *bioc; + u64 map_type; + int ret; + + if (!trans->fs_info->stripe_root) + return 0; + + map_type = list_first_entry(&ordered_extent->bioc_list, typeof(*bioc), + ordered_entry)->map_type; + + switch (map_type & BTRFS_BLOCK_GROUP_PROFILE_MASK) { + case BTRFS_BLOCK_GROUP_DUP: + case BTRFS_BLOCK_GROUP_RAID1: + case BTRFS_BLOCK_GROUP_RAID1C3: + case BTRFS_BLOCK_GROUP_RAID1C4: + ret = btrfs_insert_mirrored_raid_extents(trans, ordered_extent, + map_type); + break; + case BTRFS_BLOCK_GROUP_RAID0: + ret = btrfs_insert_striped_raid_extents(trans, ordered_extent, + map_type); + break; + case BTRFS_BLOCK_GROUP_RAID10: + ret = btrfs_insert_striped_mirrored_raid_extents(trans, ordered_extent, map_type); + break; + default: + ret = -EINVAL; + break; + } + + while (!list_empty(&ordered_extent->bioc_list)) { + bioc = list_first_entry(&ordered_extent->bioc_list, + typeof(*bioc), ordered_entry); + list_del(&bioc->ordered_entry); + btrfs_put_bioc(bioc); + } + + return ret; +} diff --git a/fs/btrfs/raid-stripe-tree.h b/fs/btrfs/raid-stripe-tree.h new file mode 100644 index 000000000000..f36e4c2d46b0 --- /dev/null +++ b/fs/btrfs/raid-stripe-tree.h @@ -0,0 +1,34 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2023 Western Digital Corporation or its affiliates. + */ + +#ifndef BTRFS_RAID_STRIPE_TREE_H +#define BTRFS_RAID_STRIPE_TREE_H + +#include "disk-io.h" + +struct btrfs_io_context; +struct btrfs_io_stripe; + +int btrfs_insert_raid_extent(struct btrfs_trans_handle *trans, + struct btrfs_ordered_extent *ordered_extent); + +static inline bool btrfs_need_stripe_tree_update(struct btrfs_fs_info *fs_info, + u64 map_type) +{ + u64 type = map_type & BTRFS_BLOCK_GROUP_TYPE_MASK; + u64 profile = map_type & BTRFS_BLOCK_GROUP_PROFILE_MASK; + + if (!btrfs_stripe_tree_root(fs_info)) + return false; + + if (type != BTRFS_BLOCK_GROUP_DATA) + return false; + + if (profile & BTRFS_BLOCK_GROUP_RAID1_MASK) + return true; + + return false; +} +#endif diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 871a55d36e32..0c0fd4eb4848 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -5881,6 +5881,7 @@ static int find_live_mirror(struct btrfs_fs_info *fs_info, } static struct btrfs_io_context *alloc_btrfs_io_context(struct btrfs_fs_info *fs_info, + u64 logical, u16 total_stripes) { struct btrfs_io_context *bioc; @@ -5900,6 +5901,7 @@ static struct btrfs_io_context *alloc_btrfs_io_context(struct btrfs_fs_info *fs_ bioc->fs_info = fs_info; bioc->replace_stripe_src = -1; bioc->full_stripe_logical = (u64)-1; + bioc->logical = logical; return bioc; } @@ -6434,7 +6436,7 @@ int btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, goto out; } - bioc = alloc_btrfs_io_context(fs_info, num_alloc_stripes); + bioc = alloc_btrfs_io_context(fs_info, logical, num_alloc_stripes); if (!bioc) { ret = -ENOMEM; goto out; diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index 576bfcb5b764..8604bfbbf510 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -390,12 +390,11 @@ struct btrfs_fs_devices { struct btrfs_io_stripe { struct btrfs_device *dev; - union { - /* Block mapping */ - u64 physical; - /* For the endio handler */ - struct btrfs_io_context *bioc; - }; + /* Block mapping */ + u64 physical; + u64 length; + /* For the endio handler */ + struct btrfs_io_context *bioc; }; struct btrfs_discard_stripe { @@ -428,6 +427,10 @@ struct btrfs_io_context { atomic_t error; u16 max_errors; + u64 logical; + u64 size; + struct list_head ordered_entry; + /* * The total number of stripes, including the extra duplicated * stripe for replace. From patchwork Mon Sep 11 12:52:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13380235 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A69B5CA0EC3 for ; Mon, 11 Sep 2023 22:07:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1356165AbjIKWDE (ORCPT ); Mon, 11 Sep 2023 18:03:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59626 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237466AbjIKMwh (ORCPT ); Mon, 11 Sep 2023 08:52:37 -0400 Received: from esa5.hgst.iphmx.com (esa5.hgst.iphmx.com [216.71.153.144]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 43B7ECEB; Mon, 11 Sep 2023 05:52:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1694436753; x=1725972753; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=cE+UPzVtXzF+bHxJkcZdCTF60ml4yxdyGi02OCjQOuI=; b=FE6YfLi5wayFhQzNpoYeNXT+qNi34BY6BgHJHEHbEGcW+nXkX+dArov5 B1rQ7jVbcgPjUmOD0vpwXNQOHK8BiMpi00L6yAMRixtfEziuCih7R9ULb aCw00osomak4AW9zhpHGhgmNWzueYAhxD1bWqCdSGRBAnnD1FPf+Eh4NQ AcafDxiYEHjDTdn8TWyDDZAXDxv7iXxTq8zMmC1XIY+FgibbSvyFeXavQ o9eehrr/RKyK+R82iS5C5zi+8c1BzM1g/3Rn+zdDviUexmsTEYQj20ivE iM2tK+ObJMZeoqvgY7Tl9kAPNIS1QQoXovykoA1Ojn3/hk/3m8zRVmNml Q==; X-CSE-ConnectionGUID: QE+c5Mf9S7Ot5r94N5aiGA== X-CSE-MsgGUID: SC5pS0z4Qd+dU5/hOKvhHg== X-IronPort-AV: E=Sophos;i="6.02,244,1688400000"; d="scan'208";a="243594391" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 11 Sep 2023 20:52:33 +0800 IronPort-SDR: V27pBxFFsKDgNgl0RZOKJXHd6hZvW2wTf5KnjPrEuxJaHWKDncyy7V7r6iTUrPZCuDPHIh6gkS AtHw+MtAxQaQfEobLnGu+CoEb8x/p3ncBhGaAvavDqiq0dvf4PxTZlntsoGVghtwBDHBLrUAOW t/VXqRQ+43HsB1Ye4TsiXt+2dcCWalS0f/bTADSEy9BEhL6BrJtqXjQffe8l7onRSf3xNItiGK h9l+gEvwPMlC7W2GdMcu3sJhMZrENzVgbubG/FZGgzsKbdgPO7P0veMKIfy0yI3QMo6MG+ITf5 lXg= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 11 Sep 2023 04:59:38 -0700 IronPort-SDR: b0sxTakIdX+U2kkYMpn6AA3uc/Dl4KC/NlPw48XyyKrSKPXXVE7ibDuQ+n6ABs2mNuR3TS5k+B zVFb5PYiyvZm0IZhrx2sQJKWj/BfUAjhziHBPm/+ud0gBjMHQfQ4BQs4yQj0uVkioHI8TQ/KxU vEgEtI70v9/n8NX5oTZ45SxDRYlIoiblZjfH42WyLjdcsBnaP4b32YBd62vTAUCWTEedasEUqL QWj+fy5wNPv8OppHe3n6bOEHQZi71mte+NdRdyIdDeVqDS5XmBx6Ys/UVQ8FuA/R8SLMFv4die Y3w= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.6]) by uls-op-cesaip02.wdc.com with ESMTP; 11 Sep 2023 05:52:29 -0700 From: Johannes Thumshirn To: Chris Mason , Josef Bacik , David Sterba Cc: Johannes Thumshirn , Christoph Hellwig , Naohiro Aota , Qu Wenruo , Damien Le Moal , linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v8 04/11] btrfs: delete stripe extent on extent deletion Date: Mon, 11 Sep 2023 05:52:05 -0700 Message-ID: <20230911-raid-stripe-tree-v8-4-647676fa852c@wdc.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230911-raid-stripe-tree-v8-0-647676fa852c@wdc.com> References: <20230911-raid-stripe-tree-v8-0-647676fa852c@wdc.com> MIME-Version: 1.0 X-Mailer: b4 0.12.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1694436627; l=3110; i=johannes.thumshirn@wdc.com; s=20230613; h=from:subject:message-id; bh=cE+UPzVtXzF+bHxJkcZdCTF60ml4yxdyGi02OCjQOuI=; b=NvUPztervxuijkZ/xcqpNHmK/91BLMtrbrcV2Mx/zouu8TkB1jznaW3mX5GOfGkGkixB4rF8f 35LIibF5YohBRtd5ZOFKWtpxbtD+4/YIGUHymPSpcklN3yuYLz8MIZ0 X-Developer-Key: i=johannes.thumshirn@wdc.com; a=ed25519; pk=TGmHKs78FdPi+QhrViEvjKIGwReUGCfa+3LEnGoR2KM= Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org As each stripe extent is tied to an extent item, delete the stripe extent once the corresponding extent item is deleted. Signed-off-by: Johannes Thumshirn --- fs/btrfs/extent-tree.c | 6 +++++ fs/btrfs/raid-stripe-tree.c | 60 +++++++++++++++++++++++++++++++++++++++++++++ fs/btrfs/raid-stripe-tree.h | 2 ++ 3 files changed, 68 insertions(+) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 2e11a699ab77..c64dd3fd4463 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -2857,6 +2857,12 @@ static int do_free_extent_accounting(struct btrfs_trans_handle *trans, btrfs_abort_transaction(trans, ret); return ret; } + + ret = btrfs_delete_raid_extent(trans, bytenr, num_bytes); + if (ret) { + btrfs_abort_transaction(trans, ret); + return ret; + } } ret = add_to_free_space_tree(trans, bytenr, num_bytes); diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c index 2415698a8fef..5b12f40877b5 100644 --- a/fs/btrfs/raid-stripe-tree.c +++ b/fs/btrfs/raid-stripe-tree.c @@ -15,6 +15,66 @@ #include "misc.h" #include "print-tree.h" +int btrfs_delete_raid_extent(struct btrfs_trans_handle *trans, u64 start, + u64 length) +{ + struct btrfs_fs_info *fs_info = trans->fs_info; + struct btrfs_root *stripe_root = btrfs_stripe_tree_root(fs_info); + struct btrfs_path *path; + struct btrfs_key key; + struct extent_buffer *leaf; + u64 found_start; + u64 found_end; + u64 end = start + length; + int slot; + int ret; + + if (!stripe_root) + return 0; + + path = btrfs_alloc_path(); + if (!path) + return -ENOMEM; + + while (1) { + + key.objectid = start; + key.type = BTRFS_RAID_STRIPE_KEY; + key.offset = length; + + ret = btrfs_search_slot(trans, stripe_root, &key, path, -1, 1); + if (ret < 0) + break; + if (ret > 0) { + ret = 0; + if (path->slots[0] == 0) + break; + path->slots[0]--; + } + + leaf = path->nodes[0]; + slot = path->slots[0]; + btrfs_item_key_to_cpu(leaf, &key, slot); + found_start = key.objectid; + found_end = found_start + key.offset; + + /* That stripe ends before we start, we're done */ + if (found_end <= start) + break; + + ASSERT(found_start >= start && found_end <= end); + ret = btrfs_del_item(trans, stripe_root, path); + if (ret) + break; + + btrfs_release_path(path); + } + + btrfs_free_path(path); + return ret; + +} + static u8 btrfs_bg_type_to_raid_encoding(u64 map_type) { switch (map_type & BTRFS_BLOCK_GROUP_PROFILE_MASK) { diff --git a/fs/btrfs/raid-stripe-tree.h b/fs/btrfs/raid-stripe-tree.h index f36e4c2d46b0..7560dc501a65 100644 --- a/fs/btrfs/raid-stripe-tree.h +++ b/fs/btrfs/raid-stripe-tree.h @@ -11,6 +11,8 @@ struct btrfs_io_context; struct btrfs_io_stripe; +int btrfs_delete_raid_extent(struct btrfs_trans_handle *trans, u64 start, + u64 length); int btrfs_insert_raid_extent(struct btrfs_trans_handle *trans, struct btrfs_ordered_extent *ordered_extent); From patchwork Mon Sep 11 12:52:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13380216 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98095CA0EC6 for ; Mon, 11 Sep 2023 22:06:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355596AbjIKWBV (ORCPT ); Mon, 11 Sep 2023 18:01:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59640 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237470AbjIKMwj (ORCPT ); Mon, 11 Sep 2023 08:52:39 -0400 Received: from esa5.hgst.iphmx.com (esa5.hgst.iphmx.com [216.71.153.144]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3BDE1E40; Mon, 11 Sep 2023 05:52:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1694436755; x=1725972755; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=aRvvWdjtbEYuAkkq1cMd1cz7TCumuqP5svgnqRV+38s=; b=Sh0DWtz5l23SfbrsD6HhkR9TBe7w5scn1fQgDg8TtW1PtqGhBvjRk8sc 0Z/h05SSUwFqCwWNipDmARXFgbxA9kC/yA43L6rOkdHAxDrEz+tl3djoS dWOg0Z/MMHVH18ih/ILxSG9SMz7kLRDqLYYhqCPUDhE7CQctRYs85Am9V HuF/cNh6yUjslUGs0KKGw+xXA4GDKOQmokoEUVLjbSec8vjiSCP0BIalT SYpANv8PsovTWtukNAIiyLFGsI6fqOEOH80IPn7pRXecKNOLbwxwRgPaz hq51YcJD+ghMNN2DMOhEx8OzAFNP5P6D1tcyTTzqGzq+OfbLifS1qYcLV A==; X-CSE-ConnectionGUID: Z3Qz3oS6Q9OdOqaKk+MeMA== X-CSE-MsgGUID: XgUAcItHSIqEe7z66pEuaQ== X-IronPort-AV: E=Sophos;i="6.02,244,1688400000"; d="scan'208";a="243594396" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 11 Sep 2023 20:52:35 +0800 IronPort-SDR: GrBQCx/Jp6BJOeyol65iZ+lhJuyECy1PkNptiuTv5yBk5wAtOdfv6sREDl9nmIMrlXzV4zJOmm Y0al2fUdko7T96lgMOKAT50Fabc+AUW5GnsZ0lgh9Ge3oWCsQJLPHTtmHMRkchl9vSRlGmgBCY xr+cWeWvw97xHczElWuawtJKOt5hfRRydj3SkRuTtwVX/npckgjfa1k2rXsjnvWXifJe0Hq2kR KpU9biygObfp+WNJPNY+ZvSP1pKQK8STnS1GX+NAPA/N8W27hcGl5BID0FHPmafSLU8rwZhHyx Jv8= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 11 Sep 2023 05:05:19 -0700 IronPort-SDR: NiTBUCr4z4SFaBV5dhYJ7XEQXGJyI+GoN2Z9Dxs+OLwGxNu6mA1H3OulWg3/0Dlpti20JpKLYM VhUpS7NIvTy8yI3hbaZWVnoAn6p5iPIcBcrFXdiTDZ6dceZRMBX9cEQ9/G0Fh93III9Ooqe1oX vbFI4I4tUr+PB72Scn22H5Hcs2iQtzysNyNjCk9/9YfHGhc7JuGptiGVBjFuyOACjotGf3LK3N ubnH0man6OyPH5F+8f/0KY+I4Dv8tZVQz6953vG/88CtK9Y1qDkRzRCV6Mf54zjiV4vl7JrkFV oR8= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.6]) by uls-op-cesaip02.wdc.com with ESMTP; 11 Sep 2023 05:52:31 -0700 From: Johannes Thumshirn To: Chris Mason , Josef Bacik , David Sterba Cc: Johannes Thumshirn , Christoph Hellwig , Naohiro Aota , Qu Wenruo , Damien Le Moal , linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v8 05/11] btrfs: lookup physical address from stripe extent Date: Mon, 11 Sep 2023 05:52:06 -0700 Message-ID: <20230911-raid-stripe-tree-v8-5-647676fa852c@wdc.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230911-raid-stripe-tree-v8-0-647676fa852c@wdc.com> References: <20230911-raid-stripe-tree-v8-0-647676fa852c@wdc.com> MIME-Version: 1.0 X-Mailer: b4 0.12.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1694436627; l=8982; i=johannes.thumshirn@wdc.com; s=20230613; h=from:subject:message-id; bh=aRvvWdjtbEYuAkkq1cMd1cz7TCumuqP5svgnqRV+38s=; b=OmZOx7p6p8oaDM6P9eM436W7XiJ8t7EFfYjke1U2ftBx4v/hl9xBnN/d0Iu4MuZSiJwoAfR3V rhrgeadB5dyDT35rUEj7AN8UchEhngIaXrIbRzdHPYa+F3UL1GFQI6T X-Developer-Key: i=johannes.thumshirn@wdc.com; a=ed25519; pk=TGmHKs78FdPi+QhrViEvjKIGwReUGCfa+3LEnGoR2KM= Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Lookup the physical address from the raid stripe tree when a read on an RAID volume formatted with the raid stripe tree was attempted. If the requested logical address was not found in the stripe tree, it may still be in the in-memory ordered stripe tree, so fallback to searching the ordered stripe tree in this case. Signed-off-by: Johannes Thumshirn --- fs/btrfs/raid-stripe-tree.c | 159 ++++++++++++++++++++++++++++++++++++++++++++ fs/btrfs/raid-stripe-tree.h | 11 +++ fs/btrfs/volumes.c | 37 ++++++++--- 3 files changed, 198 insertions(+), 9 deletions(-) diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c index 5b12f40877b5..7ed02e4b79ec 100644 --- a/fs/btrfs/raid-stripe-tree.c +++ b/fs/btrfs/raid-stripe-tree.c @@ -324,3 +324,162 @@ int btrfs_insert_raid_extent(struct btrfs_trans_handle *trans, return ret; } + +static bool btrfs_check_for_extent(struct btrfs_fs_info *fs_info, u64 logical, + u64 length, struct btrfs_path *path) +{ + struct btrfs_root *extent_root = btrfs_extent_root(fs_info, logical); + struct btrfs_key key; + int ret; + + btrfs_release_path(path); + + key.objectid = logical; + key.type = BTRFS_EXTENT_ITEM_KEY; + key.offset = length; + + ret = btrfs_search_slot(NULL, extent_root, &key, path, 0, 0); + + return ret; +} + +int btrfs_get_raid_extent_offset(struct btrfs_fs_info *fs_info, + u64 logical, u64 *length, u64 map_type, + u32 stripe_index, + struct btrfs_io_stripe *stripe) +{ + struct btrfs_root *stripe_root = btrfs_stripe_tree_root(fs_info); + struct btrfs_stripe_extent *stripe_extent; + struct btrfs_key stripe_key; + struct btrfs_key found_key; + struct btrfs_path *path; + struct extent_buffer *leaf; + int num_stripes; + u8 encoding; + u64 offset; + u64 found_logical; + u64 found_length; + u64 end; + u64 found_end; + int slot; + int ret; + int i; + + stripe_key.objectid = logical; + stripe_key.type = BTRFS_RAID_STRIPE_KEY; + stripe_key.offset = 0; + + path = btrfs_alloc_path(); + if (!path) + return -ENOMEM; + + ret = btrfs_search_slot(NULL, stripe_root, &stripe_key, path, 0, 0); + if (ret < 0) + goto free_path; + if (ret) { + if (path->slots[0] != 0) + path->slots[0]--; + } + + end = logical + *length; + + while (1) { + leaf = path->nodes[0]; + slot = path->slots[0]; + + btrfs_item_key_to_cpu(leaf, &found_key, slot); + found_logical = found_key.objectid; + found_length = found_key.offset; + found_end = found_logical + found_length; + + if (found_logical > end) { + ret = -ENOENT; + goto out; + } + + if (in_range(logical, found_logical, found_length)) + break; + + ret = btrfs_next_item(stripe_root, path); + if (ret) + goto out; + } + + offset = logical - found_logical; + + /* + * If we have a logically contiguous, but physically noncontinuous + * range, we need to split the bio. Record the length after which we + * must split the bio. + */ + if (end > found_end) + *length -= end - found_end; + + num_stripes = btrfs_num_raid_stripes(btrfs_item_size(leaf, slot)); + stripe_extent = btrfs_item_ptr(leaf, slot, struct btrfs_stripe_extent); + encoding = btrfs_stripe_extent_encoding(leaf, stripe_extent); + + if (encoding != btrfs_bg_type_to_raid_encoding(map_type)) { + ret = -ENOENT; + goto out; + } + + for (i = 0; i < num_stripes; i++) { + struct btrfs_raid_stride *stride = &stripe_extent->strides[i]; + u64 devid = btrfs_raid_stride_devid(leaf, stride); + u64 len = btrfs_raid_stride_length(leaf, stride); + u64 physical = btrfs_raid_stride_physical(leaf, stride); + + if (offset >= len) { + offset -= len; + + if (offset >= BTRFS_STRIPE_LEN) + continue; + } + + if (devid != stripe->dev->devid) + continue; + + if ((map_type & BTRFS_BLOCK_GROUP_DUP) && stripe_index != i) + continue; + + stripe->physical = physical + offset; + + ret = 0; + goto free_path; + } + + /* + * If we're here, we haven't found the requested devid in the stripe. + */ + ret = -ENOENT; +out: + if (ret > 0) + ret = -ENOENT; + if (ret && ret != -EIO) { + /* + * Check if the range we're looking for is actually backed by + * an extent. This can happen, e.g. when scrub is running on a + * block-group and the extent it is trying to scrub get's + * deleted in the meantime. Although scrub is setting the + * block-group to read-only, deletion of extents are still + * allowed. If the extent is gone, simply return ENOENT and be + * good. + */ + if (btrfs_check_for_extent(fs_info, logical, *length, path)) { + ret = -ENOENT; + goto free_path; + } + + if (IS_ENABLED(CONFIG_BTRFS_DEBUG)) + btrfs_print_tree(leaf, 1); + btrfs_err(fs_info, + "cannot find raid-stripe for logical [%llu, %llu] devid %llu, profile %s", + logical, logical + *length, stripe->dev->devid, + btrfs_bg_type_to_raid_name(map_type)); + } +free_path: + btrfs_free_path(path); + + return ret; +} diff --git a/fs/btrfs/raid-stripe-tree.h b/fs/btrfs/raid-stripe-tree.h index 7560dc501a65..40aa553ae8aa 100644 --- a/fs/btrfs/raid-stripe-tree.h +++ b/fs/btrfs/raid-stripe-tree.h @@ -13,6 +13,10 @@ struct btrfs_io_stripe; int btrfs_delete_raid_extent(struct btrfs_trans_handle *trans, u64 start, u64 length); +int btrfs_get_raid_extent_offset(struct btrfs_fs_info *fs_info, + u64 logical, u64 *length, u64 map_type, + u32 stripe_index, + struct btrfs_io_stripe *stripe); int btrfs_insert_raid_extent(struct btrfs_trans_handle *trans, struct btrfs_ordered_extent *ordered_extent); @@ -33,4 +37,11 @@ static inline bool btrfs_need_stripe_tree_update(struct btrfs_fs_info *fs_info, return false; } + +static inline int btrfs_num_raid_stripes(u32 item_size) +{ + return (item_size - offsetof(struct btrfs_stripe_extent, strides)) / + sizeof(struct btrfs_raid_stride); +} + #endif diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 0c0fd4eb4848..7c25f5c77788 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -35,6 +35,7 @@ #include "relocation.h" #include "scrub.h" #include "super.h" +#include "raid-stripe-tree.h" #define BTRFS_BLOCK_GROUP_STRIPE_MASK (BTRFS_BLOCK_GROUP_RAID0 | \ BTRFS_BLOCK_GROUP_RAID10 | \ @@ -6206,12 +6207,22 @@ static u64 btrfs_max_io_len(struct map_lookup *map, enum btrfs_map_op op, return U64_MAX; } -static void set_io_stripe(struct btrfs_io_stripe *dst, const struct map_lookup *map, - u32 stripe_index, u64 stripe_offset, u32 stripe_nr) +static int set_io_stripe(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, + u64 logical, u64 *length, struct btrfs_io_stripe *dst, + struct map_lookup *map, u32 stripe_index, + u64 stripe_offset, u64 stripe_nr) { dst->dev = map->stripes[stripe_index].dev; + + if (op == BTRFS_MAP_READ && + btrfs_need_stripe_tree_update(fs_info, map->type)) + return btrfs_get_raid_extent_offset(fs_info, logical, length, + map->type, stripe_index, + dst); + dst->physical = map->stripes[stripe_index].physical + stripe_offset + btrfs_stripe_nr_to_offset(stripe_nr); + return 0; } /* @@ -6428,11 +6439,11 @@ int btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, */ if (smap && num_alloc_stripes == 1 && !((map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) && mirror_num > 1)) { - set_io_stripe(smap, map, stripe_index, stripe_offset, stripe_nr); + ret = set_io_stripe(fs_info, op, logical, length, smap, map, + stripe_index, stripe_offset, stripe_nr); if (mirror_num_ret) *mirror_num_ret = mirror_num; *bioc_ret = NULL; - ret = 0; goto out; } @@ -6463,21 +6474,29 @@ int btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, bioc->full_stripe_logical = em->start + btrfs_stripe_nr_to_offset(stripe_nr * data_stripes); for (i = 0; i < num_stripes; i++) - set_io_stripe(&bioc->stripes[i], map, - (i + stripe_nr) % num_stripes, - stripe_offset, stripe_nr); + ret = set_io_stripe(fs_info, op, logical, length, + &bioc->stripes[i], map, + (i + stripe_nr) % num_stripes, + stripe_offset, stripe_nr); } else { /* * For all other non-RAID56 profiles, just copy the target * stripe into the bioc. */ for (i = 0; i < num_stripes; i++) { - set_io_stripe(&bioc->stripes[i], map, stripe_index, - stripe_offset, stripe_nr); + ret = set_io_stripe(fs_info, op, logical, length, + &bioc->stripes[i], map, stripe_index, + stripe_offset, stripe_nr); stripe_index++; } } + if (ret) { + *bioc_ret = NULL; + btrfs_put_bioc(bioc); + goto out; + } + if (op != BTRFS_MAP_READ) max_errors = btrfs_chunk_max_errors(map); From patchwork Mon Sep 11 12:52:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13380222 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B563CA0EC3 for ; Mon, 11 Sep 2023 22:06:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355772AbjIKWB7 (ORCPT ); Mon, 11 Sep 2023 18:01:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59630 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237469AbjIKMwi (ORCPT ); Mon, 11 Sep 2023 08:52:38 -0400 Received: from esa5.hgst.iphmx.com (esa5.hgst.iphmx.com [216.71.153.144]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 89229CEB; Mon, 11 Sep 2023 05:52:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1694436755; x=1725972755; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SgfsEwHqQ+ql6isG68bwoNb5Engbd+pFiR0bLi/DpuM=; b=a34uboaSmQ34Jz4FyPTUH/HExCoTnow7o/tTdbOvcnyuoTCRtiv4wDgR BiqkaGpkRiTYp+ERkMUH9aRGrWhzRA2DQLV3zm1L+6sBOShE+z0EQirta fJHFmenyDxTckFpmkZ+Ry8sqi1KSkHnMEYrKg9gkiJ9MLP/uiGh33I+4C hsRlwVocG66cQ5hEw7VGKejqdMhVXv3R969nFaS7D0OJU4E0YqdUlVVA3 4NBb6wVl+NbsGutd5IkfqhR+vx4tQ2FAw0J2U4Bx5MT7dC2+fkmLztial uyemMoJsOgNE1KobNBQLh/BCNDK3WceBnekl3EwfIYS5wp9Cwx2giOoWS A==; X-CSE-ConnectionGUID: PBcZDTIOSVChKFl5tPSYEQ== X-CSE-MsgGUID: Mvsx96uiQE6WXe74ZhrXLA== X-IronPort-AV: E=Sophos;i="6.02,244,1688400000"; d="scan'208";a="243594394" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 11 Sep 2023 20:52:34 +0800 IronPort-SDR: eSWnZUnb2K0nMsHhNpM/M4iUVV3O1x4EQWt4g+HEZ0lN0BJGlF5B3dYLcp+NjUhXigbPREFJUH eLXbqRMEW7fp8DBiYxLeChM+LMdjoms2ey4EIGGzCUeTKazIdjrNMucM66MobAEscxvSbwSbfL qfuk/cNXJYrGdaAmS4/vyrwC8ec9VH6ujE6oIGzIzSbxqt70gJGq0Wd686ofaZO6NceLJZJxZd i1+G73NSd+E/VgrTeQrTolyPZyNaUa2yKNIPb8XBYH2R4MUTZUfiug2TQ1PiSiky2hSiX21XfM yjU= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 11 Sep 2023 04:59:40 -0700 IronPort-SDR: xPqOOXg2YEhy3QmJe+LzSdhGE/g+wr0Jd/5ZoG5qm/eiqKY78YGZCJmFchqNMwH4GWaim3/IGt x/va9oC5kV9XpewqriTTav2HLtM9N7EQCznHrf1WegrtoSTPjXEq0/PhNSySHsFFvXJExDQ99h Z2lqNnJc8Y3t5N/UyEB710E9NwKEU5M0gxgLM7mtXGrf4AHBYHB5AMBhTOMo3k/hyPSbOeZebN /uWHZUAnNq5G7MeyUXAjlBlT8fgzGuXiR1Ryp4zlM4HFEQNB3STK4KoabaI7jcCS+/ripSvQeA KTg= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.6]) by uls-op-cesaip02.wdc.com with ESMTP; 11 Sep 2023 05:52:33 -0700 From: Johannes Thumshirn To: Chris Mason , Josef Bacik , David Sterba Cc: Johannes Thumshirn , Christoph Hellwig , Naohiro Aota , Qu Wenruo , Damien Le Moal , linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v8 06/11] btrfs: implement RST version of scrub Date: Mon, 11 Sep 2023 05:52:07 -0700 Message-ID: <20230911-raid-stripe-tree-v8-6-647676fa852c@wdc.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230911-raid-stripe-tree-v8-0-647676fa852c@wdc.com> References: <20230911-raid-stripe-tree-v8-0-647676fa852c@wdc.com> MIME-Version: 1.0 X-Mailer: b4 0.12.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1694436627; l=3130; i=johannes.thumshirn@wdc.com; s=20230613; h=from:subject:message-id; bh=SgfsEwHqQ+ql6isG68bwoNb5Engbd+pFiR0bLi/DpuM=; b=EKY4ohiJyLZgwMkU/s0o/u+g9U+um3lGuU1U+aGuCCalzHbC/jCRCD+wNk8fNEmhL/gq7E+sX ASfFlegZa0lAKAGShGU1GdP2965g87ksYgNPqakDUoRmm94YyXRm61o X-Developer-Key: i=johannes.thumshirn@wdc.com; a=ed25519; pk=TGmHKs78FdPi+QhrViEvjKIGwReUGCfa+3LEnGoR2KM= Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org A filesystem that uses the RAID stripe tree for logical to physical address translation can't use the regular scrub path, that reads all stripes and then checks if a sector is unused afterwards. When using the RAID stripe tree, this will result in lookup errors, as the stripe tree doesn't know the requested logical addresses. Instead, look up stripes that are backed by the extent bitmap. Signed-off-by: Johannes Thumshirn --- fs/btrfs/scrub.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 56 insertions(+) diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index f16220ce5fba..5101e0a3f83e 100644 --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -23,6 +23,7 @@ #include "accessors.h" #include "file-item.h" #include "scrub.h" +#include "raid-stripe-tree.h" /* * This is only the first step towards a full-features scrub. It reads all @@ -1634,6 +1635,56 @@ static void scrub_reset_stripe(struct scrub_stripe *stripe) } } +static void scrub_submit_extent_sector_read(struct scrub_ctx *sctx, + struct scrub_stripe *stripe) +{ + struct btrfs_fs_info *fs_info = stripe->bg->fs_info; + struct btrfs_bio *bbio = NULL; + int mirror = stripe->mirror_num; + int i; + + atomic_inc(&stripe->pending_io); + + for_each_set_bit(i, &stripe->extent_sector_bitmap, stripe->nr_sectors) { + struct page *page; + int pgoff; + + page = scrub_stripe_get_page(stripe, i); + pgoff = scrub_stripe_get_page_offset(stripe, i); + + /* The current sector cannot be merged, submit the bio. */ + if (bbio && + ((i > 0 && !test_bit(i - 1, &stripe->extent_sector_bitmap)) || + bbio->bio.bi_iter.bi_size >= BTRFS_STRIPE_LEN)) { + ASSERT(bbio->bio.bi_iter.bi_size); + atomic_inc(&stripe->pending_io); + btrfs_submit_bio(bbio, mirror); + bbio = NULL; + } + + if (!bbio) { + bbio = btrfs_bio_alloc(stripe->nr_sectors, REQ_OP_READ, + fs_info, scrub_read_endio, stripe); + bbio->bio.bi_iter.bi_sector = (stripe->logical + + (i << fs_info->sectorsize_bits)) >> SECTOR_SHIFT; + } + + __bio_add_page(&bbio->bio, page, fs_info->sectorsize, pgoff); + } + + if (bbio) { + ASSERT(bbio->bio.bi_iter.bi_size); + atomic_inc(&stripe->pending_io); + btrfs_submit_bio(bbio, mirror); + } + + if (atomic_dec_and_test(&stripe->pending_io)) { + wake_up(&stripe->io_wait); + INIT_WORK(&stripe->work, scrub_stripe_read_repair_worker); + queue_work(stripe->bg->fs_info->scrub_workers, &stripe->work); + } +} + static void scrub_submit_initial_read(struct scrub_ctx *sctx, struct scrub_stripe *stripe) { @@ -1645,6 +1696,11 @@ static void scrub_submit_initial_read(struct scrub_ctx *sctx, ASSERT(stripe->mirror_num > 0); ASSERT(test_bit(SCRUB_STRIPE_FLAG_INITIALIZED, &stripe->state)); + if (btrfs_need_stripe_tree_update(fs_info, stripe->bg->flags)) { + scrub_submit_extent_sector_read(sctx, stripe); + return; + } + bbio = btrfs_bio_alloc(SCRUB_STRIPE_PAGES, REQ_OP_READ, fs_info, scrub_read_endio, stripe); From patchwork Mon Sep 11 12:52:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13380233 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5EB89CA0ED0 for ; Mon, 11 Sep 2023 22:07:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1356108AbjIKWCy (ORCPT ); Mon, 11 Sep 2023 18:02:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59644 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237471AbjIKMwk (ORCPT ); Mon, 11 Sep 2023 08:52:40 -0400 Received: from esa5.hgst.iphmx.com (esa5.hgst.iphmx.com [216.71.153.144]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 36ED4CEB; Mon, 11 Sep 2023 05:52:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1694436756; x=1725972756; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=oNfuLA8tG2aivaq6yjp4JQ9c65/XbLN/yRBDi9R2XTE=; b=G71DJe5mBJCJcF5WDGKl0PGnjnF294kTQ9oh7ugeoNY65oPmSOOThOpG WMl0TXb2w+H0hgeGk+IyRJFIVsvOSbD+R+bsRZb478ZvP4l2bTNIJF3LM 2aCYhCO7QIMsrMx68fD/ZEc/3MS4GsBi4DZ9G5i9HPioJKXcMLgKzhs8Z 2+B+VxiNwG9JO1sVsM2llDz1mledc7LUrLDxcVL0HQTcXdp6IZdCCHoyn WjvP/Wq4AuqpZjabRhzu4BVUlpuyaexODs8rv+P4VA8sEg/O2NyZEre1O JhY9mLaX8a0tz8bi2BAxMStIbhIlFrzCbxOHE8GYR8KhqQKoEMz6vN9fC w==; X-CSE-ConnectionGUID: Htx3DRGgTkSfk/QSz6i4wQ== X-CSE-MsgGUID: hD9kqg/eTb6Q+5QwIbBYHg== X-IronPort-AV: E=Sophos;i="6.02,244,1688400000"; d="scan'208";a="243594399" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 11 Sep 2023 20:52:36 +0800 IronPort-SDR: NPIuTsyUGUwDs10k5kFNYmPz1bMKnbTPipyvZjhEFvdPlbT2WcZOhNAH6YXg8SVXveqpTBvylQ 6S2lyd1vF14lOgyRmxBQv5RmVWGmhic6c4OId0Z3RHBCzBCvdJs9PpYAyBdzAatJ699MYn4x2f CuW8jOdOZIOFpb1xhAzTxEc+llwsvNOUS9FloUzVnwbRvkQrWwET6VXHS9Tfi2m52C+T+MfxmV BTXj4/1XqxcWU65JggdlEvyE63xQZEKqWjHzXDW1s7lc2t+1V9UDVhuxAFIzEnTHkNWEMN8Wb7 OKU= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 11 Sep 2023 05:05:22 -0700 IronPort-SDR: 3b4LOtVH9XTHOOqFKz2/ds6Q2VOATKJRKWadc42bzCwW8mWkh3KmLjVTaKTEc3ufCKQ5ffMKuV bytkYrjU18p/SXFJlrqRgzydE5pURVrAnGdqlmiNu6/sYoF6Mp73cxJwxyavSDeew3r2KoXAYG v/faX2KROTKmkTJVQK91QVdn1HNkUEOeCh1Ayq0j8+ohBxCIxoBpSUii70HeGxpLq/BzQj3Z9C 5ILh29aNxzi43M/Z/SduN1+86beIcxs8IHyOtu5cr8I2RFUIF/wdD/KzOlPiFIzp6UNC246qAC W4w= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.6]) by uls-op-cesaip02.wdc.com with ESMTP; 11 Sep 2023 05:52:34 -0700 From: Johannes Thumshirn To: Chris Mason , Josef Bacik , David Sterba Cc: Johannes Thumshirn , Christoph Hellwig , Naohiro Aota , Qu Wenruo , Damien Le Moal , linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v8 07/11] btrfs: zoned: allow zoned RAID Date: Mon, 11 Sep 2023 05:52:08 -0700 Message-ID: <20230911-raid-stripe-tree-v8-7-647676fa852c@wdc.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230911-raid-stripe-tree-v8-0-647676fa852c@wdc.com> References: <20230911-raid-stripe-tree-v8-0-647676fa852c@wdc.com> MIME-Version: 1.0 X-Mailer: b4 0.12.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1694436627; l=5991; i=johannes.thumshirn@wdc.com; s=20230613; h=from:subject:message-id; bh=oNfuLA8tG2aivaq6yjp4JQ9c65/XbLN/yRBDi9R2XTE=; b=vw/lnwXT5Qma1bXsUIEGpCkgShSXIJEsBfWlzjWUPPC9HI2KqCTQtiVdkjCd0lxZR6f1SxK24 uVFJ52XWAKlAkCECmstLS0V1KdUhCeIlE9lZ9nBwAF6Y3Jpau+LnfDK X-Developer-Key: i=johannes.thumshirn@wdc.com; a=ed25519; pk=TGmHKs78FdPi+QhrViEvjKIGwReUGCfa+3LEnGoR2KM= Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org When we have a raid-stripe-tree, we can do RAID0/1/10 on zoned devices for data block-groups. For meta-data block-groups, we don't actually need anything special, as all meta-data I/O is protected by the btrfs_zoned_meta_io_lock() already. Signed-off-by: Johannes Thumshirn --- fs/btrfs/raid-stripe-tree.h | 7 ++- fs/btrfs/volumes.c | 2 + fs/btrfs/zoned.c | 113 +++++++++++++++++++++++++++++++++++++++++++- 3 files changed, 119 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/raid-stripe-tree.h b/fs/btrfs/raid-stripe-tree.h index 40aa553ae8aa..30c7d5981890 100644 --- a/fs/btrfs/raid-stripe-tree.h +++ b/fs/btrfs/raid-stripe-tree.h @@ -8,6 +8,11 @@ #include "disk-io.h" +#define BTRFS_RST_SUPP_BLOCK_GROUP_MASK (BTRFS_BLOCK_GROUP_DUP |\ + BTRFS_BLOCK_GROUP_RAID1_MASK |\ + BTRFS_BLOCK_GROUP_RAID0 |\ + BTRFS_BLOCK_GROUP_RAID10) + struct btrfs_io_context; struct btrfs_io_stripe; @@ -32,7 +37,7 @@ static inline bool btrfs_need_stripe_tree_update(struct btrfs_fs_info *fs_info, if (type != BTRFS_BLOCK_GROUP_DATA) return false; - if (profile & BTRFS_BLOCK_GROUP_RAID1_MASK) + if (profile & BTRFS_RST_SUPP_BLOCK_GROUP_MASK) return true; return false; diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 7c25f5c77788..9f17e5f290f4 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -6438,6 +6438,8 @@ int btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, * I/O context structure. */ if (smap && num_alloc_stripes == 1 && + !(btrfs_need_stripe_tree_update(fs_info, map->type) && + op != BTRFS_MAP_READ) && !((map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) && mirror_num > 1)) { ret = set_io_stripe(fs_info, op, logical, length, smap, map, stripe_index, stripe_offset, stripe_nr); diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index c6eedf4bfba9..4ca36875058c 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -1481,8 +1481,9 @@ int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new) set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &cache->runtime_flags); break; case BTRFS_BLOCK_GROUP_DUP: - if (map->type & BTRFS_BLOCK_GROUP_DATA) { - btrfs_err(fs_info, "zoned: profile DUP not yet supported on data bg"); + if (map->type & BTRFS_BLOCK_GROUP_DATA && + !btrfs_stripe_tree_root(fs_info)) { + btrfs_err(fs_info, "zoned: data DUP profile needs stripe_root"); ret = -EINVAL; goto out; } @@ -1520,8 +1521,116 @@ int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new) cache->zone_capacity = min(caps[0], caps[1]); break; case BTRFS_BLOCK_GROUP_RAID1: + case BTRFS_BLOCK_GROUP_RAID1C3: + case BTRFS_BLOCK_GROUP_RAID1C4: + if (map->type & BTRFS_BLOCK_GROUP_DATA && + !btrfs_stripe_tree_root(fs_info)) { + btrfs_err(fs_info, + "zoned: data %s needs stripe_root", + btrfs_bg_type_to_raid_name(map->type)); + ret = -EIO; + goto out; + + } + + for (i = 0; i < map->num_stripes; i++) { + if (alloc_offsets[i] == WP_MISSING_DEV || + alloc_offsets[i] == WP_CONVENTIONAL) + continue; + + if ((alloc_offsets[0] != alloc_offsets[i]) && + !btrfs_test_opt(fs_info, DEGRADED)) { + btrfs_err(fs_info, + "zoned: write pointer offset mismatch of zones in %s profile", + btrfs_bg_type_to_raid_name(map->type)); + ret = -EIO; + goto out; + } + if (test_bit(0, active) != test_bit(i, active)) { + if (!btrfs_test_opt(fs_info, DEGRADED) && + !btrfs_zone_activate(cache)) { + ret = -EIO; + goto out; + } + } else { + if (test_bit(0, active)) + set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, + &cache->runtime_flags); + } + /* + * In case a device is missing we have a cap of 0, so don't + * use it. + */ + cache->zone_capacity = min_not_zero(caps[0], caps[i]); + } + + if (alloc_offsets[0] != WP_MISSING_DEV) + cache->alloc_offset = alloc_offsets[0]; + else + cache->alloc_offset = alloc_offsets[i - 1]; + break; case BTRFS_BLOCK_GROUP_RAID0: + if (map->type & BTRFS_BLOCK_GROUP_DATA && + !btrfs_stripe_tree_root(fs_info)) { + btrfs_err(fs_info, + "zoned: data %s needs stripe_root", + btrfs_bg_type_to_raid_name(map->type)); + ret = -EIO; + goto out; + + } + for (i = 0; i < map->num_stripes; i++) { + if (alloc_offsets[i] == WP_MISSING_DEV || + alloc_offsets[i] == WP_CONVENTIONAL) + continue; + + if (test_bit(0, active) != test_bit(i, active)) { + if (!btrfs_zone_activate(cache)) { + ret = -EIO; + goto out; + } + } else { + if (test_bit(0, active)) + set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, + &cache->runtime_flags); + } + cache->zone_capacity += caps[i]; + cache->alloc_offset += alloc_offsets[i]; + + } + break; case BTRFS_BLOCK_GROUP_RAID10: + if (map->type & BTRFS_BLOCK_GROUP_DATA && + !btrfs_stripe_tree_root(fs_info)) { + btrfs_err(fs_info, + "zoned: data %s needs stripe_root", + btrfs_bg_type_to_raid_name(map->type)); + ret = -EIO; + goto out; + + } + for (i = 0; i < map->num_stripes; i++) { + if (alloc_offsets[i] == WP_MISSING_DEV || + alloc_offsets[i] == WP_CONVENTIONAL) + continue; + + if (test_bit(0, active) != test_bit(i, active)) { + if (!btrfs_zone_activate(cache)) { + ret = -EIO; + goto out; + } + } else { + if (test_bit(0, active)) + set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, + &cache->runtime_flags); + } + if ((i % map->sub_stripes) == 0) { + cache->zone_capacity += caps[i]; + cache->alloc_offset += alloc_offsets[i]; + } + + } + break; case BTRFS_BLOCK_GROUP_RAID5: case BTRFS_BLOCK_GROUP_RAID6: /* non-single profiles are not supported yet */ From patchwork Mon Sep 11 12:52:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13380227 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D7518CA0EC3 for ; Mon, 11 Sep 2023 22:07:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1356002AbjIKWCi (ORCPT ); Mon, 11 Sep 2023 18:02:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59660 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237474AbjIKMwl (ORCPT ); Mon, 11 Sep 2023 08:52:41 -0400 Received: from esa5.hgst.iphmx.com (esa5.hgst.iphmx.com [216.71.153.144]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D2834CEB; Mon, 11 Sep 2023 05:52:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1694436758; x=1725972758; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=blUMJPeL6YvItniPDnwd2Sx2aR4JMUlJ/5zDYysXEpc=; b=Zk4/EwiiheTs1sbZY7x0BJ9cuZHhBZBkz4MlMk28FoPTLs3BQuszdUPj SvjhMuhPJ4f8NG58X7o+pfDjbwAgc22G4O6RI8vNRsG9VCHvvctcUiA9v PWMffpDrqulXyeERh96pcWo0gzB9J1XqHW1Qy3h09s9xUyGb8oMODAAxF T4Ie76i2rwJieOg/4khgEWhwCy2nFQ0q+tcWHpGEPAsnBK0jD9u9scWbv Vjs1O1V1NZMQ9recC4vqArQWF5YDe1Q1N+GBQqIE6Ef+i8ev9Z9sHLOMO hxMXRn9ki9b0SSiktyMdxhEFJHBP4FKoo2Hj+0xbg3vXdeKpdaFxjFvhk Q==; X-CSE-ConnectionGUID: BvFzMaHXTzuqYHA4p28mUA== X-CSE-MsgGUID: 8RJiqPC1QZCggXguRveDTg== X-IronPort-AV: E=Sophos;i="6.02,244,1688400000"; d="scan'208";a="243594402" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 11 Sep 2023 20:52:38 +0800 IronPort-SDR: Rjt450QVq0yWlIXc8XS/k3WOTmcrPse3mw+K7RVQS1tecXkLUsX1rsyKaA3xq4uJK2rWHFRt/k N4R98EmsRxrs2+FzRTPrCfaxJKJERGjEb4rhv0kmmK090YV9pvC8wXcFOS3C+qSUpdlsNYL57R 2hNoZY1WoKZ3LyoOgHu4G3h+snvR5Uch2CGpE8jSX6wvS0/PdAB2ziyXqpF06/NEyHenO4ITG4 7lIIUOAlQt3nwAgwNAZfM6LOtplgd66ahcjgqiC+uwfd4LkBJGccmIzLaRaZRWt2COqdla1Jo5 5/k= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 11 Sep 2023 05:05:23 -0700 IronPort-SDR: Fyfa7eUvrHGFYP9TvU9jTEgWTyjXsdRRxOHbXJRn8Kagg/Ww1JH/aLXrhrpdUYA2ERK16UQRyP II/5p2xW7t5QaLAnD4tjzleWuZ+JHpswQQ1yDqHlCAEFnTghJap/BoAKDKKnaI8BST37b/gXoN iP2EXLFpGVQiVNrH650cZehUm3wAIHTXusZRUmPp3sBfYBNJN1s/4wtbgzgjPRvMgrcdLcouVN uHJ2CNqpMh6Y3JFx/9bAAgSTss3zw3Ol2hZQtv6vwyBU6zPMHQPYUtutxR9c3NoSuU8BESttoG EvI= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.6]) by uls-op-cesaip02.wdc.com with ESMTP; 11 Sep 2023 05:52:36 -0700 From: Johannes Thumshirn To: Chris Mason , Josef Bacik , David Sterba Cc: Johannes Thumshirn , Christoph Hellwig , Naohiro Aota , Qu Wenruo , Damien Le Moal , linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v8 08/11] btrfs: add raid stripe tree pretty printer Date: Mon, 11 Sep 2023 05:52:09 -0700 Message-ID: <20230911-raid-stripe-tree-v8-8-647676fa852c@wdc.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230911-raid-stripe-tree-v8-0-647676fa852c@wdc.com> References: <20230911-raid-stripe-tree-v8-0-647676fa852c@wdc.com> MIME-Version: 1.0 X-Mailer: b4 0.12.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1694436627; l=2829; i=johannes.thumshirn@wdc.com; s=20230613; h=from:subject:message-id; bh=blUMJPeL6YvItniPDnwd2Sx2aR4JMUlJ/5zDYysXEpc=; b=1TyvAmCP3+Bry6GkpTBFhfbqSm4lYgiH+0Hv5z5zxVk024KvG+gwsgdaNHDzc/R36iSLJAlHQ Y2PqYga/8XsC4ZAUxwQWjDhEkyK5i0uIGHOYuBhkn4qhIrdnHA2AYV8 X-Developer-Key: i=johannes.thumshirn@wdc.com; a=ed25519; pk=TGmHKs78FdPi+QhrViEvjKIGwReUGCfa+3LEnGoR2KM= Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Decode raid-stripe-tree entries on btrfs_print_tree(). Signed-off-by: Johannes Thumshirn --- fs/btrfs/print-tree.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) diff --git a/fs/btrfs/print-tree.c b/fs/btrfs/print-tree.c index 0c93439e929f..f01919e4bb37 100644 --- a/fs/btrfs/print-tree.c +++ b/fs/btrfs/print-tree.c @@ -9,6 +9,7 @@ #include "print-tree.h" #include "accessors.h" #include "tree-checker.h" +#include "raid-stripe-tree.h" struct root_name_map { u64 id; @@ -28,6 +29,7 @@ static const struct root_name_map root_map[] = { { BTRFS_FREE_SPACE_TREE_OBJECTID, "FREE_SPACE_TREE" }, { BTRFS_BLOCK_GROUP_TREE_OBJECTID, "BLOCK_GROUP_TREE" }, { BTRFS_DATA_RELOC_TREE_OBJECTID, "DATA_RELOC_TREE" }, + { BTRFS_RAID_STRIPE_TREE_OBJECTID, "RAID_STRIPE_TREE" }, }; const char *btrfs_root_name(const struct btrfs_key *key, char *buf) @@ -189,6 +191,48 @@ static void print_uuid_item(const struct extent_buffer *l, unsigned long offset, } } +struct raid_encoding_map { + u8 encoding; + char name[16]; +}; + +static const struct raid_encoding_map raid_map[] = { + { BTRFS_STRIPE_DUP, "DUP" }, + { BTRFS_STRIPE_RAID0, "RAID0" }, + { BTRFS_STRIPE_RAID1, "RAID1" }, + { BTRFS_STRIPE_RAID1C3, "RAID1C3" }, + { BTRFS_STRIPE_RAID1C4, "RAID1C4" }, + { BTRFS_STRIPE_RAID5, "RAID5" }, + { BTRFS_STRIPE_RAID6, "RAID6" }, + { BTRFS_STRIPE_RAID10, "RAID10" } +}; + +static const char *stripe_encoding_name(u8 encoding) +{ + for (int i = 0; i < ARRAY_SIZE(raid_map); i++) { + if (raid_map[i].encoding == encoding) + return raid_map[i].name; + } + + return "UNKNOWN"; +} + +static void print_raid_stripe_key(const struct extent_buffer *eb, u32 item_size, + struct btrfs_stripe_extent *stripe) +{ + int num_stripes = btrfs_num_raid_stripes(item_size); + u8 encoding = btrfs_stripe_extent_encoding(eb, stripe); + int i; + + pr_info("\t\t\tencoding: %s\n", stripe_encoding_name(encoding)); + + for (i = 0; i < num_stripes; i++) + pr_info("\t\t\tstride %d devid %llu physical %llu length %llu\n", + i, btrfs_raid_stride_devid(eb, &stripe->strides[i]), + btrfs_raid_stride_physical(eb, &stripe->strides[i]), + btrfs_raid_stride_length(eb, &stripe->strides[i])); +} + /* * Helper to output refs and locking status of extent buffer. Useful to debug * race condition related problems. @@ -349,6 +393,11 @@ void btrfs_print_leaf(const struct extent_buffer *l) print_uuid_item(l, btrfs_item_ptr_offset(l, i), btrfs_item_size(l, i)); break; + case BTRFS_RAID_STRIPE_KEY: + print_raid_stripe_key(l, btrfs_item_size(l, i), + btrfs_item_ptr(l, i, + struct btrfs_stripe_extent)); + break; } } } From patchwork Mon Sep 11 12:52:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13380219 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50EE6CA0EC8 for ; Mon, 11 Sep 2023 22:06:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355632AbjIKWBd (ORCPT ); Mon, 11 Sep 2023 18:01:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34514 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237475AbjIKMwn (ORCPT ); Mon, 11 Sep 2023 08:52:43 -0400 Received: from esa5.hgst.iphmx.com (esa5.hgst.iphmx.com [216.71.153.144]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 71DA8CEB; Mon, 11 Sep 2023 05:52:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1694436760; x=1725972760; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=D+BzdF5oP8uW6qZ5ynzhWFTIk+tOOC9drG8BemLcBr8=; b=bqXQuLkulAL78Rh4KqVmmWAh96+kjchfi1fRlbXiLnS3E+7NF+3kE9cv 6YhnHnHk77dIWUCPJVKQOYuscpI8KuC9fAI51Pn62Wf2HVlvuuplPxvMw lyBLVpUDE8Si9rCW+OiXAnwpcmO1fA6Zm2NfjAJyrjxdLjRGCJn4eMLWP oOOyPL6EDhh2l36W2jo3JOiQs91561C1Q46noA786Q7sZWblarCTo9T6x HqLbt4s3nUCBVFj0om5gdwOyJLvarl9iOpzTptFtzKS6Fkqu36hFJ3x6I xFUUDgkU2dII6k6M23+xalY9e8rfjAx35BnWptLvB8Xbfd7lIEy+IMAft Q==; X-CSE-ConnectionGUID: 3aOhP/jlRMKw1kfXvCxFeQ== X-CSE-MsgGUID: GiOuDSCiShi1su9V0sUdsA== X-IronPort-AV: E=Sophos;i="6.02,244,1688400000"; d="scan'208";a="243594404" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 11 Sep 2023 20:52:39 +0800 IronPort-SDR: CrDM9AddJJSRhAB+fkFhLIt7MM8km3t9LWi21XiryybeI5VboZIgR5u02huNq98p0ojuzlWa6W gc1WZLSi2/1IpsAw1nJ/97oxX8Vz6eEHycw9zYZ4lntwf9iDucRKn3c3LsYtMEhr7E8YBQlm4+ aIHKX2wxgO47XkzUmyZBSHuVrCif0MYNPc1XapwagFhfSFrPMbcPWqR1OhJ055QzMFrhxOHjuI BDAbDO5YIOiNII3TCoUgcDDD4m+RHODZ0f54jXqlVPRgO4vWDtMXhrPl/GsDvRbAGO1TVjKnxm sD4= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 11 Sep 2023 05:05:25 -0700 IronPort-SDR: A0hFwIzHLUE+ulyWyD+JBLip4kK2E7H7zfc1M8RRGQIreSvXh5hPAySix6iBdf5kis/Bk+vQJD dgbsaMDenLXL1yh1MA4SkKq2J09zxOeMJXFInzUsyywTqD/PO1FOESyAZAsEL8l3cltNHPq7qn kJkBilhXkxGbsZNR/2K8A7qqd9UXLyfcOSKRoOgFcxh4dk1K5cMTTG0B2epfplsqhSPhhTMjWN 3eJCQurFxV8heq0tKCq0Zoov5c8J/8jBGrKxrCoZiKw6p/yGnPm9xztUfFCESanRNSqFctqDB1 iuM= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.6]) by uls-op-cesaip02.wdc.com with ESMTP; 11 Sep 2023 05:52:38 -0700 From: Johannes Thumshirn To: Chris Mason , Josef Bacik , David Sterba Cc: Johannes Thumshirn , Christoph Hellwig , Naohiro Aota , Qu Wenruo , Damien Le Moal , linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v8 09/11] btrfs: announce presence of raid-stripe-tree in sysfs Date: Mon, 11 Sep 2023 05:52:10 -0700 Message-ID: <20230911-raid-stripe-tree-v8-9-647676fa852c@wdc.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230911-raid-stripe-tree-v8-0-647676fa852c@wdc.com> References: <20230911-raid-stripe-tree-v8-0-647676fa852c@wdc.com> MIME-Version: 1.0 X-Mailer: b4 0.12.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1694436627; l=1119; i=johannes.thumshirn@wdc.com; s=20230613; h=from:subject:message-id; bh=D+BzdF5oP8uW6qZ5ynzhWFTIk+tOOC9drG8BemLcBr8=; b=UwF2M5arf7IqLn76+V+tFbjy3tb8obqCvnHvN1HJipbCC39KeMqVVtHaW3hUz5nbh0vye5+IH AmQ225ZlfsEB7m2ZAHrEN+Z6uOf+RiKoHrhwWjck8J+XoMFt4wYyUPP X-Developer-Key: i=johannes.thumshirn@wdc.com; a=ed25519; pk=TGmHKs78FdPi+QhrViEvjKIGwReUGCfa+3LEnGoR2KM= Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org If a filesystem with a raid-stripe-tree is mounted, show the RST feature in sysfs. Reviewed-by: Josef Bacik Signed-off-by: Johannes Thumshirn --- fs/btrfs/sysfs.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index b1d1ac25237b..1bab3d7d251e 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -297,6 +297,8 @@ BTRFS_FEAT_ATTR_INCOMPAT(zoned, ZONED); #ifdef CONFIG_BTRFS_DEBUG /* Remove once support for extent tree v2 is feature complete */ BTRFS_FEAT_ATTR_INCOMPAT(extent_tree_v2, EXTENT_TREE_V2); +/* Remove once support for raid stripe tree is feature complete */ +BTRFS_FEAT_ATTR_INCOMPAT(raid_stripe_tree, RAID_STRIPE_TREE); #endif #ifdef CONFIG_FS_VERITY BTRFS_FEAT_ATTR_COMPAT_RO(verity, VERITY); @@ -327,6 +329,7 @@ static struct attribute *btrfs_supported_feature_attrs[] = { #endif #ifdef CONFIG_BTRFS_DEBUG BTRFS_FEAT_ATTR_PTR(extent_tree_v2), + BTRFS_FEAT_ATTR_PTR(raid_stripe_tree), #endif #ifdef CONFIG_FS_VERITY BTRFS_FEAT_ATTR_PTR(verity), From patchwork Mon Sep 11 12:52:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13380232 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2F41CA0ECE for ; Mon, 11 Sep 2023 22:07:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1356063AbjIKWCu (ORCPT ); Mon, 11 Sep 2023 18:02:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34516 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237476AbjIKMwp (ORCPT ); Mon, 11 Sep 2023 08:52:45 -0400 Received: from esa5.hgst.iphmx.com (esa5.hgst.iphmx.com [216.71.153.144]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1DC09E40; Mon, 11 Sep 2023 05:52:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1694436761; x=1725972761; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ahUMziwuV2b2WDbDuTtcGjuBKbJw2h6dfox5WSNvvWg=; b=EeUiZfb1zMMh9Nr5eJQ2+oCKv+I+MACxKSxraVrdPZMWuzqlQbVC40ps iQWa324cSSswfO5F5TqMdZ+rZKxcCozuhLGlx4SsAn1rKY30c/eiP6Ev2 VfwKMQOQtgtWC+jQY4mKdbOaMek+2HkIlxXeABVbgocROGTfgABTyMolI PaO/y2cA99CE1p3JVmUwONjuOEXky/NepOSEtGxjjmTCtgobFQ5QveBYf rj0B154PejxmklKPK7on7gzpt/vD0l4N+fjqbeP+obmhbL1E2f0X4gNio zTzYZvk43Lfl0AkLzdyhP7oCF5gCQkUuSTclJ4kMIADbg2mck9IfB2Eal w==; X-CSE-ConnectionGUID: Jsu749UERsqH+jii/AhXMg== X-CSE-MsgGUID: nsp3ng1dShqSmxW6GwAK3A== X-IronPort-AV: E=Sophos;i="6.02,244,1688400000"; d="scan'208";a="243594406" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 11 Sep 2023 20:52:41 +0800 IronPort-SDR: iDudBPp9nwTkKucHp/+L1r8j9y2B0RAiQgcbGlmzKWXt6JpKOJEPSfVKkpw7bgcV2UBPWjKuDg DBDQayF2zd1wLUtpGl+XLGB1zJDPeJ7uXwKcAiEg/1FVYj0gaWiP7oWqmqzfQRJpejDf5HDzHs TYZbHvvzWPig6wKOuJUOY5edYmpg2Sy+iLKOzPM7TxC/TIUtq9fdjZZznQIJMTso3Bs8qxIJ6H QSI7AIfKiSur+SD51+bPh1oSWecG2WQMxYHsbqZuw8okV9R42nlSxI5BFIpvwsn4jrFlvw6VFL zd4= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 11 Sep 2023 05:05:27 -0700 IronPort-SDR: rOc7GNb/zofhemKoiswWzje3vPgVzrWXlmdr4ZQFozPHEEVOsu+SGVpobuyw+gn/MfLUsn8PzG TsqqwytA+qoTh545DzGv7dZLFWqKRSNHmZIHRdp7dnpqW79Jet/S325QqpONKVY6VBW3pu3CYe axkY0TcIv7oRUcDTDN3ORoRAv5ncy55dNMZdOJD2Mg1qsmEvk2eACveOnuMugsNLKarchwjydr SvbjGfnha0v9Hs6/I6RGZNVhCNjVln8dbt6hhpWsIIusfhzPdZEbQ299A73kxMXsrCl+iCZE0S Srg= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.6]) by uls-op-cesaip02.wdc.com with ESMTP; 11 Sep 2023 05:52:39 -0700 From: Johannes Thumshirn To: Chris Mason , Josef Bacik , David Sterba Cc: Johannes Thumshirn , Christoph Hellwig , Naohiro Aota , Qu Wenruo , Damien Le Moal , linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v8 10/11] btrfs: add trace events for RST Date: Mon, 11 Sep 2023 05:52:11 -0700 Message-ID: <20230911-raid-stripe-tree-v8-10-647676fa852c@wdc.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230911-raid-stripe-tree-v8-0-647676fa852c@wdc.com> References: <20230911-raid-stripe-tree-v8-0-647676fa852c@wdc.com> MIME-Version: 1.0 X-Mailer: b4 0.12.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1694436627; l=3845; i=johannes.thumshirn@wdc.com; s=20230613; h=from:subject:message-id; bh=ahUMziwuV2b2WDbDuTtcGjuBKbJw2h6dfox5WSNvvWg=; b=f+9/5xxBaR7XpDIPbN/9rrepqD2WMkaf5VyHlhjBbQOcAdcILzfLJ+yEwNmNc8DkKgybPKtah GwCY3/YJJEvCeggEni27Ve165ai7x4QpQW2R6GFsUikB5eSPLl6C86g X-Developer-Key: i=johannes.thumshirn@wdc.com; a=ed25519; pk=TGmHKs78FdPi+QhrViEvjKIGwReUGCfa+3LEnGoR2KM= Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Add trace events for raid-stripe-tree operations. Signed-off-by: Johannes Thumshirn --- fs/btrfs/raid-stripe-tree.c | 8 +++++ include/trace/events/btrfs.h | 75 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 83 insertions(+) diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c index 7ed02e4b79ec..5a9952cf557c 100644 --- a/fs/btrfs/raid-stripe-tree.c +++ b/fs/btrfs/raid-stripe-tree.c @@ -62,6 +62,9 @@ int btrfs_delete_raid_extent(struct btrfs_trans_handle *trans, u64 start, if (found_end <= start) break; + trace_btrfs_raid_extent_delete(fs_info, start, end, + found_start, found_end); + ASSERT(found_start >= start && found_end <= end); ret = btrfs_del_item(trans, stripe_root, path); if (ret) @@ -120,6 +123,8 @@ static int btrfs_insert_one_raid_extent(struct btrfs_trans_handle *trans, return -ENOMEM; } + trace_btrfs_insert_one_raid_extent(fs_info, bioc->logical, bioc->size, + num_stripes); btrfs_set_stack_stripe_extent_encoding(stripe_extent, encoding); for (int i = 0; i < num_stripes; i++) { u64 devid = bioc->stripes[i].dev->devid; @@ -445,6 +450,9 @@ int btrfs_get_raid_extent_offset(struct btrfs_fs_info *fs_info, stripe->physical = physical + offset; + trace_btrfs_get_raid_extent_offset(fs_info, logical, *length, + stripe->physical, devid); + ret = 0; goto free_path; } diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h index b2db2c2f1c57..e2c6f1199212 100644 --- a/include/trace/events/btrfs.h +++ b/include/trace/events/btrfs.h @@ -2497,6 +2497,81 @@ DEFINE_EVENT(btrfs_raid56_bio, raid56_write, TP_ARGS(rbio, bio, trace_info) ); +TRACE_EVENT(btrfs_insert_one_raid_extent, + + TP_PROTO(struct btrfs_fs_info *fs_info, u64 logical, u64 length, + int num_stripes), + + TP_ARGS(fs_info, logical, length, num_stripes), + + TP_STRUCT__entry_btrfs( + __field( u64, logical ) + __field( u64, length ) + __field( int, num_stripes ) + ), + + TP_fast_assign_btrfs(fs_info, + __entry->logical = logical; + __entry->length = length; + __entry->num_stripes = num_stripes; + ), + + TP_printk_btrfs("logical=%llu, length=%llu, num_stripes=%d", + __entry->logical, __entry->length, + __entry->num_stripes) +); + +TRACE_EVENT(btrfs_raid_extent_delete, + + TP_PROTO(struct btrfs_fs_info *fs_info, u64 start, u64 end, + u64 found_start, u64 found_end), + + TP_ARGS(fs_info, start, end, found_start, found_end), + + TP_STRUCT__entry_btrfs( + __field( u64, start ) + __field( u64, end ) + __field( u64, found_start ) + __field( u64, found_end ) + ), + + TP_fast_assign_btrfs(fs_info, + __entry->start = start; + __entry->end = end; + __entry->found_start = found_start; + __entry->found_end = found_end; + ), + + TP_printk_btrfs("start=%llu, end=%llu, found_start=%llu, found_end=%llu", + __entry->start, __entry->end, __entry->found_start, + __entry->found_end) +); + +TRACE_EVENT(btrfs_get_raid_extent_offset, + + TP_PROTO(struct btrfs_fs_info *fs_info, u64 logical, u64 length, + u64 physical, u64 devid), + + TP_ARGS(fs_info, logical, length, physical, devid), + + TP_STRUCT__entry_btrfs( + __field( u64, logical ) + __field( u64, length ) + __field( u64, physical ) + __field( u64, devid ) + ), + + TP_fast_assign_btrfs(fs_info, + __entry->logical = logical; + __entry->length = length; + __entry->physical = physical; + __entry->devid = devid; + ), + + TP_printk_btrfs("logical=%llu, length=%llu, physical=%llu, devid=%llu", + __entry->logical, __entry->length, __entry->physical, + __entry->devid) +); #endif /* _TRACE_BTRFS_H */ /* This part must be outside protection */ From patchwork Mon Sep 11 12:52:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13380217 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 413BCCA0ECD for ; Mon, 11 Sep 2023 22:06:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355608AbjIKWB1 (ORCPT ); Mon, 11 Sep 2023 18:01:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34530 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237479AbjIKMwr (ORCPT ); Mon, 11 Sep 2023 08:52:47 -0400 Received: from esa5.hgst.iphmx.com (esa5.hgst.iphmx.com [216.71.153.144]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 15161E40; Mon, 11 Sep 2023 05:52:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1694436763; x=1725972763; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/fWmE+uB6+cLnKgUKo0l86vqhneVO3Sjo3PEyQRxI/Y=; b=AaCNvq5+j5JKtqGZqtug1workw7YxvixkAf+lSG1cmkYgZffYzRugZYE ElB12DXXR5Z5k5q3jILOBzfwXzr3VYcTbockRKgOh6iP79sCS4v7zp+Lr 9HEAYk1iosh4p0RbJn8LjKPRcELKRp5drW7QM2whzLqurv/P57TfJ5los TFEyJHRScXwpJRoEju8WnNeCCZkMdBp2pqsuXF0yVVd13Dx+lGzKFDIeC 3N449nDrvsteB8HPOZGsdd1TzXczDKThQ+6kMQNO/VDxXBhYDIx396DSK ces7/h/OmcHW/YfW3/1o3vsgGpU4Pyu1t39XHSFEi/cjv5JqDdjEG3Lf5 A==; X-CSE-ConnectionGUID: xKthyqHtQPSEJ1EJNU5sSQ== X-CSE-MsgGUID: jcN6phx6RMOjd3en2AV3oA== X-IronPort-AV: E=Sophos;i="6.02,244,1688400000"; d="scan'208";a="243594410" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 11 Sep 2023 20:52:43 +0800 IronPort-SDR: y3MTalOFKWs4gVTHnOG1qBJhuSTMCG3yVzb6qgtwZJ47QZmCBHF8HKQOnWJLrGwDTOhiRTL0Nf QluLaLpKkKcaD+cgISxVvRyjvRBFJ0Wqq/XVsNsBmHHYy2dxuwzzJSy9zcFki5vnYWGa9Tc5oX YpE4c8T4FpiCNg4rmuk1WciaLF5H8KkMkwIhgQyCXD9wT1I4I2jHUN/Y15SWL8mnK/guF08ZDd WQbWIPnB6RPgm9a5NX4DLK1YfdNTi3XhoPLFxt0SZP6drtCLKXBew2q8UpYbgwOSNHY7IVtEcJ cJA= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 11 Sep 2023 05:05:29 -0700 IronPort-SDR: 3btV4OVI8RNYODwm1XbMQDlmqWQijaR4Ytic85YP0CLWv97BptziphRBsy7D4tZCK1jzIi9YqI Oankj/BPIDEC341tgE0buQB2/qwxZ6+Cf9kWxe26b7LJCjJu+jzrUhJsG7U/gJ6A2hwp3oBYj1 htdgtXMcBVsqpnWVagQicXixRjG6KJFdWj29Jrpc5EYVgSZ0ov4m/RXUigLvPiidN5utcULOHZ 6Cld2xMlvFOvqN/uq/ZJKeF9wUusa9Q7du7/jA75x7qNhfp5P/bpbsVrfZFkd96Y6SuMC9o7l1 Lrs= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.6]) by uls-op-cesaip02.wdc.com with ESMTP; 11 Sep 2023 05:52:41 -0700 From: Johannes Thumshirn To: Chris Mason , Josef Bacik , David Sterba Cc: Johannes Thumshirn , Christoph Hellwig , Naohiro Aota , Qu Wenruo , Damien Le Moal , linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v8 11/11] btrfs: add raid-stripe-tree to features enabled with debug Date: Mon, 11 Sep 2023 05:52:12 -0700 Message-ID: <20230911-raid-stripe-tree-v8-11-647676fa852c@wdc.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230911-raid-stripe-tree-v8-0-647676fa852c@wdc.com> References: <20230911-raid-stripe-tree-v8-0-647676fa852c@wdc.com> MIME-Version: 1.0 X-Mailer: b4 0.12.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1694436627; l=775; i=johannes.thumshirn@wdc.com; s=20230613; h=from:subject:message-id; bh=/fWmE+uB6+cLnKgUKo0l86vqhneVO3Sjo3PEyQRxI/Y=; b=Su1TKBOnkmZ402g2jaDHChVW8ZW+tVw98E6A9CWMfigW7vRAAl0HMgD4tmh3XAAV+1CX1c6RY 9K0VQimL/l8DGEapVFrsFibjCpMVk2VZI39rad98kMcRFKfL2ed4zTG X-Developer-Key: i=johannes.thumshirn@wdc.com; a=ed25519; pk=TGmHKs78FdPi+QhrViEvjKIGwReUGCfa+3LEnGoR2KM= Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Until the RAID stripe tree code is well enough tested and feature complete, "hide" it behind CONFIG_BTRFS_DEBUG so only people who want to use it are actually using it. Reviewed-by: Josef Bacik Signed-off-by: Johannes Thumshirn --- fs/btrfs/fs.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h index 5c7778e8b5ed..0f5894e2bdeb 100644 --- a/fs/btrfs/fs.h +++ b/fs/btrfs/fs.h @@ -223,7 +223,8 @@ enum { */ #define BTRFS_FEATURE_INCOMPAT_SUPP \ (BTRFS_FEATURE_INCOMPAT_SUPP_STABLE | \ - BTRFS_FEATURE_INCOMPAT_EXTENT_TREE_V2) + BTRFS_FEATURE_INCOMPAT_EXTENT_TREE_V2 |\ + BTRFS_FEATURE_INCOMPAT_RAID_STRIPE_TREE) #else