From patchwork Wed Feb 8 10:57:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13132967 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE421C05027 for ; Wed, 8 Feb 2023 10:58:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229805AbjBHK6A (ORCPT ); Wed, 8 Feb 2023 05:58:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35880 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229479AbjBHK56 (ORCPT ); Wed, 8 Feb 2023 05:57:58 -0500 Received: from esa4.hgst.iphmx.com (esa4.hgst.iphmx.com [216.71.154.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9BFFC12F22 for ; Wed, 8 Feb 2023 02:57:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1675853877; x=1707389877; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=a1strDIm7tyg8w4s1CKx9LdLPKzwnd4fihHxz4o2En0=; b=b1Drmcc6gLifG32If0FjQ9tvd5fcHv2Qm6DpTnujCpyPDgID+1VYpqkd NcuuF23JUHrPN3BdThLTuGwBzxwgXPu+I90xkXmvKpjDAyA1dv50OHio/ bDQsPsM5isg7cYwR1Fe+p4ZjUpBMM72tizVA7aqxqM7/SXWJn4fH9Gman /bwoxa5uw5Mvg+2vsJtHd0eyREMcJzj4FfLbHsMQzIgfSBx4kFjbA5hvM bi69DaNkFaXQyYIC00Zka2POfq/c6+6gKynMRj5ETM3ENgBPE0A2vGBdC JQ1klscmRcYfP2BrG4fu2wR+PK0raOYiXiPg9fVvLcELJbTDcpJmhrppC A==; X-IronPort-AV: E=Sophos;i="5.97,280,1669046400"; d="scan'208";a="221115625" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 08 Feb 2023 18:57:56 +0800 IronPort-SDR: imLlfBSBYHPGbZZypKZPKg/B9uqnisEDzihOSoq9UBYvNgl61/S0Z9oKXf9SV8OrFre3xn6/5H G6UPW8FSVO73vqX7RSmUNhHveuaZWtmBaMUP//1/FgnrSf2wvir+U1tZkrSSGpTWUVQjiVkKlX KsA8MuLRAQrSSXbUdgi9neZs8325l+lAQIW7jYebOEOrKYPkMO6ShcTBPYT441HJeM5UlbhimZ 0edln1G6DmK6/HiLoYh0mUfP7e+NDpJRInWx5kYxLO98XqdFqiY/SIagUmemZxBlcncw+x6xBh x4w= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 08 Feb 2023 02:15:11 -0800 IronPort-SDR: YJzRk0oEc/5G7i4zPCDnxvpx+RouH64CXlBrDje0cr6E7YBr65zWDIjaFRVISydx8DS3slOTJ1 EJvdktAhBVE/JBgAjF9e/dr/SBMEjNC7CTVfjBf53DImvC8FUoRGPT4TYvZ3AuXJ/4FxNIvsbo f2e45LbXufshpkEsr23ekNxB3dQ1YzAzMHbnZit0YrPGnVWCjJBb76k8OLyrPqXiFdKfebqUFl dGFYZ7zQnvrZxn9Nr4DIhTWzRLzmchO7YJhn/+riIzUmdt3aP0HKoBX95CLFzp0PWndnxQUudp OYc= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip02.wdc.com with ESMTP; 08 Feb 2023 02:57:56 -0800 From: Johannes Thumshirn To: linux-btrfs@vger.kernel.org Cc: Johannes Thumshirn Subject: [PATCH v5 01/13] btrfs: re-add trans parameter to insert_delayed_ref Date: Wed, 8 Feb 2023 02:57:38 -0800 Message-Id: <19d1a7d054f4127c750b91692835de5abbbf164d.1675853489.git.johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.39.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Re-add the trans parameter to insert_delayed_ref as it is needed again later in this series. This reverts commit bccf28752a99 ("btrfs: drop trans parameter of insert_delayed_ref") Signed-off-by: Johannes Thumshirn Reviewed-by: Josef Bacik --- fs/btrfs/delayed-ref.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c index 886ffb232eac..7660ac642c81 100644 --- a/fs/btrfs/delayed-ref.c +++ b/fs/btrfs/delayed-ref.c @@ -598,7 +598,8 @@ void btrfs_delete_ref_head(struct btrfs_delayed_ref_root *delayed_refs, * Return 0 for insert. * Return >0 for merge. */ -static int insert_delayed_ref(struct btrfs_delayed_ref_root *root, +static int insert_delayed_ref(struct btrfs_trans_handle *trans, + struct btrfs_delayed_ref_root *root, struct btrfs_delayed_ref_head *href, struct btrfs_delayed_ref_node *ref) { @@ -974,7 +975,7 @@ int btrfs_add_delayed_tree_ref(struct btrfs_trans_handle *trans, head_ref = add_delayed_ref_head(trans, head_ref, record, action, &qrecord_inserted); - ret = insert_delayed_ref(delayed_refs, head_ref, &ref->node); + ret = insert_delayed_ref(trans, delayed_refs, head_ref, &ref->node); spin_unlock(&delayed_refs->lock); /* @@ -1066,7 +1067,7 @@ int btrfs_add_delayed_data_ref(struct btrfs_trans_handle *trans, head_ref = add_delayed_ref_head(trans, head_ref, record, action, &qrecord_inserted); - ret = insert_delayed_ref(delayed_refs, head_ref, &ref->node); + ret = insert_delayed_ref(trans, delayed_refs, head_ref, &ref->node); spin_unlock(&delayed_refs->lock); /* From patchwork Wed Feb 8 10:57:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13132969 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0771C636D7 for ; Wed, 8 Feb 2023 10:58:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229906AbjBHK6B (ORCPT ); Wed, 8 Feb 2023 05:58:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35890 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229718AbjBHK57 (ORCPT ); Wed, 8 Feb 2023 05:57:59 -0500 Received: from esa4.hgst.iphmx.com (esa4.hgst.iphmx.com [216.71.154.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4857F12F35 for ; Wed, 8 Feb 2023 02:57:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1675853877; x=1707389877; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=d9y+K1KNCJrWjn6zvnX0mY3KqOj6NwCi0YA+ypwIxPQ=; b=XyVbc3Kn0Xrtch4/kYcGyFu99xnZUcJWVsChYVZbbQSnxHXRUi1L4dOy 5G2p/sd28as060TgXyYTETk9ygYchtavyDtAJF/PgC1rs05ly/P7EChD0 RX1JiTfTCthwEwwpJZ3eYqsDYTalt0VpIBvVH9sLlL3S+1WlW1LFO/agY lZHRWjrjZCzYNtaVqSRTwfwiQJyPgJmjLUQzgqsQPKcp6C5HwzQO3k9pM RQnoSwHxyC89tm4/bo4fG5aPkakwNcRL66RviWxL1ReXcZG3gUQvaGY0F 5vkqnHV6LzWMlfKERSggE4nObgcfHCA23C+TJjOuz8L3QIi7AxbILrDH6 A==; X-IronPort-AV: E=Sophos;i="5.97,280,1669046400"; d="scan'208";a="221115626" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 08 Feb 2023 18:57:56 +0800 IronPort-SDR: DchMWofBsyZ/HJ9xgvdSpM1fe4crThG/0ElcqO+4v2OfBpm0J9+KXj100tXqoLxNEfumI6EQ0R qQgbClmwLgWeYabHM2AAtGGROwRqKi0WfK4afaEbFJj74MwowLtqGb5dQv5bmAUCjEYxuLPG+5 OWztAjymHCEey/d3iYbB60a7qC8TAQHFEKD6m9J0TKo6I94ppLZkt/J1uTLn7BMLFJEfMSCV1m RlXmHiogAVIDT7KJK3dc8B3ilSfLX/sJDSo1HKbEXDOhzjCcNok3penIRBvSdqCfZ6ZUgeEnGc HJ8= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 08 Feb 2023 02:15:12 -0800 IronPort-SDR: cdC/gtoWVgXW0nfgAQbhLImFQ7B+RP7zGDGfArTJHjXnlg64ASaI0l6jgcuKh+t+cPAqM+4p90 BOy7vqqjWPz39z7xnUl+wnuAvFfxyg01wjm7MOcWjxFh/RQL0nNgSDwGui3q3UXK0AG/WCmuKJ Dcc9FA4/OzQ1w2GM1x0l4sguJ4rue6hqIa/6QoViLOq2zsJV6PYtV+d/rQrNg70jljXVXado/j P2zzH2R83aCzJIxOzDGHhxo8YEvHasGZ9O98iLZV4BVab1iZmJVSR47PnXKhjh5C3s9opq8iWx d/M= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip02.wdc.com with ESMTP; 08 Feb 2023 02:57:57 -0800 From: Johannes Thumshirn To: linux-btrfs@vger.kernel.org Cc: Johannes Thumshirn Subject: [PATCH v5 02/13] btrfs: add raid stripe tree definitions Date: Wed, 8 Feb 2023 02:57:39 -0800 Message-Id: X-Mailer: git-send-email 2.39.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Add definitions for the raid stripe tree. This tree will hold information about the on-disk layout of the stripes in a RAID set. Each stripe extent has a 1:1 relationship with an on-disk extent item and is doing the logical to per-drive physical address translation for the extent item in question. Signed-off-by: Johannes Thumshirn Reviewed-by: Josef Bacik Reviewed-by: Anand Jain --- fs/btrfs/accessors.h | 29 +++++++++++++++++++++++++++++ include/uapi/linux/btrfs_tree.h | 20 ++++++++++++++++++-- 2 files changed, 47 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/accessors.h b/fs/btrfs/accessors.h index ceadfc5d6c66..6e753b63faae 100644 --- a/fs/btrfs/accessors.h +++ b/fs/btrfs/accessors.h @@ -306,6 +306,35 @@ BTRFS_SETGET_FUNCS(timespec_nsec, struct btrfs_timespec, nsec, 32); BTRFS_SETGET_STACK_FUNCS(stack_timespec_sec, struct btrfs_timespec, sec, 64); BTRFS_SETGET_STACK_FUNCS(stack_timespec_nsec, struct btrfs_timespec, nsec, 32); +BTRFS_SETGET_FUNCS(raid_stride_devid, struct btrfs_raid_stride, devid, 64); +BTRFS_SETGET_FUNCS(raid_stride_physical, struct btrfs_raid_stride, physical, 64); +BTRFS_SETGET_STACK_FUNCS(stack_raid_stride_devid, struct btrfs_raid_stride, devid, 64); +BTRFS_SETGET_STACK_FUNCS(stack_raid_stride_physical, struct btrfs_raid_stride, physical, 64); + +static inline struct btrfs_raid_stride *btrfs_raid_stride_nr( + struct btrfs_stripe_extent *dps, int nr) +{ + unsigned long offset = (unsigned long)dps; + + offset += offsetof(struct btrfs_stripe_extent, strides); + offset += nr * sizeof(struct btrfs_raid_stride); + return (struct btrfs_raid_stride *)offset; +} + +static inline u64 btrfs_raid_stride_devid_nr(const struct extent_buffer *eb, + struct btrfs_stripe_extent *dps, + int nr) +{ + return btrfs_raid_stride_devid(eb, btrfs_raid_stride_nr(dps, nr)); +} + +static inline u64 btrfs_raid_stride_physical_nr(const struct extent_buffer *eb, + struct btrfs_stripe_extent *dps, + int nr) +{ + return btrfs_raid_stride_physical(eb, btrfs_raid_stride_nr(dps, nr)); +} + /* struct btrfs_dev_extent */ BTRFS_SETGET_FUNCS(dev_extent_chunk_tree, struct btrfs_dev_extent, chunk_tree, 64); BTRFS_SETGET_FUNCS(dev_extent_chunk_objectid, struct btrfs_dev_extent, diff --git a/include/uapi/linux/btrfs_tree.h b/include/uapi/linux/btrfs_tree.h index ab38d0f411fa..64e6bf2a10d8 100644 --- a/include/uapi/linux/btrfs_tree.h +++ b/include/uapi/linux/btrfs_tree.h @@ -4,9 +4,8 @@ #include #include -#ifdef __KERNEL__ #include -#else +#ifndef __KERNEL__ #include #endif @@ -73,6 +72,9 @@ /* Holds the block group items for extent tree v2. */ #define BTRFS_BLOCK_GROUP_TREE_OBJECTID 11ULL +/* tracks RAID stripes in block groups. */ +#define BTRFS_RAID_STRIPE_TREE_OBJECTID 12ULL + /* device stats in the device tree */ #define BTRFS_DEV_STATS_OBJECTID 0ULL @@ -281,6 +283,8 @@ */ #define BTRFS_QGROUP_RELATION_KEY 246 +#define BTRFS_RAID_STRIPE_KEY 247 + /* * Obsolete name, see BTRFS_TEMPORARY_ITEM_KEY. */ @@ -715,6 +719,18 @@ struct btrfs_free_space_header { __le64 num_bitmaps; } __attribute__ ((__packed__)); +struct btrfs_raid_stride { + /* btrfs device-id this raid extent lives on */ + __le64 devid; + /* physical location on disk */ + __le64 physical; +}; + +struct btrfs_stripe_extent { + /* array of raid strides this stripe is composed of */ + __DECLARE_FLEX_ARRAY(struct btrfs_raid_stride, strides); +}; + #define BTRFS_HEADER_FLAG_WRITTEN (1ULL << 0) #define BTRFS_HEADER_FLAG_RELOC (1ULL << 1) From patchwork Wed Feb 8 10:57:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13132968 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B6791C636CC for ; Wed, 8 Feb 2023 10:58:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229684AbjBHK6C (ORCPT ); Wed, 8 Feb 2023 05:58:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35892 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229781AbjBHK57 (ORCPT ); Wed, 8 Feb 2023 05:57:59 -0500 Received: from esa4.hgst.iphmx.com (esa4.hgst.iphmx.com [216.71.154.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 855991350D for ; Wed, 8 Feb 2023 02:57:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1675853878; x=1707389878; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=NRTeDJlDGNPZe4rmP93nfwwO1wDLXDmBxBioaGPke5Y=; b=nF19k6z4wSyGqsTwu9EmNH1LUN3b+Z6GFVllTkAfL06aZjFSgrc4u/tn 3Jys3dIzAQnlXFHC0JPEFy70QypRw2RO0jcQUWttLa9aUK3Rli6RpjGlX sltSNo0zMUeVchdlMBwk6WLsj4bKHsGmJHO/EI0jk+pZ3ljJg1lbj6Fsw tsZB8xMQhTnuP0LxuSXpisI+pJFT6o1EgczK+2KGgmxVkI5C283hQRFT+ kaGaxgLDzfTOW2/uSF5sq05n9PyOZjquTpFUzfIZsDC+iTwVFy7S6qERx zQGGLx/4EftslqCempnPFs4bhFDm8c6b58h0hAlWk2+v4PJKZxSCEE0KP g==; X-IronPort-AV: E=Sophos;i="5.97,280,1669046400"; d="scan'208";a="221115628" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 08 Feb 2023 18:57:57 +0800 IronPort-SDR: Jxk7W65zNm+qGEr3nZ03SwZbTxO6wQAR0ymv1hpq2Pf663n7c3OEybX6hLtkENOfXogum9yWcf c1Hzil8UL9Klp5+IpK3GBmfF2XSbxUCj7Lg2qiOex0CUT4KcOWZe0RWE4e+R20nGlJkYLXLcsU pgxBA532gijAy16FyDU/VKdrX18Cpt6BO/cdZqQAA8hakP6ADvezeWzabykpp7liuzdqKZbjqF 7SISxFpIhze/ep09aCaBnReMkD335Hp/ThMWL6hFQmPouQeiEzLVe7IlhBiLZaslxsM/iPhn8Q Pu0= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 08 Feb 2023 02:15:13 -0800 IronPort-SDR: Nm/QL7lvzU3Yg+YbQCRySYApoGsVH7oc8fT4nwbwzM6S4XGQ9sWZWWQWPHvl1wIWqc0Hc7T2i8 AjK6ZAi2fDB//M3JMPFxp+7VOR1zKL//Xilg+Bct9XWadyQZwLpMbQVl/uiXtRr2cfuD9uiY0x 4pjRvb7O0hrKVKpuh4oiOx5utFbaVHSa1HhVG0U3fmJqxhvVOyVdHHDskJSyS/tlvHBnrfy0mG fQ9UtLkzzqPJ4/isC347Gr2Zv6wxXND3F1B4iTydaunjit41t2TUGy08c4sJl6ANax9eAH8MdA HrU= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip02.wdc.com with ESMTP; 08 Feb 2023 02:57:58 -0800 From: Johannes Thumshirn To: linux-btrfs@vger.kernel.org Cc: Johannes Thumshirn Subject: [PATCH v5 03/13] btrfs: read raid-stripe-tree from disk Date: Wed, 8 Feb 2023 02:57:40 -0800 Message-Id: <5898b85677c784bf6af4e3a18d61bf261af3141a.1675853489.git.johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.39.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org If we find a raid-stripe-tree on mount, read it from disk. Signed-off-by: Johannes Thumshirn Reviewed-by: Josef Bacik Reviewed-by: Anand Jain --- fs/btrfs/block-rsv.c | 1 + fs/btrfs/disk-io.c | 19 +++++++++++++++++++ fs/btrfs/disk-io.h | 5 +++++ fs/btrfs/fs.h | 1 + include/uapi/linux/btrfs.h | 1 + 5 files changed, 27 insertions(+) diff --git a/fs/btrfs/block-rsv.c b/fs/btrfs/block-rsv.c index 5367a14d44d2..384987343a64 100644 --- a/fs/btrfs/block-rsv.c +++ b/fs/btrfs/block-rsv.c @@ -402,6 +402,7 @@ void btrfs_init_root_block_rsv(struct btrfs_root *root) case BTRFS_EXTENT_TREE_OBJECTID: case BTRFS_FREE_SPACE_TREE_OBJECTID: case BTRFS_BLOCK_GROUP_TREE_OBJECTID: + case BTRFS_RAID_STRIPE_TREE_OBJECTID: root->block_rsv = &fs_info->delayed_refs_rsv; break; case BTRFS_ROOT_TREE_OBJECTID: diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 0da0bde347e5..ad64a79d052a 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1454,6 +1454,9 @@ static struct btrfs_root *btrfs_get_global_root(struct btrfs_fs_info *fs_info, return btrfs_grab_root(root) ? root : ERR_PTR(-ENOENT); } + if (objectid == BTRFS_RAID_STRIPE_TREE_OBJECTID) + return btrfs_grab_root(fs_info->stripe_root) ? + fs_info->stripe_root : ERR_PTR(-ENOENT); return NULL; } @@ -1532,6 +1535,7 @@ void btrfs_free_fs_info(struct btrfs_fs_info *fs_info) btrfs_put_root(fs_info->fs_root); btrfs_put_root(fs_info->data_reloc_root); btrfs_put_root(fs_info->block_group_root); + btrfs_put_root(fs_info->stripe_root); btrfs_check_leaked_roots(fs_info); btrfs_extent_buffer_leak_debug_check(fs_info); kfree(fs_info->super_copy); @@ -2067,6 +2071,7 @@ static void free_root_pointers(struct btrfs_fs_info *info, bool free_chunk_root) free_root_extent_buffers(info->fs_root); free_root_extent_buffers(info->data_reloc_root); free_root_extent_buffers(info->block_group_root); + free_root_extent_buffers(info->stripe_root); if (free_chunk_root) free_root_extent_buffers(info->chunk_root); } @@ -2519,6 +2524,20 @@ static int btrfs_read_roots(struct btrfs_fs_info *fs_info) fs_info->uuid_root = root; } + if (btrfs_fs_incompat(fs_info, RAID_STRIPE_TREE)) { + location.objectid = BTRFS_RAID_STRIPE_TREE_OBJECTID; + root = btrfs_read_tree_root(tree_root, &location); + if (IS_ERR(root)) { + if (!btrfs_test_opt(fs_info, IGNOREBADROOTS)) { + ret = PTR_ERR(root); + goto out; + } + } else { + set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state); + fs_info->stripe_root = root; + } + } + return 0; out: btrfs_warn(fs_info, "failed to read root (objectid=%llu): %d", diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h index 3b53fc29a858..a85a8922c3fd 100644 --- a/fs/btrfs/disk-io.h +++ b/fs/btrfs/disk-io.h @@ -106,6 +106,11 @@ static inline struct btrfs_root *btrfs_grab_root(struct btrfs_root *root) return NULL; } +static inline struct btrfs_root *btrfs_stripe_tree_root(struct btrfs_fs_info *fs_info) +{ + return fs_info->stripe_root; +} + void btrfs_put_root(struct btrfs_root *root); void btrfs_mark_buffer_dirty(struct extent_buffer *buf); int btrfs_buffer_uptodate(struct extent_buffer *buf, u64 parent_transid, diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h index 4c477eae6891..93f2499a9c5b 100644 --- a/fs/btrfs/fs.h +++ b/fs/btrfs/fs.h @@ -367,6 +367,7 @@ struct btrfs_fs_info { struct btrfs_root *uuid_root; struct btrfs_root *data_reloc_root; struct btrfs_root *block_group_root; + struct btrfs_root *stripe_root; /* The log root tree is a directory of all the other log roots */ struct btrfs_root *log_root_tree; diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h index b4f0f9531119..593fb7930a37 100644 --- a/include/uapi/linux/btrfs.h +++ b/include/uapi/linux/btrfs.h @@ -322,6 +322,7 @@ struct btrfs_ioctl_fs_info_args { #define BTRFS_FEATURE_INCOMPAT_RAID1C34 (1ULL << 11) #define BTRFS_FEATURE_INCOMPAT_ZONED (1ULL << 12) #define BTRFS_FEATURE_INCOMPAT_EXTENT_TREE_V2 (1ULL << 13) +#define BTRFS_FEATURE_INCOMPAT_RAID_STRIPE_TREE (1ULL << 14) struct btrfs_ioctl_feature_flags { __u64 compat_flags; From patchwork Wed Feb 8 10:57:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13132977 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 940BBC636D3 for ; Wed, 8 Feb 2023 10:58:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230132AbjBHK6F (ORCPT ); Wed, 8 Feb 2023 05:58:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35924 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229923AbjBHK6C (ORCPT ); Wed, 8 Feb 2023 05:58:02 -0500 Received: from esa4.hgst.iphmx.com (esa4.hgst.iphmx.com [216.71.154.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4C6FA113EC for ; Wed, 8 Feb 2023 02:57:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1675853878; x=1707389878; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Njy1xCDWF8ylzMwtZQHXscDAY7vRcDq+N2Ob1UslPiw=; b=WF/H2lnuOobGEKP4W9FIRPPWioRCrhpAjQZPxvRhPiZdcjjQyojgzEWO p4BtK/XuKid326f4IHoeB2XJ9LNbpL7CmmDrpnzHGufK9vZbjfe5oR08b SP2GKJSAdjkv6Ra3dcgngLHqwufmD7Q/gwoBi3qk43oy1ezjZf9ubNzk5 Mi9gFGwU+1wwn0P2KAXI6apJdMwdnCCqqV7KWVWwFfCY4WGokyShSRMNb znwDlivslw99nPEsEw9V/IoS06J2fJUHeDb66YAZL5aMc6QZwGCt81kXH gMBT66eWzT1+EnlIOBmbJJ5es4J4nfQLY8wAxbNkL8xGsjqNYX9FqYvp8 w==; X-IronPort-AV: E=Sophos;i="5.97,280,1669046400"; d="scan'208";a="221115631" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 08 Feb 2023 18:57:58 +0800 IronPort-SDR: bnSmh2s0UpEBT+XUGCik+X+VIRwwCh+QqoQ/vsokcm9Ae9sUoTcvOtYT53xIo1/QamUNHbUzBa zyXJiukYLXLS4jfEqyAktD7KlWaI2Z3JLULI3JVc/72D9d/TTEgUx8BtuR2wQvbbq+LeV5YCsq cAkBH4fnSV0SdnPtyBvaZmbOV5t6X+qm5+3JjL4GCe9qcUPZAMUggAP5YhF3UahpzfZKgtK+gn 1wfHs6OPbWKVcTYTMnDBWRPkRHIGYuZRzJHQ4o5Yh9CR59YhHH5SjVpWNgZmUzyJ9QDgAYkIq1 7jw= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 08 Feb 2023 02:15:14 -0800 IronPort-SDR: DyhptMyjeLYX0EBXmYevZ9uzoG9R3/ZmZm5hXM+G5Dg9Gp+1jZQTNn460xMVgeEhbSTtH9BbUS f9+EBOH6JZgTzY5BG3TcqCvf6xYjXIUT/y9E+A2TzxZhbhWV7dOZZGjeSzvAo53Qwna+tJmPFs vHJTbNFhw99KLCIrUf5qBnyCJ2NwGrHKUa/QNTdSWBdsGIJIJv1JCBi76AYYpIfYjtmndnmYMu Cj3eRVPZ5zZzN3O0A3xrvDqifq3IDtYtz6Ll7hoIeRsQf6d7jkaccNxQYh+RN/imUEjKGV45Jy z3U= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip02.wdc.com with ESMTP; 08 Feb 2023 02:57:58 -0800 From: Johannes Thumshirn To: linux-btrfs@vger.kernel.org Cc: Johannes Thumshirn Subject: [PATCH v5 04/13] btrfs: add support for inserting raid stripe extents Date: Wed, 8 Feb 2023 02:57:41 -0800 Message-Id: <96f86c817184925f3d1e625d735058373d90e757.1675853489.git.johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.39.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Add support for inserting stripe extents into the raid stripe tree on completion of every write that needs an extra logical-to-physical translation when using RAID. Inserting the stripe extents happens after the data I/O has completed, this is done to a) support zone-append and b) rule out the possibility of a RAID-write-hole. This is done by creating in-memory ordered stripe extents, just like the in memory ordered extents, on I/O completion and the on-disk raid stripe extents get created once we're running the delayed_refs for the extent item this stripe extent is tied to. Signed-off-by: Johannes Thumshirn --- fs/btrfs/Makefile | 2 +- fs/btrfs/bio.c | 29 ++++++ fs/btrfs/bio.h | 2 + fs/btrfs/delayed-ref.c | 6 +- fs/btrfs/delayed-ref.h | 2 + fs/btrfs/disk-io.c | 9 +- fs/btrfs/extent-tree.c | 60 +++++++++++ fs/btrfs/fs.h | 4 + fs/btrfs/inode.c | 15 ++- fs/btrfs/raid-stripe-tree.c | 202 ++++++++++++++++++++++++++++++++++++ fs/btrfs/raid-stripe-tree.h | 71 +++++++++++++ fs/btrfs/volumes.c | 5 +- fs/btrfs/volumes.h | 12 +-- fs/btrfs/zoned.c | 4 + 14 files changed, 410 insertions(+), 13 deletions(-) create mode 100644 fs/btrfs/raid-stripe-tree.c create mode 100644 fs/btrfs/raid-stripe-tree.h diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile index 90d53209755b..3bb869a84e54 100644 --- a/fs/btrfs/Makefile +++ b/fs/btrfs/Makefile @@ -33,7 +33,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \ uuid-tree.o props.o free-space-tree.o tree-checker.o space-info.o \ block-rsv.o delalloc-space.o block-group.o discard.o reflink.o \ subpage.o tree-mod-log.o extent-io-tree.o fs.o messages.o bio.o \ - lru_cache.o + lru_cache.o raid-stripe-tree.o btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c index e6fe1b7dbc50..a01c6560ef49 100644 --- a/fs/btrfs/bio.c +++ b/fs/btrfs/bio.c @@ -15,6 +15,7 @@ #include "rcu-string.h" #include "zoned.h" #include "file-item.h" +#include "raid-stripe-tree.h" static struct bio_set btrfs_bioset; static struct bio_set btrfs_clone_bioset; @@ -350,6 +351,21 @@ static void btrfs_raid56_end_io(struct bio *bio) btrfs_put_bioc(bioc); } +static void btrfs_raid_stripe_update(struct work_struct *work) +{ + struct btrfs_bio *bbio = + container_of(work, struct btrfs_bio, raid_stripe_work); + struct btrfs_io_stripe *stripe = bbio->bio.bi_private; + struct btrfs_io_context *bioc = stripe->bioc; + int ret; + + ret = btrfs_add_ordered_stripe(bioc); + if (ret) + bbio->bio.bi_status = errno_to_blk_status(ret); + btrfs_orig_bbio_end_io(bbio); + btrfs_put_bioc(bioc); +} + static void btrfs_orig_write_end_io(struct bio *bio) { struct btrfs_io_stripe *stripe = bio->bi_private; @@ -372,6 +388,16 @@ static void btrfs_orig_write_end_io(struct bio *bio) else bio->bi_status = BLK_STS_OK; + if (bio_op(bio) == REQ_OP_ZONE_APPEND) + stripe->physical = bio->bi_iter.bi_sector << SECTOR_SHIFT; + + if (btrfs_need_stripe_tree_update(bioc->fs_info, bioc->map_type)) { + INIT_WORK(&bbio->raid_stripe_work, btrfs_raid_stripe_update); + queue_work(bbio->inode->root->fs_info->raid_stripe_workers, + &bbio->raid_stripe_work); + return; + } + btrfs_orig_bbio_end_io(bbio); btrfs_put_bioc(bioc); } @@ -383,6 +409,8 @@ static void btrfs_clone_write_end_io(struct bio *bio) if (bio->bi_status) { atomic_inc(&stripe->bioc->error); btrfs_log_dev_io_error(bio, stripe->dev); + } else if (bio_op(bio) == REQ_OP_ZONE_APPEND) { + stripe->physical = bio->bi_iter.bi_sector << SECTOR_SHIFT; } /* Pass on control to the original bio this one was cloned from */ @@ -442,6 +470,7 @@ static void btrfs_submit_mirrored_bio(struct btrfs_io_context *bioc, int dev_nr) bio->bi_private = &bioc->stripes[dev_nr]; bio->bi_iter.bi_sector = bioc->stripes[dev_nr].physical >> SECTOR_SHIFT; bioc->stripes[dev_nr].bioc = bioc; + bioc->size = bio->bi_iter.bi_size; btrfs_submit_dev_bio(bioc->stripes[dev_nr].dev, bio); } diff --git a/fs/btrfs/bio.h b/fs/btrfs/bio.h index 20105806c8ac..bf5fbc105148 100644 --- a/fs/btrfs/bio.h +++ b/fs/btrfs/bio.h @@ -58,6 +58,8 @@ struct btrfs_bio { atomic_t pending_ios; struct work_struct end_io_work; + struct work_struct raid_stripe_work; + /* * This member must come last, bio_alloc_bioset will allocate enough * bytes for entire btrfs_bio but relies on bio being last. diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c index 7660ac642c81..261f52ad8e12 100644 --- a/fs/btrfs/delayed-ref.c +++ b/fs/btrfs/delayed-ref.c @@ -14,6 +14,7 @@ #include "space-info.h" #include "tree-mod-log.h" #include "fs.h" +#include "raid-stripe-tree.h" struct kmem_cache *btrfs_delayed_ref_head_cachep; struct kmem_cache *btrfs_delayed_tree_ref_cachep; @@ -637,8 +638,11 @@ static int insert_delayed_ref(struct btrfs_trans_handle *trans, exist->ref_mod += mod; /* remove existing tail if its ref_mod is zero */ - if (exist->ref_mod == 0) + if (exist->ref_mod == 0) { + btrfs_drop_ordered_stripe(trans->fs_info, exist->bytenr); drop_delayed_ref(root, href, exist); + } + spin_unlock(&href->lock); return ret; inserted: diff --git a/fs/btrfs/delayed-ref.h b/fs/btrfs/delayed-ref.h index 2eb34abf700f..5096c1a1ed3e 100644 --- a/fs/btrfs/delayed-ref.h +++ b/fs/btrfs/delayed-ref.h @@ -51,6 +51,8 @@ struct btrfs_delayed_ref_node { /* is this node still in the rbtree? */ unsigned int is_head:1; unsigned int in_tree:1; + /* Do we need RAID stripe tree modifications? */ + unsigned int must_insert_stripe:1; }; struct btrfs_delayed_extent_op { diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index ad64a79d052a..b130c8dcd8d9 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2022,6 +2022,8 @@ static void btrfs_stop_all_workers(struct btrfs_fs_info *fs_info) destroy_workqueue(fs_info->rmw_workers); if (fs_info->compressed_write_workers) destroy_workqueue(fs_info->compressed_write_workers); + if (fs_info->raid_stripe_workers) + destroy_workqueue(fs_info->raid_stripe_workers); btrfs_destroy_workqueue(fs_info->endio_write_workers); btrfs_destroy_workqueue(fs_info->endio_freespace_worker); btrfs_destroy_workqueue(fs_info->delayed_workers); @@ -2240,12 +2242,14 @@ static int btrfs_init_workqueues(struct btrfs_fs_info *fs_info) btrfs_alloc_workqueue(fs_info, "qgroup-rescan", flags, 1, 0); fs_info->discard_ctl.discard_workers = alloc_workqueue("btrfs_discard", WQ_UNBOUND | WQ_FREEZABLE, 1); + fs_info->raid_stripe_workers = + alloc_workqueue("btrfs-raid-stripe", flags, max_active); if (!(fs_info->workers && fs_info->hipri_workers && fs_info->delalloc_workers && fs_info->flush_workers && fs_info->endio_workers && fs_info->endio_meta_workers && fs_info->compressed_write_workers && - fs_info->endio_write_workers && + fs_info->endio_write_workers && fs_info->raid_stripe_workers && fs_info->endio_freespace_worker && fs_info->rmw_workers && fs_info->caching_workers && fs_info->fixup_workers && fs_info->delayed_workers && fs_info->qgroup_rescan_workers && @@ -3046,6 +3050,9 @@ void btrfs_init_fs_info(struct btrfs_fs_info *fs_info) fs_info->bg_reclaim_threshold = BTRFS_DEFAULT_RECLAIM_THRESH; INIT_WORK(&fs_info->reclaim_bgs_work, btrfs_reclaim_bgs_work); + + rwlock_init(&fs_info->stripe_update_lock); + fs_info->stripe_update_tree = RB_ROOT; } static int init_mount_fs_info(struct btrfs_fs_info *fs_info, struct super_block *sb) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 688cdf816957..50b3a2c3c0dd 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -42,6 +42,7 @@ #include "file-item.h" #include "orphan.h" #include "tree-checker.h" +#include "raid-stripe-tree.h" #undef SCRAMBLE_DELAYED_REFS @@ -1497,6 +1498,56 @@ static int __btrfs_inc_extent_ref(struct btrfs_trans_handle *trans, return ret; } +static bool delayed_ref_needs_rst_update(struct btrfs_fs_info *fs_info, + struct btrfs_delayed_ref_head *head) +{ + struct extent_map *em; + struct map_lookup *map; + bool ret = false; + + if (!btrfs_stripe_tree_root(fs_info)) + return ret; + + em = btrfs_get_chunk_map(fs_info, head->bytenr, head->num_bytes); + if (!em) + return ret; + + map = em->map_lookup; + + if (btrfs_need_stripe_tree_update(fs_info, map->type)) + ret = true; + + free_extent_map(em); + + return ret; +} + +static int add_stripe_entry_for_delayed_ref(struct btrfs_trans_handle *trans, + struct btrfs_delayed_ref_node *node) +{ + struct btrfs_fs_info *fs_info = trans->fs_info; + struct btrfs_ordered_stripe *stripe; + int ret = 0; + + stripe = btrfs_lookup_ordered_stripe(fs_info, node->bytenr); + if (!stripe) { + btrfs_err(fs_info, + "cannot get stripe extent for address %llu (%llu)", + node->bytenr, node->num_bytes); + return -EINVAL; + } + + ASSERT(stripe->logical == node->bytenr); + + ret = btrfs_insert_raid_extent(trans, stripe); + /* once for us */ + btrfs_put_ordered_stripe(fs_info, stripe); + /* once for the tree */ + btrfs_put_ordered_stripe(fs_info, stripe); + + return ret; +} + static int run_delayed_data_ref(struct btrfs_trans_handle *trans, struct btrfs_delayed_ref_node *node, struct btrfs_delayed_extent_op *extent_op, @@ -1527,11 +1578,17 @@ static int run_delayed_data_ref(struct btrfs_trans_handle *trans, flags, ref->objectid, ref->offset, &ins, node->ref_mod); + if (ret) + return ret; + if (node->must_insert_stripe) + ret = add_stripe_entry_for_delayed_ref(trans, node); } else if (node->action == BTRFS_ADD_DELAYED_REF) { ret = __btrfs_inc_extent_ref(trans, node, parent, ref_root, ref->objectid, ref->offset, node->ref_mod, extent_op); } else if (node->action == BTRFS_DROP_DELAYED_REF) { + if (node->must_insert_stripe) + btrfs_drop_ordered_stripe(trans->fs_info, node->bytenr); ret = __btrfs_free_extent(trans, node, parent, ref_root, ref->objectid, ref->offset, node->ref_mod, @@ -1901,6 +1958,8 @@ static int btrfs_run_delayed_refs_for_head(struct btrfs_trans_handle *trans, struct btrfs_delayed_ref_root *delayed_refs; struct btrfs_delayed_extent_op *extent_op; struct btrfs_delayed_ref_node *ref; + const bool need_rst_update = + delayed_ref_needs_rst_update(fs_info, locked_ref); int must_insert_reserved = 0; int ret; @@ -1951,6 +2010,7 @@ static int btrfs_run_delayed_refs_for_head(struct btrfs_trans_handle *trans, locked_ref->extent_op = NULL; spin_unlock(&locked_ref->lock); + ref->must_insert_stripe = need_rst_update; ret = run_one_delayed_ref(trans, ref, extent_op, must_insert_reserved); diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h index 93f2499a9c5b..bee7ed0304cd 100644 --- a/fs/btrfs/fs.h +++ b/fs/btrfs/fs.h @@ -551,6 +551,7 @@ struct btrfs_fs_info { struct btrfs_workqueue *endio_write_workers; struct btrfs_workqueue *endio_freespace_worker; struct btrfs_workqueue *caching_workers; + struct workqueue_struct *raid_stripe_workers; /* * Fixup workers take dirty pages that didn't properly go through the @@ -791,6 +792,9 @@ struct btrfs_fs_info { struct lockdep_map btrfs_trans_pending_ordered_map; struct lockdep_map btrfs_ordered_extent_map; + rwlock_t stripe_update_lock; + struct rb_root stripe_update_tree; + #ifdef CONFIG_BTRFS_FS_REF_VERIFY spinlock_t ref_verify_lock; struct rb_root block_tree; diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 36ae541ad51b..74c0b486e496 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -70,6 +70,7 @@ #include "verity.h" #include "super.h" #include "orphan.h" +#include "raid-stripe-tree.h" struct btrfs_iget_args { u64 ino; @@ -9509,12 +9510,17 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent( if (qgroup_released < 0) return ERR_PTR(qgroup_released); + ret = btrfs_insert_preallocated_raid_stripe(inode->root->fs_info, + start, len); + if (ret) + goto free_qgroup; + if (trans) { ret = insert_reserved_file_extent(trans, inode, file_offset, &stack_fi, true, qgroup_released); if (ret) - goto free_qgroup; + goto free_stripe_extent; return trans; } @@ -9532,7 +9538,7 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent( path = btrfs_alloc_path(); if (!path) { ret = -ENOMEM; - goto free_qgroup; + goto free_stripe_extent; } ret = btrfs_replace_file_extents(inode, path, file_offset, @@ -9540,9 +9546,12 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent( &trans); btrfs_free_path(path); if (ret) - goto free_qgroup; + goto free_stripe_extent; return trans; +free_stripe_extent: + btrfs_drop_ordered_stripe(inode->root->fs_info, start); + free_qgroup: /* * We have released qgroup data range at the beginning of the function, diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c new file mode 100644 index 000000000000..d184cd9dc96e --- /dev/null +++ b/fs/btrfs/raid-stripe-tree.c @@ -0,0 +1,202 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2022 Western Digital Corporation or its affiliates. + */ + +#include + +#include "ctree.h" +#include "fs.h" +#include "accessors.h" +#include "transaction.h" +#include "disk-io.h" +#include "raid-stripe-tree.h" +#include "volumes.h" +#include "misc.h" +#include "disk-io.h" +#include "print-tree.h" + +static int ordered_stripe_cmp(const void *key, const struct rb_node *node) +{ + struct btrfs_ordered_stripe *stripe = + rb_entry(node, struct btrfs_ordered_stripe, rb_node); + const u64 *logical = key; + + if (*logical < stripe->logical) + return -1; + if (*logical >= stripe->logical + stripe->num_bytes) + return 1; + return 0; +} + +static int ordered_stripe_less(struct rb_node *rba, const struct rb_node *rbb) +{ + struct btrfs_ordered_stripe *stripe = + rb_entry(rba, struct btrfs_ordered_stripe, rb_node); + return ordered_stripe_cmp(&stripe->logical, rbb); +} + +int btrfs_add_ordered_stripe(struct btrfs_io_context *bioc) +{ + struct btrfs_fs_info *fs_info = bioc->fs_info; + struct btrfs_ordered_stripe *stripe; + struct btrfs_io_stripe *tmp; + u64 logical = bioc->logical; + u64 length = bioc->size; + struct rb_node *node; + size_t size; + + size = bioc->num_stripes * sizeof(struct btrfs_io_stripe); + stripe = kzalloc(sizeof(struct btrfs_ordered_stripe), GFP_NOFS); + if (!stripe) + return -ENOMEM; + + spin_lock_init(&stripe->lock); + tmp = kmemdup(bioc->stripes, size, GFP_NOFS); + if (!tmp) { + kfree(stripe); + return -ENOMEM; + } + + stripe->logical = logical; + stripe->num_bytes = length; + stripe->num_stripes = bioc->num_stripes; + spin_lock(&stripe->lock); + stripe->stripes = tmp; + spin_unlock(&stripe->lock); + refcount_set(&stripe->ref, 1); + + write_lock(&fs_info->stripe_update_lock); + node = rb_find_add(&stripe->rb_node, &fs_info->stripe_update_tree, + ordered_stripe_less); + write_unlock(&fs_info->stripe_update_lock); + if (node) { + struct btrfs_ordered_stripe *old = + rb_entry(node, struct btrfs_ordered_stripe, rb_node); + + btrfs_debug(fs_info, "logical: %llu, length: %llu already exists", + logical, length); + ASSERT(logical == old->logical); + write_lock(&fs_info->stripe_update_lock); + rb_replace_node(node, &stripe->rb_node, + &fs_info->stripe_update_tree); + write_unlock(&fs_info->stripe_update_lock); + } + + return 0; +} + +struct btrfs_ordered_stripe *btrfs_lookup_ordered_stripe(struct btrfs_fs_info *fs_info, + u64 logical) +{ + struct rb_root *root = &fs_info->stripe_update_tree; + struct btrfs_ordered_stripe *stripe = NULL; + struct rb_node *node; + + read_lock(&fs_info->stripe_update_lock); + node = rb_find(&logical, root, ordered_stripe_cmp); + if (node) { + stripe = rb_entry(node, struct btrfs_ordered_stripe, rb_node); + refcount_inc(&stripe->ref); + } + read_unlock(&fs_info->stripe_update_lock); + + return stripe; +} + +void btrfs_put_ordered_stripe(struct btrfs_fs_info *fs_info, + struct btrfs_ordered_stripe *stripe) +{ + write_lock(&fs_info->stripe_update_lock); + if (refcount_dec_and_test(&stripe->ref)) { + struct rb_node *node = &stripe->rb_node; + + rb_erase(node, &fs_info->stripe_update_tree); + RB_CLEAR_NODE(node); + + spin_lock(&stripe->lock); + kfree(stripe->stripes); + spin_unlock(&stripe->lock); + kfree(stripe); + } + write_unlock(&fs_info->stripe_update_lock); +} + +int btrfs_insert_preallocated_raid_stripe(struct btrfs_fs_info *fs_info, + u64 start, u64 len) +{ + struct btrfs_io_context *bioc = NULL; + struct btrfs_ordered_stripe *stripe; + u64 map_length = len; + int ret; + + if (!btrfs_stripe_tree_root(fs_info)) + return 0; + + ret = btrfs_map_block(fs_info, BTRFS_MAP_WRITE, start, &map_length, + &bioc, 0); + if (ret) + return ret; + + bioc->size = len; + + stripe = btrfs_lookup_ordered_stripe(fs_info, start); + if (!stripe) { + ret = btrfs_add_ordered_stripe(bioc); + if (ret) + return ret; + } else { + spin_lock(&stripe->lock); + memcpy(stripe->stripes, bioc->stripes, + bioc->num_stripes * sizeof(struct btrfs_io_stripe)); + spin_unlock(&stripe->lock); + btrfs_put_ordered_stripe(fs_info, stripe); + } + + return 0; +} + +int btrfs_insert_raid_extent(struct btrfs_trans_handle *trans, + struct btrfs_ordered_stripe *stripe) +{ + struct btrfs_fs_info *fs_info = trans->fs_info; + struct btrfs_key stripe_key; + struct btrfs_root *stripe_root = btrfs_stripe_tree_root(fs_info); + struct btrfs_stripe_extent *stripe_extent; + size_t item_size; + int ret; + + item_size = stripe->num_stripes * sizeof(struct btrfs_raid_stride); + + stripe_extent = kzalloc(item_size, GFP_NOFS); + if (!stripe_extent) { + btrfs_abort_transaction(trans, -ENOMEM); + btrfs_end_transaction(trans); + return -ENOMEM; + } + + spin_lock(&stripe->lock); + for (int i = 0; i < stripe->num_stripes; i++) { + u64 devid = stripe->stripes[i].dev->devid; + u64 physical = stripe->stripes[i].physical; + struct btrfs_raid_stride *raid_stride = + &stripe_extent->strides[i]; + + btrfs_set_stack_raid_stride_devid(raid_stride, devid); + btrfs_set_stack_raid_stride_physical(raid_stride, physical); + } + spin_unlock(&stripe->lock); + + stripe_key.objectid = stripe->logical; + stripe_key.type = BTRFS_RAID_STRIPE_KEY; + stripe_key.offset = stripe->num_bytes; + + ret = btrfs_insert_item(trans, stripe_root, &stripe_key, stripe_extent, + item_size); + if (ret) + btrfs_abort_transaction(trans, ret); + + kfree(stripe_extent); + + return ret; +} diff --git a/fs/btrfs/raid-stripe-tree.h b/fs/btrfs/raid-stripe-tree.h new file mode 100644 index 000000000000..60d3f8489cc9 --- /dev/null +++ b/fs/btrfs/raid-stripe-tree.h @@ -0,0 +1,71 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2022 Western Digital Corporation or its affiliates. + */ + +#ifndef BTRFS_RAID_STRIPE_TREE_H +#define BTRFS_RAID_STRIPE_TREE_H + +#include "disk-io.h" +#include "messages.h" + +struct btrfs_io_context; + +struct btrfs_ordered_stripe { + struct rb_node rb_node; + + u64 logical; + u64 num_bytes; + int num_stripes; + struct btrfs_io_stripe *stripes; + spinlock_t lock; + refcount_t ref; +}; + +int btrfs_insert_raid_extent(struct btrfs_trans_handle *trans, + struct btrfs_ordered_stripe *stripe); +int btrfs_insert_preallocated_raid_stripe(struct btrfs_fs_info *fs_info, + u64 start, u64 len); +struct btrfs_ordered_stripe *btrfs_lookup_ordered_stripe( + struct btrfs_fs_info *fs_info, + u64 logical); +int btrfs_add_ordered_stripe(struct btrfs_io_context *bioc); +void btrfs_put_ordered_stripe(struct btrfs_fs_info *fs_info, + struct btrfs_ordered_stripe *stripe); + +static inline bool btrfs_need_stripe_tree_update(struct btrfs_fs_info *fs_info, + u64 map_type) +{ + u64 type = map_type & BTRFS_BLOCK_GROUP_TYPE_MASK; + u64 profile = map_type & BTRFS_BLOCK_GROUP_PROFILE_MASK; + + if (!btrfs_stripe_tree_root(fs_info)) + return false; + + if (type != BTRFS_BLOCK_GROUP_DATA) + return false; + + if (profile & BTRFS_BLOCK_GROUP_RAID1_MASK) + return true; + + return false; +} + +static inline void btrfs_drop_ordered_stripe(struct btrfs_fs_info *fs_info, + u64 logical) +{ + struct btrfs_ordered_stripe *stripe; + + if (!btrfs_stripe_tree_root(fs_info)) + return; + + stripe = btrfs_lookup_ordered_stripe(fs_info, logical); + if (!stripe) + return; + ASSERT(refcount_read(&stripe->ref) == 2); + /* once for us */ + btrfs_put_ordered_stripe(fs_info, stripe); + /* once for the tree */ + btrfs_put_ordered_stripe(fs_info, stripe); +} +#endif diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 707dd0456cea..e7c0353e5655 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -5885,6 +5885,7 @@ static void sort_parity_stripes(struct btrfs_io_context *bioc, int num_stripes) } static struct btrfs_io_context *alloc_btrfs_io_context(struct btrfs_fs_info *fs_info, + u64 logical, int total_stripes, int real_stripes) { @@ -5908,6 +5909,7 @@ static struct btrfs_io_context *alloc_btrfs_io_context(struct btrfs_fs_info *fs_ refcount_set(&bioc->refs, 1); bioc->fs_info = fs_info; + bioc->logical = logical; bioc->tgtdev_map = (int *)(bioc->stripes + total_stripes); bioc->raid_map = (u64 *)(bioc->tgtdev_map + real_stripes); @@ -6513,7 +6515,8 @@ int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, goto out; } - bioc = alloc_btrfs_io_context(fs_info, num_alloc_stripes, tgtdev_indexes); + bioc = alloc_btrfs_io_context(fs_info, logical, num_alloc_stripes, + tgtdev_indexes); if (!bioc) { ret = -ENOMEM; goto out; diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index 7e51f2238f72..5d7547b5fa87 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -368,12 +368,10 @@ struct btrfs_fs_devices { struct btrfs_io_stripe { struct btrfs_device *dev; - union { - /* Block mapping */ - u64 physical; - /* For the endio handler */ - struct btrfs_io_context *bioc; - }; + /* Block mapping */ + u64 physical; + /* For the endio handler */ + struct btrfs_io_context *bioc; }; struct btrfs_discard_stripe { @@ -409,6 +407,8 @@ struct btrfs_io_context { int mirror_num; int num_tgtdevs; int *tgtdev_map; + u64 logical; + u64 size; /* * logical block numbers for the start of each stripe * The last one or two are p/q. These are sorted, diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index d862477f79f3..ed49150e6e6f 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -1687,6 +1687,10 @@ void btrfs_rewrite_logical_zoned(struct btrfs_ordered_extent *ordered) u64 *logical = NULL; int nr, stripe_len; + /* Filesystems with a stripe tree have their own l2p mapping */ + if (btrfs_stripe_tree_root(fs_info)) + return; + /* Zoned devices should not have partitions. So, we can assume it is 0 */ ASSERT(!bdev_is_partition(ordered->bdev)); if (WARN_ON(!ordered->bdev)) From patchwork Wed Feb 8 10:57:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13132970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 94D83C6379F for ; Wed, 8 Feb 2023 10:58:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229982AbjBHK6D (ORCPT ); Wed, 8 Feb 2023 05:58:03 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35902 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229479AbjBHK6A (ORCPT ); Wed, 8 Feb 2023 05:58:00 -0500 Received: from esa4.hgst.iphmx.com (esa4.hgst.iphmx.com [216.71.154.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7AF2D12F22 for ; Wed, 8 Feb 2023 02:57:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1675853879; x=1707389879; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=YGz2sP9tKx3GxmURnJl4X80fN6nsh25j5xnfqy2FgM8=; b=H/4fRyHv6whEKjeUS39CMz5tchSzaH6+VqxnrKLJhrAVF2XMdNXFhfyV xVnuRGQbbVLnW21UvdImuFMCPF6ihFLhxQxvI909SGIT1AsWAufYsMBD2 Hlxs6wjiwJ7cFYmlKOE3igkl6eRuZyodiNh6m/yqtCiqxBO/FBKWKIiSD faucoQnuACtzWKDuyyA7WiY/VieBNOLWNSG6ilgLDdjMrg4cO0mjoGzO3 IhB/fXA80kq2Gy5jMnNyHwj6WTzBhMBvPQOkVayFzi2SZC6t/pT9qDi72 0A+7cUWCuHCVeieqK5gfyFLfIVdNnEq5EYn4+qy2tmTCvs9/E02ym7K0S g==; X-IronPort-AV: E=Sophos;i="5.97,280,1669046400"; d="scan'208";a="221115633" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 08 Feb 2023 18:57:59 +0800 IronPort-SDR: dUPDoSOKDFEpoY3EHGbNjlUw5WxENudC9v9Ki1+i3MeWySDrinccrMaBoQnQthtA+Ff3m9QoRq 9CCOlcKN6TPVNVgv5g+KRFK+E/SUR3TqfS+Iqrd6mflDozTWAQEN0XkA/bkNH3T+WlyIkf7QCv NKiF/n99G6UCT5IcRh/QHER3OihXNcM0kesRxLeJOwC0jKj+jBBf88odQt48F0jh76F82KDqWk iZIkSHUWMP8J0FbfP9iqdKCdYyWreGarI7fywvMSXYJJ5LUDrbK2QRDFqqGMQAmBFbmyU0/IfO n3s= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 08 Feb 2023 02:15:14 -0800 IronPort-SDR: o2FzghvbkRkjKSVFSnhnuSCsPhU1aqRp89JQZ9o5XmD8aNBgWke5l8wcphVVdasv12YzPvlNS7 12sQDvlyE7J/Q8ytG7cvhmUqiXU2VdRsmfijthcPdZxjCQQCUIks+Q00NSedYuwVUOUr3fiLRN 67rgb5fEmUfiHTZ33Q5WN9Ghz7Q//ktTaH1hBOAOhoeVWWbaY3GmIR6JQZV2jXLfV4HEMSNxjX uAqiHln9l1orujqWpUnwNbYs7iyvKu7VW/42oSGRHydiE2WdUt+wnHhf6YFVG+xJ++z4ExNP3G AEI= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip02.wdc.com with ESMTP; 08 Feb 2023 02:57:59 -0800 From: Johannes Thumshirn To: linux-btrfs@vger.kernel.org Cc: Johannes Thumshirn Subject: [PATCH v5 05/13] btrfs: delete stripe extent on extent deletion Date: Wed, 8 Feb 2023 02:57:42 -0800 Message-Id: <28ff4b0398d52b27c93fac93297be8f0e2a18fab.1675853489.git.johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.39.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org As each stripe extent is tied to an extent item, delete the stripe extent once the corresponding extent item is deleted. Signed-off-by: Johannes Thumshirn --- fs/btrfs/extent-tree.c | 8 ++++++++ fs/btrfs/raid-stripe-tree.c | 31 +++++++++++++++++++++++++++++++ fs/btrfs/raid-stripe-tree.h | 2 ++ 3 files changed, 41 insertions(+) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 50b3a2c3c0dd..f08ee7d9211c 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -3238,6 +3238,14 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans, } } + if (is_data) { + ret = btrfs_delete_raid_extent(trans, bytenr, num_bytes); + if (ret) { + btrfs_abort_transaction(trans, ret); + return ret; + } + } + ret = btrfs_del_items(trans, extent_root, path, path->slots[0], num_to_del); if (ret) { diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c index d184cd9dc96e..ff5787a19454 100644 --- a/fs/btrfs/raid-stripe-tree.c +++ b/fs/btrfs/raid-stripe-tree.c @@ -122,6 +122,37 @@ void btrfs_put_ordered_stripe(struct btrfs_fs_info *fs_info, write_unlock(&fs_info->stripe_update_lock); } +int btrfs_delete_raid_extent(struct btrfs_trans_handle *trans, u64 start, + u64 length) +{ + struct btrfs_fs_info *fs_info = trans->fs_info; + struct btrfs_root *stripe_root = btrfs_stripe_tree_root(fs_info); + struct btrfs_path *path; + struct btrfs_key stripe_key; + int ret; + + if (!stripe_root) + return 0; + + stripe_key.objectid = start; + stripe_key.type = BTRFS_RAID_STRIPE_KEY; + stripe_key.offset = length; + + path = btrfs_alloc_path(); + if (!path) + return -ENOMEM; + + ret = btrfs_search_slot(trans, stripe_root, &stripe_key, path, -1, 1); + if (ret < 0) + goto out; + + ret = btrfs_del_item(trans, stripe_root, path); +out: + btrfs_free_path(path); + return ret; + +} + int btrfs_insert_preallocated_raid_stripe(struct btrfs_fs_info *fs_info, u64 start, u64 len) { diff --git a/fs/btrfs/raid-stripe-tree.h b/fs/btrfs/raid-stripe-tree.h index 60d3f8489cc9..12d2f588b22d 100644 --- a/fs/btrfs/raid-stripe-tree.h +++ b/fs/btrfs/raid-stripe-tree.h @@ -22,6 +22,8 @@ struct btrfs_ordered_stripe { refcount_t ref; }; +int btrfs_delete_raid_extent(struct btrfs_trans_handle *trans, u64 start, + u64 length); int btrfs_insert_raid_extent(struct btrfs_trans_handle *trans, struct btrfs_ordered_stripe *stripe); int btrfs_insert_preallocated_raid_stripe(struct btrfs_fs_info *fs_info, From patchwork Wed Feb 8 10:57:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13132972 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9510C05027 for ; Wed, 8 Feb 2023 10:58:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230099AbjBHK6E (ORCPT ); Wed, 8 Feb 2023 05:58:04 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35926 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229781AbjBHK6C (ORCPT ); Wed, 8 Feb 2023 05:58:02 -0500 Received: from esa4.hgst.iphmx.com (esa4.hgst.iphmx.com [216.71.154.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 765B512F35 for ; Wed, 8 Feb 2023 02:58:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1675853880; x=1707389880; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=DqrTELnrQRQRmRB1mPTAANP82bk/S5mEaquDVf0iSMA=; b=HIAu0JzSTF3N7yX/3yuj8YhJ6XWwLIHtrh0pYNMIBSnkq8F5tqbhvYBT UMCJgN1de8pzhfkNCamlzNaP47JcE8hknFtiiZQUVznVmJLOkZWoeHMOe ruCNAvvhUDFROzJI9+4vclsIpmscnxQAzabwnxDS1fStPpt1inTd4tvEY 18cb11aq34md3DlrkuYRutYc1LrsoUFp/hez+8cOpjH/Kmfhvp737Y0Od HIkbixgFV5UFb8V5+aEJRTmvEuzFLgESz7eICp7gbVxrvLxQ2kH+vzfr/ XLK0GEEXEpPuFTgNOgCPt1y7C+zSx0tHheiZ1FUsEAemt5JQFtK4eg2eh w==; X-IronPort-AV: E=Sophos;i="5.97,280,1669046400"; d="scan'208";a="221115634" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 08 Feb 2023 18:57:59 +0800 IronPort-SDR: Qn4CZRKsOM4mRy+61kZuPdKRA7MkZnfUsoyux4jc9dFf9qG3J7oOe0xN1DRVueFFzRzN59IquT g941tPDHZxO4MoO7X8ZL71k3yDcsMMLnGuUXgqAPY4JHy2xBji6KZZiA18FBItyO5mhkXtBlmw bCVjMwaGCcNQEVmamourFzquCg2lqhwLP4D2utxdmMT5YgJ46ZbS5aRFmIMkGAEi9mEWlt4n09 bvjHTcTeIpX/5gYMm7ZhmBQcCEhAo0K89Q3wmWVPDRWvvlA5o5cWxBDdL0A9M9qyYGLh8VGbJy CjA= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 08 Feb 2023 02:15:15 -0800 IronPort-SDR: DIRrQCAt5/nZtbE0caho7gQ2MhzDFheiPykIILpEk/x9IcY8aDas623j+/ku4/6KT4zLSBRG6G SyfXsbucHXqjvaQcH+0S2rDVVBdtXR7ykjcEG8SW7Q4WscmrCoThp0E//MUT8mzCkDrzpbgrUo d4PEYua4b93210SZHCPbGjkYK4bh97WndQGrlomDwMtU69rS1GCVgVQJatkGK8Cwukn6fXn1QP L1kH/HCGdgZjPMzJ+nuqcAjZPqy86h/CGEfbtwelrQYTuYfEnzZ9sjKbL/jHyT5y1AHq/oKmZx lTA= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip02.wdc.com with ESMTP; 08 Feb 2023 02:58:00 -0800 From: Johannes Thumshirn To: linux-btrfs@vger.kernel.org Cc: Johannes Thumshirn Subject: [PATCH v5 06/13] btrfs: lookup physical address from stripe extent Date: Wed, 8 Feb 2023 02:57:43 -0800 Message-Id: X-Mailer: git-send-email 2.39.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Lookup the physical address from the raid stripe tree when a read on an RAID volume formatted with the raid stripe tree was attempted. If the requested logical address was not found in the stripe tree, it may still be in the in-memory ordered stripe tree, so fallback to searching the ordered stripe tree in this case. Signed-off-by: Johannes Thumshirn Reviewed-by: Josef Bacik --- fs/btrfs/raid-stripe-tree.c | 145 ++++++++++++++++++++++++++++++++++++ fs/btrfs/raid-stripe-tree.h | 3 + fs/btrfs/volumes.c | 31 ++++++-- 3 files changed, 172 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c index ff5787a19454..ba7015a8012c 100644 --- a/fs/btrfs/raid-stripe-tree.c +++ b/fs/btrfs/raid-stripe-tree.c @@ -231,3 +231,148 @@ int btrfs_insert_raid_extent(struct btrfs_trans_handle *trans, return ret; } + +static bool btrfs_physical_from_ordered_stripe(struct btrfs_fs_info *fs_info, + u64 logical, u64 *length, + int num_stripes, + struct btrfs_io_stripe *stripe) +{ + struct btrfs_ordered_stripe *os; + u64 offset; + u64 found_end; + u64 end; + int i; + + os = btrfs_lookup_ordered_stripe(fs_info, logical); + if (!os) + return false; + + end = logical + *length; + found_end = os->logical + os->num_bytes; + if (end > found_end) + *length -= end - found_end; + + for (i = 0; i < num_stripes; i++) { + if (os->stripes[i].dev != stripe->dev) + continue; + + offset = logical - os->logical; + ASSERT(offset >= 0); + stripe->physical = os->stripes[i].physical + offset; + btrfs_put_ordered_stripe(fs_info, os); + break; + } + + return true; +} + +int btrfs_get_raid_extent_offset(struct btrfs_fs_info *fs_info, + u64 logical, u64 *length, u64 map_type, + struct btrfs_io_stripe *stripe) +{ + struct btrfs_root *stripe_root = btrfs_stripe_tree_root(fs_info); + int num_stripes = btrfs_bg_type_to_factor(map_type); + struct btrfs_stripe_extent *stripe_extent; + struct btrfs_key stripe_key; + struct btrfs_key found_key; + struct btrfs_path *path; + struct extent_buffer *leaf; + u64 offset; + u64 found_logical; + u64 found_length; + u64 end; + u64 found_end; + int slot; + int ret; + int i; + + /* + * If we still have the stripe in the ordered stripe tree get it from + * there + */ + if (btrfs_physical_from_ordered_stripe(fs_info, logical, length, + num_stripes, stripe)) + return 0; + + stripe_key.objectid = logical; + stripe_key.type = BTRFS_RAID_STRIPE_KEY; + stripe_key.offset = 0; + + path = btrfs_alloc_path(); + if (!path) + return -ENOMEM; + + ret = btrfs_search_slot(NULL, stripe_root, &stripe_key, path, 0, 0); + if (ret < 0) + goto free_path; + if (ret) { + if (path->slots[0] != 0) + path->slots[0]--; + } + + end = logical + *length; + + while (1) { + leaf = path->nodes[0]; + slot = path->slots[0]; + + btrfs_item_key_to_cpu(leaf, &found_key, slot); + found_logical = found_key.objectid; + found_length = found_key.offset; + + if (found_logical > end) + break; + + if (!in_range(logical, found_logical, found_length)) + goto next; + + offset = logical - found_logical; + found_end = found_logical + found_length; + + /* + * If we have a logically contiguous, but physically + * noncontinuous range, we need to split the bio. Record the + * length after which we must split the bio. + */ + if (end > found_end) + *length -= end - found_end; + + stripe_extent = + btrfs_item_ptr(leaf, slot, struct btrfs_stripe_extent); + for (i = 0; i < num_stripes; i++) { + if (btrfs_raid_stride_devid_nr(leaf, + stripe_extent, i) != stripe->dev->devid) + continue; + stripe->physical = btrfs_raid_stride_physical_nr(leaf, + stripe_extent, i) + offset; + ret = 0; + goto out; + } + + /* + * If we're here, we haven't found the requested devid in the + * stripe. + */ + ret = -ENOENT; + goto out; +next: + ret = btrfs_next_item(stripe_root, path); + if (ret) + break; + } + +out: + if (ret > 0) + ret = -ENOENT; + if (ret && ret != -EIO) { + btrfs_err(fs_info, + "cannot find raid-stripe for logical [%llu, %llu]", + logical, logical + *length); + btrfs_print_tree(leaf, 1); + } + +free_path: + btrfs_free_path(path); + + return ret; +} diff --git a/fs/btrfs/raid-stripe-tree.h b/fs/btrfs/raid-stripe-tree.h index 12d2f588b22d..9359df0ca3f1 100644 --- a/fs/btrfs/raid-stripe-tree.h +++ b/fs/btrfs/raid-stripe-tree.h @@ -22,6 +22,9 @@ struct btrfs_ordered_stripe { refcount_t ref; }; +int btrfs_get_raid_extent_offset(struct btrfs_fs_info *fs_info, + u64 logical, u64 *length, u64 map_type, + struct btrfs_io_stripe *stripe); int btrfs_delete_raid_extent(struct btrfs_trans_handle *trans, u64 start, u64 length); int btrfs_insert_raid_extent(struct btrfs_trans_handle *trans, diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index e7c0353e5655..7a784bb511ed 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -35,6 +35,7 @@ #include "relocation.h" #include "scrub.h" #include "super.h" +#include "raid-stripe-tree.h" #define BTRFS_BLOCK_GROUP_STRIPE_MASK (BTRFS_BLOCK_GROUP_RAID0 | \ BTRFS_BLOCK_GROUP_RAID10 | \ @@ -6311,12 +6312,21 @@ static u64 btrfs_max_io_len(struct map_lookup *map, enum btrfs_map_op op, return U64_MAX; } -static void set_io_stripe(struct btrfs_io_stripe *dst, const struct map_lookup *map, - u32 stripe_index, u64 stripe_offset, u64 stripe_nr) +static int set_io_stripe(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, + u64 logical, u64 *length, struct btrfs_io_stripe *dst, + struct map_lookup *map, u32 stripe_index, + u64 stripe_offset, u64 stripe_nr) { dst->dev = map->stripes[stripe_index].dev; + + if (op == BTRFS_MAP_READ && + btrfs_need_stripe_tree_update(fs_info, map->type)) + return btrfs_get_raid_extent_offset(fs_info, logical, length, + map->type, dst); + dst->physical = map->stripes[stripe_index].physical + stripe_offset + stripe_nr * map->stripe_len; + return 0; } int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, @@ -6505,13 +6515,14 @@ int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, smap->dev = dev_replace->tgtdev; smap->physical = physical_to_patch_in_first_stripe; *mirror_num_ret = map->num_stripes + 1; + ret = 0; } else { - set_io_stripe(smap, map, stripe_index, stripe_offset, - stripe_nr); *mirror_num_ret = mirror_num; + ret = set_io_stripe(fs_info, op, logical, length, smap, + map, stripe_index, stripe_offset, + stripe_nr); } *bioc_ret = NULL; - ret = 0; goto out; } @@ -6523,8 +6534,14 @@ int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, } for (i = 0; i < num_stripes; i++) { - set_io_stripe(&bioc->stripes[i], map, stripe_index, stripe_offset, - stripe_nr); + ret = set_io_stripe(fs_info, op, logical, length, + &bioc->stripes[i], map, stripe_index, + stripe_offset, stripe_nr); + if (ret) { + btrfs_put_bioc(bioc); + goto out; + } + stripe_index++; } From patchwork Wed Feb 8 10:57:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13132971 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 83FEFC636D3 for ; Wed, 8 Feb 2023 10:58:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230012AbjBHK6E (ORCPT ); Wed, 8 Feb 2023 05:58:04 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35928 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229965AbjBHK6C (ORCPT ); Wed, 8 Feb 2023 05:58:02 -0500 Received: from esa4.hgst.iphmx.com (esa4.hgst.iphmx.com [216.71.154.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1CABC13523 for ; Wed, 8 Feb 2023 02:58:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1675853880; x=1707389880; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=kS5bXobcd6NfCD1W2BItNtdkyqBgpEkjK5VH+u/635c=; b=bLFDE85owU//HefEWFoltxkZ4bSvecTuUck1eDgmDgRCtWjDfMqO2ye7 pu8oJWCYu7mIpYaYRKnzna5EBBCKJY76ZRR/pAj616ynq+w7IQX5kNZtN 2/mAq16okpBRyp9TLgQ/23vObhrUnZ9U3WlgUCB+R40zDbbfA8Ipxu/7M 8UhaJnOkzuHpJyrgo7Pyg42cvPJLS0D+Q66DcLKQ0uA2LRolquBRJlBIK B9RQl+y6r9xq0D3391sYEdIAlFLF+GIAFHmKp2QH7zpeAgrn/lWprstBF MqIMobLCneRXWC2qY3EeYVy5CHQHfXvw/g2RK6EF1doPps2sjxIoj2FaF g==; X-IronPort-AV: E=Sophos;i="5.97,280,1669046400"; d="scan'208";a="221115635" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 08 Feb 2023 18:58:00 +0800 IronPort-SDR: xlJiOabWiv/rzEU1QFt+xNFFnlf8dd0TrFcGBkVYkVQTXnudunWzyoHsLbYljViZOkC8rwhzZE BihsPnbvlbQGAAsVeMzdVD0pjt5E85lZcGYaGmvpXT6wGuplRLVolDSGy+P7EIgsZVYlODpbUB qtEJSOr3yLq41btWRPfBLvz+zkJPIU1nFKYU9Ls/bKxGssyd1+qEVBoBRDICnrLVh15GX4EKZf YphU8K2yFBcKuUZbKUaCiA6y9iWrbMr6pJpfJFnkXkqgDUTIWEW39uMKH4gLhNdSI/4/JEZZKq ddg= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 08 Feb 2023 02:15:16 -0800 IronPort-SDR: fl/ednZ+FTDHc2969T8zYv41O7qDpDBeWLIwflV3AmRO+op0RdIwP/REgkLgUSGTK6hW63nrWD 1eXD55Ofho1nQmAKr1gARD+flKs0y5woM9pyoSlwsIuEUmT1ZfXoEn/VSKlopSijzwg0ms5ann U2HW7R2HTUyjf3iWr2cIBzPx0tsDUKKTkUvMPNyaPbr4EG64t0Il0OYLEPadd8o3fYGCEjha5H wY2FpBEUfrVnt4OrCGkUa8bsrmMreB1/4DkNr2G+Izz6s+Pc7GO2Cpjbtzp6cjFQGUMH7frSJK Ftk= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip02.wdc.com with ESMTP; 08 Feb 2023 02:58:00 -0800 From: Johannes Thumshirn To: linux-btrfs@vger.kernel.org Cc: Johannes Thumshirn , Josef Bacik Subject: [PATCH v5 07/13] btrfs: add raid stripe tree pretty printer Date: Wed, 8 Feb 2023 02:57:44 -0800 Message-Id: <64a6d9c550f9b1e52e3b3817e6f9d90668c2ae31.1675853489.git.johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.39.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Decode raid-stripe-tree entries on btrfs_print_tree(). Reviewed-by: Josef Bacik Signed-off-by: Johannes Thumshirn --- fs/btrfs/print-tree.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/fs/btrfs/print-tree.c b/fs/btrfs/print-tree.c index b93c96213304..d9506d54298b 100644 --- a/fs/btrfs/print-tree.c +++ b/fs/btrfs/print-tree.c @@ -9,6 +9,7 @@ #include "print-tree.h" #include "accessors.h" #include "tree-checker.h" +#include "raid-stripe-tree.h" struct root_name_map { u64 id; @@ -28,6 +29,7 @@ static const struct root_name_map root_map[] = { { BTRFS_FREE_SPACE_TREE_OBJECTID, "FREE_SPACE_TREE" }, { BTRFS_BLOCK_GROUP_TREE_OBJECTID, "BLOCK_GROUP_TREE" }, { BTRFS_DATA_RELOC_TREE_OBJECTID, "DATA_RELOC_TREE" }, + { BTRFS_RAID_STRIPE_TREE_OBJECTID, "RAID_STRIPE_TREE" }, }; const char *btrfs_root_name(const struct btrfs_key *key, char *buf) @@ -187,6 +189,20 @@ static void print_uuid_item(struct extent_buffer *l, unsigned long offset, } } +static void print_raid_stripe_key(struct extent_buffer *eb, u32 item_size, + struct btrfs_stripe_extent *stripe) +{ + int num_stripes; + int i; + + num_stripes = item_size / sizeof(struct btrfs_raid_stride); + + for (i = 0; i < num_stripes; i++) + pr_info("\t\t\tstride %d devid %llu physical %llu\n", i, + btrfs_raid_stride_devid_nr(eb, stripe, i), + btrfs_raid_stride_physical_nr(eb, stripe, i)); +} + /* * Helper to output refs and locking status of extent buffer. Useful to debug * race condition related problems. @@ -351,6 +367,11 @@ void btrfs_print_leaf(struct extent_buffer *l) print_uuid_item(l, btrfs_item_ptr_offset(l, i), btrfs_item_size(l, i)); break; + case BTRFS_RAID_STRIPE_KEY: + print_raid_stripe_key(l, btrfs_item_size(l, i), + btrfs_item_ptr(l, i, + struct btrfs_stripe_extent)); + break; } } } From patchwork Wed Feb 8 10:57:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13132974 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 812B1C05027 for ; Wed, 8 Feb 2023 10:58:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229500AbjBHK6H (ORCPT ); Wed, 8 Feb 2023 05:58:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35950 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229807AbjBHK6D (ORCPT ); Wed, 8 Feb 2023 05:58:03 -0500 Received: from esa4.hgst.iphmx.com (esa4.hgst.iphmx.com [216.71.154.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8753513538 for ; Wed, 8 Feb 2023 02:58:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1675853882; x=1707389882; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=e4jlg4v/7Iq8bHhberMtgSbo8vYGpRLrYipOjTT9yAU=; b=jHSSbHk67dMsmUm4Ni1ZT8i/ZypWtbbaQi/n4XtLQWImzGdVpcYP92VH KBP4zKnr+kx9DMfDo6seUvhgNf6GtDUOjIpv3XyVkE/oyU4W7pznksljl w+XVz7il7RGJJG0QeXKACJQqJjfdcHrNC+ipl+E61K91guG7JX76K85Og jAOiPjWOjDi8lVKtkMElLxmBp7MB/dFT66snmVBUXOdS+DFhGl1XvtEhL 2DhphqP5RvCFDGrJPljybrukRd1ZRVwq/17szFBmBjgPxD17LP2Vx+ycx NS4NEIpnHVaI9UQr5nZbSj5v9dvPNezvbILQyjxymf+/9SiS5iej0aaqD w==; X-IronPort-AV: E=Sophos;i="5.97,280,1669046400"; d="scan'208";a="221115638" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 08 Feb 2023 18:58:01 +0800 IronPort-SDR: nYOh7JaBQ2MaiynWaGhRP4l4hy6xAh1Ecc3x+JUiKsCd/2GuaTL4sYbrkFfHVdGGqNQVocPl+u g/JO/tL6NOoYFsAWAdF+vVx9Iyy9fWLbkiSWPsGk2JPr6Et48CMvIr1c32pGX5Tv19hvAu4oq1 TtIRZy9FmwrUZOMwdC0MtzATUJw60l3au6VH8R5/hzjqaom76I4q6KfI0AlyTTIXcM5YyG+T8K gchz0742VBn4SEAWzwLgnMWDd3gAKAe4kCkaoE/kYbqjAax1TJalaBDhRIa5PN3/swT/E+Fl9I /wk= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 08 Feb 2023 02:15:16 -0800 IronPort-SDR: NWm5MLkJ0KZf6ZQ3WHghNrOfc7rxBw+/QtkbDfbdb1+PMRMN1jpe7PYxB6Be8RmrQNNgsy287m 6JpwbGwPsUQNlxxv94DFnGe2nC3PRi1YKh4HwV+abR2ocrxA/YauKM9LQXE3nDnoYxYkPuDXE8 KDJxd+3uLfFYDAfXV6H2TLa3UNa5DRqCAwsuCZqPgSs60wo5ZrsK6XzVonhkUUoGj3hJJdwICF VAitOQ34S7Rb4btm2rWI70kI2yzjTIYfoJMtU641VMEHSIbYYj38B1K9TOAtkH9HTg07MdiTp2 Aps= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip02.wdc.com with ESMTP; 08 Feb 2023 02:58:01 -0800 From: Johannes Thumshirn To: linux-btrfs@vger.kernel.org Cc: Johannes Thumshirn Subject: [PATCH v5 08/13] btrfs: zoned: allow zoned RAID Date: Wed, 8 Feb 2023 02:57:45 -0800 Message-Id: <946bf77cc07eba1b536466c6da1ce8c575865e7e.1675853489.git.johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.39.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org When we have a raid-stripe-tree, we can do RAID0/1/10 on zoned devices for data block-groups. For meta-data block-groups, we don't actually need anything special, as all meta-data I/O is protected by the btrfs_zoned_meta_io_lock() already. Signed-off-by: Johannes Thumshirn Reviewed-by: Josef Bacik --- fs/btrfs/raid-stripe-tree.c | 4 ++++ fs/btrfs/raid-stripe-tree.h | 10 +++++++++ fs/btrfs/volumes.c | 5 ++++- fs/btrfs/zoned.c | 45 +++++++++++++++++++++++++++++++++++-- 4 files changed, 61 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c index ba7015a8012c..1eaa97378d1c 100644 --- a/fs/btrfs/raid-stripe-tree.c +++ b/fs/btrfs/raid-stripe-tree.c @@ -268,10 +268,12 @@ static bool btrfs_physical_from_ordered_stripe(struct btrfs_fs_info *fs_info, int btrfs_get_raid_extent_offset(struct btrfs_fs_info *fs_info, u64 logical, u64 *length, u64 map_type, + u32 stripe_index, struct btrfs_io_stripe *stripe) { struct btrfs_root *stripe_root = btrfs_stripe_tree_root(fs_info); int num_stripes = btrfs_bg_type_to_factor(map_type); + const bool is_dup = map_type & BTRFS_BLOCK_GROUP_DUP; struct btrfs_stripe_extent *stripe_extent; struct btrfs_key stripe_key; struct btrfs_key found_key; @@ -343,6 +345,8 @@ int btrfs_get_raid_extent_offset(struct btrfs_fs_info *fs_info, if (btrfs_raid_stride_devid_nr(leaf, stripe_extent, i) != stripe->dev->devid) continue; + if (is_dup && (stripe_index - 1) != i) + continue; stripe->physical = btrfs_raid_stride_physical_nr(leaf, stripe_extent, i) + offset; ret = 0; diff --git a/fs/btrfs/raid-stripe-tree.h b/fs/btrfs/raid-stripe-tree.h index 9359df0ca3f1..c7f6c5377aaa 100644 --- a/fs/btrfs/raid-stripe-tree.h +++ b/fs/btrfs/raid-stripe-tree.h @@ -24,6 +24,7 @@ struct btrfs_ordered_stripe { int btrfs_get_raid_extent_offset(struct btrfs_fs_info *fs_info, u64 logical, u64 *length, u64 map_type, + u32 stripe_index, struct btrfs_io_stripe *stripe); int btrfs_delete_raid_extent(struct btrfs_trans_handle *trans, u64 start, u64 length); @@ -50,9 +51,18 @@ static inline bool btrfs_need_stripe_tree_update(struct btrfs_fs_info *fs_info, if (type != BTRFS_BLOCK_GROUP_DATA) return false; + if (profile & BTRFS_BLOCK_GROUP_DUP) + return true; + if (profile & BTRFS_BLOCK_GROUP_RAID1_MASK) return true; + if (profile & BTRFS_BLOCK_GROUP_RAID0) + return true; + + if (profile & BTRFS_BLOCK_GROUP_RAID10) + return true; + return false; } diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 7a784bb511ed..ef626f932af5 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -6322,7 +6322,8 @@ static int set_io_stripe(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, if (op == BTRFS_MAP_READ && btrfs_need_stripe_tree_update(fs_info, map->type)) return btrfs_get_raid_extent_offset(fs_info, logical, length, - map->type, dst); + map->type, stripe_index, + dst); dst->physical = map->stripes[stripe_index].physical + stripe_offset + stripe_nr * map->stripe_len; @@ -6508,6 +6509,8 @@ int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, * I/O context structure. */ if (smap && num_alloc_stripes == 1 && + !(btrfs_need_stripe_tree_update(fs_info, map->type) && + op != BTRFS_MAP_READ) && !((map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) && mirror_num > 1) && (!need_full_stripe(op) || !dev_replace_is_ongoing || !dev_replace->tgtdev)) { diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index ed49150e6e6f..9796f76cffd6 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -1476,8 +1476,9 @@ int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new) set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &cache->runtime_flags); break; case BTRFS_BLOCK_GROUP_DUP: - if (map->type & BTRFS_BLOCK_GROUP_DATA) { - btrfs_err(fs_info, "zoned: profile DUP not yet supported on data bg"); + if (map->type & BTRFS_BLOCK_GROUP_DATA && + !btrfs_stripe_tree_root(fs_info)) { + btrfs_err(fs_info, "zoned: data DUP profile needs stripe_root"); ret = -EINVAL; goto out; } @@ -1515,8 +1516,48 @@ int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new) cache->zone_capacity = min(caps[0], caps[1]); break; case BTRFS_BLOCK_GROUP_RAID1: + case BTRFS_BLOCK_GROUP_RAID1C3: + case BTRFS_BLOCK_GROUP_RAID1C4: case BTRFS_BLOCK_GROUP_RAID0: case BTRFS_BLOCK_GROUP_RAID10: + if (map->type & BTRFS_BLOCK_GROUP_DATA && + !btrfs_stripe_tree_root(fs_info)) { + btrfs_err(fs_info, + "zoned: data %s needs stripe_root", + btrfs_bg_type_to_raid_name(map->type)); + ret = -EIO; + goto out; + + } + + for (i = 0; i < map->num_stripes; i++) { + if (alloc_offsets[i] == WP_MISSING_DEV || + alloc_offsets[i] == WP_CONVENTIONAL) + continue; + + if (i == 0) + continue; + + if (alloc_offsets[0] != alloc_offsets[i]) { + btrfs_err(fs_info, + "zoned: write pointer offset mismatch of zones in RAID profile"); + ret = -EIO; + goto out; + } + if (test_bit(0, active) != test_bit(i, active)) { + if (!btrfs_zone_activate(cache)) { + ret = -EIO; + goto out; + } + } else { + if (test_bit(0, active)) + set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, + &cache->runtime_flags); + } + cache->zone_capacity = min(caps[0], caps[i]); + } + cache->alloc_offset = alloc_offsets[0]; + break; case BTRFS_BLOCK_GROUP_RAID5: case BTRFS_BLOCK_GROUP_RAID6: /* non-single profiles are not supported yet */ From patchwork Wed Feb 8 10:57:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13132973 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6D2E5C636CC for ; Wed, 8 Feb 2023 10:58:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229706AbjBHK6G (ORCPT ); Wed, 8 Feb 2023 05:58:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35952 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229509AbjBHK6D (ORCPT ); Wed, 8 Feb 2023 05:58:03 -0500 Received: from esa4.hgst.iphmx.com (esa4.hgst.iphmx.com [216.71.154.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BA37113DDE for ; Wed, 8 Feb 2023 02:58:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1675853882; x=1707389882; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zDSC7QZ7gl2/8xr/fpmdU9u2tvW8IXXjHfrdTCN8mlQ=; b=P3X3iqVJtu4mBO7nj10hLvytWxyCjMobp2m8MGEGYLlyoXw1ZjNbApJZ cfsGtp9r8pr7Hruii7juaQA+3En1zHs5MkL7LUi7VPNmRcAc40r0ZA2sK 6RSP9+BLYfF+v0N2tQfW5UTCd6pyuqP7/OHyPeaBA4q7KxTdf89rAet9+ 8tPqXV181QXTgilc/N3PDzjjdsbScLcH0w1+ctYo6qLEOAt7E0Ygwbh6S EExw+jf6mbxzbtFg8GUaS9QWX+TW6l5UDH9R/sUUuV60SLkp+6SdJFF/r xmk+DLcOYNSoBaFxziepG7eDyhBFAnHCJ4+3Swsw+Q73KehWP6iA0ItvA Q==; X-IronPort-AV: E=Sophos;i="5.97,280,1669046400"; d="scan'208";a="221115641" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 08 Feb 2023 18:58:01 +0800 IronPort-SDR: 134QBTgmmymvUDKry1Flwy+b8UmPuMS1/t7uC728V8km8ZfB6FKcBWq/1hVhuMix1uXFIBvt0M 8NRd1Fzb68k1Rv5pJBCHopHebPFPd7rBQ8RDqO6dBGW58AOLTSGhFncx4XOOGeq17jMMHudl2t DvYa3ecFkSfKxvArel8pWltCu3n2WBSoAJM/hvloxzrjyoEfEMi+BGK7uaiJKqXZpAX9YeAoJe Q6C/giY45CmoOMmRjRdtVP8jlgX3dQPaEcHLyT8R3O4JgSD41mxqeD918sa7CYFfuTSLzCW1hC xfM= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 08 Feb 2023 02:15:17 -0800 IronPort-SDR: zDowUkgHzqoWjadgupYPsJ5fjxFcNmsYoe7BeZX/GKTSAzaTksuiuBanbqyFbrGefdx2021M2a bAtbjkCTjuO5wYldv6KcSJ+GCrnLVf7OslQdAqKm6/1eLwluOglAD8ntxK3yNePLJ7VuiPdULx wJzdtWJCS1G2Yd0k4HvcNRMPlsMR6PlREXB5FjpXikjKl08gY1uCA/zD4MJ8Y8duphZmi4pRwS 59j7vO/2zaKKfdLW80/PYmXgrbXDl3xVM9ZKXPGExjx4+Ne7nIdW/HcB6NGRqL4D7/a0GGhSya 87Y= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip02.wdc.com with ESMTP; 08 Feb 2023 02:58:02 -0800 From: Johannes Thumshirn To: linux-btrfs@vger.kernel.org Cc: Johannes Thumshirn Subject: [PATCH v5 09/13] btrfs: check for leaks of ordered stripes on umount Date: Wed, 8 Feb 2023 02:57:46 -0800 Message-Id: X-Mailer: git-send-email 2.39.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Check if we're leaking any ordered stripes when unmounting a filesystem with an stripe tree. This check is gated behind CONFIG_BTRFS_DEBUG to not affect any production type systems. Signed-off-by: Johannes Thumshirn Reviewed-by: Josef Bacik --- fs/btrfs/disk-io.c | 2 ++ fs/btrfs/raid-stripe-tree.c | 30 ++++++++++++++++++++++++++++++ fs/btrfs/raid-stripe-tree.h | 1 + 3 files changed, 33 insertions(+) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index b130c8dcd8d9..f2de4d3600d6 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -52,6 +52,7 @@ #include "relocation.h" #include "scrub.h" #include "super.h" +#include "raid-stripe-tree.h" #define BTRFS_SUPER_FLAG_SUPP (BTRFS_HEADER_FLAG_WRITTEN |\ BTRFS_HEADER_FLAG_RELOC |\ @@ -1538,6 +1539,7 @@ void btrfs_free_fs_info(struct btrfs_fs_info *fs_info) btrfs_put_root(fs_info->stripe_root); btrfs_check_leaked_roots(fs_info); btrfs_extent_buffer_leak_debug_check(fs_info); + btrfs_check_ordered_stripe_leak(fs_info); kfree(fs_info->super_copy); kfree(fs_info->super_for_commit); kfree(fs_info->subpage_info); diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c index 1eaa97378d1c..32a428413f5b 100644 --- a/fs/btrfs/raid-stripe-tree.c +++ b/fs/btrfs/raid-stripe-tree.c @@ -36,6 +36,36 @@ static int ordered_stripe_less(struct rb_node *rba, const struct rb_node *rbb) return ordered_stripe_cmp(&stripe->logical, rbb); } +void btrfs_check_ordered_stripe_leak(struct btrfs_fs_info *fs_info) +{ +#ifdef CONFIG_BTRFS_DEBUG + struct rb_node *node; + + if (!btrfs_stripe_tree_root(fs_info) || + RB_EMPTY_ROOT(&fs_info->stripe_update_tree)) + return; + + WARN_ON_ONCE(1); + write_lock(&fs_info->stripe_update_lock); + while ((node = rb_first_postorder(&fs_info->stripe_update_tree)) + != NULL) { + struct btrfs_ordered_stripe *stripe = + rb_entry(node, struct btrfs_ordered_stripe, rb_node); + + write_unlock(&fs_info->stripe_update_lock); + btrfs_err(fs_info, + "ordered_stripe [%llu, %llu] leaked, refcount=%d", + stripe->logical, stripe->logical + stripe->num_bytes, + refcount_read(&stripe->ref)); + while (refcount_read(&stripe->ref) > 1) + btrfs_put_ordered_stripe(fs_info, stripe); + btrfs_put_ordered_stripe(fs_info, stripe); + write_lock(&fs_info->stripe_update_lock); + } + write_unlock(&fs_info->stripe_update_lock); +#endif +} + int btrfs_add_ordered_stripe(struct btrfs_io_context *bioc) { struct btrfs_fs_info *fs_info = bioc->fs_info; diff --git a/fs/btrfs/raid-stripe-tree.h b/fs/btrfs/raid-stripe-tree.h index c7f6c5377aaa..371409351d60 100644 --- a/fs/btrfs/raid-stripe-tree.h +++ b/fs/btrfs/raid-stripe-tree.h @@ -38,6 +38,7 @@ struct btrfs_ordered_stripe *btrfs_lookup_ordered_stripe( int btrfs_add_ordered_stripe(struct btrfs_io_context *bioc); void btrfs_put_ordered_stripe(struct btrfs_fs_info *fs_info, struct btrfs_ordered_stripe *stripe); +void btrfs_check_ordered_stripe_leak(struct btrfs_fs_info *fs_info); static inline bool btrfs_need_stripe_tree_update(struct btrfs_fs_info *fs_info, u64 map_type) From patchwork Wed Feb 8 10:57:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13132976 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AF0F9C64EC5 for ; Wed, 8 Feb 2023 10:58:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230176AbjBHK6J (ORCPT ); Wed, 8 Feb 2023 05:58:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35980 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229888AbjBHK6F (ORCPT ); Wed, 8 Feb 2023 05:58:05 -0500 Received: from esa4.hgst.iphmx.com (esa4.hgst.iphmx.com [216.71.154.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 29A4B1BD0 for ; Wed, 8 Feb 2023 02:58:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1675853882; x=1707389882; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mkCi1575S3bvKqEmnvqKepM6XD8ejcSOfuyZUT30T/Q=; b=Vn9dbuVtHMFufaK3wJCodRFAsur1YP2KO+l1w8SNtE+SoME2M0mnyaTe PjkfiXrb0vYCBctZYUkNGDQon0DHw/VtpdpvPVUYTUgxGTIhudmPTkuBv QIH4kagu1wK8sbKO3Uv4MkVU2t6lW5CHyBRkwT/pnNi3jNGiFo5ywFPWC HnHX8dWDHPGVvXaXfTvJxn5UpCxxxsvolgcxVWlbaBzu/0BOZJk0f9tp+ PUImYV025iIBBkbkvYz5fLdHr6fBo7zevk53rQ/3d1gmfWevH1OKjzAfu f/GC8xNCVGc2+y4Tl/ieIezfyO1+Q7ppx9qculEqDzEt+gK+oVT9KQ7q3 g==; X-IronPort-AV: E=Sophos;i="5.97,280,1669046400"; d="scan'208";a="221115643" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 08 Feb 2023 18:58:02 +0800 IronPort-SDR: d6i0lgBjrV9jJp4DWwpqtp3hsYdU+JUihQErUFZjGRCFUzyZxH9XD8V2fErpci/InhbRrhrbLV PQOLlKtJXHSl6CAe4clUdNw2/Ffu8OYpCa73x72vV99Fn5bI8hAOENt2Au8uHMpWdr3izWR2kz qMmxymUCH8z/0ZM+icUXa664UKoPi/BxvCdfS7jl4S6bSWvDXntElTmALoSJsBQYjEXnsVj80A 8wmaKTFNwyVbyba7MbDFKmEQkogmj2uOwdNg9tebuILh7Bac6K/gEGC4GIi0cGW8NHw/b3TYsF ZPA= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 08 Feb 2023 02:15:18 -0800 IronPort-SDR: qrQcgr7BNOhtmqrB6XpopO3fizpjthm5ULRsnngL7GEgHNC6RLlY2BtIlf5T5+xtBBWmkCNiA7 7QaoIHObR0ae+AX/zhrO10WcJgTMf+xF/huIVNIm2Ex5jNeBkeLxU6xslZT8OZG+oEj+gGj4KO oWZPWVb0kJ55dlrU7/wiJiJI3SCtWHqatyQ9vJJ3L5bKTQo1R9gjJNb4s4qCcP/C97cBVLB6pe kO/LkNi7a2ug4+xmgSpkJDZ3eRP7ybSwYhMVtZo9LhEy2XgVIq5HUyIjQEl1Whd76e+L/QUW9J sS4= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip02.wdc.com with ESMTP; 08 Feb 2023 02:58:02 -0800 From: Johannes Thumshirn To: linux-btrfs@vger.kernel.org Cc: Johannes Thumshirn , Josef Bacik Subject: [PATCH v5 10/13] btrfs: add tracepoints for ordered stripes Date: Wed, 8 Feb 2023 02:57:47 -0800 Message-Id: <10aae0632d221643b66784b9f1dccdee1c90fc5b.1675853489.git.johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.39.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Add tracepoints to check the lifetime of btrfs_ordered_stripe entries. Reviewed-by: Josef Bacik Signed-off-by: Johannes Thumshirn --- fs/btrfs/raid-stripe-tree.c | 3 +++ fs/btrfs/super.c | 1 + include/trace/events/btrfs.h | 50 ++++++++++++++++++++++++++++++++++++ 3 files changed, 54 insertions(+) diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c index 32a428413f5b..4d4c7870547c 100644 --- a/fs/btrfs/raid-stripe-tree.c +++ b/fs/btrfs/raid-stripe-tree.c @@ -113,6 +113,7 @@ int btrfs_add_ordered_stripe(struct btrfs_io_context *bioc) write_unlock(&fs_info->stripe_update_lock); } + trace_btrfs_ordered_stripe_add(fs_info, stripe); return 0; } @@ -128,6 +129,7 @@ struct btrfs_ordered_stripe *btrfs_lookup_ordered_stripe(struct btrfs_fs_info *f if (node) { stripe = rb_entry(node, struct btrfs_ordered_stripe, rb_node); refcount_inc(&stripe->ref); + trace_btrfs_ordered_stripe_lookup(fs_info, stripe); } read_unlock(&fs_info->stripe_update_lock); @@ -138,6 +140,7 @@ void btrfs_put_ordered_stripe(struct btrfs_fs_info *fs_info, struct btrfs_ordered_stripe *stripe) { write_lock(&fs_info->stripe_update_lock); + trace_btrfs_ordered_stripe_put(fs_info, stripe); if (refcount_dec_and_test(&stripe->ref)) { struct rb_node *node = &stripe->rb_node; diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index e5136baef9af..7235106e8d08 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -59,6 +59,7 @@ #include "verity.h" #include "super.h" #include "extent-tree.h" +#include "raid-stripe-tree.h" #define CREATE_TRACE_POINTS #include diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h index 75d7d22c3a27..8efea1445dd9 100644 --- a/include/trace/events/btrfs.h +++ b/include/trace/events/btrfs.h @@ -33,6 +33,7 @@ struct btrfs_space_info; struct btrfs_raid_bio; struct raid56_bio_trace_info; struct find_free_extent_ctl; +struct btrfs_ordered_stripe; #define show_ref_type(type) \ __print_symbolic(type, \ @@ -2492,6 +2493,55 @@ DEFINE_EVENT(btrfs_raid56_bio, raid56_scrub_read_recover, TP_ARGS(rbio, bio, trace_info) ); +DECLARE_EVENT_CLASS(btrfs__ordered_stripe, + + TP_PROTO(const struct btrfs_fs_info *fs_info, + const struct btrfs_ordered_stripe *stripe), + + TP_ARGS(fs_info, stripe), + + TP_STRUCT__entry_btrfs( + __field( u64, logical ) + __field( u64, num_bytes ) + __field( int, num_stripes ) + __field( int, ref ) + ), + + TP_fast_assign_btrfs(fs_info, + __entry->logical = stripe->logical; + __entry->num_bytes = stripe->num_bytes; + __entry->num_stripes = stripe->num_stripes; + __entry->ref = refcount_read(&stripe->ref); + ), + + TP_printk_btrfs("logical=%llu, num_bytes=%llu, num_stripes=%d, ref=%d", + __entry->logical, __entry->num_bytes, + __entry->num_stripes, __entry->ref) +); + +DEFINE_EVENT(btrfs__ordered_stripe, btrfs_ordered_stripe_add, + + TP_PROTO(const struct btrfs_fs_info *fs_info, + const struct btrfs_ordered_stripe *stripe), + + TP_ARGS(fs_info, stripe) +); + +DEFINE_EVENT(btrfs__ordered_stripe, btrfs_ordered_stripe_lookup, + + TP_PROTO(const struct btrfs_fs_info *fs_info, + const struct btrfs_ordered_stripe *stripe), + + TP_ARGS(fs_info, stripe) +); + +DEFINE_EVENT(btrfs__ordered_stripe, btrfs_ordered_stripe_put, + + TP_PROTO(const struct btrfs_fs_info *fs_info, + const struct btrfs_ordered_stripe *stripe), + + TP_ARGS(fs_info, stripe) +); #endif /* _TRACE_BTRFS_H */ /* This part must be outside protection */ From patchwork Wed Feb 8 10:57:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13132975 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2C9ECC636D7 for ; Wed, 8 Feb 2023 10:58:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229688AbjBHK6I (ORCPT ); Wed, 8 Feb 2023 05:58:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35970 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230075AbjBHK6E (ORCPT ); Wed, 8 Feb 2023 05:58:04 -0500 Received: from esa4.hgst.iphmx.com (esa4.hgst.iphmx.com [216.71.154.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DFAEB144A3 for ; Wed, 8 Feb 2023 02:58:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1675853883; x=1707389883; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=I0Gn7m5tulzrtzgvKene6GK3lQmNBfTo7Rai58K13dY=; b=NdLYgc4jfbPuW1flTxpXs1gPoh1FdM1s9io818qeSBzMrY/qjWaZOki0 +NA/nqVEz4fslAPguNEnG0Kv0h0Ypc3EzaKVXwoQyx2q6gyh/Mwbl6DMR NqUXzGOmQdN8R6JCrerI0nkSmDwFGeosRbvbgoyHyaHyb/WH2XNBDOBbj H0Jmitcgc4e1uNZA96YwmVWpEaqI77W51uCjbMk/AjYQw/VdVxV+1ZsZx AhvOuNFoLJQapjcA1K0uVSDEf+Q5ZcvtBF/np7qmD6y+zd7J9aTokJUE4 1Ye3FQJkGETEooSq7XewtCuUvzlajcmVAe+9Lzsil/IYedgBXTFI/E+WI w==; X-IronPort-AV: E=Sophos;i="5.97,280,1669046400"; d="scan'208";a="221115647" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 08 Feb 2023 18:58:03 +0800 IronPort-SDR: lpryzls5OmJYtJozy7BIsIOjzAof6GRqT9kAX6OEMXDmitRtF457KlZjiHBfOmwQHcTKZ2Tl7J zbujgkZYojzfLXBseCINfiPWUaPVMafbcb3bGYO3IEpzmOBNSN6TQP96f/NQRWyapKcpSvsfji bqdDo+sPrPng1p9o5nObT9b384RnIVlGpoXPGSZuVrkwR3Ns80nYLrq3il1VqyPZuXQ5mUt0g7 GN6RDSxMfWlcAtNoNzDGUTjmGT0JlLcH7qt9T7Lt4bsh3DW7N1pBXfwDdbsD3hCT/04iWhQDyM Am8= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 08 Feb 2023 02:15:19 -0800 IronPort-SDR: 9CiCrT+4HAfOy8YeLBz12EkL75j0TeWG/mV6Am4emqaHHe7KjoiYsBH5MVnulamheOsV3NZ5M7 NafwTu0yHIoUpRYSYj1sHx5ldM+O5Rrz+i9B5GgFaCjDU8lgzGm0ToTDGj7uDvVdcVt3aCeibO e3un90zoYWRG/BcUCc/awPUQO/AUfKv/tt6h+H2HsLk9ZTNiY32GaNBYfMhuwKfxVe7iuCMS1Y +NCe6f3QMd4+D6zSknHk5TZumAFUZ/djSTlkzY58tRYkOmfkB4ifgntpVgN4hZUKAIo2KM433p 9tc= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip02.wdc.com with ESMTP; 08 Feb 2023 02:58:03 -0800 From: Johannes Thumshirn To: linux-btrfs@vger.kernel.org Cc: Johannes Thumshirn Subject: [PATCH v5 11/13] btrfs: announce presence of raid-stripe-tree in sysfs Date: Wed, 8 Feb 2023 02:57:48 -0800 Message-Id: X-Mailer: git-send-email 2.39.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org If a filesystem with a raid-stripe-tree is mounted, show the RST feature in sysfs. Signed-off-by: Johannes Thumshirn Reviewed-by: Josef Bacik --- fs/btrfs/sysfs.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index 108aa3876186..776f9b897642 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -296,6 +296,8 @@ BTRFS_FEAT_ATTR_INCOMPAT(zoned, ZONED); #ifdef CONFIG_BTRFS_DEBUG /* Remove once support for extent tree v2 is feature complete */ BTRFS_FEAT_ATTR_INCOMPAT(extent_tree_v2, EXTENT_TREE_V2); +/* Remove once support for raid stripe tree is feature complete */ +BTRFS_FEAT_ATTR_INCOMPAT(raid_stripe_tree, RAID_STRIPE_TREE); #endif #ifdef CONFIG_FS_VERITY BTRFS_FEAT_ATTR_COMPAT_RO(verity, VERITY); @@ -326,6 +328,7 @@ static struct attribute *btrfs_supported_feature_attrs[] = { #endif #ifdef CONFIG_BTRFS_DEBUG BTRFS_FEAT_ATTR_PTR(extent_tree_v2), + BTRFS_FEAT_ATTR_PTR(raid_stripe_tree), #endif #ifdef CONFIG_FS_VERITY BTRFS_FEAT_ATTR_PTR(verity), From patchwork Wed Feb 8 10:57:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13132978 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2242CC05027 for ; Wed, 8 Feb 2023 10:58:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229888AbjBHK6L (ORCPT ); Wed, 8 Feb 2023 05:58:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36002 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229527AbjBHK6G (ORCPT ); Wed, 8 Feb 2023 05:58:06 -0500 Received: from esa4.hgst.iphmx.com (esa4.hgst.iphmx.com [216.71.154.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8BD931350E for ; Wed, 8 Feb 2023 02:58:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1675853884; x=1707389884; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=oh3KNTXYN5CoEVcuSykxoAqYMbfvM/Ya1AGBLL/lGNE=; b=eyFQVgz3/gBQqYIJqHUh/wTWphye+rnYqVn8KhNPfwi87IalcPF/QtT3 i3EmmY0KBdXdS6DaUywgleqBhjWOdkmp0NG1iRBiXHFtXzOgHggjkNNGS aq3dgw9lF1r7RZ+DprYTBzvGefRRqapWCed6YfU+jJCbNR2bKN6iumacM dH1LpTp6ClWDRe4dq0K0CGrv5zbDo1SBuY3IeyWMeo8Mkpsc7U6hWmn5T 7xbqIQpecc6JOf+HKd+S5GXCUW+Ot+o1lGZ9ezv3y4LA6QmGXGnvJt3JE LTHA+98AwUEl5gwL3o/bLf+u+8xHKP5M7MCSvJbc7y8cu70x3ZP0jEkWs Q==; X-IronPort-AV: E=Sophos;i="5.97,280,1669046400"; d="scan'208";a="221115649" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 08 Feb 2023 18:58:04 +0800 IronPort-SDR: t2u/Ll4yiTp6szkrMji2SNZXqd86l7rGf5wzKevrzjXzksDCYiPKHc6jMY9Z47IeFGB88RbEXx ShGo4M3fUvJuAQOZEzTRpQlEago1jsh3uGMK9F3A074PUk1DC0aXcI4f3DApvl/bkta6kCiPWu WMqDC8uCbXK66dh3slB6DAuO27UlBqrJJwWDZjfwEh4xGA/QFRem7id0P9nO0me+cKzcTgS6Az HKxtCaqqq8m3TnQ+9XrW5JWt1g2AjjlmHk9WHiZLLgq2dGDxBIy8hd29O7YlAgXSOEotFX2iE4 szs= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 08 Feb 2023 02:15:19 -0800 IronPort-SDR: XLgaSVOg5XvCTxAKltyWRm4Biuvl3thLsBhMEzOUhy/3WOssA/SeTV0uFhhY5Ampi2yhDSb3wn r2oxk+or5x3EQaQj7k3uzpSqfLjqkUpxX4S55gica30EkzSPbZln9ChHQFR9aDS1jLaJm7OxsU QRbCkkmdOlmWf+zcl+PsDFRuZBFQrh1Az1Gq8BF/7CjGG+Lp7bjaZ6t0nmjxNPepDcPtEGoEcB Vs/HFXz5x1NjSQR4DOSHvan0djdU9kNOs+AhTlRtubUD8BhBOIYQWxNe5nzemC8skcnpCaZUVr 268= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip02.wdc.com with ESMTP; 08 Feb 2023 02:58:04 -0800 From: Johannes Thumshirn To: linux-btrfs@vger.kernel.org Cc: Johannes Thumshirn Subject: [PATCH v5 12/13] btrfs: consult raid-stripe-tree when scrubbing Date: Wed, 8 Feb 2023 02:57:49 -0800 Message-Id: X-Mailer: git-send-email 2.39.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org When scrubbing a filesystem which uses the raid-stripe-tree for logical to physical address translation, consult the RST to perform the address translation instead of relying on fixed block group offsets. Signed-off-by: Johannes Thumshirn Reviewed-by: Josef Bacik --- fs/btrfs/scrub.c | 33 +++++++++++++++++++++++++++++++-- 1 file changed, 31 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index a5d026041be4..d456dda8c5b0 100644 --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -24,6 +24,7 @@ #include "accessors.h" #include "file-item.h" #include "scrub.h" +#include "raid-stripe-tree.h" /* * This is only the first step towards a full-features scrub. It reads all @@ -2719,6 +2720,21 @@ static int scrub_extent(struct scrub_ctx *sctx, struct map_lookup *map, int ret; u8 csum[BTRFS_CSUM_SIZE]; u32 blocksize; + struct btrfs_io_stripe stripe; + const bool stripe_update = + btrfs_need_stripe_tree_update(sctx->fs_info, map->type); + + if (stripe_update) { + stripe.dev = src_dev; + ret = btrfs_get_raid_extent_offset(sctx->fs_info, logical, + (u64 *)&len, + map->type, mirror_num, + &stripe); + if (ret) + return ret; + + src_physical = stripe.physical; + } if (flags & BTRFS_EXTENT_FLAG_DATA) { if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) @@ -2772,8 +2788,21 @@ static int scrub_extent(struct scrub_ctx *sctx, struct map_lookup *map, return ret; len -= l; logical += l; - physical += l; - src_physical += l; + if (stripe_update && len) { + + ret = btrfs_get_raid_extent_offset(sctx->fs_info, + logical, (u64 *)&len, + map->type, mirror_num, + &stripe); + if (ret) + return ret; + + src_physical = stripe.physical; + physical = stripe.physical; + } else { + physical += l; + src_physical += l; + } } return 0; } From patchwork Wed Feb 8 10:57:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13132979 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16D54C636CC for ; Wed, 8 Feb 2023 10:58:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230211AbjBHK6L (ORCPT ); Wed, 8 Feb 2023 05:58:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36018 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230167AbjBHK6G (ORCPT ); Wed, 8 Feb 2023 05:58:06 -0500 Received: from esa4.hgst.iphmx.com (esa4.hgst.iphmx.com [216.71.154.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7E22715CA6 for ; Wed, 8 Feb 2023 02:58:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1675853885; x=1707389885; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=HoksUmEZ3eyxw4qrFGWYGpsHqq/p34pWEDYoQk8BqRU=; b=Rv41EJcAhOet/9UejZUlhxaOyxAWgtAXwaApQbvs+Y/EafttgWkS9Q5s 4dAh/IikiFz0vimTLkxFxjVUj2t8lPFjVYyYu4i3R/3leNQwQmxHJsqE9 VO32WAjlv9jBZ68OeVBK9fhhfyFsofNwvhaSSeB+YwKE/rKd8eNly9a6K PYxj5mC6bJteekTi/yNVjZ1JfZghEcrWvIT/IY4R6F66TdHVmvOH5NOyZ HVU2sH1uQg3RKSyobM57FkXEdsEFwy2b58EuXwGRS7NYfWtYm+X1MJ6Q+ m9W6cLDAVeraaeq0c7VopeA/ECHaJiLG3odB6iYm8ju71KCJJCfVsqf3r Q==; X-IronPort-AV: E=Sophos;i="5.97,280,1669046400"; d="scan'208";a="221115651" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 08 Feb 2023 18:58:04 +0800 IronPort-SDR: SWfL8jaqcHeZlRFN/iU8S/T9bpWQE9u08tX5+kNx1WJ32Emy4b9Spq6RgTi/0gOB5z411kvjLz 0Y0GMArud/wqGkmQjQjy8SOtqqoMBfO8LPyxEJKMKGG0gPa1/TslcScUKcpCa3S8wNd63doxja GuQFxZYf/T+ASGHid81nFEtHAUXLUJQe9dkCBl+SoQkGgmCMKNZgVBQv4EhDD+dHnKu16eVK6U tGbXPAitIj3KI9q0fTJbkvdy8FcUth7NkqeVgv6alXE99QKwQ8iZKrO3HtVoih5kVByh14oWMO FdA= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 08 Feb 2023 02:15:20 -0800 IronPort-SDR: dhPxIrbEdKWdJm9/pPFogxkHR3KlnStyFExDborFFwQfxf1iXYpGNKdhoMuXj+ARrX00kGy7sU i839BwTwb3l+anwiV6T2W2P+/EJGcQgMeWMR1fTKEjSn+V9sGV9HmS5QuA0gZCUcIrYKOG/Vfx 2dTPawVjifXVhaEoKDPQRKQj7iY5dAWLs3ijSFzmTWurMJkFknwKoZkWmyCoMM9Tr70vVVPc6E SC5dxRI2oAX0QRMBToRlgpTR7FI4FQo2b24KSebC0cWFve6k4xySKoGcJ50wtndzoqwHXSBIX7 6yo= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip02.wdc.com with ESMTP; 08 Feb 2023 02:58:05 -0800 From: Johannes Thumshirn To: linux-btrfs@vger.kernel.org Cc: Johannes Thumshirn Subject: [PATCH v5 13/13] btrfs: add raid-stripe-tree to features enabled with debug Date: Wed, 8 Feb 2023 02:57:50 -0800 Message-Id: X-Mailer: git-send-email 2.39.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Until the RAID stripe tree code is well enough tested and feature complete, "hide" it behind CONFIG_BTRFS_DEBUG so only people who want to use it are actually using it. Signed-off-by: Johannes Thumshirn Reviewed-by: Josef Bacik --- fs/btrfs/fs.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h index bee7ed0304cd..c0d6dd89e3b0 100644 --- a/fs/btrfs/fs.h +++ b/fs/btrfs/fs.h @@ -214,7 +214,8 @@ enum { BTRFS_FEATURE_INCOMPAT_METADATA_UUID | \ BTRFS_FEATURE_INCOMPAT_RAID1C34 | \ BTRFS_FEATURE_INCOMPAT_ZONED | \ - BTRFS_FEATURE_INCOMPAT_EXTENT_TREE_V2) + BTRFS_FEATURE_INCOMPAT_EXTENT_TREE_V2 | \ + BTRFS_FEATURE_INCOMPAT_RAID_STRIPE_TREE) #else #define BTRFS_FEATURE_INCOMPAT_SUPP \ (BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF | \