From patchwork Thu Mar 2 09:45:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13156939 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 53C0BC678D4 for ; Thu, 2 Mar 2023 09:46:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230000AbjCBJp6 (ORCPT ); Thu, 2 Mar 2023 04:45:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53330 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230039AbjCBJp4 (ORCPT ); Thu, 2 Mar 2023 04:45:56 -0500 Received: from esa2.hgst.iphmx.com (esa2.hgst.iphmx.com [68.232.143.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4EE143B875 for ; Thu, 2 Mar 2023 01:45:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1677750341; x=1709286341; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=6/eo7HuIQ7ocuWDBEO4ykBPaYkvBhZ5F6L1dYcDQec0=; b=rhEPno7D9t5bwiP0+wpiDnoQeVYJWJoHvpK5VbDS5eJDM7GJRy+lx9NG EP9oYwlHJoJGXjanL9zXlcJ0fHzLTSpvZuPqUKZQ6zuHLToBJkcwp9FOd BQO/yT1GC27YAAAiWPMXbpyWK5YM8f5MiTURbU5rCe02eFlv9jkQt7POT zJyUgXQU/w3CssrV3czh+0k1xlAh2p81+l4PnmZUenUspH7WUdy+mzhST M3h/x4HYG+jWXK2uOWmBaH6GDzWOLy9rqrdSKxgRe4t9oSwPk0JP+sVAP LARqVek067wH84UEcqnwFgqVDemoIehdFdESC+BBvtRxlEwWKjpQ1e8UD Q==; X-IronPort-AV: E=Sophos;i="5.98,227,1673884800"; d="scan'208";a="328939171" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 02 Mar 2023 17:45:39 +0800 IronPort-SDR: YOoGvDn0oNc2KOBu6AD4ze2YliDAQ31tjUTCwAt5fekiU/+bL7NOyOBPlc6p9V7+k87NXGsbQ9 gsmx4O+D/fMkEi1VMaOXbluSPcyes6DtQiMJrXch8te2Q+Mm1/lycR7pZiw08KAZNK10g1dr/S 9Nl4dqEBwM9+HStBwYicD7jeIhOsMarb/oQKLwzrkqk8mSLG/2QTI+lLHRNVCTl237Zg2+/bRJ jQOeAJ9fOQB4w+vRf6vqRvCMaai8omjoiv4ekuSR7+P4vRefJ6iLX72k7DKDci/V+pGjpK6TG/ hqM= Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 02 Mar 2023 00:56:43 -0800 IronPort-SDR: /IphbyYWcls/wcIATun39z8rzmRG2YEhCY4+OLjWBQ3/rIksw1iYfPcsXlbn5xxffwo623YanQ /j40ZF2HW6zpM33vv+ULjVhC74xV37X3rndwdWk2NhsD4VRdDkGXW3TYITly0qtysdVpBYzIje XE0soipKmucDCcpPfxrNJEtCz67nou3scrUUa9KloarFQdtRR4z10/8iTVixRu4B8nFXVmkFsU 6UYhP9nGSiB+h7QLEe7sGo1K+eDYCn5AaAwnn/iHPM9rcaxVFVTAWX370x9gDLhLsyEBcYygiR Gg4= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip01.wdc.com with ESMTP; 02 Mar 2023 01:45:39 -0800 From: Johannes Thumshirn To: David Sterba Cc: Johannes Thumshirn , linux-btrfs@vger.kernel.org, Josef Bacik , Christoph Hellwig Subject: [PATCH v7 01/13] btrfs: re-add trans parameter to insert_delayed_ref Date: Thu, 2 Mar 2023 01:45:23 -0800 Message-Id: X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Re-add the trans parameter to insert_delayed_ref as it is needed again later in this series. This reverts commit bccf28752a99 ("btrfs: drop trans parameter of insert_delayed_ref") Reviewed-by: Josef Bacik Signed-off-by: Johannes Thumshirn Reviewed-by: Anand Jain --- fs/btrfs/delayed-ref.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c index 886ffb232eac..7660ac642c81 100644 --- a/fs/btrfs/delayed-ref.c +++ b/fs/btrfs/delayed-ref.c @@ -598,7 +598,8 @@ void btrfs_delete_ref_head(struct btrfs_delayed_ref_root *delayed_refs, * Return 0 for insert. * Return >0 for merge. */ -static int insert_delayed_ref(struct btrfs_delayed_ref_root *root, +static int insert_delayed_ref(struct btrfs_trans_handle *trans, + struct btrfs_delayed_ref_root *root, struct btrfs_delayed_ref_head *href, struct btrfs_delayed_ref_node *ref) { @@ -974,7 +975,7 @@ int btrfs_add_delayed_tree_ref(struct btrfs_trans_handle *trans, head_ref = add_delayed_ref_head(trans, head_ref, record, action, &qrecord_inserted); - ret = insert_delayed_ref(delayed_refs, head_ref, &ref->node); + ret = insert_delayed_ref(trans, delayed_refs, head_ref, &ref->node); spin_unlock(&delayed_refs->lock); /* @@ -1066,7 +1067,7 @@ int btrfs_add_delayed_data_ref(struct btrfs_trans_handle *trans, head_ref = add_delayed_ref_head(trans, head_ref, record, action, &qrecord_inserted); - ret = insert_delayed_ref(delayed_refs, head_ref, &ref->node); + ret = insert_delayed_ref(trans, delayed_refs, head_ref, &ref->node); spin_unlock(&delayed_refs->lock); /* From patchwork Thu Mar 2 09:45:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13156940 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 11956C6FA8E for ; Thu, 2 Mar 2023 09:46:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230086AbjCBJqA (ORCPT ); Thu, 2 Mar 2023 04:46:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53436 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230072AbjCBJp4 (ORCPT ); Thu, 2 Mar 2023 04:45:56 -0500 Received: from esa2.hgst.iphmx.com (esa2.hgst.iphmx.com [68.232.143.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5C2221F5C0 for ; Thu, 2 Mar 2023 01:45:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1677750342; x=1709286342; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=YTeD/W4HwepwYrQ4gN7KD38BpdkOVOpFOk9Fnt+u+xs=; b=e8/xf110xLLIOUUi2qS5ROCPITo8K2BGj31WuqXuM5DM90XzBrxq69jK gcAa24eAXbnsIq2nStKlUHJiqkvhaIfiB+qy+H8fgDot8znXBBFBCFedU X95YahsN/d4TSwK5jwJrhrTDI23aXHLH3bw027o4Vuxe+36x6tJAGlYTz vQ35vlzVcSRLXiwp7Kdfq/GQgBGltkFlWXeQpnjR/8WScuDgurwxkv6h3 v7aUmyfyLdBrI7fzNxElXRdQPZRTQEQrX+HYisWVPMJPluRBmwIFyuP3O k5ru1TdsN1RvkBhV8QN60AfXN5g5Pwhz69oneTF55DTRuSO5EJ7heDXUC Q==; X-IronPort-AV: E=Sophos;i="5.98,227,1673884800"; d="scan'208";a="328939176" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 02 Mar 2023 17:45:40 +0800 IronPort-SDR: CeEK3vJMhiil2pNssZwEoVoXTYGAPSKuKvtqQj2RZ5gKYA1rAEBzBdMEgBdCuD5QunHdetQnK4 nO0USRXmWQu8xyqS+VdODYIH8DOWNsjr93FSbksp7qKuxWVUu1ch5R0kyJgFzvQMgI1Iekjgh0 pLbYB11LGI/TkuqBkkt1qhMV2waPPyUuocJVfXyVTwMfdcPSyXLrvyxeuZaUuVY0X9KG/31Ziw PtLyPum82IWPcf/+4GskvEFtv/Y/B2QgUQn3RhsHlx7ERoGzCNNDFR1N0xckFU7ACij3iCKVeZ ik0= Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 02 Mar 2023 00:56:44 -0800 IronPort-SDR: 8HkLDbvc9n/Me3ZbtLWYIcwwkeVddTS16ftdU8WFccPmc+3hSkDZRaBMzNM+imNOlV8R6zUGfr fUdDoqn8EMeRgxiLwiT4TToLXmUiIjVrW+hZMic66JbWcjbNzZqIfZw6qJjT7ZO4jn4R+eqyEH fMs0KkVqikjZiMXTKIQ7JhcNjSDrMfTE27X2T97SZHOgNX8/ClL9cUC2/nJJ8CYZo4CjNEONuC YTAR0Xflvpp/mKV90RWyzQPbKoA/VgAZRBVXf8aTR84kLrF3LzyNtNmtJ9pn59lz+JNNZUSfg1 X9g= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip01.wdc.com with ESMTP; 02 Mar 2023 01:45:40 -0800 From: Johannes Thumshirn To: David Sterba Cc: Johannes Thumshirn , linux-btrfs@vger.kernel.org, Josef Bacik , Christoph Hellwig , Anand Jain Subject: [PATCH v7 02/13] btrfs: add raid stripe tree definitions Date: Thu, 2 Mar 2023 01:45:24 -0800 Message-Id: <2810f6d796a2270cf8827c98cc64d9c523022e9d.1677750131.git.johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Add definitions for the raid stripe tree. This tree will hold information about the on-disk layout of the stripes in a RAID set. Each stripe extent has a 1:1 relationship with an on-disk extent item and is doing the logical to per-drive physical address translation for the extent item in question. Reviewed-by: Josef Bacik Reviewed-by: Anand Jain Signed-off-by: Johannes Thumshirn --- fs/btrfs/accessors.h | 29 +++++++++++++++++++++++++++++ include/uapi/linux/btrfs_tree.h | 20 ++++++++++++++++++-- 2 files changed, 47 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/accessors.h b/fs/btrfs/accessors.h index ceadfc5d6c66..6e753b63faae 100644 --- a/fs/btrfs/accessors.h +++ b/fs/btrfs/accessors.h @@ -306,6 +306,35 @@ BTRFS_SETGET_FUNCS(timespec_nsec, struct btrfs_timespec, nsec, 32); BTRFS_SETGET_STACK_FUNCS(stack_timespec_sec, struct btrfs_timespec, sec, 64); BTRFS_SETGET_STACK_FUNCS(stack_timespec_nsec, struct btrfs_timespec, nsec, 32); +BTRFS_SETGET_FUNCS(raid_stride_devid, struct btrfs_raid_stride, devid, 64); +BTRFS_SETGET_FUNCS(raid_stride_physical, struct btrfs_raid_stride, physical, 64); +BTRFS_SETGET_STACK_FUNCS(stack_raid_stride_devid, struct btrfs_raid_stride, devid, 64); +BTRFS_SETGET_STACK_FUNCS(stack_raid_stride_physical, struct btrfs_raid_stride, physical, 64); + +static inline struct btrfs_raid_stride *btrfs_raid_stride_nr( + struct btrfs_stripe_extent *dps, int nr) +{ + unsigned long offset = (unsigned long)dps; + + offset += offsetof(struct btrfs_stripe_extent, strides); + offset += nr * sizeof(struct btrfs_raid_stride); + return (struct btrfs_raid_stride *)offset; +} + +static inline u64 btrfs_raid_stride_devid_nr(const struct extent_buffer *eb, + struct btrfs_stripe_extent *dps, + int nr) +{ + return btrfs_raid_stride_devid(eb, btrfs_raid_stride_nr(dps, nr)); +} + +static inline u64 btrfs_raid_stride_physical_nr(const struct extent_buffer *eb, + struct btrfs_stripe_extent *dps, + int nr) +{ + return btrfs_raid_stride_physical(eb, btrfs_raid_stride_nr(dps, nr)); +} + /* struct btrfs_dev_extent */ BTRFS_SETGET_FUNCS(dev_extent_chunk_tree, struct btrfs_dev_extent, chunk_tree, 64); BTRFS_SETGET_FUNCS(dev_extent_chunk_objectid, struct btrfs_dev_extent, diff --git a/include/uapi/linux/btrfs_tree.h b/include/uapi/linux/btrfs_tree.h index ab38d0f411fa..64e6bf2a10d8 100644 --- a/include/uapi/linux/btrfs_tree.h +++ b/include/uapi/linux/btrfs_tree.h @@ -4,9 +4,8 @@ #include #include -#ifdef __KERNEL__ #include -#else +#ifndef __KERNEL__ #include #endif @@ -73,6 +72,9 @@ /* Holds the block group items for extent tree v2. */ #define BTRFS_BLOCK_GROUP_TREE_OBJECTID 11ULL +/* tracks RAID stripes in block groups. */ +#define BTRFS_RAID_STRIPE_TREE_OBJECTID 12ULL + /* device stats in the device tree */ #define BTRFS_DEV_STATS_OBJECTID 0ULL @@ -281,6 +283,8 @@ */ #define BTRFS_QGROUP_RELATION_KEY 246 +#define BTRFS_RAID_STRIPE_KEY 247 + /* * Obsolete name, see BTRFS_TEMPORARY_ITEM_KEY. */ @@ -715,6 +719,18 @@ struct btrfs_free_space_header { __le64 num_bitmaps; } __attribute__ ((__packed__)); +struct btrfs_raid_stride { + /* btrfs device-id this raid extent lives on */ + __le64 devid; + /* physical location on disk */ + __le64 physical; +}; + +struct btrfs_stripe_extent { + /* array of raid strides this stripe is composed of */ + __DECLARE_FLEX_ARRAY(struct btrfs_raid_stride, strides); +}; + #define BTRFS_HEADER_FLAG_WRITTEN (1ULL << 0) #define BTRFS_HEADER_FLAG_RELOC (1ULL << 1) From patchwork Thu Mar 2 09:45:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13156941 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F4FCC678D4 for ; Thu, 2 Mar 2023 09:46:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230160AbjCBJqV (ORCPT ); Thu, 2 Mar 2023 04:46:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53708 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230130AbjCBJqL (ORCPT ); Thu, 2 Mar 2023 04:46:11 -0500 Received: from esa2.hgst.iphmx.com (esa2.hgst.iphmx.com [68.232.143.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 24B563D084 for ; Thu, 2 Mar 2023 01:45:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1677750354; x=1709286354; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=033sDHFu6R7zD4a7zJ1WuhelqYgIg3KNs4tJ9+KlXNU=; b=YbxFqEbG0Ujn/CPvzQ2qYd34gOhBFuulj0who7AB8OJUlbXKa9VOBada DWUp0MTF68ea8fSs8/vzC2aEzkCoqabcQxxGxH8QRoJuazuooPEJ5UIM6 TCEs+O5b4Q6HYW0EON6yfxxksjVKxC3VD/3h3rqrc3xtokOpJlUkS10qW isRjStzPNishZZi2IZKNpjPLGQISm8CWyURpfiOTKh9sGvuxi1Ne/2RNS cRBrbmXQFokOofbrZ1N2Y8KTYag2WrVhd/xV4oB2FyFA/uQzchcqlnqbd LXqPO3I2T8UaGhtlR44uknjK/uiHJdeY32Cxuve8g4KPoVURx6pbUw42w A==; X-IronPort-AV: E=Sophos;i="5.98,227,1673884800"; d="scan'208";a="328939179" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 02 Mar 2023 17:45:41 +0800 IronPort-SDR: cYjhM8LkeEM4F/OvGDvghXZtnhtJar3+02nX5/OO5kMhYINnivVQsvW+qRpJY8yCq9Ylx4ccah IauOaAx2H0FZyBeSWkREYPiwhTK/6xT/aNiK1/CvRcYZNvoSVYRBzMnyd1NRTSGH+tJe3DOcLM /M8gER95wmPriJj4OBQ+CvMQS/xsQN1Hzppn6sYruI/vlBOFUYmYbMaV6k0yYnMITVZr7OIRgo 7qZSKJX6lhwyn0p+TacIy9wTbVxg6HwOEI4bYKSiOkk/RyLqU8+0bQtoBwdwSFTheXr2mUjCDY XT8= Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 02 Mar 2023 00:56:46 -0800 IronPort-SDR: i6YxaM4ZxOlRzSBtGd5uQiHXaezK9ERsqi/VWvEjDepfEsAEST0cCOvvNUED7c+VQkPYSzmacg RSEGggsc0r+RDrr8ZCOqJCVG80POuxVty3IAZ8FJgn3EalMu3HqI3jfvaTUg7053TIWeR2KLiW MPIP4is5tufFkMUROYnSIbuR23LM/OEopDWz++X4PcAq3eGm8oNoS4j/Z7+Twkr2LVvBtDX617 hiN9/cSdpOYD0oPGcOsO1UfJm2zmuapNRSER22Zy4Ve0tnCsWLBZsQ9mT40IoKahPHSW6PiSG9 9W8= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip01.wdc.com with ESMTP; 02 Mar 2023 01:45:41 -0800 From: Johannes Thumshirn To: David Sterba Cc: Johannes Thumshirn , linux-btrfs@vger.kernel.org, Josef Bacik , Christoph Hellwig , Anand Jain Subject: [PATCH v7 03/13] btrfs: read raid-stripe-tree from disk Date: Thu, 2 Mar 2023 01:45:25 -0800 Message-Id: <0934ca7511d7e3fbfc90bb8174e298cb42b008f2.1677750131.git.johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org If we find a raid-stripe-tree on mount, read it from disk. Reviewed-by: Josef Bacik Reviewed-by: Anand Jain Signed-off-by: Johannes Thumshirn --- fs/btrfs/block-rsv.c | 1 + fs/btrfs/disk-io.c | 22 ++++++++++++++++++++++ fs/btrfs/disk-io.h | 5 +++++ fs/btrfs/fs.h | 4 ++++ include/uapi/linux/btrfs.h | 1 + 5 files changed, 33 insertions(+) diff --git a/fs/btrfs/block-rsv.c b/fs/btrfs/block-rsv.c index 5367a14d44d2..384987343a64 100644 --- a/fs/btrfs/block-rsv.c +++ b/fs/btrfs/block-rsv.c @@ -402,6 +402,7 @@ void btrfs_init_root_block_rsv(struct btrfs_root *root) case BTRFS_EXTENT_TREE_OBJECTID: case BTRFS_FREE_SPACE_TREE_OBJECTID: case BTRFS_BLOCK_GROUP_TREE_OBJECTID: + case BTRFS_RAID_STRIPE_TREE_OBJECTID: root->block_rsv = &fs_info->delayed_refs_rsv; break; case BTRFS_ROOT_TREE_OBJECTID: diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 0e0c30fe6df6..ac200b367ec8 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1438,6 +1438,9 @@ static struct btrfs_root *btrfs_get_global_root(struct btrfs_fs_info *fs_info, return btrfs_grab_root(root) ? root : ERR_PTR(-ENOENT); } + if (objectid == BTRFS_RAID_STRIPE_TREE_OBJECTID) + return btrfs_grab_root(fs_info->stripe_root) ? + fs_info->stripe_root : ERR_PTR(-ENOENT); return NULL; } @@ -1516,6 +1519,7 @@ void btrfs_free_fs_info(struct btrfs_fs_info *fs_info) btrfs_put_root(fs_info->fs_root); btrfs_put_root(fs_info->data_reloc_root); btrfs_put_root(fs_info->block_group_root); + btrfs_put_root(fs_info->stripe_root); btrfs_check_leaked_roots(fs_info); btrfs_extent_buffer_leak_debug_check(fs_info); kfree(fs_info->super_copy); @@ -2051,6 +2055,7 @@ static void free_root_pointers(struct btrfs_fs_info *info, bool free_chunk_root) free_root_extent_buffers(info->fs_root); free_root_extent_buffers(info->data_reloc_root); free_root_extent_buffers(info->block_group_root); + free_root_extent_buffers(info->stripe_root); if (free_chunk_root) free_root_extent_buffers(info->chunk_root); } @@ -2512,6 +2517,20 @@ static int btrfs_read_roots(struct btrfs_fs_info *fs_info) fs_info->uuid_root = root; } + if (btrfs_fs_incompat(fs_info, RAID_STRIPE_TREE)) { + location.objectid = BTRFS_RAID_STRIPE_TREE_OBJECTID; + root = btrfs_read_tree_root(tree_root, &location); + if (IS_ERR(root)) { + if (!btrfs_test_opt(fs_info, IGNOREBADROOTS)) { + ret = PTR_ERR(root); + goto out; + } + } else { + set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state); + fs_info->stripe_root = root; + } + } + return 0; out: btrfs_warn(fs_info, "failed to read root (objectid=%llu): %d", @@ -3020,6 +3039,9 @@ void btrfs_init_fs_info(struct btrfs_fs_info *fs_info) fs_info->bg_reclaim_threshold = BTRFS_DEFAULT_RECLAIM_THRESH; INIT_WORK(&fs_info->reclaim_bgs_work, btrfs_reclaim_bgs_work); + + rwlock_init(&fs_info->stripe_update_lock); + fs_info->stripe_update_tree = RB_ROOT; } static int init_mount_fs_info(struct btrfs_fs_info *fs_info, struct super_block *sb) diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h index 4d5772330110..c4de38374b62 100644 --- a/fs/btrfs/disk-io.h +++ b/fs/btrfs/disk-io.h @@ -107,6 +107,11 @@ static inline struct btrfs_root *btrfs_grab_root(struct btrfs_root *root) return NULL; } +static inline struct btrfs_root *btrfs_stripe_tree_root(struct btrfs_fs_info *fs_info) +{ + return fs_info->stripe_root; +} + void btrfs_put_root(struct btrfs_root *root); void btrfs_mark_buffer_dirty(struct extent_buffer *buf); int btrfs_buffer_uptodate(struct extent_buffer *buf, u64 parent_transid, diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h index 4c477eae6891..d0d80540b32b 100644 --- a/fs/btrfs/fs.h +++ b/fs/btrfs/fs.h @@ -367,6 +367,7 @@ struct btrfs_fs_info { struct btrfs_root *uuid_root; struct btrfs_root *data_reloc_root; struct btrfs_root *block_group_root; + struct btrfs_root *stripe_root; /* The log root tree is a directory of all the other log roots */ struct btrfs_root *log_root_tree; @@ -790,6 +791,9 @@ struct btrfs_fs_info { struct lockdep_map btrfs_trans_pending_ordered_map; struct lockdep_map btrfs_ordered_extent_map; + rwlock_t stripe_update_lock; + struct rb_root stripe_update_tree; + #ifdef CONFIG_BTRFS_FS_REF_VERIFY spinlock_t ref_verify_lock; struct rb_root block_tree; diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h index ada0a489bf2b..df7b60483642 100644 --- a/include/uapi/linux/btrfs.h +++ b/include/uapi/linux/btrfs.h @@ -332,6 +332,7 @@ struct btrfs_ioctl_fs_info_args { #define BTRFS_FEATURE_INCOMPAT_RAID1C34 (1ULL << 11) #define BTRFS_FEATURE_INCOMPAT_ZONED (1ULL << 12) #define BTRFS_FEATURE_INCOMPAT_EXTENT_TREE_V2 (1ULL << 13) +#define BTRFS_FEATURE_INCOMPAT_RAID_STRIPE_TREE (1ULL << 14) struct btrfs_ioctl_feature_flags { __u64 compat_flags; From patchwork Thu Mar 2 09:45:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13156945 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DEC97C83005 for ; Thu, 2 Mar 2023 09:46:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230187AbjCBJqX (ORCPT ); Thu, 2 Mar 2023 04:46:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53860 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230061AbjCBJqO (ORCPT ); Thu, 2 Mar 2023 04:46:14 -0500 Received: from esa2.hgst.iphmx.com (esa2.hgst.iphmx.com [68.232.143.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 780A47687 for ; Thu, 2 Mar 2023 01:45:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1677750356; x=1709286356; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/ZQUcmnHloCc2sMf7de6wH4EUdjGRwPc5PGQsS4XXrc=; b=cGv6fbX7EF6+v1n36qAAYye1YCs1VRohDz9Su3mXxreH2Y5WCS0WLb17 jX+TsH8TorAdEAZUQvtdiP30gE/fb+G9ZpL9SA8YFCi+CU6q+M4UQ/Bcq YUv5kCSmRzn6UV18K2OkSV4s2us0c4+y8JP3Ek6WS//bSBhIsqlB9/DVu n7GtfbsKOM3rrC1lwID9uMa+q9sgN5JNxjtqZonMrDKQJznWVq4IUgvNL 57UUNQzNFEDmi/iU7UJL7C3dVJlCBLcceB8zVbwOrkQsb4AcRBdsInfnT Ya+3wWvOUz34lTXG+uYO5ECZHmi3OeSthH/BdeFyFL4Z8aOHwuGM1hffq w==; X-IronPort-AV: E=Sophos;i="5.98,227,1673884800"; d="scan'208";a="328939186" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 02 Mar 2023 17:45:42 +0800 IronPort-SDR: FaxsajVByZJYwW81G3YjxLEsfUxhwfkfaA/e8SCe+4vBXZbIiLvrs7xkBMBZ+NXxjplVlxWLQg p/zEjLZxJNddTShj3PdQj2gjmQRy/r3VITD5R/XZQjv6KoAQUrrAN8AyfBRZHORcVtw36W/HIr +H5e1Msltq+twUeT5Ze8qgLvRQ2WVQiyXCmICqealc318N0z3sjejYzwZ365I212RhcPX5TrBv mgf8Yg7nFSXgLzJRmVWVE5WxTG4g8eH9ngFfZXDDFZmy/n9TeG3apJKCpryGatcAkMEBSaIMjS dq4= Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 02 Mar 2023 00:56:47 -0800 IronPort-SDR: d3GlqcLunCezP+cOGH18cjT6jKuvZjjqifc52FdHJcopVm8oYAR4vQHQKKu1ZyfP5pUsLFkv/p wkFES1XiMQ3UU7/UnzVveb3buDEUOJ9sLwDxHsRdKL/tTikuZnT5zO0Xouv7rrmMEhdGNui19D BwQeSf9C2amql7GGUyGR1zW5yoPc8g/ufNcFE+wSBqGL/TsVDvapkNn1VzVWB0D+c4TbtX3oHs d945sRkyupK5RZas/UuVYVpEurhb90gDAwU2ggDgs+dieUyN3KmFab40kEGgpEU58ADU++KOCl Uyg= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip01.wdc.com with ESMTP; 02 Mar 2023 01:45:43 -0800 From: Johannes Thumshirn To: David Sterba Cc: Johannes Thumshirn , linux-btrfs@vger.kernel.org, Josef Bacik , Christoph Hellwig Subject: [PATCH v7 04/13] btrfs: add support for inserting raid stripe extents Date: Thu, 2 Mar 2023 01:45:26 -0800 Message-Id: <94293952cdc120b46edf82672af874b0877e1e83.1677750131.git.johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Add support for inserting stripe extents into the raid stripe tree on completion of every write that needs an extra logical-to-physical translation when using RAID. Inserting the stripe extents happens after the data I/O has completed, this is done to a) support zone-append and b) rule out the possibility of a RAID-write-hole. This is done by creating in-memory ordered stripe extents, just like the in memory ordered extents, on I/O completion and the on-disk raid stripe extents get created once we're running the delayed_refs for the extent item this stripe extent is tied to. Signed-off-by: Johannes Thumshirn --- fs/btrfs/Makefile | 2 +- fs/btrfs/bio.c | 29 +++++ fs/btrfs/delayed-ref.c | 6 +- fs/btrfs/delayed-ref.h | 2 + fs/btrfs/extent-tree.c | 60 +++++++++++ fs/btrfs/inode.c | 15 ++- fs/btrfs/raid-stripe-tree.c | 204 ++++++++++++++++++++++++++++++++++++ fs/btrfs/raid-stripe-tree.h | 71 +++++++++++++ fs/btrfs/volumes.c | 4 +- fs/btrfs/volumes.h | 13 +-- fs/btrfs/zoned.c | 3 + 11 files changed, 397 insertions(+), 12 deletions(-) create mode 100644 fs/btrfs/raid-stripe-tree.c create mode 100644 fs/btrfs/raid-stripe-tree.h diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile index 90d53209755b..3bb869a84e54 100644 --- a/fs/btrfs/Makefile +++ b/fs/btrfs/Makefile @@ -33,7 +33,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \ uuid-tree.o props.o free-space-tree.o tree-checker.o space-info.o \ block-rsv.o delalloc-space.o block-group.o discard.o reflink.o \ subpage.o tree-mod-log.o extent-io-tree.o fs.o messages.o bio.o \ - lru_cache.o + lru_cache.o raid-stripe-tree.o btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c index 726592868e9c..2b174865d347 100644 --- a/fs/btrfs/bio.c +++ b/fs/btrfs/bio.c @@ -15,6 +15,7 @@ #include "rcu-string.h" #include "zoned.h" #include "file-item.h" +#include "raid-stripe-tree.h" static struct bio_set btrfs_bioset; static struct bio_set btrfs_clone_bioset; @@ -348,6 +349,21 @@ static void btrfs_raid56_end_io(struct bio *bio) btrfs_put_bioc(bioc); } +static void btrfs_raid_stripe_update(struct work_struct *work) +{ + struct btrfs_bio *bbio = + container_of(work, struct btrfs_bio, end_io_work); + struct btrfs_io_stripe *stripe = bbio->bio.bi_private; + struct btrfs_io_context *bioc = stripe->bioc; + int ret; + + ret = btrfs_add_ordered_stripe(bioc); + if (ret) + bbio->bio.bi_status = errno_to_blk_status(ret); + btrfs_orig_bbio_end_io(bbio); + btrfs_put_bioc(bioc); +} + static void btrfs_orig_write_end_io(struct bio *bio) { struct btrfs_io_stripe *stripe = bio->bi_private; @@ -370,6 +386,16 @@ static void btrfs_orig_write_end_io(struct bio *bio) else bio->bi_status = BLK_STS_OK; + if (bio_op(bio) == REQ_OP_ZONE_APPEND) + stripe->physical = bio->bi_iter.bi_sector << SECTOR_SHIFT; + + if (btrfs_need_stripe_tree_update(bioc->fs_info, bioc->map_type)) { + INIT_WORK(&bbio->end_io_work, btrfs_raid_stripe_update); + queue_work(btrfs_end_io_wq(bioc->fs_info, bio), + &bbio->end_io_work); + return; + } + btrfs_orig_bbio_end_io(bbio); btrfs_put_bioc(bioc); } @@ -381,6 +407,8 @@ static void btrfs_clone_write_end_io(struct bio *bio) if (bio->bi_status) { atomic_inc(&stripe->bioc->error); btrfs_log_dev_io_error(bio, stripe->dev); + } else if (bio_op(bio) == REQ_OP_ZONE_APPEND) { + stripe->physical = bio->bi_iter.bi_sector << SECTOR_SHIFT; } /* Pass on control to the original bio this one was cloned from */ @@ -440,6 +468,7 @@ static void btrfs_submit_mirrored_bio(struct btrfs_io_context *bioc, int dev_nr) bio->bi_private = &bioc->stripes[dev_nr]; bio->bi_iter.bi_sector = bioc->stripes[dev_nr].physical >> SECTOR_SHIFT; bioc->stripes[dev_nr].bioc = bioc; + bioc->size = bio->bi_iter.bi_size; btrfs_submit_dev_bio(bioc->stripes[dev_nr].dev, bio); } diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c index 7660ac642c81..261f52ad8e12 100644 --- a/fs/btrfs/delayed-ref.c +++ b/fs/btrfs/delayed-ref.c @@ -14,6 +14,7 @@ #include "space-info.h" #include "tree-mod-log.h" #include "fs.h" +#include "raid-stripe-tree.h" struct kmem_cache *btrfs_delayed_ref_head_cachep; struct kmem_cache *btrfs_delayed_tree_ref_cachep; @@ -637,8 +638,11 @@ static int insert_delayed_ref(struct btrfs_trans_handle *trans, exist->ref_mod += mod; /* remove existing tail if its ref_mod is zero */ - if (exist->ref_mod == 0) + if (exist->ref_mod == 0) { + btrfs_drop_ordered_stripe(trans->fs_info, exist->bytenr); drop_delayed_ref(root, href, exist); + } + spin_unlock(&href->lock); return ret; inserted: diff --git a/fs/btrfs/delayed-ref.h b/fs/btrfs/delayed-ref.h index 2eb34abf700f..5096c1a1ed3e 100644 --- a/fs/btrfs/delayed-ref.h +++ b/fs/btrfs/delayed-ref.h @@ -51,6 +51,8 @@ struct btrfs_delayed_ref_node { /* is this node still in the rbtree? */ unsigned int is_head:1; unsigned int in_tree:1; + /* Do we need RAID stripe tree modifications? */ + unsigned int must_insert_stripe:1; }; struct btrfs_delayed_extent_op { diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 6b6c59e6805c..7441d784fe03 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -42,6 +42,7 @@ #include "file-item.h" #include "orphan.h" #include "tree-checker.h" +#include "raid-stripe-tree.h" #undef SCRAMBLE_DELAYED_REFS @@ -1497,6 +1498,56 @@ static int __btrfs_inc_extent_ref(struct btrfs_trans_handle *trans, return ret; } +static bool delayed_ref_needs_rst_update(struct btrfs_fs_info *fs_info, + struct btrfs_delayed_ref_head *head) +{ + struct extent_map *em; + struct map_lookup *map; + bool ret = false; + + if (!btrfs_stripe_tree_root(fs_info)) + return ret; + + em = btrfs_get_chunk_map(fs_info, head->bytenr, head->num_bytes); + if (!em) + return ret; + + map = em->map_lookup; + + if (btrfs_need_stripe_tree_update(fs_info, map->type)) + ret = true; + + free_extent_map(em); + + return ret; +} + +static int add_stripe_entry_for_delayed_ref(struct btrfs_trans_handle *trans, + struct btrfs_delayed_ref_node *node) +{ + struct btrfs_fs_info *fs_info = trans->fs_info; + struct btrfs_ordered_stripe *stripe; + int ret = 0; + + stripe = btrfs_lookup_ordered_stripe(fs_info, node->bytenr); + if (!stripe) { + btrfs_err(fs_info, + "cannot get stripe extent for address %llu (%llu)", + node->bytenr, node->num_bytes); + return -EINVAL; + } + + ASSERT(stripe->logical == node->bytenr); + + ret = btrfs_insert_raid_extent(trans, stripe); + /* once for us */ + btrfs_put_ordered_stripe(fs_info, stripe); + /* once for the tree */ + btrfs_put_ordered_stripe(fs_info, stripe); + + return ret; +} + static int run_delayed_data_ref(struct btrfs_trans_handle *trans, struct btrfs_delayed_ref_node *node, struct btrfs_delayed_extent_op *extent_op, @@ -1527,11 +1578,17 @@ static int run_delayed_data_ref(struct btrfs_trans_handle *trans, flags, ref->objectid, ref->offset, &ins, node->ref_mod); + if (ret) + return ret; + if (node->must_insert_stripe) + ret = add_stripe_entry_for_delayed_ref(trans, node); } else if (node->action == BTRFS_ADD_DELAYED_REF) { ret = __btrfs_inc_extent_ref(trans, node, parent, ref_root, ref->objectid, ref->offset, node->ref_mod, extent_op); } else if (node->action == BTRFS_DROP_DELAYED_REF) { + if (node->must_insert_stripe) + btrfs_drop_ordered_stripe(trans->fs_info, node->bytenr); ret = __btrfs_free_extent(trans, node, parent, ref_root, ref->objectid, ref->offset, node->ref_mod, @@ -1901,6 +1958,8 @@ static int btrfs_run_delayed_refs_for_head(struct btrfs_trans_handle *trans, struct btrfs_delayed_ref_root *delayed_refs; struct btrfs_delayed_extent_op *extent_op; struct btrfs_delayed_ref_node *ref; + const bool need_rst_update = + delayed_ref_needs_rst_update(fs_info, locked_ref); int must_insert_reserved = 0; int ret; @@ -1951,6 +2010,7 @@ static int btrfs_run_delayed_refs_for_head(struct btrfs_trans_handle *trans, locked_ref->extent_op = NULL; spin_unlock(&locked_ref->lock); + ref->must_insert_stripe = need_rst_update; ret = run_one_delayed_ref(trans, ref, extent_op, must_insert_reserved); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 8f07d59e8193..aaa1db90e58b 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -70,6 +70,7 @@ #include "verity.h" #include "super.h" #include "orphan.h" +#include "raid-stripe-tree.h" struct btrfs_iget_args { u64 ino; @@ -9495,12 +9496,17 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent( if (qgroup_released < 0) return ERR_PTR(qgroup_released); + ret = btrfs_insert_preallocated_raid_stripe(inode->root->fs_info, + start, len); + if (ret) + goto free_qgroup; + if (trans) { ret = insert_reserved_file_extent(trans, inode, file_offset, &stack_fi, true, qgroup_released); if (ret) - goto free_qgroup; + goto free_stripe_extent; return trans; } @@ -9518,7 +9524,7 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent( path = btrfs_alloc_path(); if (!path) { ret = -ENOMEM; - goto free_qgroup; + goto free_stripe_extent; } ret = btrfs_replace_file_extents(inode, path, file_offset, @@ -9526,9 +9532,12 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent( &trans); btrfs_free_path(path); if (ret) - goto free_qgroup; + goto free_stripe_extent; return trans; +free_stripe_extent: + btrfs_drop_ordered_stripe(inode->root->fs_info, start); + free_qgroup: /* * We have released qgroup data range at the beginning of the function, diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c new file mode 100644 index 000000000000..9d3e7bffe6f8 --- /dev/null +++ b/fs/btrfs/raid-stripe-tree.c @@ -0,0 +1,204 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2022 Western Digital Corporation or its affiliates. + */ + +#include + +#include "ctree.h" +#include "fs.h" +#include "accessors.h" +#include "transaction.h" +#include "disk-io.h" +#include "raid-stripe-tree.h" +#include "volumes.h" +#include "misc.h" +#include "disk-io.h" +#include "print-tree.h" + +static int ordered_stripe_cmp(const void *key, const struct rb_node *node) +{ + struct btrfs_ordered_stripe *stripe = + rb_entry(node, struct btrfs_ordered_stripe, rb_node); + const u64 *logical = key; + + if (*logical < stripe->logical) + return -1; + if (*logical >= stripe->logical + stripe->num_bytes) + return 1; + return 0; +} + +static int ordered_stripe_less(struct rb_node *rba, const struct rb_node *rbb) +{ + struct btrfs_ordered_stripe *stripe = + rb_entry(rba, struct btrfs_ordered_stripe, rb_node); + return ordered_stripe_cmp(&stripe->logical, rbb); +} + +int btrfs_add_ordered_stripe(struct btrfs_io_context *bioc) +{ + struct btrfs_fs_info *fs_info = bioc->fs_info; + struct btrfs_ordered_stripe *stripe; + struct btrfs_io_stripe *tmp; + u64 logical = bioc->logical; + u64 length = bioc->size; + struct rb_node *node; + size_t size; + + size = bioc->num_stripes * sizeof(struct btrfs_io_stripe); + stripe = kzalloc(sizeof(struct btrfs_ordered_stripe), GFP_NOFS); + if (!stripe) + return -ENOMEM; + + spin_lock_init(&stripe->lock); + tmp = kmemdup(bioc->stripes, size, GFP_NOFS); + if (!tmp) { + kfree(stripe); + return -ENOMEM; + } + + stripe->logical = logical; + stripe->num_bytes = length; + stripe->num_stripes = bioc->num_stripes; + spin_lock(&stripe->lock); + stripe->stripes = tmp; + spin_unlock(&stripe->lock); + refcount_set(&stripe->ref, 1); + + write_lock(&fs_info->stripe_update_lock); + node = rb_find_add(&stripe->rb_node, &fs_info->stripe_update_tree, + ordered_stripe_less); + if (node) { + struct btrfs_ordered_stripe *old = + rb_entry(node, struct btrfs_ordered_stripe, rb_node); + + btrfs_debug(fs_info, "logical: %llu, length: %llu already exists", + logical, length); + ASSERT(logical == old->logical); + + rb_replace_node(node, &stripe->rb_node, + &fs_info->stripe_update_tree); + } + write_unlock(&fs_info->stripe_update_lock); + + return 0; +} + +struct btrfs_ordered_stripe *btrfs_lookup_ordered_stripe(struct btrfs_fs_info *fs_info, + u64 logical) +{ + struct rb_root *root = &fs_info->stripe_update_tree; + struct btrfs_ordered_stripe *stripe = NULL; + struct rb_node *node; + + read_lock(&fs_info->stripe_update_lock); + node = rb_find(&logical, root, ordered_stripe_cmp); + if (node) { + stripe = rb_entry(node, struct btrfs_ordered_stripe, rb_node); + refcount_inc(&stripe->ref); + } + read_unlock(&fs_info->stripe_update_lock); + + return stripe; +} + +void btrfs_put_ordered_stripe(struct btrfs_fs_info *fs_info, + struct btrfs_ordered_stripe *stripe) +{ + + if (refcount_dec_and_test(&stripe->ref)) { + struct rb_node *node; + + write_lock(&fs_info->stripe_update_lock); + + node = &stripe->rb_node; + rb_erase(node, &fs_info->stripe_update_tree); + RB_CLEAR_NODE(node); + + spin_lock(&stripe->lock); + kfree(stripe->stripes); + spin_unlock(&stripe->lock); + kfree(stripe); + write_unlock(&fs_info->stripe_update_lock); + } +} + +int btrfs_insert_preallocated_raid_stripe(struct btrfs_fs_info *fs_info, + u64 start, u64 len) +{ + struct btrfs_io_context *bioc = NULL; + struct btrfs_ordered_stripe *stripe; + u64 map_length = len; + int ret; + + if (!btrfs_stripe_tree_root(fs_info)) + return 0; + + ret = btrfs_map_block(fs_info, BTRFS_MAP_WRITE, start, &map_length, + &bioc, 0); + if (ret) + return ret; + + bioc->size = len; + + stripe = btrfs_lookup_ordered_stripe(fs_info, start); + if (!stripe) { + ret = btrfs_add_ordered_stripe(bioc); + if (ret) + return ret; + } else { + spin_lock(&stripe->lock); + memcpy(stripe->stripes, bioc->stripes, + bioc->num_stripes * sizeof(struct btrfs_io_stripe)); + spin_unlock(&stripe->lock); + btrfs_put_ordered_stripe(fs_info, stripe); + } + + return 0; +} + +int btrfs_insert_raid_extent(struct btrfs_trans_handle *trans, + struct btrfs_ordered_stripe *stripe) +{ + struct btrfs_fs_info *fs_info = trans->fs_info; + struct btrfs_key stripe_key; + struct btrfs_root *stripe_root = btrfs_stripe_tree_root(fs_info); + struct btrfs_stripe_extent *stripe_extent; + size_t item_size; + int ret; + + item_size = stripe->num_stripes * sizeof(struct btrfs_raid_stride); + + stripe_extent = kzalloc(item_size, GFP_NOFS); + if (!stripe_extent) { + btrfs_abort_transaction(trans, -ENOMEM); + btrfs_end_transaction(trans); + return -ENOMEM; + } + + spin_lock(&stripe->lock); + for (int i = 0; i < stripe->num_stripes; i++) { + u64 devid = stripe->stripes[i].dev->devid; + u64 physical = stripe->stripes[i].physical; + struct btrfs_raid_stride *raid_stride = + &stripe_extent->strides[i]; + + btrfs_set_stack_raid_stride_devid(raid_stride, devid); + btrfs_set_stack_raid_stride_physical(raid_stride, physical); + } + spin_unlock(&stripe->lock); + + stripe_key.objectid = stripe->logical; + stripe_key.type = BTRFS_RAID_STRIPE_KEY; + stripe_key.offset = stripe->num_bytes; + + ret = btrfs_insert_item(trans, stripe_root, &stripe_key, stripe_extent, + item_size); + if (ret) + btrfs_abort_transaction(trans, ret); + + kfree(stripe_extent); + + return ret; +} diff --git a/fs/btrfs/raid-stripe-tree.h b/fs/btrfs/raid-stripe-tree.h new file mode 100644 index 000000000000..60d3f8489cc9 --- /dev/null +++ b/fs/btrfs/raid-stripe-tree.h @@ -0,0 +1,71 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2022 Western Digital Corporation or its affiliates. + */ + +#ifndef BTRFS_RAID_STRIPE_TREE_H +#define BTRFS_RAID_STRIPE_TREE_H + +#include "disk-io.h" +#include "messages.h" + +struct btrfs_io_context; + +struct btrfs_ordered_stripe { + struct rb_node rb_node; + + u64 logical; + u64 num_bytes; + int num_stripes; + struct btrfs_io_stripe *stripes; + spinlock_t lock; + refcount_t ref; +}; + +int btrfs_insert_raid_extent(struct btrfs_trans_handle *trans, + struct btrfs_ordered_stripe *stripe); +int btrfs_insert_preallocated_raid_stripe(struct btrfs_fs_info *fs_info, + u64 start, u64 len); +struct btrfs_ordered_stripe *btrfs_lookup_ordered_stripe( + struct btrfs_fs_info *fs_info, + u64 logical); +int btrfs_add_ordered_stripe(struct btrfs_io_context *bioc); +void btrfs_put_ordered_stripe(struct btrfs_fs_info *fs_info, + struct btrfs_ordered_stripe *stripe); + +static inline bool btrfs_need_stripe_tree_update(struct btrfs_fs_info *fs_info, + u64 map_type) +{ + u64 type = map_type & BTRFS_BLOCK_GROUP_TYPE_MASK; + u64 profile = map_type & BTRFS_BLOCK_GROUP_PROFILE_MASK; + + if (!btrfs_stripe_tree_root(fs_info)) + return false; + + if (type != BTRFS_BLOCK_GROUP_DATA) + return false; + + if (profile & BTRFS_BLOCK_GROUP_RAID1_MASK) + return true; + + return false; +} + +static inline void btrfs_drop_ordered_stripe(struct btrfs_fs_info *fs_info, + u64 logical) +{ + struct btrfs_ordered_stripe *stripe; + + if (!btrfs_stripe_tree_root(fs_info)) + return; + + stripe = btrfs_lookup_ordered_stripe(fs_info, logical); + if (!stripe) + return; + ASSERT(refcount_read(&stripe->ref) == 2); + /* once for us */ + btrfs_put_ordered_stripe(fs_info, stripe); + /* once for the tree */ + btrfs_put_ordered_stripe(fs_info, stripe); +} +#endif diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 9d6775c7196f..fee611d1b01d 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -5879,6 +5879,7 @@ static int find_live_mirror(struct btrfs_fs_info *fs_info, } static struct btrfs_io_context *alloc_btrfs_io_context(struct btrfs_fs_info *fs_info, + u64 logical, u16 total_stripes) { struct btrfs_io_context *bioc; @@ -5898,6 +5899,7 @@ static struct btrfs_io_context *alloc_btrfs_io_context(struct btrfs_fs_info *fs_ bioc->fs_info = fs_info; bioc->replace_stripe_src = -1; bioc->full_stripe_logical = (u64)-1; + bioc->logical = logical; return bioc; } @@ -6493,7 +6495,7 @@ int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, goto out; } - bioc = alloc_btrfs_io_context(fs_info, num_alloc_stripes); + bioc = alloc_btrfs_io_context(fs_info, logical, num_alloc_stripes); if (!bioc) { ret = -ENOMEM; goto out; diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index 650e131d079e..114c76c81eda 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -372,12 +372,10 @@ struct btrfs_fs_devices { struct btrfs_io_stripe { struct btrfs_device *dev; - union { - /* Block mapping */ - u64 physical; - /* For the endio handler */ - struct btrfs_io_context *bioc; - }; + /* Block mapping */ + u64 physical; + /* For the endio handler */ + struct btrfs_io_context *bioc; }; struct btrfs_discard_stripe { @@ -410,6 +408,9 @@ struct btrfs_io_context { atomic_t error; u16 max_errors; + u64 logical; + u64 size; + /* * The total number of stripes, including the extra duplicated * stripe for replace. diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index f95b2c94d619..7e6cfc7a2918 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -1692,6 +1692,9 @@ void btrfs_rewrite_logical_zoned(struct btrfs_ordered_extent *ordered) u64 chunk_start_phys; u64 logical; + /* Filesystems with a stripe tree have their own l2p mapping */ + ASSERT(!btrfs_stripe_tree_root(fs_info)); + em = btrfs_get_chunk_map(fs_info, orig_logical, 1); if (IS_ERR(em)) return; From patchwork Thu Mar 2 09:45:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13156942 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 846C0C7EE33 for ; Thu, 2 Mar 2023 09:46:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230177AbjCBJqW (ORCPT ); Thu, 2 Mar 2023 04:46:22 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52662 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229752AbjCBJqN (ORCPT ); Thu, 2 Mar 2023 04:46:13 -0500 Received: from esa2.hgst.iphmx.com (esa2.hgst.iphmx.com [68.232.143.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CC7DC1A4BB for ; Thu, 2 Mar 2023 01:45:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1677750356; x=1709286356; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Ns1PI8o3llu401Dyqrmm9+7dNPsiMPwazAICcXX2m0s=; b=LBZNB4ZpZ+M/A/SsaZKWkZt8Y46ODg3H/pE7BOnscGvdygbaG6lyZL8A iXVWuXM6IAa3zCSYOs8SqAHP9vr9z8pSpci4VNKgasozVgzppOOO+4X/3 2WSnjgoB9u9lkVwDP4ZU9av8zIYSkfmUhztXuaEi9LTuLJgKBZCxsxbdE +JIN7BR6wEG/MyP2j81qDw6HvZZmOyppudHhiSJogNdllJWdrj7RnDWST OsawXZDyhqFMg/hmdCikq0l/WqKruJbd+hz+TTkbadTbQJO4g0k4WTaDR 0NovL/jErybfPDXBclMXFRCMvPVhshUHUUMLmmqx5FUeqCbGfDZX+wcQA Q==; X-IronPort-AV: E=Sophos;i="5.98,227,1673884800"; d="scan'208";a="328939191" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 02 Mar 2023 17:45:44 +0800 IronPort-SDR: x5KeIwYjzUeO2NkWp6X5DZCEeZLhQgttnICSUFbPgJ6f22WBzCPCtR5jlqjrqvK6yzQOyZUSeF JspXN4jDU1ZEVMFLRM4G5/L2haFUBbuk+n68+nrA0+fhdr7kMvQe/RqXqqM20lP9Aqj4tWg2gA KTjTIbu4M3WlBLWBwVRbCuJv+CYQsdUy44E1U8sScPsK78VgHk0ZzV1BP9wn3Yuj6usXQkyapS uUfxNsHSX+99ZyvUQ6XIRWeUbKPt1PfNu1cI9Ti080PUipnAxHcN82AnNjDEmyrdWDLU6oUgSA Vug= Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 02 Mar 2023 00:56:48 -0800 IronPort-SDR: WVAkz0yT5P0ParqCV6B4TJ83HpjGv1vePHCLXwLhz3qNn87XroGWW0VOavcilmy3wH2/PWkTYR QilbrX9zYvtbOd8P22qEe7BPwDw9ToU9/b3TLrXwPoYpyygGBuV+5Y8OAvxpgREZ7jg1lhFWkS Ju0HedUZvpaojENqpUUramKLWjvFtKNcAtxHqYZjEKxUxgSF9bX2Kn3XIhPPvC3W/Rzij4QMPC q1WSUnz+jPSqOzjLqy4Q7ii4whvMa+BCQWNo3B0DLhmClkiFPc7/mbDfSO5cdnpyUBy6vxvh4a 2OQ= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip01.wdc.com with ESMTP; 02 Mar 2023 01:45:44 -0800 From: Johannes Thumshirn To: David Sterba Cc: Johannes Thumshirn , linux-btrfs@vger.kernel.org, Josef Bacik , Christoph Hellwig Subject: [PATCH v7 05/13] btrfs: delete stripe extent on extent deletion Date: Thu, 2 Mar 2023 01:45:27 -0800 Message-Id: X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org As each stripe extent is tied to an extent item, delete the stripe extent once the corresponding extent item is deleted. Signed-off-by: Johannes Thumshirn --- fs/btrfs/extent-tree.c | 8 ++ fs/btrfs/raid-stripe-tree.c | 176 ++++++++++++++++++++++++++++++++++++ fs/btrfs/raid-stripe-tree.h | 5 + fs/btrfs/volumes.c | 27 ++++-- 4 files changed, 209 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 7441d784fe03..b08e7b4688e0 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -3238,6 +3238,14 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans, } } + if (is_data) { + ret = btrfs_delete_raid_extent(trans, bytenr, num_bytes); + if (ret) { + btrfs_abort_transaction(trans, ret); + return ret; + } + } + ret = btrfs_del_items(trans, extent_root, path, path->slots[0], num_to_del); if (ret) { diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c index 9d3e7bffe6f8..f58b28157a9c 100644 --- a/fs/btrfs/raid-stripe-tree.c +++ b/fs/btrfs/raid-stripe-tree.c @@ -124,6 +124,37 @@ void btrfs_put_ordered_stripe(struct btrfs_fs_info *fs_info, } } +int btrfs_delete_raid_extent(struct btrfs_trans_handle *trans, u64 start, + u64 length) +{ + struct btrfs_fs_info *fs_info = trans->fs_info; + struct btrfs_root *stripe_root = btrfs_stripe_tree_root(fs_info); + struct btrfs_path *path; + struct btrfs_key stripe_key; + int ret; + + if (!stripe_root) + return 0; + + stripe_key.objectid = start; + stripe_key.type = BTRFS_RAID_STRIPE_KEY; + stripe_key.offset = length; + + path = btrfs_alloc_path(); + if (!path) + return -ENOMEM; + + ret = btrfs_search_slot(trans, stripe_root, &stripe_key, path, -1, 1); + if (ret < 0) + goto out; + + ret = btrfs_del_item(trans, stripe_root, path); +out: + btrfs_free_path(path); + return ret; + +} + int btrfs_insert_preallocated_raid_stripe(struct btrfs_fs_info *fs_info, u64 start, u64 len) { @@ -202,3 +233,148 @@ int btrfs_insert_raid_extent(struct btrfs_trans_handle *trans, return ret; } + +static bool btrfs_physical_from_ordered_stripe(struct btrfs_fs_info *fs_info, + u64 logical, u64 *length, + int num_stripes, + struct btrfs_io_stripe *stripe) +{ + struct btrfs_ordered_stripe *os; + u64 offset; + u64 found_end; + u64 end; + int i; + + os = btrfs_lookup_ordered_stripe(fs_info, logical); + if (!os) + return false; + + end = logical + *length; + found_end = os->logical + os->num_bytes; + if (end > found_end) + *length -= end - found_end; + + for (i = 0; i < num_stripes; i++) { + if (os->stripes[i].dev != stripe->dev) + continue; + + ASSERT(logical >= os->logical); + offset = logical - os->logical; + stripe->physical = os->stripes[i].physical + offset; + btrfs_put_ordered_stripe(fs_info, os); + break; + } + + return true; +} + +int btrfs_get_raid_extent_offset(struct btrfs_fs_info *fs_info, + u64 logical, u64 *length, u64 map_type, + struct btrfs_io_stripe *stripe) +{ + struct btrfs_root *stripe_root = btrfs_stripe_tree_root(fs_info); + int num_stripes = btrfs_bg_type_to_factor(map_type); + struct btrfs_stripe_extent *stripe_extent; + struct btrfs_key stripe_key; + struct btrfs_key found_key; + struct btrfs_path *path; + struct extent_buffer *leaf; + u64 offset; + u64 found_logical; + u64 found_length; + u64 end; + u64 found_end; + int slot; + int ret; + int i; + + /* + * If we still have the stripe in the ordered stripe tree get it from + * there + */ + if (btrfs_physical_from_ordered_stripe(fs_info, logical, length, + num_stripes, stripe)) + return 0; + + stripe_key.objectid = logical; + stripe_key.type = BTRFS_RAID_STRIPE_KEY; + stripe_key.offset = 0; + + path = btrfs_alloc_path(); + if (!path) + return -ENOMEM; + + ret = btrfs_search_slot(NULL, stripe_root, &stripe_key, path, 0, 0); + if (ret < 0) + goto free_path; + if (ret) { + if (path->slots[0] != 0) + path->slots[0]--; + } + + end = logical + *length; + + while (1) { + leaf = path->nodes[0]; + slot = path->slots[0]; + + btrfs_item_key_to_cpu(leaf, &found_key, slot); + found_logical = found_key.objectid; + found_length = found_key.offset; + + if (found_logical > end) + break; + + if (!in_range(logical, found_logical, found_length)) + goto next; + + offset = logical - found_logical; + found_end = found_logical + found_length; + + /* + * If we have a logically contiguous, but physically + * noncontinuous range, we need to split the bio. Record the + * length after which we must split the bio. + */ + if (end > found_end) + *length -= end - found_end; + + stripe_extent = + btrfs_item_ptr(leaf, slot, struct btrfs_stripe_extent); + for (i = 0; i < num_stripes; i++) { + if (btrfs_raid_stride_devid_nr(leaf, + stripe_extent, i) != stripe->dev->devid) + continue; + stripe->physical = btrfs_raid_stride_physical_nr(leaf, + stripe_extent, i) + offset; + ret = 0; + goto out; + } + + /* + * If we're here, we haven't found the requested devid in the + * stripe. + */ + ret = -ENOENT; + goto out; +next: + ret = btrfs_next_item(stripe_root, path); + if (ret) + break; + } + +out: + if (ret > 0) + ret = -ENOENT; + if (ret && ret != -EIO) { + btrfs_err(fs_info, + "cannot find raid-stripe for logical [%llu, %llu]", + logical, logical + *length); + btrfs_print_tree(leaf, 1); + } + +free_path: + btrfs_free_path(path); + + return ret; +} diff --git a/fs/btrfs/raid-stripe-tree.h b/fs/btrfs/raid-stripe-tree.h index 60d3f8489cc9..9359df0ca3f1 100644 --- a/fs/btrfs/raid-stripe-tree.h +++ b/fs/btrfs/raid-stripe-tree.h @@ -22,6 +22,11 @@ struct btrfs_ordered_stripe { refcount_t ref; }; +int btrfs_get_raid_extent_offset(struct btrfs_fs_info *fs_info, + u64 logical, u64 *length, u64 map_type, + struct btrfs_io_stripe *stripe); +int btrfs_delete_raid_extent(struct btrfs_trans_handle *trans, u64 start, + u64 length); int btrfs_insert_raid_extent(struct btrfs_trans_handle *trans, struct btrfs_ordered_stripe *stripe); int btrfs_insert_preallocated_raid_stripe(struct btrfs_fs_info *fs_info, diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index fee611d1b01d..b4b615421643 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -35,6 +35,7 @@ #include "relocation.h" #include "scrub.h" #include "super.h" +#include "raid-stripe-tree.h" #define BTRFS_BLOCK_GROUP_STRIPE_MASK (BTRFS_BLOCK_GROUP_RAID0 | \ BTRFS_BLOCK_GROUP_RAID10 | \ @@ -6286,12 +6287,21 @@ static u64 btrfs_max_io_len(struct map_lookup *map, enum btrfs_map_op op, return U64_MAX; } -static void set_io_stripe(struct btrfs_io_stripe *dst, const struct map_lookup *map, - u32 stripe_index, u64 stripe_offset, u32 stripe_nr) +static int set_io_stripe(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, + u64 logical, u64 *length, struct btrfs_io_stripe *dst, + struct map_lookup *map, u32 stripe_index, + u64 stripe_offset, u64 stripe_nr) { dst->dev = map->stripes[stripe_index].dev; + + if (op == BTRFS_MAP_READ && + btrfs_need_stripe_tree_update(fs_info, map->type)) + return btrfs_get_raid_extent_offset(fs_info, logical, length, + map->type, dst); + dst->physical = map->stripes[stripe_index].physical + stripe_offset + (stripe_nr << BTRFS_STRIPE_LEN_SHIFT); + return 0; } int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, @@ -6485,13 +6495,14 @@ int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, smap->dev = dev_replace->tgtdev; smap->physical = physical_to_patch_in_first_stripe; *mirror_num_ret = map->num_stripes + 1; + ret = 0; } else { - set_io_stripe(smap, map, stripe_index, stripe_offset, - stripe_nr); + ret = set_io_stripe(fs_info, op, logical, length, smap, + map, stripe_index, stripe_offset, + stripe_nr); *mirror_num_ret = mirror_num; } *bioc_ret = NULL; - ret = 0; goto out; } @@ -6522,7 +6533,8 @@ int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, bioc->full_stripe_logical = em->start + ((stripe_nr * data_stripes) << BTRFS_STRIPE_LEN_SHIFT); for (i = 0; i < num_stripes; i++) - set_io_stripe(&bioc->stripes[i], map, + set_io_stripe(fs_info, op, logical, length, + &bioc->stripes[i], map, (i + stripe_nr) % num_stripes, stripe_offset, stripe_nr); } else { @@ -6531,7 +6543,8 @@ int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, * stripe into the bioc. */ for (i = 0; i < num_stripes; i++) { - set_io_stripe(&bioc->stripes[i], map, stripe_index, + set_io_stripe(fs_info, op, logical, length, + &bioc->stripes[i], map, stripe_index, stripe_offset, stripe_nr); stripe_index++; } From patchwork Thu Mar 2 09:45:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13156943 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0AF58C678D4 for ; Thu, 2 Mar 2023 09:46:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230191AbjCBJqY (ORCPT ); Thu, 2 Mar 2023 04:46:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54088 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230151AbjCBJqS (ORCPT ); Thu, 2 Mar 2023 04:46:18 -0500 Received: from esa2.hgst.iphmx.com (esa2.hgst.iphmx.com [68.232.143.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DC3C618169 for ; Thu, 2 Mar 2023 01:46:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1677750371; x=1709286371; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=BiUlq7KHxnIhG46uYmi6dJmELPgSEo8C8Q5VoFC/kCU=; b=FNOzRIVnlLvBpvhGiZwH6cCZfghsRA9sQTTW2/YRh1Yg/DF156/N1PGK uFw5oBICtXakWrRrns6+PYmnZQ30Ev2U7+D2aqMiRdbuCSgptUajspULj DJBJTnPQBaheFDo8juI6jmuEOvs9WlAKS+Zj93LvvLVmC0Nehx1MqHxIy 3maSGXSAb2XKJ/eytowddCxcILbyNNpmhGHPx601XFoQ0AWohKrskgvZl mvMPX5Id8fTimPseoLHhbAAs26ceigRIIaSjB+OCgWSBt48osqFsXOUUu 4gsgA4urMY/4LLo0oByfpH2axMCgnUjodxrpmo+GQ0d3LyjS2YIgZ1td8 A==; X-IronPort-AV: E=Sophos;i="5.98,227,1673884800"; d="scan'208";a="328939193" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 02 Mar 2023 17:45:44 +0800 IronPort-SDR: 3UKFP+I19/LqYDo8a7R5p9tpT15lSOX2VMH9POo1YtX+XqrRaw38UigKTWpQFyLq8EGeH5cqCy wBQlzxbCkA/sLwvPLNP+Mjkeo8rym/v5lg1C5zbE9R5yux9ygUrOf8arjz7ZeIZg6pMTs+IC80 O+iSLYinF2RBlQUnqTHJnkjPiVsLV2Abb9gvb3bJcttZ+4icjP4pP0aDXgKKIJxvhMzAXj43sb IpECPGv8cn/ZoWmpLGOptTWJVCJS2N+tCaaGRrW3wkaBmvbA38An/fC0Yx/DAgQvBbx28eYno9 Yac= Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 02 Mar 2023 00:56:49 -0800 IronPort-SDR: WJE8ESZnFb90+T9UrSDxVWaryO0ecMaFtwFrjN4eyZAOJVfRuq4sur2hOFrOzG33IKluc2C2AA 9s+/ZrJB58w7FnHudsNA0m1rd1G4hGVnYU47Iru34OJxVVDtk04nuKY+Q3Q3TxWb8oKtol2Ftq GHdnz7p/C3ISealNDlWhEjfhndGymGh/+5mQeBEvPLjwv0tr1xw4Ice+WgmwDOsnfQBLOdn2AQ 4dW+5IJkI3jeMJ6568Dqguoev7KJnRWdZKde30ar82lioyS4gmdXTiCyiPpgy5RW8DFlJkUubA TBg= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip01.wdc.com with ESMTP; 02 Mar 2023 01:45:45 -0800 From: Johannes Thumshirn To: David Sterba Cc: Johannes Thumshirn , linux-btrfs@vger.kernel.org, Josef Bacik , Christoph Hellwig Subject: [PATCH v7 06/13] btrfs: lookup physical address from stripe extent Date: Thu, 2 Mar 2023 01:45:28 -0800 Message-Id: X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Lookup the physical address from the raid stripe tree when a read on an RAID volume formatted with the raid stripe tree was attempted. If the requested logical address was not found in the stripe tree, it may still be in the in-memory ordered stripe tree, so fallback to searching the ordered stripe tree in this case. Reviewed-by: Josef Bacik Signed-off-by: Johannes Thumshirn --- fs/btrfs/volumes.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index b4b615421643..80baabdef153 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -6533,23 +6533,29 @@ int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, bioc->full_stripe_logical = em->start + ((stripe_nr * data_stripes) << BTRFS_STRIPE_LEN_SHIFT); for (i = 0; i < num_stripes; i++) - set_io_stripe(fs_info, op, logical, length, - &bioc->stripes[i], map, - (i + stripe_nr) % num_stripes, - stripe_offset, stripe_nr); + ret = set_io_stripe(fs_info, op, logical, length, + &bioc->stripes[i], map, + (i + stripe_nr) % num_stripes, + stripe_offset, stripe_nr); } else { /* * For all other non-RAID56 profiles, just copy the target * stripe into the bioc. */ for (i = 0; i < num_stripes; i++) { - set_io_stripe(fs_info, op, logical, length, - &bioc->stripes[i], map, stripe_index, - stripe_offset, stripe_nr); + ret = set_io_stripe(fs_info, op, logical, length, + &bioc->stripes[i], map, stripe_index, + stripe_offset, stripe_nr); stripe_index++; } } + if (ret) { + *bioc_ret = NULL; + btrfs_put_bioc(bioc); + goto out; + } + if (need_full_stripe(op)) max_errors = btrfs_chunk_max_errors(map); From patchwork Thu Mar 2 09:45:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13156944 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1BD6C6FA8E for ; Thu, 2 Mar 2023 09:46:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230051AbjCBJq0 (ORCPT ); Thu, 2 Mar 2023 04:46:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53708 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230156AbjCBJqU (ORCPT ); Thu, 2 Mar 2023 04:46:20 -0500 Received: from esa2.hgst.iphmx.com (esa2.hgst.iphmx.com [68.232.143.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 978A83D936 for ; Thu, 2 Mar 2023 01:46:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1677750373; x=1709286373; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=wCuSxw95aH1suzVvBS/aXESuKb2fqgaXz5cjhC7nqfw=; b=V5UiGWNAnEXbrlG4YjQ3km5KrcBk8GDBULMQdQ1bUmEJrzqPnYMDcTuK v4NJ+vjnwwPTqixaGU4hZPEqRz93zlpEode64QC5ChUoWwf2zJxkINlcf yOHuUfIda75rKpiq+uQFXe3kLhN0/UKyKUU5yeTI0Jc5gjEERrPBYmwnQ BhXE2j6KZhISBo0ktjMfY3WzVehr4TxqX9W6XMUfC7DjCn31fBM5dYj+V 9Y5p+EJ6p+uwp5EpkBURtN8DGs/ncTZx5OvBsS4bReiVEa+sb6RbxuzJf DiuJYyrUXQkrJbYDQHXK27qW4A5Prv7WMCex+4Zvy35BqTZoDcJvt7pXz A==; X-IronPort-AV: E=Sophos;i="5.98,227,1673884800"; d="scan'208";a="328939194" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 02 Mar 2023 17:45:45 +0800 IronPort-SDR: DdwKxoCG5dRd8bwXElm+5h9Gxltpe4jD0HvK+y9XDLjuon6tihVIwNqnlCtzeJZAxQFRZXkk6P JDdkN+mvT9kOHPxN+hcxSM7TJcantKVFVXfTCzNOYSfHYY15Cy4ZfuhtYcK8m67DtXC+7NgtKX n0pz4J76WC3smW3vdoWdKJTLBeLnBzTQL/gmLJRk6eoDdCFSHAJPh+UYVQJQFlqqxX8rzStZ7u ME03vta6WEu5BMOHgrq4W7Pd7i9X/rAcE01ijwbl2YoCpgEUbywNa657IBJiiKVF1je2xdtR8b fU4= Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 02 Mar 2023 00:56:50 -0800 IronPort-SDR: P/MrUKkfkSuyMuxniz2UD3JgPCn5UiSLyDbE3P/+Y6frsxedOYXIGXXc6j0hGRYSrW2/PAtBs/ HFmUWK80FsyKX1zsD+tXPJhSxJu6uwYGxNbUed7l2gL89zsYXJPx6KyJlqdlp+ZkPVtyxfLTJb SL8gsWcwHCbpwz390suHdp9FrOLmw3NpdYmmo3ogmj0s0Ueb23aWgcYSpsAT6oulLyC5550aOz 8HO28vKuSOtt27o0X2l7byRjdT/Zghn+qgeskIVA5yTRcy0xJCrTdZXOVIdfwny/5Pm9CrUZZa dtg= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip01.wdc.com with ESMTP; 02 Mar 2023 01:45:46 -0800 From: Johannes Thumshirn To: David Sterba Cc: Johannes Thumshirn , linux-btrfs@vger.kernel.org, Josef Bacik , Christoph Hellwig Subject: [PATCH v7 07/13] btrfs: add raid stripe tree pretty printer Date: Thu, 2 Mar 2023 01:45:29 -0800 Message-Id: <1843ab1ff73ccc92a589d5b961a541760b5f339f.1677750131.git.johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Decode raid-stripe-tree entries on btrfs_print_tree(). Reviewed-by: Josef Bacik Signed-off-by: Johannes Thumshirn --- fs/btrfs/print-tree.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/fs/btrfs/print-tree.c b/fs/btrfs/print-tree.c index b93c96213304..d9506d54298b 100644 --- a/fs/btrfs/print-tree.c +++ b/fs/btrfs/print-tree.c @@ -9,6 +9,7 @@ #include "print-tree.h" #include "accessors.h" #include "tree-checker.h" +#include "raid-stripe-tree.h" struct root_name_map { u64 id; @@ -28,6 +29,7 @@ static const struct root_name_map root_map[] = { { BTRFS_FREE_SPACE_TREE_OBJECTID, "FREE_SPACE_TREE" }, { BTRFS_BLOCK_GROUP_TREE_OBJECTID, "BLOCK_GROUP_TREE" }, { BTRFS_DATA_RELOC_TREE_OBJECTID, "DATA_RELOC_TREE" }, + { BTRFS_RAID_STRIPE_TREE_OBJECTID, "RAID_STRIPE_TREE" }, }; const char *btrfs_root_name(const struct btrfs_key *key, char *buf) @@ -187,6 +189,20 @@ static void print_uuid_item(struct extent_buffer *l, unsigned long offset, } } +static void print_raid_stripe_key(struct extent_buffer *eb, u32 item_size, + struct btrfs_stripe_extent *stripe) +{ + int num_stripes; + int i; + + num_stripes = item_size / sizeof(struct btrfs_raid_stride); + + for (i = 0; i < num_stripes; i++) + pr_info("\t\t\tstride %d devid %llu physical %llu\n", i, + btrfs_raid_stride_devid_nr(eb, stripe, i), + btrfs_raid_stride_physical_nr(eb, stripe, i)); +} + /* * Helper to output refs and locking status of extent buffer. Useful to debug * race condition related problems. @@ -351,6 +367,11 @@ void btrfs_print_leaf(struct extent_buffer *l) print_uuid_item(l, btrfs_item_ptr_offset(l, i), btrfs_item_size(l, i)); break; + case BTRFS_RAID_STRIPE_KEY: + print_raid_stripe_key(l, btrfs_item_size(l, i), + btrfs_item_ptr(l, i, + struct btrfs_stripe_extent)); + break; } } } From patchwork Thu Mar 2 09:45:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13156946 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 36383C6FA8E for ; Thu, 2 Mar 2023 09:46:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230199AbjCBJqh (ORCPT ); Thu, 2 Mar 2023 04:46:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53892 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230161AbjCBJqV (ORCPT ); Thu, 2 Mar 2023 04:46:21 -0500 Received: from esa2.hgst.iphmx.com (esa2.hgst.iphmx.com [68.232.143.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 47F743E62A for ; Thu, 2 Mar 2023 01:46:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1677750374; x=1709286374; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=dyibukDYjfz1DN8fDxXm7wqjfLtI9+eGKnDdK3jfjgc=; b=i+xNpcGGAMamZjMfR071tadvH5j6AjcKivmpYgOLP3/Ixz6GReCTE/Qa zXUjPRjRpxWNxa1BjQzMp0uJb9fUA3BqoQcgGzTPqj8nvF6GKkbyhqqMb GkXRIs3mLBq60Qrmc44logII7RHStG85JrzuX6fbOrhVwi+Q1bm7lKvCi aFapd4tIkohJd8Wg5nm+abyNo2L9FnANgI+VHRnq5OgQ9+W779MVOzqgB WawfTySpsoMpevEIMbo7uWE2j/+kbDiY46yQraw09QuemyLSXO09dPfI7 59l9H1GS2RFsKlxgo2+ZJoSUH3WX/BfOvDzeYOFrke41WfwM9ZOqhiejr g==; X-IronPort-AV: E=Sophos;i="5.98,227,1673884800"; d="scan'208";a="328939195" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 02 Mar 2023 17:45:46 +0800 IronPort-SDR: 5CgrjAf9YVylLDfghheW+QX+4XkOT5A8taAhoIe0FxoCUKEaI76BFCHBzpDZC2tiY/zkNJXpZY qJUpPLY6JYkeThg68H3dIe9B1saZm7Ssf8J0uNnTeBiKgbDBxrCLBYb6qM4TFgOxwJ6FNSP6sH yI+jvM2EvQE8GoD8pQzWECBQ+v9B9ggkc2SeqG5UbXV68HYhRXIcq9ImBjTGbSxI26LSctG1U4 N497Kdi+s8EeodJPp/FJvrzm5C1FLXcBJKVl/H8aO/hE0/mutFwCpUpm541D1YWT0rKrNdO/XW G10= Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 02 Mar 2023 00:56:51 -0800 IronPort-SDR: bE8d4Lc4HsDRCEVO0A70vhpOcm8EH7BrmIZkZVHAvQrr7CKXt38aLEn9bbULm4cMXaiQhUtYoH xBSmuk+HkqMbztX0G1t4qfCpk1ABElPmTM4IKaSOCLAnZbcNBGNEC/TCvgq4WCCBerVZx74RzU i4Aek3Zi7wT2Fx5ikxeBD88+XS6TB7LYuYJByGOS5g+R3muHZRFJQsHTGWv1cF0z45QtpMLLHp S8VPzFKVRD+HanznZIHWywwZYr4DnH1qW6ALYM8VARDUOE0S10zqy67HY7JxGrlHaHs/ggfB6n J+4= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip01.wdc.com with ESMTP; 02 Mar 2023 01:45:47 -0800 From: Johannes Thumshirn To: David Sterba Cc: Johannes Thumshirn , linux-btrfs@vger.kernel.org, Josef Bacik , Christoph Hellwig Subject: [PATCH v7 08/13] btrfs: zoned: allow zoned RAID Date: Thu, 2 Mar 2023 01:45:30 -0800 Message-Id: X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org When we have a raid-stripe-tree, we can do RAID0/1/10 on zoned devices for data block-groups. For meta-data block-groups, we don't actually need anything special, as all meta-data I/O is protected by the btrfs_zoned_meta_io_lock() already. Signed-off-by: Johannes Thumshirn --- fs/btrfs/raid-stripe-tree.c | 4 ++ fs/btrfs/raid-stripe-tree.h | 10 ++++ fs/btrfs/volumes.c | 5 +- fs/btrfs/zoned.c | 116 +++++++++++++++++++++++++++++++++++- 4 files changed, 132 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c index f58b28157a9c..836299fe0ebe 100644 --- a/fs/btrfs/raid-stripe-tree.c +++ b/fs/btrfs/raid-stripe-tree.c @@ -270,10 +270,12 @@ static bool btrfs_physical_from_ordered_stripe(struct btrfs_fs_info *fs_info, int btrfs_get_raid_extent_offset(struct btrfs_fs_info *fs_info, u64 logical, u64 *length, u64 map_type, + u32 stripe_index, struct btrfs_io_stripe *stripe) { struct btrfs_root *stripe_root = btrfs_stripe_tree_root(fs_info); int num_stripes = btrfs_bg_type_to_factor(map_type); + const bool is_dup = map_type & BTRFS_BLOCK_GROUP_DUP; struct btrfs_stripe_extent *stripe_extent; struct btrfs_key stripe_key; struct btrfs_key found_key; @@ -345,6 +347,8 @@ int btrfs_get_raid_extent_offset(struct btrfs_fs_info *fs_info, if (btrfs_raid_stride_devid_nr(leaf, stripe_extent, i) != stripe->dev->devid) continue; + if (is_dup && (stripe_index - 1) != i) + continue; stripe->physical = btrfs_raid_stride_physical_nr(leaf, stripe_extent, i) + offset; ret = 0; diff --git a/fs/btrfs/raid-stripe-tree.h b/fs/btrfs/raid-stripe-tree.h index 9359df0ca3f1..c7f6c5377aaa 100644 --- a/fs/btrfs/raid-stripe-tree.h +++ b/fs/btrfs/raid-stripe-tree.h @@ -24,6 +24,7 @@ struct btrfs_ordered_stripe { int btrfs_get_raid_extent_offset(struct btrfs_fs_info *fs_info, u64 logical, u64 *length, u64 map_type, + u32 stripe_index, struct btrfs_io_stripe *stripe); int btrfs_delete_raid_extent(struct btrfs_trans_handle *trans, u64 start, u64 length); @@ -50,9 +51,18 @@ static inline bool btrfs_need_stripe_tree_update(struct btrfs_fs_info *fs_info, if (type != BTRFS_BLOCK_GROUP_DATA) return false; + if (profile & BTRFS_BLOCK_GROUP_DUP) + return true; + if (profile & BTRFS_BLOCK_GROUP_RAID1_MASK) return true; + if (profile & BTRFS_BLOCK_GROUP_RAID0) + return true; + + if (profile & BTRFS_BLOCK_GROUP_RAID10) + return true; + return false; } diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 80baabdef153..ae92567e1275 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -6297,7 +6297,8 @@ static int set_io_stripe(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, if (op == BTRFS_MAP_READ && btrfs_need_stripe_tree_update(fs_info, map->type)) return btrfs_get_raid_extent_offset(fs_info, logical, length, - map->type, dst); + map->type, stripe_index, + dst); dst->physical = map->stripes[stripe_index].physical + stripe_offset + (stripe_nr << BTRFS_STRIPE_LEN_SHIFT); @@ -6488,6 +6489,8 @@ int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, * I/O context structure. */ if (smap && num_alloc_stripes == 1 && + !(btrfs_need_stripe_tree_update(fs_info, map->type) && + op != BTRFS_MAP_READ) && !((map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) && mirror_num > 1) && (!need_full_stripe(op) || !dev_replace_is_ongoing || !dev_replace->tgtdev)) { diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index 7e6cfc7a2918..5328a600f526 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -1476,8 +1476,9 @@ int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new) set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &cache->runtime_flags); break; case BTRFS_BLOCK_GROUP_DUP: - if (map->type & BTRFS_BLOCK_GROUP_DATA) { - btrfs_err(fs_info, "zoned: profile DUP not yet supported on data bg"); + if (map->type & BTRFS_BLOCK_GROUP_DATA && + !btrfs_stripe_tree_root(fs_info)) { + btrfs_err(fs_info, "zoned: data DUP profile needs stripe_root"); ret = -EINVAL; goto out; } @@ -1515,8 +1516,116 @@ int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new) cache->zone_capacity = min(caps[0], caps[1]); break; case BTRFS_BLOCK_GROUP_RAID1: + case BTRFS_BLOCK_GROUP_RAID1C3: + case BTRFS_BLOCK_GROUP_RAID1C4: + if (map->type & BTRFS_BLOCK_GROUP_DATA && + !btrfs_stripe_tree_root(fs_info)) { + btrfs_err(fs_info, + "zoned: data %s needs stripe_root", + btrfs_bg_type_to_raid_name(map->type)); + ret = -EIO; + goto out; + + } + + for (i = 0; i < map->num_stripes; i++) { + if (alloc_offsets[i] == WP_MISSING_DEV || + alloc_offsets[i] == WP_CONVENTIONAL) + continue; + + if ((alloc_offsets[0] != alloc_offsets[i]) && + !btrfs_test_opt(fs_info, DEGRADED)) { + btrfs_err(fs_info, + "zoned: write pointer offset mismatch of zones in %s profile", + btrfs_bg_type_to_raid_name(map->type)); + ret = -EIO; + goto out; + } + if (test_bit(0, active) != test_bit(i, active)) { + if (!btrfs_test_opt(fs_info, DEGRADED) && + !btrfs_zone_activate(cache)) { + ret = -EIO; + goto out; + } + } else { + if (test_bit(0, active)) + set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, + &cache->runtime_flags); + } + /* + * In case a device is missing we have a cap of 0, so don't + * use it. + */ + cache->zone_capacity = min_not_zero(caps[0], caps[i]); + } + + if (alloc_offsets[0] != WP_MISSING_DEV) + cache->alloc_offset = alloc_offsets[0]; + else + cache->alloc_offset = alloc_offsets[i - 1]; + break; case BTRFS_BLOCK_GROUP_RAID0: + if (map->type & BTRFS_BLOCK_GROUP_DATA && + !btrfs_stripe_tree_root(fs_info)) { + btrfs_err(fs_info, + "zoned: data %s needs stripe_root", + btrfs_bg_type_to_raid_name(map->type)); + ret = -EIO; + goto out; + + } + for (i = 0; i < map->num_stripes; i++) { + if (alloc_offsets[i] == WP_MISSING_DEV || + alloc_offsets[i] == WP_CONVENTIONAL) + continue; + + if (test_bit(0, active) != test_bit(i, active)) { + if (!btrfs_zone_activate(cache)) { + ret = -EIO; + goto out; + } + } else { + if (test_bit(0, active)) + set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, + &cache->runtime_flags); + } + cache->zone_capacity += caps[i]; + cache->alloc_offset += alloc_offsets[i]; + + } + break; case BTRFS_BLOCK_GROUP_RAID10: + if (map->type & BTRFS_BLOCK_GROUP_DATA && + !btrfs_stripe_tree_root(fs_info)) { + btrfs_err(fs_info, + "zoned: data %s needs stripe_root", + btrfs_bg_type_to_raid_name(map->type)); + ret = -EIO; + goto out; + + } + for (i = 0; i < map->num_stripes; i++) { + if (alloc_offsets[i] == WP_MISSING_DEV || + alloc_offsets[i] == WP_CONVENTIONAL) + continue; + + if (test_bit(0, active) != test_bit(i, active)) { + if (!btrfs_zone_activate(cache)) { + ret = -EIO; + goto out; + } + } else { + if (test_bit(0, active)) + set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, + &cache->runtime_flags); + } + if ((i % map->sub_stripes) == 0) { + cache->zone_capacity += caps[i]; + cache->alloc_offset += alloc_offsets[i]; + } + + } + break; case BTRFS_BLOCK_GROUP_RAID5: case BTRFS_BLOCK_GROUP_RAID6: /* non-single profiles are not supported yet */ @@ -1893,6 +2002,9 @@ bool btrfs_zone_activate(struct btrfs_block_group *block_group) device = map->stripes[i].dev; physical = map->stripes[i].physical; + if (!device->zone_info) + continue; + if (device->zone_info->max_active_zones == 0) continue; From patchwork Thu Mar 2 09:45:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13156947 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4DA9C678D4 for ; Thu, 2 Mar 2023 09:46:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230184AbjCBJql (ORCPT ); Thu, 2 Mar 2023 04:46:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54018 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230188AbjCBJqY (ORCPT ); Thu, 2 Mar 2023 04:46:24 -0500 Received: from esa2.hgst.iphmx.com (esa2.hgst.iphmx.com [68.232.143.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 90F043B201 for ; Thu, 2 Mar 2023 01:46:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1677750378; x=1709286378; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=gAMqcQ+s/ZI2meVZMBPOj5EOUE7UGTITqEVlqAuBBYQ=; b=rKkylGJhU0S9/FnA7zYySZl4GAJnR3+acq/S7xL62g6SPdAn/CaA9uv/ NBTg6GGm4CaQXW30Dn6+YmUHEiK3cYZOYXjCl64HYN9ac4Qbf+/m+M5RX hW/yi5r3f2iDUvZZploVN0cbFhK5QM0O2XpZU6kKaFGR2Vevs3bSQvPOq /tCQetfEEhm4OWPksj8ydRPFwdmO28XxuzaqAnbB3SAcSjgY4o5tjWGTz tCvOCQ8xnHf8SeuWceeY7fW3FUHf/ew1/KhcRkgZpID4e87KGyeqO/fGb N1OVOFgWAYTwffQEsP8Pb5F+//Bap3RW8WeIm0uh1DHRt0oKESCcTTMMK Q==; X-IronPort-AV: E=Sophos;i="5.98,227,1673884800"; d="scan'208";a="328939196" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 02 Mar 2023 17:45:47 +0800 IronPort-SDR: 4bsMB0uzPSmU6dopT0vnKum2e0/DxvUa4Cld2TnMyty0QxWYOTjBMGYpYM1QEp7HIl6VjoSeiG Attqaf+WFm9hjcbqDVWoNlW2NEYShG1e6DWwYKfrZ23UVOahoCsCJLDiPmW7DJBLjOPjtrW0q3 Auxk0tZZUfUnKWpNDEI7/v9l30RDxZyySVxaGU4jVhmw/THlJRdePf20qtUa3Ro6klzR97Vi79 jIPkHVg8nqB9YzSp4+OosWoMFcy1d3cuGWEnuw64J24SVWeET3AE4LDPQCnaiAXQvH5svgbU1+ e2k= Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 02 Mar 2023 00:56:52 -0800 IronPort-SDR: XYibh1ODKBpFIM3McWZkmkINA5p3owhHVKRR9gx2QvXz1P1l56fgKeppgSjKGgyZkvMB2T1tb7 CqParsYH2mzKWOI0Jt8exK3KNNb51cqQVr0mhNlb4t/WJSrc93PNkqgmAO901yIjhArBP2qToB xGaXF+h+K8h5PZj5EAwNwGHCT853l/DM5UP0nzJpcvPJHUzOl3TzjW0ffr71ZuuuIjNuDJvLWF wc4EE/ZVp6ASfER9YGWBs1j3rMJlD0z2sVEx/sVXaRTMwZSuypyqvuq3I4vbtB41C9eVbI+VdB R00= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip01.wdc.com with ESMTP; 02 Mar 2023 01:45:48 -0800 From: Johannes Thumshirn To: David Sterba Cc: Johannes Thumshirn , linux-btrfs@vger.kernel.org, Josef Bacik , Christoph Hellwig Subject: [PATCH v7 09/13] btrfs: check for leaks of ordered stripes on umount Date: Thu, 2 Mar 2023 01:45:31 -0800 Message-Id: <762259752f5ace3e35e40e6160515c3637071a80.1677750131.git.johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Check if we're leaking any ordered stripes when unmounting a filesystem with an stripe tree. This check is gated behind CONFIG_BTRFS_DEBUG to not affect any production type systems. Reviewed-by: Josef Bacik Signed-off-by: Johannes Thumshirn --- fs/btrfs/disk-io.c | 2 ++ fs/btrfs/raid-stripe-tree.c | 30 ++++++++++++++++++++++++++++++ fs/btrfs/raid-stripe-tree.h | 1 + 3 files changed, 33 insertions(+) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index ac200b367ec8..abbfd71f2cb6 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -52,6 +52,7 @@ #include "relocation.h" #include "scrub.h" #include "super.h" +#include "raid-stripe-tree.h" #define BTRFS_SUPER_FLAG_SUPP (BTRFS_HEADER_FLAG_WRITTEN |\ BTRFS_HEADER_FLAG_RELOC |\ @@ -1522,6 +1523,7 @@ void btrfs_free_fs_info(struct btrfs_fs_info *fs_info) btrfs_put_root(fs_info->stripe_root); btrfs_check_leaked_roots(fs_info); btrfs_extent_buffer_leak_debug_check(fs_info); + btrfs_check_ordered_stripe_leak(fs_info); kfree(fs_info->super_copy); kfree(fs_info->super_for_commit); kfree(fs_info->subpage_info); diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c index 836299fe0ebe..391f69effd90 100644 --- a/fs/btrfs/raid-stripe-tree.c +++ b/fs/btrfs/raid-stripe-tree.c @@ -36,6 +36,36 @@ static int ordered_stripe_less(struct rb_node *rba, const struct rb_node *rbb) return ordered_stripe_cmp(&stripe->logical, rbb); } +void btrfs_check_ordered_stripe_leak(struct btrfs_fs_info *fs_info) +{ +#ifdef CONFIG_BTRFS_DEBUG + struct rb_node *node; + + if (!btrfs_stripe_tree_root(fs_info) || + RB_EMPTY_ROOT(&fs_info->stripe_update_tree)) + return; + + WARN_ON_ONCE(1); + write_lock(&fs_info->stripe_update_lock); + while ((node = rb_first_postorder(&fs_info->stripe_update_tree)) + != NULL) { + struct btrfs_ordered_stripe *stripe = + rb_entry(node, struct btrfs_ordered_stripe, rb_node); + + write_unlock(&fs_info->stripe_update_lock); + btrfs_err(fs_info, + "ordered_stripe [%llu, %llu] leaked, refcount=%d", + stripe->logical, stripe->logical + stripe->num_bytes, + refcount_read(&stripe->ref)); + while (refcount_read(&stripe->ref) > 1) + btrfs_put_ordered_stripe(fs_info, stripe); + btrfs_put_ordered_stripe(fs_info, stripe); + write_lock(&fs_info->stripe_update_lock); + } + write_unlock(&fs_info->stripe_update_lock); +#endif +} + int btrfs_add_ordered_stripe(struct btrfs_io_context *bioc) { struct btrfs_fs_info *fs_info = bioc->fs_info; diff --git a/fs/btrfs/raid-stripe-tree.h b/fs/btrfs/raid-stripe-tree.h index c7f6c5377aaa..371409351d60 100644 --- a/fs/btrfs/raid-stripe-tree.h +++ b/fs/btrfs/raid-stripe-tree.h @@ -38,6 +38,7 @@ struct btrfs_ordered_stripe *btrfs_lookup_ordered_stripe( int btrfs_add_ordered_stripe(struct btrfs_io_context *bioc); void btrfs_put_ordered_stripe(struct btrfs_fs_info *fs_info, struct btrfs_ordered_stripe *stripe); +void btrfs_check_ordered_stripe_leak(struct btrfs_fs_info *fs_info); static inline bool btrfs_need_stripe_tree_update(struct btrfs_fs_info *fs_info, u64 map_type) From patchwork Thu Mar 2 09:45:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13156948 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 098A4C7EE30 for ; Thu, 2 Mar 2023 09:46:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230151AbjCBJqm (ORCPT ); Thu, 2 Mar 2023 04:46:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54090 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230126AbjCBJqY (ORCPT ); Thu, 2 Mar 2023 04:46:24 -0500 Received: from esa2.hgst.iphmx.com (esa2.hgst.iphmx.com [68.232.143.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3461639CEC for ; Thu, 2 Mar 2023 01:46:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1677750380; x=1709286380; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mouDJAbVbx8rB75NcLOmREPJ46xZAnothKZPGkp9CNI=; b=UdSycMKfZfxw6uuq0IaQ1EKokwowtA/E808kjU1fPLdKzfLOHcgzPVkJ 1xwO/pkQtUEJZGSLxyk5N8KLR0fokX+HtXfActVmZ9RGAS1q50Wqvoy4t dqVSlPwiKrYzHKcC2uMbj3w4jwJozXrc4rt65EZ+YQ2Y9LunzFCPeNfow C/Nh1F6ZCb6fCYm/fOFzOltpNjqUBxInMxRG6Ueeqhh+3IrwQ7Lqi7KAm W8g3nRNFsEe9ombpZcQ8sGYtsqOLUa8KCGe+pYUp2evhStXIBMANiqKA8 le8XbxWIEoJgrLUAYsPVzoClsUK5FsvLbiFJyYfzIP06eGMd/Ba//PIA9 g==; X-IronPort-AV: E=Sophos;i="5.98,227,1673884800"; d="scan'208";a="328939202" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 02 Mar 2023 17:45:48 +0800 IronPort-SDR: r0m0wou3jox56OyG6qGd/vZfR1r/ssgMwRUwSUsUUI2zFODm46MR4oFwV0IjV8xNx+0e+TdLhP h+G7k0Gs/ZZFEoOvDoouZkg/FuayPU4Fyg4ByrMAJLSYTKJt57K3Xunf7ufGM9BY6d6+5YD+4H NCUiyF1wW4kMIcBalTl/Z67V2kp0OgiDADGTDdwua3OWz38kpq9rEGSYgRm4PwahWyzP0pWU3r 7BI9GX6dAOfonbaauOPl7mKxgggxKvsLqyM4HaiLdvn6Sd/XpWuuKjklt7o7/nviD/KA3l1X3h Qnw= Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 02 Mar 2023 00:56:53 -0800 IronPort-SDR: rC+Q4I5limTGSk31cMtqZfBE3rx7eUsXaBMJoIbBD9PWQPIhHCndkQzjiZ7SclRB6QGj9w9FZN k2btO8IeBBRmX4s4ZQ0rZrlFT4NYSHh9VTexFshKe2cCrTucfZwuStyabXv7OW/W8mwJo0KdW4 kaxNByk7crR8cESDVds87526wGB2ObT53Thirz9z4NdK3UGKiU0vNKFflTZPpaHfWSzad86mWR SxrGCVLDULi+hKXV79ZoJigVNoRMTdx5u36Yv0Hy2VY2wuyUw2Bf9vcKGG443Dc0ZEz0Tvrnmy P/Q= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip01.wdc.com with ESMTP; 02 Mar 2023 01:45:49 -0800 From: Johannes Thumshirn To: David Sterba Cc: Johannes Thumshirn , linux-btrfs@vger.kernel.org, Josef Bacik , Christoph Hellwig Subject: [PATCH v7 10/13] btrfs: add tracepoints for ordered stripes Date: Thu, 2 Mar 2023 01:45:32 -0800 Message-Id: <793f20f1437eee4ee1faeeb937c0119c625e369d.1677750131.git.johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Add tracepoints to check the lifetime of btrfs_ordered_stripe entries. Reviewed-by: Josef Bacik Signed-off-by: Johannes Thumshirn --- fs/btrfs/raid-stripe-tree.c | 4 ++- fs/btrfs/super.c | 1 + include/trace/events/btrfs.h | 50 ++++++++++++++++++++++++++++++++++++ 3 files changed, 54 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c index 391f69effd90..8799a7abaf38 100644 --- a/fs/btrfs/raid-stripe-tree.c +++ b/fs/btrfs/raid-stripe-tree.c @@ -112,6 +112,7 @@ int btrfs_add_ordered_stripe(struct btrfs_io_context *bioc) } write_unlock(&fs_info->stripe_update_lock); + trace_btrfs_ordered_stripe_add(fs_info, stripe); return 0; } @@ -127,6 +128,7 @@ struct btrfs_ordered_stripe *btrfs_lookup_ordered_stripe(struct btrfs_fs_info *f if (node) { stripe = rb_entry(node, struct btrfs_ordered_stripe, rb_node); refcount_inc(&stripe->ref); + trace_btrfs_ordered_stripe_lookup(fs_info, stripe); } read_unlock(&fs_info->stripe_update_lock); @@ -136,7 +138,7 @@ struct btrfs_ordered_stripe *btrfs_lookup_ordered_stripe(struct btrfs_fs_info *f void btrfs_put_ordered_stripe(struct btrfs_fs_info *fs_info, struct btrfs_ordered_stripe *stripe) { - + trace_btrfs_ordered_stripe_put(fs_info, stripe); if (refcount_dec_and_test(&stripe->ref)) { struct rb_node *node; diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index d8885966e801..fd49da569a8a 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -59,6 +59,7 @@ #include "verity.h" #include "super.h" #include "extent-tree.h" +#include "raid-stripe-tree.h" #define CREATE_TRACE_POINTS #include diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h index 8ea9cea9bfeb..7bdc8cc595cc 100644 --- a/include/trace/events/btrfs.h +++ b/include/trace/events/btrfs.h @@ -33,6 +33,7 @@ struct btrfs_space_info; struct btrfs_raid_bio; struct raid56_bio_trace_info; struct find_free_extent_ctl; +struct btrfs_ordered_stripe; #define show_ref_type(type) \ __print_symbolic(type, \ @@ -2492,6 +2493,55 @@ DEFINE_EVENT(btrfs_raid56_bio, raid56_scrub_read_recover, TP_ARGS(rbio, bio, trace_info) ); +DECLARE_EVENT_CLASS(btrfs__ordered_stripe, + + TP_PROTO(const struct btrfs_fs_info *fs_info, + const struct btrfs_ordered_stripe *stripe), + + TP_ARGS(fs_info, stripe), + + TP_STRUCT__entry_btrfs( + __field( u64, logical ) + __field( u64, num_bytes ) + __field( int, num_stripes ) + __field( int, ref ) + ), + + TP_fast_assign_btrfs(fs_info, + __entry->logical = stripe->logical; + __entry->num_bytes = stripe->num_bytes; + __entry->num_stripes = stripe->num_stripes; + __entry->ref = refcount_read(&stripe->ref); + ), + + TP_printk_btrfs("logical=%llu, num_bytes=%llu, num_stripes=%d, ref=%d", + __entry->logical, __entry->num_bytes, + __entry->num_stripes, __entry->ref) +); + +DEFINE_EVENT(btrfs__ordered_stripe, btrfs_ordered_stripe_add, + + TP_PROTO(const struct btrfs_fs_info *fs_info, + const struct btrfs_ordered_stripe *stripe), + + TP_ARGS(fs_info, stripe) +); + +DEFINE_EVENT(btrfs__ordered_stripe, btrfs_ordered_stripe_lookup, + + TP_PROTO(const struct btrfs_fs_info *fs_info, + const struct btrfs_ordered_stripe *stripe), + + TP_ARGS(fs_info, stripe) +); + +DEFINE_EVENT(btrfs__ordered_stripe, btrfs_ordered_stripe_put, + + TP_PROTO(const struct btrfs_fs_info *fs_info, + const struct btrfs_ordered_stripe *stripe), + + TP_ARGS(fs_info, stripe) +); #endif /* _TRACE_BTRFS_H */ /* This part must be outside protection */ From patchwork Thu Mar 2 09:45:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13156949 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 743C7C678D4 for ; Thu, 2 Mar 2023 09:46:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230224AbjCBJqs (ORCPT ); Thu, 2 Mar 2023 04:46:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54448 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230139AbjCBJq1 (ORCPT ); Thu, 2 Mar 2023 04:46:27 -0500 Received: from esa2.hgst.iphmx.com (esa2.hgst.iphmx.com [68.232.143.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8DFBF3B65D for ; Thu, 2 Mar 2023 01:46:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1677750381; x=1709286381; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VuCg9mvYmVdSiFqucgxEiGNf7TNxG6nEoCVr5b7cCKA=; b=n0x7/hUtu2QQJ4XPIAKMYuMZQ0vHdExJcBbzCEi0ZOv33lNs4+L2NxwS wXjCqS+Rurqdy7maEi91iiNKzDskD2aHEM86HwtjBJx6EhfhJ7knfE4dh ihgXWlGDiP/Ts5VAvZgFlZK2TN+Rf/iQ/6WthfqyXeCf0N1vY8ETEIbIB ieX6m5QaKzWXxGwWteH2Tpa+3JXY6ZQ6s/3g7wqspxJcnlspetKtX9NTr Amxfwn/nvigsQP98DpqMgekdTqp9kC/G7/QTRM5wiBiE4Go/9u2BuOpAB fFin6Myblef+rT+LNnDhc+9F07kNr1F7mLUcH1Nx39IpttQcPoKhApX6Q Q==; X-IronPort-AV: E=Sophos;i="5.98,227,1673884800"; d="scan'208";a="328939204" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 02 Mar 2023 17:45:49 +0800 IronPort-SDR: e7xoEw62pH4C7UQgizu3wlDMCzrWRU7lRkZi96tKkD/Eolt4KeFqBgvgnocmjADXpLHvP6Piwx gPuJ9wb24NIM5ge86X0FXfpjg5HFqs5E1AoCGYPcQbM/ivwNkAW72vlb74vtZ2mEhBTlBbWBiC Y1wd9SEau+ZqVSM8xXMCAbFrRzk4QmpL54GGq/BVxY1cND5IJfC9xMTq8SEB5ffRUmvN4GGK2D hjLfRMmrDoVuTgEIBe9vZ7v0CEZ/Lf650q4+7Ar104qdQsDjJP/2Sdh1A4oVgtJkD04t8SAUQy D2c= Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 02 Mar 2023 00:56:54 -0800 IronPort-SDR: ZLqGiNzGQBQiNnbvc27cAKB//GjTZuaEWqF1ceXE231u5BeUlAaA7jal0myJB5jT2+J80Zzg9v teiHbyl7HdVRCI94XPTZmGjv5SytnwEK8ZkFnGgn3fjQF5LHiB/iteYFJgTcagG5QrQkc+3AtX yMHjvWZ1d4mmItduAZqcm4hZ/7bgKoZUndCJfcgj+/LO5ado+kXcXrokPSDQFjNYOfO54mRQR9 AEWWq4ewSrbItcrGkqWIQ2Jpq+tMRKPkDfFMj9+EC0UZTMDI+0oAeYdhLWiBrAYjJWsAV/boSB pHQ= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip01.wdc.com with ESMTP; 02 Mar 2023 01:45:50 -0800 From: Johannes Thumshirn To: David Sterba Cc: Johannes Thumshirn , linux-btrfs@vger.kernel.org, Josef Bacik , Christoph Hellwig Subject: [PATCH v7 11/13] btrfs: announce presence of raid-stripe-tree in sysfs Date: Thu, 2 Mar 2023 01:45:33 -0800 Message-Id: <5ed728a84c8f2077bdcd3ed76324d3026ed25958.1677750131.git.johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org If a filesystem with a raid-stripe-tree is mounted, show the RST feature in sysfs. Reviewed-by: Josef Bacik Signed-off-by: Johannes Thumshirn --- fs/btrfs/sysfs.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index 37fc58a7f27e..bf7190e0b17a 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -297,6 +297,8 @@ BTRFS_FEAT_ATTR_INCOMPAT(zoned, ZONED); #ifdef CONFIG_BTRFS_DEBUG /* Remove once support for extent tree v2 is feature complete */ BTRFS_FEAT_ATTR_INCOMPAT(extent_tree_v2, EXTENT_TREE_V2); +/* Remove once support for raid stripe tree is feature complete */ +BTRFS_FEAT_ATTR_INCOMPAT(raid_stripe_tree, RAID_STRIPE_TREE); #endif #ifdef CONFIG_FS_VERITY BTRFS_FEAT_ATTR_COMPAT_RO(verity, VERITY); @@ -327,6 +329,7 @@ static struct attribute *btrfs_supported_feature_attrs[] = { #endif #ifdef CONFIG_BTRFS_DEBUG BTRFS_FEAT_ATTR_PTR(extent_tree_v2), + BTRFS_FEAT_ATTR_PTR(raid_stripe_tree), #endif #ifdef CONFIG_FS_VERITY BTRFS_FEAT_ATTR_PTR(verity), From patchwork Thu Mar 2 09:45:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13156951 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B197C7EE30 for ; Thu, 2 Mar 2023 09:46:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230118AbjCBJqx (ORCPT ); Thu, 2 Mar 2023 04:46:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54756 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230176AbjCBJqj (ORCPT ); Thu, 2 Mar 2023 04:46:39 -0500 Received: from esa2.hgst.iphmx.com (esa2.hgst.iphmx.com [68.232.143.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3675A3E614 for ; Thu, 2 Mar 2023 01:46:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1677750384; x=1709286384; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=g/cZhscpJ/cvgZUU0wWafl5CJk5PJAiE+H5UNDPMTUk=; b=Xf+vMuh554UDQJg1X+oGaDHgyvh4MkEkC9LVPma2DO13THiUDKTbZJ3Y ex36F+Upb5ugwQSt15HLhrH+7wdTDKB7eSueAYW/AlSgJZpo4fUc191BQ DCiBzb7hXUrMVEDxVzQMIYI2m+3f791Z1nJ/uSMb6PM3rpV8LMCtPkvP9 9XkEfj/nXbJjmRqPq2WnzEwwHaP0HFmWB6Pcs2dteuYD9G5e3069SznY7 gtpv4nPRMYYzOpCrPoF5RxL47Wx7yjFKhz6CBE7xFZdIZauF5O2tpY0nr nuoU+I9CUbks1avCmKBDSdHoM/pCQZ7PU8T+lHh16JvW4o9FjjotxLcvN w==; X-IronPort-AV: E=Sophos;i="5.98,227,1673884800"; d="scan'208";a="328939205" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 02 Mar 2023 17:45:50 +0800 IronPort-SDR: eoIqCjsogaRF0rdqHsrRq7wd+D6ECdjSB2Iqry22moQ8by4Z0mWRTHLF4QDkX7siOsOBzwE/HZ JOuy65qEXtyECpZvgIhBpoFZCzHJVHuyi1Si3LShWkNfLcgMDojQI3pVMTOpqQLZixdQZ9c3ZD brgGapcn+IhqPtyBQSFHWZ7xbdD6ce4dfHu8gvDUi3+i2RrYKO7ZJ/9V6lqlcqJ4/YLs0fBSZh u8OkTLbWCEHuY++Oo0sH9lH2X0nZlxnNmfm11FFnXjeU4HnscLY8n5AGARX0dGyO+cj+ZTP5Kp EmQ= Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 02 Mar 2023 00:56:55 -0800 IronPort-SDR: vca9Cj5M+mxPFNX+lmjz6wofED6pfdDF5mhF98vPhn2H/RQcPERri2KlC3O4Dat0kjoLRrh6IN E4K2hEWmVZ1R2fMOaSIHnLzr5scloRGxSBywKnmt2sY3jDjJkLm8vvccfp+ke0bnqLqw3Ga3rF 2bIzyKMfRXH5zr6R4teGvV9vW2uj9UpMDc4o1/aeBN0Yzb/ojO8S/6EYwirwdvrRIdioTxPq+E 2Nn5Y5fJ1s8jj1VY8M9j9vJbvJkpTFWQ1d8M5fjB3XOSJaLx+/tjv6ZX8iinafVDEeTzbcS55U cLk= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip01.wdc.com with ESMTP; 02 Mar 2023 01:45:51 -0800 From: Johannes Thumshirn To: David Sterba Cc: Johannes Thumshirn , linux-btrfs@vger.kernel.org, Josef Bacik , Christoph Hellwig Subject: [PATCH v7 12/13] btrfs: consult raid-stripe-tree when scrubbing Date: Thu, 2 Mar 2023 01:45:34 -0800 Message-Id: <4f1616ea9843b785f498d52a18cd4c8d5690c442.1677750131.git.johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org When scrubbing a filesystem which uses the raid-stripe-tree for logical to physical address translation, consult the RST to perform the address translation instead of relying on fixed block group offsets. Reviewed-by: Josef Bacik Signed-off-by: Johannes Thumshirn --- fs/btrfs/scrub.c | 33 +++++++++++++++++++++++++++++++-- 1 file changed, 31 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index c83ac6b80c2f..86e3374d49b0 100644 --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -24,6 +24,7 @@ #include "accessors.h" #include "file-item.h" #include "scrub.h" +#include "raid-stripe-tree.h" /* * This is only the first step towards a full-features scrub. It reads all @@ -2821,6 +2822,21 @@ static int scrub_extent(struct scrub_ctx *sctx, struct map_lookup *map, int ret; u8 csum[BTRFS_CSUM_SIZE]; u32 blocksize; + struct btrfs_io_stripe stripe; + const bool stripe_update = + btrfs_need_stripe_tree_update(sctx->fs_info, map->type); + + if (stripe_update) { + stripe.dev = src_dev; + ret = btrfs_get_raid_extent_offset(sctx->fs_info, logical, + (u64 *)&len, + map->type, mirror_num, + &stripe); + if (ret) + return ret; + + src_physical = stripe.physical; + } if (flags & BTRFS_EXTENT_FLAG_DATA) { if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) @@ -2872,8 +2888,21 @@ static int scrub_extent(struct scrub_ctx *sctx, struct map_lookup *map, return ret; len -= l; logical += l; - physical += l; - src_physical += l; + if (stripe_update && len) { + + ret = btrfs_get_raid_extent_offset(sctx->fs_info, + logical, (u64 *)&len, + map->type, mirror_num, + &stripe); + if (ret) + return ret; + + src_physical = stripe.physical; + physical = stripe.physical; + } else { + physical += l; + src_physical += l; + } } return 0; } From patchwork Thu Mar 2 09:45:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 13156950 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 66D95C678D4 for ; Thu, 2 Mar 2023 09:46:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230098AbjCBJqw (ORCPT ); Thu, 2 Mar 2023 04:46:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53984 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230175AbjCBJqj (ORCPT ); Thu, 2 Mar 2023 04:46:39 -0500 Received: from esa2.hgst.iphmx.com (esa2.hgst.iphmx.com [68.232.143.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C6E3B3E637 for ; Thu, 2 Mar 2023 01:46:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1677750384; x=1709286384; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=OjQbsF1ne03+nUqwRERuk1yUqBeoGcdCoxumcqJrz8I=; b=gzFLBHw9sbHjK52+LQw5yiSQQ45ei5gkPPVilqtO0dQnV2QPuZivIhCs Mplgcn9jIIsxLQRbaxf6a5/auoCl7j+fgaf1Hq/azvwGDj4sOEW323ca9 FqxAey60jjfQUQoe3O3AnEAvoCgtXgJ7k+9wW6kUAN4m/FzsRsWlsQv6v rgZ6YSdeRAu15MVic8JKnU/rgz7UKUGVVhVx9El1OdZ+TWwNYoXuGpDNg hJurXMm1JEHdL8xonrAod5zHyjFa9o9Hd3bQiEI1P6l7E5xM7CtGqG5N2 rBbzM/OjQ7WFxYRkT94WwX8GEgU0G6rTczmSXFDvcHg7KIyCKZvtNVJPY Q==; X-IronPort-AV: E=Sophos;i="5.98,227,1673884800"; d="scan'208";a="328939206" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 02 Mar 2023 17:45:51 +0800 IronPort-SDR: I515wWw5Yw58bj96Tj+w9wJ4ui69vG/SE75NCB1/cn075Tdew5Oci4VRUT1rUXObCpSvbEtsVZ AVFp/dtvUtwUi24PHa4NY+RgtZ0v5lKEpLljGCkpYd2gCGEAF/aiHHoTBfnH2u+Eshys5czdZl bFO3q+Qj0+nFe1tFUiH5ymlspmld7jo3dqAIU9SRc77zy/UUnEKrhIo1raZKMGlh8la2mBSdDc z/KPBvoc/YxkZ6zNt84xAH9uWoNMx0qAVOQapPMsIZ/JuZAHOv2HD/jtKoq+rum302pK3Oi3uJ muk= Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 02 Mar 2023 00:56:56 -0800 IronPort-SDR: Q/z8FpBbQnGrP15n2BRH3NwOtachSozFAxzJq0jgydB1r7XrCWwiqlsUYUSM3cDwKx3vinzTjv 2Kcay5E8C+K5oaP1ykY4F+wg6AkS8/kv1AHgTcSCUmoFmITGhYJetrHdw7m+f7EDqdPuJ7mYMD lengNA84kiRqHA5AtYVLokK2l+74Esie3sUa8oa4QLTEyrGhpatCO8rK/LVZDCu9/kjuULKLh4 bkuvCRiUTLP3GeDhJq0HTJcOLqCHTqrS21VxGaXnEPlNSYlbAYHiSx53AfTul8jywh+vezjtcm nwA= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip01.wdc.com with ESMTP; 02 Mar 2023 01:45:51 -0800 From: Johannes Thumshirn To: David Sterba Cc: Johannes Thumshirn , linux-btrfs@vger.kernel.org, Josef Bacik , Christoph Hellwig Subject: [PATCH v7 13/13] btrfs: add raid-stripe-tree to features enabled with debug Date: Thu, 2 Mar 2023 01:45:35 -0800 Message-Id: <98922293ba48f88d3a71241ccc8341f5b3f7ca33.1677750131.git.johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Until the RAID stripe tree code is well enough tested and feature complete, "hide" it behind CONFIG_BTRFS_DEBUG so only people who want to use it are actually using it. Reviewed-by: Josef Bacik Signed-off-by: Johannes Thumshirn Reviewed-by: Anand Jain Reviewed-by: Naohiro Aota --- fs/btrfs/fs.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h index d0d80540b32b..dd151538d2b1 100644 --- a/fs/btrfs/fs.h +++ b/fs/btrfs/fs.h @@ -214,7 +214,8 @@ enum { BTRFS_FEATURE_INCOMPAT_METADATA_UUID | \ BTRFS_FEATURE_INCOMPAT_RAID1C34 | \ BTRFS_FEATURE_INCOMPAT_ZONED | \ - BTRFS_FEATURE_INCOMPAT_EXTENT_TREE_V2) + BTRFS_FEATURE_INCOMPAT_EXTENT_TREE_V2 | \ + BTRFS_FEATURE_INCOMPAT_RAID_STRIPE_TREE) #else #define BTRFS_FEATURE_INCOMPAT_SUPP \ (BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF | \