From patchwork Mon Jul 31 17:17:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 13335327 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D2FFC001DC for ; Mon, 31 Jul 2023 17:20:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233583AbjGaRUD (ORCPT ); Mon, 31 Jul 2023 13:20:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43878 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232777AbjGaRTl (ORCPT ); Mon, 31 Jul 2023 13:19:41 -0400 Received: from esa3.hgst.iphmx.com (esa3.hgst.iphmx.com [216.71.153.141]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DE4CD4487 for ; Mon, 31 Jul 2023 10:18:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1690823919; x=1722359919; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=H5O5/TNjnGB0hTLhYOqlAY6fg6UplW4pIlYJuBPhwXY=; b=OivaPpiESA0R9yIUzSYWlvDocyw39/qvldsfPvU24cQ/Y94jwL1dz4nx 91a5H76XacmP+HaGDWZqWG08OgMCXk+K+8uGrsf9wywvxofoXoxffs+X7 YTu8SQe4exVThx9l2yrCxpvj9mWDfIKJg5RR/ROhoiEpdhQPNcTVFMaez pn8Z2pKukrpv9IXFsqX9AfG/J9umRmGHQM8v5ISZPD7/HftiGFe80L+iX 3Jod3ykMh6Qdj9e8D5rl3BamBfdSXni/ycyYj97C0AnBv/WOi19uEGhKl Rt9L0UIlOMlPn9u7ADgyi/nNY+xPmWyHTlHN7g1OhO1Mc8jychkpHNj5Z Q==; X-IronPort-AV: E=Sophos;i="6.01,245,1684771200"; d="scan'208";a="244269550" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 01 Aug 2023 01:17:33 +0800 IronPort-SDR: mRcKle7LDQQRsblXjjWuPK4ITKLWJiGtMoWLlvXLENI03/2YbQpgdPllsv+tKVPQxoFZm5r1GK 29XFZ7nkDswCM1lJV5Zm1Rc8CaObS8BnySy+gS1s4V1L1g2KuvRZGQaBHxihzhLcWAOv/GU4/m DqhRHMfPeFBp47zL8Z6g5KoRMVcXcaLpW4FbPTyfRTTJgwgcWAZdUREP68EUtYIlhemyRSQHQm dypOJqXkQk5UjA68Fbi2i7qqkDg4FH5nxw6kvMuPOeAZO2jKl1WKT30hgJX2E/dN/O4ZHVryH3 LUk= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 31 Jul 2023 09:31:10 -0700 IronPort-SDR: F94xRnNSW3g6o+CYzCM4nN9qNT8RT4R1UTEK3Go/EjWx0abhwMxHRV/xF+KpCkf2ZL7mPcV7AV yUY2CUo71n1ahi84If+s3nY/a1528Yf7noEhOpqhzU8RaM3P+P+AbQcVyGuoN5Pxr8kRpziASf 2OGLrNL7X1elcEHPL75p0atAFyU3/leeu+HnXoe4i2wtJECkLA+wUbC97EXv9Yl7/Q7J07JDox WKYUbheuWa03mk11FaDQBhrJjZZLT2nOPNEGoV3Gt8UqctP8lIXuxO6ZYUUlpYW19R6BTYXqGi Mj0= WDCIronportException: Internal Received: from unknown (HELO naota-xeon.wdc.com) ([10.225.163.18]) by uls-op-cesaip02.wdc.com with ESMTP; 31 Jul 2023 10:17:33 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org Cc: hch@infradead.org, josef@toxicpanda.com, dsterba@suse.cz, Naohiro Aota Subject: [PATCH v2 01/10] btrfs: introduce struct to consolidate extent buffer write context Date: Tue, 1 Aug 2023 02:17:10 +0900 Message-ID: <1cc8f3a21680d196751171f09ddb77b9c14a5b9a.1690823282.git.naohiro.aota@wdc.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Introduce btrfs_eb_write_context to consolidate writeback_control and the exntent buffer context. This will help adding a block group context as well. While at it, move the eb context setting before btrfs_check_meta_write_pointer(). We can set it here because we anyway need to skip pages in the same eb if that eb is rejected by btrfs_check_meta_write_pointer(). Suggested-by: Christoph Hellwig Signed-off-by: Naohiro Aota Reviewed-by: Christoph Hellwig Reviewed-by: Johannes Thumshirn --- fs/btrfs/extent_io.c | 17 ++++++++++------- fs/btrfs/extent_io.h | 5 +++++ 2 files changed, 15 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 177d65d51447..40633bc15c97 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -1784,9 +1784,9 @@ static int submit_eb_subpage(struct page *page, struct writeback_control *wbc) * previous call. * Return <0 for fatal error. */ -static int submit_eb_page(struct page *page, struct writeback_control *wbc, - struct extent_buffer **eb_context) +static int submit_eb_page(struct page *page, struct btrfs_eb_write_context *ctx) { + struct writeback_control *wbc = ctx->wbc; struct address_space *mapping = page->mapping; struct btrfs_block_group *cache = NULL; struct extent_buffer *eb; @@ -1815,7 +1815,7 @@ static int submit_eb_page(struct page *page, struct writeback_control *wbc, return 0; } - if (eb == *eb_context) { + if (eb == ctx->eb) { spin_unlock(&mapping->private_lock); return 0; } @@ -1824,6 +1824,8 @@ static int submit_eb_page(struct page *page, struct writeback_control *wbc, if (!ret) return 0; + ctx->eb = eb; + if (!btrfs_check_meta_write_pointer(eb->fs_info, eb, &cache)) { /* * If for_sync, this hole will be filled with @@ -1837,8 +1839,6 @@ static int submit_eb_page(struct page *page, struct writeback_control *wbc, return ret; } - *eb_context = eb; - if (!lock_extent_buffer_for_io(eb, wbc)) { btrfs_revert_meta_write_pointer(cache, eb); if (cache) @@ -1861,7 +1861,10 @@ static int submit_eb_page(struct page *page, struct writeback_control *wbc, int btree_write_cache_pages(struct address_space *mapping, struct writeback_control *wbc) { - struct extent_buffer *eb_context = NULL; + struct btrfs_eb_write_context ctx = { + .wbc = wbc, + .eb = NULL, + }; struct btrfs_fs_info *fs_info = BTRFS_I(mapping->host)->root->fs_info; int ret = 0; int done = 0; @@ -1903,7 +1906,7 @@ int btree_write_cache_pages(struct address_space *mapping, for (i = 0; i < nr_folios; i++) { struct folio *folio = fbatch.folios[i]; - ret = submit_eb_page(&folio->page, wbc, &eb_context); + ret = submit_eb_page(&folio->page, &ctx); if (ret == 0) continue; if (ret < 0) { diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index adda14c1b763..e243a8eac910 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -93,6 +93,11 @@ struct extent_buffer { #endif }; +struct btrfs_eb_write_context { + struct writeback_control *wbc; + struct extent_buffer *eb; +}; + /* * Get the correct offset inside the page of extent buffer. * From patchwork Mon Jul 31 17:17:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 13335328 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 671F2C001DC for ; Mon, 31 Jul 2023 17:20:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232141AbjGaRU1 (ORCPT ); Mon, 31 Jul 2023 13:20:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42468 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232094AbjGaRTt (ORCPT ); Mon, 31 Jul 2023 13:19:49 -0400 Received: from esa3.hgst.iphmx.com (esa3.hgst.iphmx.com [216.71.153.141]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E53EC4687 for ; Mon, 31 Jul 2023 10:18:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1690823926; x=1722359926; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=29NmnBpzMVnfklSXrLpRVa5ows284tTdzjd0plRynD0=; b=dcMY8jifPOaqA9th1DZnf7E3ryOxtDSp81CgYnUcuuYJt4WY3SOTr29V p6rilutUg4NQpIif368bfQWptvvCrcKPUrIMTlVqPNIxBkTFPBKDWi3S/ 86jax+LTsOtYLGI4GYiYuUEKnQ/Py0ztEJ6NblPah12XvMZLMCXJkjilf E9N310gS8+xo6Jp2p/MAUilWnxI6hou0Yw+We9HCRDRQxj+nvjAM8NPBR fSqTCO9/T7fOJbIdPCjoR+U9jSqp3L8FoI8t7e7eUKyDH5Tp7+QqOJa+c s0sel3hUyyaWcOeAfyl8anq7f/hFH6Fzym8EL/4TK5U5RgAYH2c7vEyDi A==; X-IronPort-AV: E=Sophos;i="6.01,245,1684771200"; d="scan'208";a="244269554" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 01 Aug 2023 01:17:34 +0800 IronPort-SDR: TDm8OaQM+SoDTsXh2NYNBsLCa83rHDJwzFvfvjlK+gdrn5Mn3ALAFHnci3RbKpar2CYmeI+tHl xMfYv6CX8Psr2tmfR/OoiBt4Ltq5tKVb1uQOiKbns9wemKqeRrjzR2yNKfTSaclSwdmNlDuZNJ KONLrbRlZrxmPwVoOMg+0ZjZw97UTvesUSEVZjXUh79YOFxkqAKzUExiLBsz8ao9UBRbvXlOzu Yuoc153uztPQFvqyiMyXxNd5fUglGPaHZRRFD/Ua8QfUaemj0Ft3uOyiZLXylHfr407CAZF/rN kXo= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 31 Jul 2023 09:31:11 -0700 IronPort-SDR: oet1X2vtkw0955oZHfZbyktzc0PTL3D1JiutwKH4dcunIaxlfrlyBeyEIdDUoGsOch1VCiRfNd ZAA4/UyLR/TPTIFSylg/PoqHT04lTaF3mZkMVg4hzwEanLMq4rmgPomY8urMgsnHOQffucaeN+ 3iUzoaSAj5ip7nnuRfKWoC4rHHB2i5wxRBgHJe9IIgqurqWHY7ZfJchV85ZJN83scx+bZD/9wm TuuoL0cWmJiWavKg0ktO2oj4hqcEYxD1KwmKdteVjUzsTmBw1GK4bqWYxwl6VNBkpFK6jnq127 KJk= WDCIronportException: Internal Received: from unknown (HELO naota-xeon.wdc.com) ([10.225.163.18]) by uls-op-cesaip02.wdc.com with ESMTP; 31 Jul 2023 10:17:34 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org Cc: hch@infradead.org, josef@toxicpanda.com, dsterba@suse.cz, Naohiro Aota Subject: [PATCH v2 02/10] btrfs: zoned: introduce block_group context to btrfs_eb_write_context Date: Tue, 1 Aug 2023 02:17:11 +0900 Message-ID: <31cfb11cd71edc3513f0d65d1da6a2b6d3b959e7.1690823282.git.naohiro.aota@wdc.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For metadata write out on the zoned mode, we call btrfs_check_meta_write_pointer() to check if an extent buffer to be written is aligned to the write pointer. We lookup for a block group containing the extent buffer for every extent buffer, which take unnecessary effort as the writing extent buffers are mostly contiguous. Introduce "block_group" to cache the block group working on. Also, while at it, rename "cache" to "block_group". Signed-off-by: Naohiro Aota Reviewed-by: Christoph Hellwig Reviewed-by: Johannes Thumshirn --- fs/btrfs/extent_io.c | 16 ++++++++-------- fs/btrfs/extent_io.h | 1 + fs/btrfs/zoned.c | 35 ++++++++++++++++++++--------------- fs/btrfs/zoned.h | 6 ++---- 4 files changed, 31 insertions(+), 27 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 40633bc15c97..da8d9478972c 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -1788,7 +1788,6 @@ static int submit_eb_page(struct page *page, struct btrfs_eb_write_context *ctx) { struct writeback_control *wbc = ctx->wbc; struct address_space *mapping = page->mapping; - struct btrfs_block_group *cache = NULL; struct extent_buffer *eb; int ret; @@ -1826,7 +1825,7 @@ static int submit_eb_page(struct page *page, struct btrfs_eb_write_context *ctx) ctx->eb = eb; - if (!btrfs_check_meta_write_pointer(eb->fs_info, eb, &cache)) { + if (!btrfs_check_meta_write_pointer(eb->fs_info, ctx)) { /* * If for_sync, this hole will be filled with * trasnsaction commit. @@ -1840,18 +1839,15 @@ static int submit_eb_page(struct page *page, struct btrfs_eb_write_context *ctx) } if (!lock_extent_buffer_for_io(eb, wbc)) { - btrfs_revert_meta_write_pointer(cache, eb); - if (cache) - btrfs_put_block_group(cache); + btrfs_revert_meta_write_pointer(ctx->block_group, eb); free_extent_buffer(eb); return 0; } - if (cache) { + if (ctx->block_group) { /* * Implies write in zoned mode. Mark the last eb in a block group. */ - btrfs_schedule_zone_finish_bg(cache, eb); - btrfs_put_block_group(cache); + btrfs_schedule_zone_finish_bg(ctx->block_group, eb); } write_one_eb(eb, wbc); free_extent_buffer(eb); @@ -1864,6 +1860,7 @@ int btree_write_cache_pages(struct address_space *mapping, struct btrfs_eb_write_context ctx = { .wbc = wbc, .eb = NULL, + .block_group = NULL, }; struct btrfs_fs_info *fs_info = BTRFS_I(mapping->host)->root->fs_info; int ret = 0; @@ -1967,6 +1964,9 @@ int btree_write_cache_pages(struct address_space *mapping, ret = 0; if (!ret && BTRFS_FS_ERROR(fs_info)) ret = -EROFS; + + if (ctx.block_group) + btrfs_put_block_group(ctx.block_group); btrfs_zoned_meta_io_unlock(fs_info); return ret; } diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index e243a8eac910..d616d30ed4bd 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -96,6 +96,7 @@ struct extent_buffer { struct btrfs_eb_write_context { struct writeback_control *wbc; struct extent_buffer *eb; + struct btrfs_block_group *block_group; }; /* diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index 5e4285ae112c..a6cdd0c4d7b7 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -1748,30 +1748,35 @@ void btrfs_finish_ordered_zoned(struct btrfs_ordered_extent *ordered) } bool btrfs_check_meta_write_pointer(struct btrfs_fs_info *fs_info, - struct extent_buffer *eb, - struct btrfs_block_group **cache_ret) + struct btrfs_eb_write_context *ctx) { - struct btrfs_block_group *cache; - bool ret = true; + const struct extent_buffer *eb = ctx->eb; + struct btrfs_block_group *block_group = ctx->block_group; if (!btrfs_is_zoned(fs_info)) return true; - cache = btrfs_lookup_block_group(fs_info, eb->start); - if (!cache) - return true; + if (block_group) { + if (block_group->start > eb->start || + block_group->start + block_group->length <= eb->start) { + btrfs_put_block_group(block_group); + block_group = NULL; + ctx->block_group = NULL; + } + } - if (cache->meta_write_pointer != eb->start) { - btrfs_put_block_group(cache); - cache = NULL; - ret = false; - } else { - cache->meta_write_pointer = eb->start + eb->len; + if (!block_group) { + block_group = btrfs_lookup_block_group(fs_info, eb->start); + if (!block_group) + return true; + ctx->block_group = block_group; } - *cache_ret = cache; + if (block_group->meta_write_pointer != eb->start) + return false; + block_group->meta_write_pointer = eb->start + eb->len; - return ret; + return true; } void btrfs_revert_meta_write_pointer(struct btrfs_block_group *cache, diff --git a/fs/btrfs/zoned.h b/fs/btrfs/zoned.h index 27322b926038..49d5bd87245c 100644 --- a/fs/btrfs/zoned.h +++ b/fs/btrfs/zoned.h @@ -59,8 +59,7 @@ void btrfs_redirty_list_add(struct btrfs_transaction *trans, bool btrfs_use_zone_append(struct btrfs_bio *bbio); void btrfs_record_physical_zoned(struct btrfs_bio *bbio); bool btrfs_check_meta_write_pointer(struct btrfs_fs_info *fs_info, - struct extent_buffer *eb, - struct btrfs_block_group **cache_ret); + struct btrfs_eb_write_context *ctx); void btrfs_revert_meta_write_pointer(struct btrfs_block_group *cache, struct extent_buffer *eb); int btrfs_zoned_issue_zeroout(struct btrfs_device *device, u64 physical, u64 length); @@ -190,8 +189,7 @@ static inline void btrfs_record_physical_zoned(struct btrfs_bio *bbio) } static inline bool btrfs_check_meta_write_pointer(struct btrfs_fs_info *fs_info, - struct extent_buffer *eb, - struct btrfs_block_group **cache_ret) + struct btrfs_eb_write_context *ctx) { return true; } From patchwork Mon Jul 31 17:17:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 13335330 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B69C2C001E0 for ; Mon, 31 Jul 2023 17:20:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230114AbjGaRUf (ORCPT ); Mon, 31 Jul 2023 13:20:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43510 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232528AbjGaRUP (ORCPT ); Mon, 31 Jul 2023 13:20:15 -0400 Received: from esa3.hgst.iphmx.com (esa3.hgst.iphmx.com [216.71.153.141]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B9DC81BF1 for ; Mon, 31 Jul 2023 10:19:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1690823940; x=1722359940; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=RZMbr38WIVVfSUpxRsQULT34AVa+TI4hSVGvb2cAnmE=; b=HkD508sEcG5qnUy73dbcBn10jT0k0XDfatgfOJjFIDQ3GxdWWbT74qAu w3C3u+erGPI79cdagd3h/PyrOPocp8ZyFclTND47v+893WrMEge669Qut KLGhCjUVrCmAUvR0nRYtVQQzm9gaKLqRnDTGBtUkx8LscfZwtCzYq9Ovx SzxdPeD7adgdHo2A11fSNLuV+MxWJH9FqkJT4DFDVRDpHV0SLpHJz1ekk +UTmMB4k/tWzbVgSX/kIhcNfC+L2aBey9Bwygr3sdOVWtDUChApULms/o GskMjYa5YiEMKbgrGsZUy2eeFFPSlchGV8ps8eixlU5DGTbB/ajBEUpRt Q==; X-IronPort-AV: E=Sophos;i="6.01,245,1684771200"; d="scan'208";a="244269556" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 01 Aug 2023 01:17:35 +0800 IronPort-SDR: Ko9ewRLr+30Et4OC5WBzZNgwUo5cCE5ww9DC3pj6jZqsaH25TOfcHFpD0nj6OIx1wyVJeZz3G2 jUjsvmgH8xHldPf2VluytoCHApriub3aDWcDT4OvDV+nExVIO8BHg2foCFw7MTynQAYfIpUnLK zkcPeYhNtXmdRYYXww+ZdzGZuChKQ4FirHovPnbl6bn9HyXwglPC6+He2zH96mjY7u0En+lA23 w47YhOmiGhX3EkyRXNddaM0DZn+EOWqFppK5xL9Lsz/0HkDfTOvgc42tCVoB1BtUVDyAw+kYGE A5s= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 31 Jul 2023 09:31:12 -0700 IronPort-SDR: 6AFjpUaM/6VdWD3HC0gyfdgK3G6e+zMCPBIRsMYRpNT4dbkGNE2wFifDTnxzapA5+xfsPimR2B yZZTIRgdNudvxiesHOYUbbn5jSiZXB9IPP5t3P61kFcPX6ZXLt6w2HxwznEFAEj9AvrgCeJEeG w1yNB7FmH+E8gx83afw3L0t9Hxt9uk/v6mlerb5REv2gHoARx8Yr+34LGOaerZXjkw/RjGcPnj QcT75M9GQoyRTfe6J7SL6dxPeuemjGxiX3TnTaiAlsNowXPaqYVQ/YRhNt21DwGY7m9I2N8VVB FE4= WDCIronportException: Internal Received: from unknown (HELO naota-xeon.wdc.com) ([10.225.163.18]) by uls-op-cesaip02.wdc.com with ESMTP; 31 Jul 2023 10:17:35 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org Cc: hch@infradead.org, josef@toxicpanda.com, dsterba@suse.cz, Naohiro Aota Subject: [PATCH v2 03/10] btrfs: zoned: return int from btrfs_check_meta_write_pointer Date: Tue, 1 Aug 2023 02:17:12 +0900 Message-ID: X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Now that we have writeback_controll passed to btrfs_check_meta_write_pointer(), we can move the wbc condition in submit_eb_page() to btrfs_check_meta_write_pointer() and return int. Reviewed-by: Christoph Hellwig Reviewed-by: Johannes Thumshirn Signed-off-by: Naohiro Aota --- fs/btrfs/extent_io.c | 11 +++-------- fs/btrfs/zoned.c | 30 ++++++++++++++++++++++-------- fs/btrfs/zoned.h | 10 +++++----- 3 files changed, 30 insertions(+), 21 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index da8d9478972c..012f2853b835 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -1825,14 +1825,9 @@ static int submit_eb_page(struct page *page, struct btrfs_eb_write_context *ctx) ctx->eb = eb; - if (!btrfs_check_meta_write_pointer(eb->fs_info, ctx)) { - /* - * If for_sync, this hole will be filled with - * trasnsaction commit. - */ - if (wbc->sync_mode == WB_SYNC_ALL && !wbc->for_sync) - ret = -EAGAIN; - else + ret = btrfs_check_meta_write_pointer(eb->fs_info, ctx); + if (ret) { + if (ret == -EBUSY) ret = 0; free_extent_buffer(eb); return ret; diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index a6cdd0c4d7b7..0aa32b19adb5 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -1747,14 +1747,23 @@ void btrfs_finish_ordered_zoned(struct btrfs_ordered_extent *ordered) } } -bool btrfs_check_meta_write_pointer(struct btrfs_fs_info *fs_info, - struct btrfs_eb_write_context *ctx) +/* + * Check @ctx->eb is aligned to the write pointer + * + * Return: + * 0: @ctx->eb is at the write pointer. You can write it. + * -EAGAIN: There is a hole. The caller should handle the case. + * -EBUSY: There is a hole, but the caller can just bail out. + */ +int btrfs_check_meta_write_pointer(struct btrfs_fs_info *fs_info, + struct btrfs_eb_write_context *ctx) { + const struct writeback_control *wbc = ctx->wbc; const struct extent_buffer *eb = ctx->eb; struct btrfs_block_group *block_group = ctx->block_group; if (!btrfs_is_zoned(fs_info)) - return true; + return 0; if (block_group) { if (block_group->start > eb->start || @@ -1768,15 +1777,20 @@ bool btrfs_check_meta_write_pointer(struct btrfs_fs_info *fs_info, if (!block_group) { block_group = btrfs_lookup_block_group(fs_info, eb->start); if (!block_group) - return true; + return 0; ctx->block_group = block_group; } - if (block_group->meta_write_pointer != eb->start) - return false; - block_group->meta_write_pointer = eb->start + eb->len; + if (block_group->meta_write_pointer == eb->start) { + block_group->meta_write_pointer = eb->start + eb->len; - return true; + return 0; + } + + /* If for_sync, this hole will be filled with trasnsaction commit. */ + if (wbc->sync_mode == WB_SYNC_ALL && !wbc->for_sync) + return -EAGAIN; + return -EBUSY; } void btrfs_revert_meta_write_pointer(struct btrfs_block_group *cache, diff --git a/fs/btrfs/zoned.h b/fs/btrfs/zoned.h index 49d5bd87245c..c0859d8be152 100644 --- a/fs/btrfs/zoned.h +++ b/fs/btrfs/zoned.h @@ -58,8 +58,8 @@ void btrfs_redirty_list_add(struct btrfs_transaction *trans, struct extent_buffer *eb); bool btrfs_use_zone_append(struct btrfs_bio *bbio); void btrfs_record_physical_zoned(struct btrfs_bio *bbio); -bool btrfs_check_meta_write_pointer(struct btrfs_fs_info *fs_info, - struct btrfs_eb_write_context *ctx); +int btrfs_check_meta_write_pointer(struct btrfs_fs_info *fs_info, + struct btrfs_eb_write_context *ctx); void btrfs_revert_meta_write_pointer(struct btrfs_block_group *cache, struct extent_buffer *eb); int btrfs_zoned_issue_zeroout(struct btrfs_device *device, u64 physical, u64 length); @@ -188,10 +188,10 @@ static inline void btrfs_record_physical_zoned(struct btrfs_bio *bbio) { } -static inline bool btrfs_check_meta_write_pointer(struct btrfs_fs_info *fs_info, - struct btrfs_eb_write_context *ctx) +static inline int btrfs_check_meta_write_pointer(struct btrfs_fs_info *fs_info, + struct btrfs_eb_write_context *ctx) { - return true; + return 0; } static inline void btrfs_revert_meta_write_pointer( From patchwork Mon Jul 31 17:17:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 13335329 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AF5F6C001DC for ; Mon, 31 Jul 2023 17:20:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232933AbjGaRUf (ORCPT ); Mon, 31 Jul 2023 13:20:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43528 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232926AbjGaRUQ (ORCPT ); Mon, 31 Jul 2023 13:20:16 -0400 Received: from esa3.hgst.iphmx.com (esa3.hgst.iphmx.com [216.71.153.141]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BD00E1BF2 for ; Mon, 31 Jul 2023 10:19:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1690823940; x=1722359940; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=dyqrTeHzvvtU36YSw3C99zKH2XHPaRtcO8cHr1T7Kk8=; b=d3RlOYnLHvnYgU/DuCko6oof1GW8RDlnLqswlS2BxHM9Ib7QNY27HPcH EJpf519E0nMFf7bTxC5zd1XzIV98Sg19lfYFQcn3WkUyJ7ZZwxeI8tMp2 8/bEyRFFwS8EU3BwCKVTWNoGlRZnjoKC2Mx/Bv8MqpMdyjfhfQSDX29o2 gQv8tQWqEMZlJezvQUAPV69GFy2ggMJ9N6ViuMg/S1Apm3gNgrG4jcMEP RwCrH7UB2Wt7HJBEQXOHSlzE7p/FG0WO3CvIh8mhpGYX4f8XDaXOzAISP 6YZF43KWYubkvrEU8hltwg5+oVv36rBWsFUeP6iqQp25FNq4fdHNLRKDj w==; X-IronPort-AV: E=Sophos;i="6.01,245,1684771200"; d="scan'208";a="244269560" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 01 Aug 2023 01:17:36 +0800 IronPort-SDR: 9RhgvoJhCcN+NS1CE1hZI/pbaYR4hNzX5qasFDZ6gIBoKVoZMM8l6MsvmTTMDN4XbUcDkR4nxR vTly25KcAX69Fx5EIGAIzUTwNb46ctqjf4ra1Mu0lQRyzIrmp/rvdO8Fb80+IT/HgNiFLHoCn2 3/Yi1dq5Xp/+LPuXPCXNMPmJ3kbFF4GcP+PyzuWTh6RoFqkmc/TkM29B4+w12wXhJC2Ou5hXHA 4OfKLrPlc0q84XH42GrWbmYAcDU86XBJH0VG6ZjG1TsJAoXXM93UZD3DScOglV3Q78hIe/TuwY 6fI= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 31 Jul 2023 09:31:13 -0700 IronPort-SDR: JzEbITZ011VqK7Rk/q88OlnuWmyPR2g8D0wH+qe96hOxcrET1bu0B/Ue/0BSj2inJpMIfQiwdB ROFFBxpEzZYaHaCZZWYBW3/NU79AAVJXt/bbHo4IY3yWiRMk/IWiDrVR0wXMqZ8CMJDxLMAuUV aZoyOc5laAC2j6GuKtoAejwDsPfOP1OSycRjCslhcm2qWbRndKdzxY3Djt0W33y8a6KgcegvQW y2kzzySWGQuubHtffM1tfnACifFOvZnmcGaPXadfQDVP78aBQa+y9911adlkR9FSDSuoHqR4d4 P1M= WDCIronportException: Internal Received: from unknown (HELO naota-xeon.wdc.com) ([10.225.163.18]) by uls-op-cesaip02.wdc.com with ESMTP; 31 Jul 2023 10:17:36 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org Cc: hch@infradead.org, josef@toxicpanda.com, dsterba@suse.cz, Naohiro Aota Subject: [PATCH v2 04/10] btrfs: zoned: defer advancing meta_write_pointer Date: Tue, 1 Aug 2023 02:17:13 +0900 Message-ID: X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org We currently advance the meta_write_pointer in btrfs_check_meta_write_pointer(). That make it necessary to revert to it when locking the buffer failed. Instead, we can advance it just before sending the buffer. Also, this is necessary for the following commit. In the commit, it needs to release the zoned_meta_io_lock to allow IOs to come in and wait for them to fill the currently active block group. If we advance the meta_write_pointer before locking the extent buffer, the following extent buffer can pass the meta_write_pointer check, resuting in an unaligned write failure. Advancing the pointer is still thread-safe as the extent buffer is locked. Signed-off-by: Naohiro Aota Reviewed-by: Christoph Hellwig Reviewed-by: Johannes Thumshirn --- fs/btrfs/extent_io.c | 8 ++++---- fs/btrfs/zoned.c | 15 +-------------- fs/btrfs/zoned.h | 8 -------- 3 files changed, 5 insertions(+), 26 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 012f2853b835..5388c2c3c6f4 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -1834,15 +1834,15 @@ static int submit_eb_page(struct page *page, struct btrfs_eb_write_context *ctx) } if (!lock_extent_buffer_for_io(eb, wbc)) { - btrfs_revert_meta_write_pointer(ctx->block_group, eb); free_extent_buffer(eb); return 0; } if (ctx->block_group) { - /* - * Implies write in zoned mode. Mark the last eb in a block group. - */ + /* Implies write in zoned mode. */ + + /* Mark the last eb in the block group. */ btrfs_schedule_zone_finish_bg(ctx->block_group, eb); + ctx->block_group->meta_write_pointer += eb->len; } write_one_eb(eb, wbc); free_extent_buffer(eb); diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index 0aa32b19adb5..fa595eca39ca 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -1781,11 +1781,8 @@ int btrfs_check_meta_write_pointer(struct btrfs_fs_info *fs_info, ctx->block_group = block_group; } - if (block_group->meta_write_pointer == eb->start) { - block_group->meta_write_pointer = eb->start + eb->len; - + if (block_group->meta_write_pointer == eb->start) return 0; - } /* If for_sync, this hole will be filled with trasnsaction commit. */ if (wbc->sync_mode == WB_SYNC_ALL && !wbc->for_sync) @@ -1793,16 +1790,6 @@ int btrfs_check_meta_write_pointer(struct btrfs_fs_info *fs_info, return -EBUSY; } -void btrfs_revert_meta_write_pointer(struct btrfs_block_group *cache, - struct extent_buffer *eb) -{ - if (!btrfs_is_zoned(eb->fs_info) || !cache) - return; - - ASSERT(cache->meta_write_pointer == eb->start + eb->len); - cache->meta_write_pointer = eb->start; -} - int btrfs_zoned_issue_zeroout(struct btrfs_device *device, u64 physical, u64 length) { if (!btrfs_dev_is_sequential(device, physical)) diff --git a/fs/btrfs/zoned.h b/fs/btrfs/zoned.h index c0859d8be152..74ec37a25808 100644 --- a/fs/btrfs/zoned.h +++ b/fs/btrfs/zoned.h @@ -60,8 +60,6 @@ bool btrfs_use_zone_append(struct btrfs_bio *bbio); void btrfs_record_physical_zoned(struct btrfs_bio *bbio); int btrfs_check_meta_write_pointer(struct btrfs_fs_info *fs_info, struct btrfs_eb_write_context *ctx); -void btrfs_revert_meta_write_pointer(struct btrfs_block_group *cache, - struct extent_buffer *eb); int btrfs_zoned_issue_zeroout(struct btrfs_device *device, u64 physical, u64 length); int btrfs_sync_zone_write_pointer(struct btrfs_device *tgt_dev, u64 logical, u64 physical_start, u64 physical_pos); @@ -194,12 +192,6 @@ static inline int btrfs_check_meta_write_pointer(struct btrfs_fs_info *fs_info, return 0; } -static inline void btrfs_revert_meta_write_pointer( - struct btrfs_block_group *cache, - struct extent_buffer *eb) -{ -} - static inline int btrfs_zoned_issue_zeroout(struct btrfs_device *device, u64 physical, u64 length) { From patchwork Mon Jul 31 17:17:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 13335331 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5008FC001DC for ; Mon, 31 Jul 2023 17:20:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231484AbjGaRUi (ORCPT ); Mon, 31 Jul 2023 13:20:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46594 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233193AbjGaRUY (ORCPT ); Mon, 31 Jul 2023 13:20:24 -0400 Received: from esa3.hgst.iphmx.com (esa3.hgst.iphmx.com [216.71.153.141]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 82AC22110 for ; Mon, 31 Jul 2023 10:19:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1690823949; x=1722359949; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=vEOtRxCSRKx2pAmA1DUGLA0cJmEAPn5KhY7gJ9HmTRU=; b=IP8JjM3jUzI9kj3EysqtpBUk95INa/QST+JBLjmgqxJTiR8fp4miOPAN G+biGQoaC2nzifT5SaifF45Af5mBh1kxOZ2xuBps/NB0NjyoG0anieb2a 9EFqKY6NzFvqmsSeythzG4eMchFI53ycp9FDMYkV7PRHrVEOXk6tEty2Z u8fjY3yEXseyy78gbBUiuIxDcDHYa8aFHWAbMclBCDKtz37R0rnBqz66C frVx8qafqxm3R9pKetvfCppVunheeDpkS3xEDVJyKRp4fS/JR7A1EqbPj UZFLAvQU/L0WorphU4MernA8bzeB1tAE8LJW/OLrSKRL27BZRqLv2DiB7 g==; X-IronPort-AV: E=Sophos;i="6.01,245,1684771200"; d="scan'208";a="244269562" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 01 Aug 2023 01:17:37 +0800 IronPort-SDR: echdY/PDR9eLQt5aNCdkpOmuOOgnjDI9rS90c/KThTnjUxo/YNXFRgtuV2ARmivmtHX4j0wEjN 2dA6XIehJw54cvmWdXBoWYc4Yvf47pfAhwnS2X5t8202D/PPg9caohhJq60YzEMC6MC19JDJ/K 5Kl/WyVjTBkJSCOasuH17o2Uqgg9gIef7gsRCa4Gy1DYhNlm6xyXh6sh4UNeEXZCPAPMTJc8X7 HDT9ScR2J/64Pzvkmh6eb+Nr6uRvLh0npl/DhgJiARbI4c3Cg0IOu32qMIDCERJeLgi4vU1Qcu a2Q= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 31 Jul 2023 09:31:14 -0700 IronPort-SDR: 2OrkP3CgvIQIR2ISWXDc6FuVBQbDl5igohG23Q5DIQGEtLSZKNtc/NaNwlPG+6T56fekv9iSkb vcoNa5eVAvW0D7rQAfiBkm5Oeny8lM/2pJY6ilGJe3ZRfZfWFF/FnLTp00Booc/YUgiLwFKxAD oqbG7n0OBKXUnAwe/iBoC3Uq+zo7OMC+6adJh55LMzBhYW909ChWm0w3o1esfI9DdoxWCDPOgT qUSmY+z9xgsFSi14tgfA0ySTGdcaXYxSLrKklP6NauXWsTQ/RHE17ELeOJqLvCXVUxOct8P++4 b78= WDCIronportException: Internal Received: from unknown (HELO naota-xeon.wdc.com) ([10.225.163.18]) by uls-op-cesaip02.wdc.com with ESMTP; 31 Jul 2023 10:17:37 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org Cc: hch@infradead.org, josef@toxicpanda.com, dsterba@suse.cz, Naohiro Aota , Christoph Hellwig Subject: [PATCH v2 05/10] btrfs: zoned: update meta_write_pointer on zone finish Date: Tue, 1 Aug 2023 02:17:14 +0900 Message-ID: <22b2378b5f33e1b7f244f6a930f1d01821804893.1690823282.git.naohiro.aota@wdc.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org On finishing a zone, the meta_write_pointer should be set of the end of the zone to reflect the actual write pointer position. Reviewed-by: Christoph Hellwig Signed-off-by: Naohiro Aota Reviewed-by: Johannes Thumshirn --- fs/btrfs/zoned.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index fa595eca39ca..3902c16b9188 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -2056,6 +2056,9 @@ static int do_zone_finish(struct btrfs_block_group *block_group, bool fully_writ clear_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &block_group->runtime_flags); block_group->alloc_offset = block_group->zone_capacity; + if (block_group->flags & (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_SYSTEM)) + block_group->meta_write_pointer = block_group->start + + block_group->zone_capacity; block_group->free_space_ctl->free_space = 0; btrfs_clear_treelog_bg(block_group); btrfs_clear_data_reloc_bg(block_group); From patchwork Mon Jul 31 17:17:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 13335333 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AEEFEC001DE for ; Mon, 31 Jul 2023 17:20:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232938AbjGaRUn (ORCPT ); Mon, 31 Jul 2023 13:20:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44548 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232690AbjGaRU2 (ORCPT ); Mon, 31 Jul 2023 13:20:28 -0400 Received: from esa3.hgst.iphmx.com (esa3.hgst.iphmx.com [216.71.153.141]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7B3F31BC6 for ; Mon, 31 Jul 2023 10:19:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1690823961; x=1722359961; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=wn9cLpoQSaBP3G0whOQaePZu6El+QdodnQ6bl3Zy3t0=; b=H4bq4MP+WyE2KCvHKg9nF3Bdd1Q7NX1j3jVTgVfAk/63FzYAUjq+zWq8 YDJlXeOA4ZZ+CBbAyW/+0S5beiZcIOUOue5UQkwChNhUurWuKEkT2MOLf mQvs8lRjsVZ83MbRFw9ZAeqg9jjVHMGK53xtI5ZuuCvQ4euPr0bpwuuns 3Z7+12cUwtmd6oTIeru5RMj8I/AN76mlxbgZNeGuLOoVkgGDN9e30bLVi NQ6ty/ew2KtxAIfpXHwOWHacnEuBx85UTwdjzz80v6URaYjxyhck/GqB4 9hjPwlqALHBW/EV6wxRgJX53tMHOWj82zxH8Sppbr9tOoSJE3gDvIXo2/ Q==; X-IronPort-AV: E=Sophos;i="6.01,245,1684771200"; d="scan'208";a="244269566" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 01 Aug 2023 01:17:38 +0800 IronPort-SDR: X5OXbPFwE0ZAq/pb+9/UEUU9SHxpglasUUbKO+ijwvArEFo8hLSvNzXavx8gJ8lrH5Rbi/ygPu NAw2hRT1TnkFO8qKb7KZkY4Ohj5b9ZvIEX4Gd2XJ9aYwmJOccA1/SpiQq9zxf0evViAYYsrj2s 6sXlMn78gqnW1NgKJCF8sv++VwZxrN0+D7dHt+exZoDqnw3jotoES8vxcraIPmnxddc8o6Rcl5 F09fLQ24ZRt/aKTvUJdP+tBbQMyVKD1/9KtpH4G6/u51KZ/dlvJ9GwQrGTMc/yvyoGLj0AT9FG pes= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 31 Jul 2023 09:31:15 -0700 IronPort-SDR: 3nZ8ZRfO+HHxhJb4+P0P/E7c6tCWY59Y58cp43UQbc3CHFy1BFUAsrK+WXnLpNvs2mk/vYs9Qh aRfCfduGy5UgDPQymF10e0rtCog5tqJjOgStrIiFkyXVgqLOW3NxHKQF2RkXqyWSBHg5vr8jmo keQWWXr8u5mWyJKk2xTGYB0JnvIzgPx/xhtG3IjoKqRmDoTijoNwnqoHWSgs9oF7SWvAFKDpb9 Ybicj8LDO0nU9oLPdeYwaHg43u0iaW5gED/HtzUNPs586mkKQfUTndDJylGb6KCI4/sFH2yAC1 cUg= WDCIronportException: Internal Received: from unknown (HELO naota-xeon.wdc.com) ([10.225.163.18]) by uls-op-cesaip02.wdc.com with ESMTP; 31 Jul 2023 10:17:38 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org Cc: hch@infradead.org, josef@toxicpanda.com, dsterba@suse.cz, Naohiro Aota Subject: [PATCH v2 06/10] btrfs: zoned: reserve zones for an active metadata/system block group Date: Tue, 1 Aug 2023 02:17:15 +0900 Message-ID: <790055decdb2cfa7dfaa3a47dd43b0a1f9129814.1690823282.git.naohiro.aota@wdc.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Ensure a metadata and system block group can be activated on write time, by leaving a certain number of active zones when trying to activate a data block group. When both metadata and system profiles are set to SINGLE, we need to reserve two zones. When both are DUP, we need to reserve four zones. In the case only one of them is DUP, we should reserve three zones. However, handling the case requires at least two bits to track if we have seen DUP profile for metadata and system, which is cumbersome. So, just reserve four zones in that case for now. Signed-off-by: Naohiro Aota Reviewed-by: Johannes Thumshirn --- fs/btrfs/fs.h | 6 ++++++ fs/btrfs/zoned.c | 39 +++++++++++++++++++++++++++++++++++++-- 2 files changed, 43 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h index ef07c6c252d8..2ce391959b6a 100644 --- a/fs/btrfs/fs.h +++ b/fs/btrfs/fs.h @@ -775,6 +775,12 @@ struct btrfs_fs_info { spinlock_t zone_active_bgs_lock; struct list_head zone_active_bgs; + /* + * Reserved active zones per-device for one metadata and one system + * block group. + */ + unsigned int reserved_active_zones; + /* Updates are not protected by any lock */ struct btrfs_commit_stats commit_stats; diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index 3902c16b9188..9dbcd747ee74 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -525,6 +525,12 @@ int btrfs_get_dev_zone_info(struct btrfs_device *device, bool populate_cache) atomic_set(&zone_info->active_zones_left, max_active_zones - nactive); set_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &fs_info->flags); + /* + * First, reserve zones for SINGLE metadata and SINGLE system + * profile. The reservation will be increased when seeing DUP + * profile. + */ + fs_info->reserved_active_zones = 2; } /* Validate superblock log */ @@ -1515,6 +1521,22 @@ int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new) } cache->alloc_offset = alloc_offsets[0]; cache->zone_capacity = min(caps[0], caps[1]); + + /* + * DUP profile needs two zones on the same device. Reserve 2 + * zones * 2 types (metadata and system) = 4 zones. + * + * Technically, we can have SINGLE metadata and DUP system + * config. And, in that case, we only need 3 zones, wasting one + * active zone. But, to do the precise reservation, we need one + * more variable just to track we already seen a DUP block group + * or not, which is cumbersome. + * + * For now, let's be lazy and just reserve 4 zones. + */ + if (test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &fs_info->flags) && + !(cache->flags & BTRFS_BLOCK_GROUP_DATA)) + fs_info->reserved_active_zones = 4; break; case BTRFS_BLOCK_GROUP_RAID1: case BTRFS_BLOCK_GROUP_RAID0: @@ -1888,6 +1910,8 @@ bool btrfs_zone_activate(struct btrfs_block_group *block_group) struct btrfs_space_info *space_info = block_group->space_info; struct map_lookup *map; struct btrfs_device *device; + const unsigned int reserved = (block_group->flags & BTRFS_BLOCK_GROUP_DATA) ? + fs_info->reserved_active_zones : 0; u64 physical; bool ret; int i; @@ -1917,6 +1941,15 @@ bool btrfs_zone_activate(struct btrfs_block_group *block_group) if (device->zone_info->max_active_zones == 0) continue; + /* + * For the data block group, leave active zones for one + * metadata block group and one system block group. + */ + if (atomic_read(&device->zone_info->active_zones_left) <= reserved) { + ret = false; + goto out_unlock; + } + if (!btrfs_dev_set_active_zone(device, physical)) { /* Cannot activate the zone */ ret = false; @@ -2111,6 +2144,8 @@ bool btrfs_can_activate_zone(struct btrfs_fs_devices *fs_devices, u64 flags) { struct btrfs_fs_info *fs_info = fs_devices->fs_info; struct btrfs_device *device; + const unsigned int reserved = (flags & BTRFS_BLOCK_GROUP_DATA) ? + fs_info->reserved_active_zones : 0; bool ret = false; if (!btrfs_is_zoned(fs_info)) @@ -2131,10 +2166,10 @@ bool btrfs_can_activate_zone(struct btrfs_fs_devices *fs_devices, u64 flags) switch (flags & BTRFS_BLOCK_GROUP_PROFILE_MASK) { case 0: /* single */ - ret = (atomic_read(&zinfo->active_zones_left) >= 1); + ret = (atomic_read(&zinfo->active_zones_left) >= (1 + reserved)); break; case BTRFS_BLOCK_GROUP_DUP: - ret = (atomic_read(&zinfo->active_zones_left) >= 2); + ret = (atomic_read(&zinfo->active_zones_left) >= (2 + reserved)); break; } if (ret) From patchwork Mon Jul 31 17:17:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 13335332 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A04DC001DC for ; Mon, 31 Jul 2023 17:20:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233875AbjGaRUm (ORCPT ); Mon, 31 Jul 2023 13:20:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44060 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233392AbjGaRU2 (ORCPT ); Mon, 31 Jul 2023 13:20:28 -0400 Received: from esa3.hgst.iphmx.com (esa3.hgst.iphmx.com [216.71.153.141]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9F9DC1BD8 for ; Mon, 31 Jul 2023 10:19:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1690823961; x=1722359961; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=nK7V7Tn6KE6BxwLrABnQTOtU9OSJbalKoDB3Z4vymww=; b=NLBJdeyZg768IWFNtGA9sdN7RHBLGmxy4Za4QvbcfWqCBgqJKTeG4KwW u6W5+GaHHwN+qkPZIxsU8vfAqRaJCXFQAeNmZDf+BcC5Zxq5Cj6dMBcxT iwCcuuDStDgmHbIXpHAkj+fz+s5QLVEsias/uTqRMZq6Um7EHlKGJvzmC 8Ztnl1KRnJXWUfqanyA6ffCyu/u3PCJciNBFPHjQkNN5/LOseaRA9WhgQ EcF4DxvRab7FeKkdUwrzJ33MrMtd9kFLTj+U8eePjQTAnhtyMRETGROYu 3t4wCXHYNqSsexlqCdugJJ4grKZi2fn3HKqBWtNt1gWnHwqgjcrw/qYLn A==; X-IronPort-AV: E=Sophos;i="6.01,245,1684771200"; d="scan'208";a="244269568" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 01 Aug 2023 01:17:39 +0800 IronPort-SDR: zCgs+M+LMVxMgex/PJx0JOKJ2r2DIbg6JgjocPOiSLx9Sp6LROZvz8icOYwrzbdBLeRVNoYT/a K9rKzrAQFHnQ+EXZjBFDJHyMM655tizOcOIuSBamsZLtD/BzN7TwNvbuag4LE9jBGJItyMyZQm As0HEzhnZfJFf9dHYDqAeaW319HC4H0EZHcRQL5yTI++ENjkGKyjm0ACQ6UgIP/ugU4QqkaRkT TChNUpHTWd867aVcicLpP5A0ahDNF/CXAGvhlQGbY54CaxcE/hIs35ZXU7Ue1M1qYL3ZG/zXpa D6g= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 31 Jul 2023 09:31:16 -0700 IronPort-SDR: VmmThoF8Fq0yGyx55eXZm31tQIOZnDiQNuTVUYJSo8UMySTxlNgCi1N1E9y8BKzqxTHQTWm+DX pqaxtefli0syEwXBwL+YA/4DqNEZwknkLr17FQSxcZlN+ujjgLJRYJYeKx8z1pbKNATvo+rOKf RvMIGs1+WhyzsbXLAUEU26F1Kmk0YBFkXci6Aozta2yluw1QwMBNm9pioNDrn8Kc77eZsP2QQj dDBcvj1oBjP2SoI1/3e716am6os/BluBFSXJ/yNh73aGp1ip6ffOOJOmdkjcprcQCBHFtMUgIt tEI= WDCIronportException: Internal Received: from unknown (HELO naota-xeon.wdc.com) ([10.225.163.18]) by uls-op-cesaip02.wdc.com with ESMTP; 31 Jul 2023 10:17:39 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org Cc: hch@infradead.org, josef@toxicpanda.com, dsterba@suse.cz, Naohiro Aota Subject: [PATCH v2 07/10] btrfs: zoned: activate metadata block group on write time Date: Tue, 1 Aug 2023 02:17:16 +0900 Message-ID: <15f3fcb8dee1563c78354c7aee64e3af19a6eb93.1690823282.git.naohiro.aota@wdc.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org In the current implementation, block groups are activated at reservation time to ensure that all reserved bytes can be written to an active metadata block group. However, this approach has proven to be less efficient, as it activates block groups more frequently than necessary, putting pressure on the active zone resource and leading to potential issues such as early ENOSPC or hung_task. Another drawback of the current method is that it hampers metadata over-commit, and necessitates additional flush operations and block group allocations, resulting in decreased overall performance. To address these issues, this commit introduces a write-time activation of metadata and system block group. This involves reserving at least one active block group specifically for a metadata and system block group. Since metadata write-out is always allocated sequentially, when we need to write to a non-active block group, we can wait for the ongoing IOs to complete, activate a new block group, and then proceed with writing to the new block group. Fixes: b09315139136 ("btrfs: zoned: activate metadata block group on flush_space") CC: stable@vger.kernel.org # 6.1+ Signed-off-by: Naohiro Aota --- fs/btrfs/block-group.c | 11 ++++++ fs/btrfs/fs.h | 3 ++ fs/btrfs/zoned.c | 83 +++++++++++++++++++++++++++++++++++++++++- 3 files changed, 95 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c index a127865f49f9..b0e432c30e1d 100644 --- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -4287,6 +4287,17 @@ int btrfs_free_block_groups(struct btrfs_fs_info *info) struct btrfs_caching_control *caching_ctl; struct rb_node *n; + if (btrfs_is_zoned(info)) { + if (info->active_meta_bg) { + btrfs_put_block_group(info->active_meta_bg); + info->active_meta_bg = NULL; + } + if (info->active_system_bg) { + btrfs_put_block_group(info->active_system_bg); + info->active_system_bg = NULL; + } + } + write_lock(&info->block_group_cache_lock); while (!list_empty(&info->caching_block_groups)) { caching_ctl = list_entry(info->caching_block_groups.next, diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h index 2ce391959b6a..bcb43ba55ef6 100644 --- a/fs/btrfs/fs.h +++ b/fs/btrfs/fs.h @@ -770,6 +770,9 @@ struct btrfs_fs_info { u64 data_reloc_bg; struct mutex zoned_data_reloc_io_lock; + struct btrfs_block_group *active_meta_bg; + struct btrfs_block_group *active_system_bg; + u64 nr_global_roots; spinlock_t zone_active_bgs_lock; diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index 9dbcd747ee74..91eca8b48715 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -65,6 +65,9 @@ #define SUPER_INFO_SECTORS ((u64)BTRFS_SUPER_INFO_SIZE >> SECTOR_SHIFT) +static void wait_eb_writebacks(struct btrfs_block_group *block_group); +static int do_zone_finish(struct btrfs_block_group *block_group, bool fully_written); + static inline bool sb_zone_is_full(const struct blk_zone *zone) { return (zone->cond == BLK_ZONE_COND_FULL) || @@ -1769,6 +1772,64 @@ void btrfs_finish_ordered_zoned(struct btrfs_ordered_extent *ordered) } } +static bool check_bg_is_active(struct btrfs_eb_write_context *ctx, + struct btrfs_block_group **active_bg) +{ + const struct writeback_control *wbc = ctx->wbc; + struct btrfs_block_group *block_group = ctx->block_group; + struct btrfs_fs_info *fs_info = block_group->fs_info; + + if (test_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &block_group->runtime_flags)) + return true; + + if (fs_info->treelog_bg == block_group->start) { + if (!btrfs_zone_activate(block_group)) { + int ret_fin = btrfs_zone_finish_one_bg(fs_info); + + if (ret_fin != 1 || !btrfs_zone_activate(block_group)) + return false; + } + } else if (*active_bg != block_group) { + struct btrfs_block_group *tgt = *active_bg; + + /* + * zoned_meta_io_lock protects fs_info->active_{meta,system}_bg. + */ + lockdep_assert_held(&fs_info->zoned_meta_io_lock); + + if (tgt) { + /* + * If there is an unsent IO left in the allocated area, + * we cannot wait for them as it may cause a deadlock. + */ + if (tgt->meta_write_pointer < tgt->start + tgt->alloc_offset) { + if (wbc->sync_mode == WB_SYNC_NONE || + (wbc->sync_mode == WB_SYNC_ALL && !wbc->for_sync)) + return false; + } + + /* Pivot active metadata/system block group. */ + btrfs_zoned_meta_io_unlock(fs_info); + wait_eb_writebacks(tgt); + do_zone_finish(tgt, true); + btrfs_zoned_meta_io_lock(fs_info); + if (*active_bg == tgt) { + btrfs_put_block_group(tgt); + *active_bg = NULL; + } + } + if (!btrfs_zone_activate(block_group)) + return false; + if (*active_bg != block_group) { + ASSERT(*active_bg == NULL); + *active_bg = block_group; + btrfs_get_block_group(block_group); + } + } + + return true; +} + /* * Check @ctx->eb is aligned to the write pointer * @@ -1803,8 +1864,26 @@ int btrfs_check_meta_write_pointer(struct btrfs_fs_info *fs_info, ctx->block_group = block_group; } - if (block_group->meta_write_pointer == eb->start) - return 0; + if (block_group->meta_write_pointer == eb->start) { + struct btrfs_block_group **tgt; + + if (!test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &fs_info->flags)) + return 0; + + if (block_group->flags & BTRFS_BLOCK_GROUP_SYSTEM) + tgt = &fs_info->active_system_bg; + else + tgt = &fs_info->active_meta_bg; + if (check_bg_is_active(ctx, tgt)) + return 0; + } + + /* + * Since we may release fs_info->zoned_meta_io_lock, someone can already + * start writing this eb. In that case, we can just bail out. + */ + if (block_group->meta_write_pointer > eb->start) + return -EBUSY; /* If for_sync, this hole will be filled with trasnsaction commit. */ if (wbc->sync_mode == WB_SYNC_ALL && !wbc->for_sync) From patchwork Mon Jul 31 17:17:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 13335334 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D63BC001DC for ; Mon, 31 Jul 2023 17:20:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232198AbjGaRUq (ORCPT ); Mon, 31 Jul 2023 13:20:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43892 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233020AbjGaRUa (ORCPT ); Mon, 31 Jul 2023 13:20:30 -0400 Received: from esa3.hgst.iphmx.com (esa3.hgst.iphmx.com [216.71.153.141]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EA1EE18D for ; Mon, 31 Jul 2023 10:19:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1690823964; x=1722359964; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=IhpujwgWrph6FHOVL04IiONRcxruk0whcIU1u0hndOM=; b=O/f4JaL4aYDHpCjAjLjV7gkPxf9e8uWmf5da1Kx7RZ3pGGMJKCrA2BpM Isx8PsYX2ccDiVNRQVaEwC7qBCaor36dQxCnqjz5W7OiUnenrJtq2jfnK u1YL2VS+ZDzGKlqTOW+Hg5DZ7btNWI6kYRnl0xcuRWNC/U1DgI3P0z3CN ae9Nj6shUZN5/2Tq+KsB3Ttwh21lmYNrXw45rQ/JMYtM3RTXZUItKNn05 O/sldzGTH1VASH+baYwzBZRfPMURdD3kvI+l+yUPHpjA3GjWPQkxZbM0/ Bm97C3O8U1H2SKjuDVeWcwo8l13ytrosjeQgLSu+KfAr7Y2XXx5Tib7C7 A==; X-IronPort-AV: E=Sophos;i="6.01,245,1684771200"; d="scan'208";a="244269572" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 01 Aug 2023 01:17:40 +0800 IronPort-SDR: QYFbAJWidgwAxXU74TO4wrX3stsycJ6hhW66lzhycoZSEBOUqTpPm44FXa2knALz62uw89mKU0 M5mDqk2ipmPjGzTYsTQYpmYqOu7HEAEiREaEmkH2iGHcrbIZF3u/7Z2kBg6x6lKh5Ei1D/2eeu 1DN59Wc0lxxrgO8KFJqqUE0sFf+PNp1zhqNi0UBmB4fFyWMAE+h99+e6AeJ/zNv45XQUvQR+g9 jdRmpC5c4POnw+1pyxPlYn6DA4KPNz9W91AiddFK7+WJPDDQWByjqYb6WeW+PynI4FkIvzz3eD uI4= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 31 Jul 2023 09:31:17 -0700 IronPort-SDR: azzYYjOxJolbwEgXUu0CLFEgpgpE9BFRMB4TBUe5xLoaOi2bb+Th8aBmkVJ2JRaz8CsyqWpoI+ V2mZw3iFnOp3PLfbcV04EX9M9RJ3DlR00uVt0GwlXYKlXB0/oXHRU4oonek9cg3QxtGvL6Zvup 37XTASQGVnYCHykCK4A6eUHFjkkFWydfFPMAsYoQfjOKJakcsrczOrzMmURT9C1k0K5GP8J+Uy TIZZFhhUxTAHts8/zqp40aZ6A6oGxpts5TiUhs/L3wf9xEJBbsRE0enjGQCYuJJzJ8B8HweyuF NnU= WDCIronportException: Internal Received: from unknown (HELO naota-xeon.wdc.com) ([10.225.163.18]) by uls-op-cesaip02.wdc.com with ESMTP; 31 Jul 2023 10:17:40 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org Cc: hch@infradead.org, josef@toxicpanda.com, dsterba@suse.cz, Naohiro Aota Subject: [PATCH v2 08/10] btrfs: zoned: no longer count fresh BG region as zone unusable Date: Tue, 1 Aug 2023 02:17:17 +0900 Message-ID: <5ae5510f8616620c037eff05e3a15df6f401c486.1690823282.git.naohiro.aota@wdc.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Now that we switched to write time activation, we no longer need to (and must not) count the fresh region as zone unusable. This commit is similar to revert commit fc22cf8eba79 ("btrfs: zoned: count fresh BG region as zone unusable"). Signed-off-by: Naohiro Aota --- fs/btrfs/free-space-cache.c | 8 +------- fs/btrfs/zoned.c | 26 +++----------------------- 2 files changed, 4 insertions(+), 30 deletions(-) diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index cd5bfda2c259..27fad70451aa 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -2704,13 +2704,8 @@ static int __btrfs_add_free_space_zoned(struct btrfs_block_group *block_group, bg_reclaim_threshold = READ_ONCE(sinfo->bg_reclaim_threshold); spin_lock(&ctl->tree_lock); - /* Count initial region as zone_unusable until it gets activated. */ if (!used) to_free = size; - else if (initial && - test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &block_group->fs_info->flags) && - (block_group->flags & (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_SYSTEM))) - to_free = 0; else if (initial) to_free = block_group->zone_capacity; else if (offset >= block_group->alloc_offset) @@ -2738,8 +2733,7 @@ static int __btrfs_add_free_space_zoned(struct btrfs_block_group *block_group, reclaimable_unusable = block_group->zone_unusable - (block_group->length - block_group->zone_capacity); /* All the region is now unusable. Mark it as unused and reclaim */ - if (block_group->zone_unusable == block_group->length && - block_group->alloc_offset) { + if (block_group->zone_unusable == block_group->length) { btrfs_mark_bg_unused(block_group); } else if (bg_reclaim_threshold && reclaimable_unusable >= diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index 91eca8b48715..8c2b88be1480 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -1608,19 +1608,9 @@ void btrfs_calc_zone_unusable(struct btrfs_block_group *cache) return; WARN_ON(cache->bytes_super != 0); - - /* Check for block groups never get activated */ - if (test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &cache->fs_info->flags) && - cache->flags & (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_SYSTEM) && - !test_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &cache->runtime_flags) && - cache->alloc_offset == 0) { - unusable = cache->length; - free = 0; - } else { - unusable = (cache->alloc_offset - cache->used) + - (cache->length - cache->zone_capacity); - free = cache->zone_capacity - cache->alloc_offset; - } + unusable = (cache->alloc_offset - cache->used) + + (cache->length - cache->zone_capacity); + free = cache->zone_capacity - cache->alloc_offset; /* We only need ->free_space in ALLOC_SEQ block groups */ cache->cached = BTRFS_CACHE_FINISHED; @@ -1986,7 +1976,6 @@ int btrfs_sync_zone_write_pointer(struct btrfs_device *tgt_dev, u64 logical, bool btrfs_zone_activate(struct btrfs_block_group *block_group) { struct btrfs_fs_info *fs_info = block_group->fs_info; - struct btrfs_space_info *space_info = block_group->space_info; struct map_lookup *map; struct btrfs_device *device; const unsigned int reserved = (block_group->flags & BTRFS_BLOCK_GROUP_DATA) ? @@ -2000,7 +1989,6 @@ bool btrfs_zone_activate(struct btrfs_block_group *block_group) map = block_group->physical_map; - spin_lock(&space_info->lock); spin_lock(&block_group->lock); if (test_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &block_group->runtime_flags)) { ret = true; @@ -2038,14 +2026,7 @@ bool btrfs_zone_activate(struct btrfs_block_group *block_group) /* Successfully activated all the zones */ set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &block_group->runtime_flags); - WARN_ON(block_group->alloc_offset != 0); - if (block_group->zone_unusable == block_group->length) { - block_group->zone_unusable = block_group->length - block_group->zone_capacity; - space_info->bytes_zone_unusable -= block_group->zone_capacity; - } spin_unlock(&block_group->lock); - btrfs_try_granting_tickets(fs_info, space_info); - spin_unlock(&space_info->lock); /* For the active block group list */ btrfs_get_block_group(block_group); @@ -2058,7 +2039,6 @@ bool btrfs_zone_activate(struct btrfs_block_group *block_group) out_unlock: spin_unlock(&block_group->lock); - spin_unlock(&space_info->lock); return ret; } From patchwork Mon Jul 31 17:17:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 13335336 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E041C001DC for ; Mon, 31 Jul 2023 17:20:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232673AbjGaRUu (ORCPT ); Mon, 31 Jul 2023 13:20:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44636 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232289AbjGaRUc (ORCPT ); Mon, 31 Jul 2023 13:20:32 -0400 Received: from esa3.hgst.iphmx.com (esa3.hgst.iphmx.com [216.71.153.141]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0431D1BFD for ; Mon, 31 Jul 2023 10:19:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1690823969; x=1722359969; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=LHm2FreoA/KcFawHVQ4EbCevBhfwbWxg31InugeYM1A=; b=CXPjz7jcj2B5HHWk/Rs/Y4G5ZjJpczpLrAUEOLyQttqRjk1RRXBkvkJ7 b16RqcKDZnBv7eqUfmlasI4lGIUgvRLmD5VyE3mX6eeANwlQkDbT8+REC XeTe7ftjQZH9rbp/f3TOODdm4DtoSljiLDS1s+8k+Rfux8V4DzY74+uDX rh0O8RrAS08symOKq/0h6tLOkFXTsB4/aQo5ZFMG8wekZ0ELcH6Iuwk83 wx3yW+r/hirhTeBiSSBS9aNijZAIcoXr40/qL92mm6zUmd+ZU4XIM3FK1 NOz72CvI7JHIGWuV5Xl/5AYFXrHV+ZtJAPx+mTYDzbtk98g4eAb+NtiwD Q==; X-IronPort-AV: E=Sophos;i="6.01,245,1684771200"; d="scan'208";a="244269574" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 01 Aug 2023 01:17:41 +0800 IronPort-SDR: Dn/1QJlLyrEFNm1BSFiO/3QrOS+32KNdP2alg6RVIP/AMtbLKuonz9CDRlrfpr1bTYzwQkxEkc 7O8dMHGEFMwZrwlTdJQ3oPCuXKYXG+XnPm3IMQ/gv6AVKcd+/QwBNgtghBKFkY037EisK4S4L8 lxFkxMMCLcXlHutsLjKNQAR7RFSJgNVBbaRxO4C3EmL20WvUutyRZkjDg8IwMqr3yYb5rYF3Z9 dEhcCBXDhFq3e8TkfH0vze6RY3OKMHKH6ymSbd5oeH1WIREG8L+agIKQq8U+ePCmUH+ati7NFl /18= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 31 Jul 2023 09:31:18 -0700 IronPort-SDR: qNoEH2hWSRWVNjXsWCeJbDLMGS7yJ//983UweuKFtCgd1QBQEw7qPHEdIFHbQ/pESlyb6cYYEl FrGEWrJjzp+rSu6mXF/TlknVHnMFxcAiPXgebZapd70xQm1nENLUHQcg1sSai5jjyvoEcUvYRE DB1Qkb1yioZUa/mZcOju4+negKFVkeU9QDKX2a9MxQG/a6zkNY9UmwkwwAyyS/Bnqirk6bNwJa lfxVXUo5UHC8vKAM3H528Mp68hQ/S1Xf0tiWgXeDekS1EDsMqPVIr8NCIQKHFHyfK+c/ZyhCOP HF0= WDCIronportException: Internal Received: from unknown (HELO naota-xeon.wdc.com) ([10.225.163.18]) by uls-op-cesaip02.wdc.com with ESMTP; 31 Jul 2023 10:17:41 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org Cc: hch@infradead.org, josef@toxicpanda.com, dsterba@suse.cz, Naohiro Aota Subject: [PATCH v2 09/10] btrfs: zoned: don't activate non-DATA BG on allocation Date: Tue, 1 Aug 2023 02:17:18 +0900 Message-ID: <65989fd4940f6c936237f491fbebe9311ff8d1f4.1690823282.git.naohiro.aota@wdc.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Now that, a non-DATA block group is activated at write time. Don't activate it on allocation time. Signed-off-by: Naohiro Aota Reviewed-by: Johannes Thumshirn --- fs/btrfs/block-group.c | 2 +- fs/btrfs/extent-tree.c | 8 +++++++- fs/btrfs/space-info.c | 28 ---------------------------- 3 files changed, 8 insertions(+), 30 deletions(-) diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c index b0e432c30e1d..0cb1dee965a0 100644 --- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -4089,7 +4089,7 @@ int btrfs_chunk_alloc(struct btrfs_trans_handle *trans, u64 flags, if (IS_ERR(ret_bg)) { ret = PTR_ERR(ret_bg); - } else if (from_extent_allocation) { + } else if (from_extent_allocation && (flags & BTRFS_BLOCK_GROUP_DATA)) { /* * New block group is likely to be used soon. Try to activate * it now. Failure is OK for now. diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 12bd8dc37385..92eccb0cd487 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -3690,7 +3690,9 @@ static int do_allocation_zoned(struct btrfs_block_group *block_group, } spin_unlock(&block_group->lock); - if (!ret && !btrfs_zone_activate(block_group)) { + /* Metadata block group is activated on write time. */ + if (!ret && (block_group->flags & BTRFS_BLOCK_GROUP_DATA) && + !btrfs_zone_activate(block_group)) { ret = 1; /* * May need to clear fs_info->{treelog,data_reloc}_bg. @@ -3870,6 +3872,10 @@ static void found_extent(struct find_free_extent_ctl *ffe_ctl, static int can_allocate_chunk_zoned(struct btrfs_fs_info *fs_info, struct find_free_extent_ctl *ffe_ctl) { + /* Block group's activeness is not a requirement for METADATA block groups. */ + if (!(ffe_ctl->flags & BTRFS_BLOCK_GROUP_DATA)) + return 0; + /* If we can activate new zone, just allocate a chunk and use it */ if (btrfs_can_activate_zone(fs_info->fs_devices, ffe_ctl->flags)) return 0; diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c index 17c86db7b1b1..356638f54fef 100644 --- a/fs/btrfs/space-info.c +++ b/fs/btrfs/space-info.c @@ -761,18 +761,6 @@ static void flush_space(struct btrfs_fs_info *fs_info, break; case ALLOC_CHUNK: case ALLOC_CHUNK_FORCE: - /* - * For metadata space on zoned filesystem, reaching here means we - * don't have enough space left in active_total_bytes. Try to - * activate a block group first, because we may have inactive - * block group already allocated. - */ - ret = btrfs_zoned_activate_one_bg(fs_info, space_info, false); - if (ret < 0) - break; - else if (ret == 1) - break; - trans = btrfs_join_transaction(root); if (IS_ERR(trans)) { ret = PTR_ERR(trans); @@ -784,22 +772,6 @@ static void flush_space(struct btrfs_fs_info *fs_info, CHUNK_ALLOC_FORCE); btrfs_end_transaction(trans); - /* - * For metadata space on zoned filesystem, allocating a new chunk - * is not enough. We still need to activate the block * group. - * Active the newly allocated block group by (maybe) finishing - * a block group. - */ - if (ret == 1) { - ret = btrfs_zoned_activate_one_bg(fs_info, space_info, true); - /* - * Revert to the original ret regardless we could finish - * one block group or not. - */ - if (ret >= 0) - ret = 1; - } - if (ret > 0 || ret == -ENOSPC) ret = 0; break; From patchwork Mon Jul 31 17:17:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 13335335 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 025D7C001DE for ; Mon, 31 Jul 2023 17:20:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232606AbjGaRUt (ORCPT ); Mon, 31 Jul 2023 13:20:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44242 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232673AbjGaRUc (ORCPT ); Mon, 31 Jul 2023 13:20:32 -0400 Received: from esa3.hgst.iphmx.com (esa3.hgst.iphmx.com [216.71.153.141]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 481531FC3 for ; Mon, 31 Jul 2023 10:19:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1690823970; x=1722359970; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VxbYhci5eiE+WBMGE/BPON0esufGYWlKynCosHnf6zo=; b=eK/7J9v/BvHTJWfDxr6XEWa4E3ZCdz0L5eFT0Es4x3lTEuwfmmrNJKvX 9UoIHbLHJp20rRzIpxPrZXpBLptNDNWyWxA/5b0d0Vq60635kHyPfdTqM vC+0YVmYgaQnt4dv/UB0ACRRjqoLiN7RdrKZBBL6hv1XS1vhp4atKNQQQ 2GPNHlA4zX0yr0VCwLkXNLx03qL3Hth3NOaHSd4l7Vz9Z5WzBpTKeH+Jp dZx2jutWT7u80C5sgytQ2KQzpnZylgqX65dal3Aj/qoMfUT6kEEcvALTC ya2ni7nEQjU7TSXlyUIPPUloV8QhMAb0ifAG09fi+xxr21cMFKQoGx1jU A==; X-IronPort-AV: E=Sophos;i="6.01,245,1684771200"; d="scan'208";a="244269577" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 01 Aug 2023 01:17:42 +0800 IronPort-SDR: BUmB65LYMaWFJtUuhcA5appfXRH3jiZqoD5qXcpE+uGKm29NO/SDltKEbyii6mtO4eVCwmsvfe oqqiNnHBXUywewvJzpxmCxBn4LEBpcWOcsWCCRca5+EJ/aVe+iy4ScdqMNuums2czYld58JlCq Uvrl39p5EF055hcYa/QMPznKoJ9QRI7Cpx5KCNlcvfs1eWCDI8AYrjn5Z70kWULVukDBqMSeqL U3/tDz2r1kTmmu+83IY07usM9J4aMQcFalHWGv691IL3beI+6ikKYTW2L0UejGsN8fP1jTNpXY rds= Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 31 Jul 2023 09:31:19 -0700 IronPort-SDR: hR1vK/lHCQfnIyMVQ03If2naTtL1Qzm43NlT24kowIP8Bf0x+omfBx3u+Vqw/grFiS7FvbmInu cMpyNeGwhMV/r9AGM/tES84eV2AgEk8DBln4Vj0FFTz/zQHCGu0mEQQj4oweAbmA3bXIsrITyR HcAHc0AYczYpmpP46y54ccZ7+ERLs+NtskkgxQ501zNCt36Rsj25fdHs951P3xyIscqTbjCKdf kydkcHlRbU+E4kzawkZVsqbnnzuuT7gQJUrzFeFjXlwmitbmqhug6iXCpPjQgX5MN2ahmzVP2j CJE= WDCIronportException: Internal Received: from unknown (HELO naota-xeon.wdc.com) ([10.225.163.18]) by uls-op-cesaip02.wdc.com with ESMTP; 31 Jul 2023 10:17:42 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org Cc: hch@infradead.org, josef@toxicpanda.com, dsterba@suse.cz, Naohiro Aota Subject: [PATCH v2 10/10] btrfs: zoned: re-enable metadata over-commit for zoned mode Date: Tue, 1 Aug 2023 02:17:19 +0900 Message-ID: X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Now that, we can re-enable metadata over-commit. As we moved the activation from the reservation time to the write time, we no longer need to ensure all the reserved bytes is properly activated. Without the metadata over-commit, it suffers from lower performance because it needs to flush the delalloc items more often and allocate more block groups. Re-enabling metadata over-commit will solve the issue. Fixes: 79417d040f4f ("btrfs: zoned: disable metadata overcommit for zoned") CC: stable@vger.kernel.org # 6.1+ Signed-off-by: Naohiro Aota Reviewed-by: Johannes Thumshirn --- fs/btrfs/space-info.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c index 356638f54fef..d7e8cd4f140c 100644 --- a/fs/btrfs/space-info.c +++ b/fs/btrfs/space-info.c @@ -389,11 +389,7 @@ int btrfs_can_overcommit(struct btrfs_fs_info *fs_info, return 0; used = btrfs_space_info_used(space_info, true); - if (test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &fs_info->flags) && - (space_info->flags & BTRFS_BLOCK_GROUP_METADATA)) - avail = 0; - else - avail = calc_available_free_space(fs_info, space_info, flush); + avail = calc_available_free_space(fs_info, space_info, flush); if (used + bytes < space_info->total_bytes + avail) return 1;