From patchwork Thu Nov 14 08:04:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 13874702 Received: from esa3.hgst.iphmx.com (esa3.hgst.iphmx.com [216.71.153.141]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6E2AF163 for ; Thu, 14 Nov 2024 08:04:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=216.71.153.141 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731571486; cv=none; b=DeQwXBbK5k6i0WaYu0Nrw/gLxTZbSkbfI9dZ3QA72LOJWz0/NZ8RhFb0bq+197uz08wpEs7BDHczQvaH4NdwvW6qlPI/whWw3/HVRrldvEA5i+Xfj8xYm/RRjXSZ5prpU+PjacHhQPKA6yiDHwNGYWOVvIcsDYX9nZhMmqx6jMk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731571486; c=relaxed/simple; bh=8Ouzb4E8uaeyT8WcHy1CYAR5KmZHhOFOlwP8qUO+sO8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=AHKmgOYOApwlA9xJyTOtetwhYzw2A2bqiGRzSqt0bkzRCcXtyWB56ExbBHa4nTTwKapQLMl7es8anFmasi53LpmtQSF4ls38ogUVs8YO2PpM3IZZJEx3RpPWOyOVmZ8SzpYay6FFPdN2WuCh+nR9EozAgbm7ftBoAX1eIy0ywy0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=wdc.com; spf=pass smtp.mailfrom=wdc.com; dkim=pass (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b=MH3sgjUs; arc=none smtp.client-ip=216.71.153.141 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=wdc.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=wdc.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="MH3sgjUs" DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1731571485; x=1763107485; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=8Ouzb4E8uaeyT8WcHy1CYAR5KmZHhOFOlwP8qUO+sO8=; b=MH3sgjUsXTtkfk9ERmWcG//z7zVpfsQBIvRGuRcjQvQMbRMqZl00ZfIv jnr13Gj2CI3NFZRhb4SLPwHzs8jkrhyrIQv8mbwU6I4N4x860uwbQwiu4 BCrysrDhhErc37wfXTi9abcmCSr+Sjs3JeGozsBcUz1LtMD4lnbDaF4gL DwfjclpKm1/DbH5qJ57El3+J5+4VQsavQbHXUzhViKbdUqXbymbgfGt1s kcqw6BYcFjdoDQIhbjlAQb7qbcgG7hVirgItyzXbmNqWUcKhceFhzm19Y BCDGxf5c4U6tguExRB2eaIDBoHrdGeKfs5Mrq6Kr8aasw1YpbAQYD5nXf A==; X-CSE-ConnectionGUID: OD8Od+lMQw+NyJBdVvAWQg== X-CSE-MsgGUID: gx1ChyehRNSQe1/FCNAHTA== X-IronPort-AV: E=Sophos;i="6.12,153,1728921600"; d="scan'208";a="31543757" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 14 Nov 2024 16:04:43 +0800 IronPort-SDR: 6735a1d9_bDNtDp3+RqbGwypPcQ/pf1fAIxqspIXQevKZNgaj4oUnc7T N+zAFbfOUvn3QlQ/8fxA8r4CZiI1/Hdlefw6wtw== Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 13 Nov 2024 23:08:10 -0800 WDCIronportException: Internal Received: from unknown (HELO naota-xeon.wdc.com) ([10.225.163.24]) by uls-op-cesaip01.wdc.com with ESMTP; 14 Nov 2024 00:04:42 -0800 From: Naohiro Aota To: linux-btrfs@vger.kernel.org Cc: Naohiro Aota , Johannes Thumshirn Subject: [PATCH v2 1/3] btrfs: introduce btrfs_return_free_space() Date: Thu, 14 Nov 2024 17:04:27 +0900 Message-ID: <8848f10974bbbf4dc2618c606ee71ee0142b2bbe.1731571240.git.naohiro.aota@wdc.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This commit factors out a part of unpin_extent_range() into a function for the next commit. Also, move the "len" variable into the loop to clarify we don't need to carry it beyond an iteration. Signed-off-by: Naohiro Aota Reviewed-by: Johannes Thumshirn --- fs/btrfs/extent-tree.c | 25 ++++--------------------- fs/btrfs/space-info.c | 29 +++++++++++++++++++++++++++++ fs/btrfs/space-info.h | 1 + 3 files changed, 34 insertions(+), 21 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 412e318e4a22..ce7c963dd0a6 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -2724,15 +2724,15 @@ static int unpin_extent_range(struct btrfs_fs_info *fs_info, { struct btrfs_block_group *cache = NULL; struct btrfs_space_info *space_info; - struct btrfs_block_rsv *global_rsv = &fs_info->global_block_rsv; struct btrfs_free_cluster *cluster = NULL; - u64 len; u64 total_unpinned = 0; u64 empty_cluster = 0; bool readonly; int ret = 0; while (start <= end) { + u64 len; + readonly = false; if (!cache || start >= cache->start + cache->length) { @@ -2790,25 +2790,8 @@ static int unpin_extent_range(struct btrfs_fs_info *fs_info, readonly = true; } spin_unlock(&cache->lock); - if (!readonly && return_free_space && - global_rsv->space_info == space_info) { - spin_lock(&global_rsv->lock); - if (!global_rsv->full) { - u64 to_add = min(len, global_rsv->size - - global_rsv->reserved); - - global_rsv->reserved += to_add; - btrfs_space_info_update_bytes_may_use(fs_info, - space_info, to_add); - if (global_rsv->reserved >= global_rsv->size) - global_rsv->full = 1; - len -= to_add; - } - spin_unlock(&global_rsv->lock); - } - /* Add to any tickets we may have */ - if (!readonly && return_free_space && len) - btrfs_try_granting_tickets(fs_info, space_info); + if (!readonly && return_free_space) + btrfs_return_free_space(space_info, len); spin_unlock(&space_info->lock); } diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c index 255e85f78313..93818339a9a9 100644 --- a/fs/btrfs/space-info.c +++ b/fs/btrfs/space-info.c @@ -2082,3 +2082,32 @@ void btrfs_reclaim_sweep(const struct btrfs_fs_info *fs_info) do_reclaim_sweep(space_info, raid); } } + +void btrfs_return_free_space(struct btrfs_space_info *space_info, u64 len) +{ + struct btrfs_fs_info *fs_info = space_info->fs_info; + struct btrfs_block_rsv *global_rsv = &fs_info->global_block_rsv; + + lockdep_assert_held(&space_info->lock); + + /* Prioritize the global reservation to receive the freed space. */ + if (global_rsv->space_info != space_info) + goto grant; + + spin_lock(&global_rsv->lock); + if (!global_rsv->full) { + u64 to_add = min(len, global_rsv->size - global_rsv->reserved); + + global_rsv->reserved += to_add; + btrfs_space_info_update_bytes_may_use(fs_info, space_info, to_add); + if (global_rsv->reserved >= global_rsv->size) + global_rsv->full = 1; + len -= to_add; + } + spin_unlock(&global_rsv->lock); + +grant: + /* Add to any tickets we may have */ + if (len) + btrfs_try_granting_tickets(fs_info, space_info); +} diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h index efbecc0c5258..4c9e8aabee51 100644 --- a/fs/btrfs/space-info.h +++ b/fs/btrfs/space-info.h @@ -295,5 +295,6 @@ void btrfs_set_periodic_reclaim_ready(struct btrfs_space_info *space_info, bool bool btrfs_should_periodic_reclaim(struct btrfs_space_info *space_info); int btrfs_calc_reclaim_threshold(const struct btrfs_space_info *space_info); void btrfs_reclaim_sweep(const struct btrfs_fs_info *fs_info); +void btrfs_return_free_space(struct btrfs_space_info *space_info, u64 len); #endif /* BTRFS_SPACE_INFO_H */ From patchwork Thu Nov 14 08:04:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 13874703 Received: from esa3.hgst.iphmx.com (esa3.hgst.iphmx.com [216.71.153.141]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B551E1F5858 for ; Thu, 14 Nov 2024 08:04:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=216.71.153.141 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731571486; cv=none; b=NeMkHM1pA1LvrDb+RfPiUXLhsP8P5ieVCAahnbbOJ1GwhPXSn+ZUiZWHjkVBcQfvT49ND29iQfp8EE1FwpDF1vyLUxICRuy4NstX5ymRhweAMKh9EBfrG7DwH0/qWSVvFAIcUo/w5iSdIFzFzSoFyIgHpSZsqbPly2fty2TDt14= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731571486; c=relaxed/simple; bh=Raqc7b08m3fuDKYZISOjIAumeDI39ruqTKEy+n1to7M=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=OWBLHJEZNIOZFYSABxadWPyXeaEI5HUYuxlyUd0hN/bOP0jeRXhKGgXbWHeHSRtH8xS4AZSknZRcWMYvd7l9hltfeAWxzJ2mRd0+GHzHLcTDBK9nbWYU1q6VQivgFLmBABxkrZ2N8n/OGOGwovlYfaMv33/TyQZ8XLtaINmthxw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=wdc.com; spf=pass smtp.mailfrom=wdc.com; dkim=pass (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b=XandHI7I; arc=none smtp.client-ip=216.71.153.141 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=wdc.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=wdc.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="XandHI7I" DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1731571485; x=1763107485; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Raqc7b08m3fuDKYZISOjIAumeDI39ruqTKEy+n1to7M=; b=XandHI7I0N7fwSs5bE0fBCZ/E7P4lxob8XXOAqIG3hL/+nn4/yY1x5A9 kIcQ3GFp/TXdmsCWyJYYPhkVv9wHUdapze14nEkkA7vCsujE3HC4NdYMP 9EJ0ZGMJzgGvmxiKezfzfhUOOxEmFUrBzD7hTePoWixI1i+g7081nSYC9 +H9ilUB9fFA6L9BSre9ZQTwFTSwDvjanKhuUBAJGfij2gQwkQ5JlUFry+ rsOYEI7BSoUCQXk+zF3fJk+RW1bBtmPijVtkw1WieVGT6R8CRuQH97YLN 3ucnkPEcFJHl1w+CO40OG8N7Fp+uPXeAcgJ9oZJL3etUiH78+1wzX3QPN Q==; X-CSE-ConnectionGUID: kHzXkApMQm2uIdV/qdL0+Q== X-CSE-MsgGUID: SccUFpd6RF2qF4x6beFi4w== X-IronPort-AV: E=Sophos;i="6.12,153,1728921600"; d="scan'208";a="31543761" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 14 Nov 2024 16:04:44 +0800 IronPort-SDR: 6735a1da_Dv+17eXXVy1qHJU/49X5Zl8oBqVy9nYocC3PUUadlKMXv4f KimiVWz5xzSSRCaegglkx11b36pJXADhRLtcV5A== Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 13 Nov 2024 23:08:11 -0800 WDCIronportException: Internal Received: from unknown (HELO naota-xeon.wdc.com) ([10.225.163.24]) by uls-op-cesaip01.wdc.com with ESMTP; 14 Nov 2024 00:04:42 -0800 From: Naohiro Aota To: linux-btrfs@vger.kernel.org Cc: Naohiro Aota , Johannes Thumshirn Subject: [PATCH v2 2/3] btrfs: drop fs_info argument from btrfs_update_space_info_* Date: Thu, 14 Nov 2024 17:04:28 +0900 Message-ID: X-Mailer: git-send-email 2.47.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Since commit e1e577aafe41 ("btrfs: store fs_info in space_info"), we have the fs_info in a space_info. So, we can drop fs_info argument from btrfs_update_space_info_*. There is no behavior change. Signed-off-by: Naohiro Aota Reviewed-by: Johannes Thumshirn --- fs/btrfs/block-group.c | 16 ++++++---------- fs/btrfs/block-rsv.c | 10 +++------- fs/btrfs/delalloc-space.c | 2 +- fs/btrfs/delayed-ref.c | 5 ++--- fs/btrfs/extent-tree.c | 10 +++------- fs/btrfs/inode.c | 2 +- fs/btrfs/space-info.c | 14 +++++--------- fs/btrfs/space-info.h | 9 ++++----- fs/btrfs/transaction.c | 3 +-- 9 files changed, 26 insertions(+), 45 deletions(-) diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c index 4427c1b835e8..5be029734cfa 100644 --- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -1223,7 +1223,7 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans, block_group->space_info->total_bytes -= block_group->length; block_group->space_info->bytes_readonly -= (block_group->length - block_group->zone_unusable); - btrfs_space_info_update_bytes_zone_unusable(fs_info, block_group->space_info, + btrfs_space_info_update_bytes_zone_unusable(block_group->space_info, -block_group->zone_unusable); block_group->space_info->disk_total -= block_group->length * factor; @@ -1396,8 +1396,7 @@ static int inc_block_group_ro(struct btrfs_block_group *cache, int force) if (btrfs_is_zoned(cache->fs_info)) { /* Migrate zone_unusable bytes to readonly */ sinfo->bytes_readonly += cache->zone_unusable; - btrfs_space_info_update_bytes_zone_unusable(cache->fs_info, sinfo, - -cache->zone_unusable); + btrfs_space_info_update_bytes_zone_unusable(sinfo, -cache->zone_unusable); cache->zone_unusable = 0; } cache->ro++; @@ -1645,8 +1644,7 @@ void btrfs_delete_unused_bgs(struct btrfs_fs_info *fs_info) spin_lock(&space_info->lock); spin_lock(&block_group->lock); - btrfs_space_info_update_bytes_pinned(fs_info, space_info, - -block_group->pinned); + btrfs_space_info_update_bytes_pinned(space_info, -block_group->pinned); space_info->bytes_readonly += block_group->pinned; block_group->pinned = 0; @@ -3060,8 +3058,7 @@ void btrfs_dec_block_group_ro(struct btrfs_block_group *cache) (cache->alloc_offset - cache->used - cache->pinned - cache->reserved) + (cache->length - cache->zone_capacity); - btrfs_space_info_update_bytes_zone_unusable(cache->fs_info, sinfo, - cache->zone_unusable); + btrfs_space_info_update_bytes_zone_unusable(sinfo, cache->zone_unusable); sinfo->bytes_readonly -= cache->zone_unusable; } num_bytes = cache->length - cache->reserved - @@ -3699,7 +3696,7 @@ int btrfs_update_block_group(struct btrfs_trans_handle *trans, old_val -= num_bytes; cache->used = old_val; cache->pinned += num_bytes; - btrfs_space_info_update_bytes_pinned(info, space_info, num_bytes); + btrfs_space_info_update_bytes_pinned(space_info, num_bytes); space_info->bytes_used -= num_bytes; space_info->disk_used -= num_bytes * factor; if (READ_ONCE(space_info->periodic_reclaim)) @@ -3781,8 +3778,7 @@ int btrfs_add_reserved_bytes(struct btrfs_block_group *cache, space_info->bytes_reserved += num_bytes; trace_btrfs_space_reservation(cache->fs_info, "space_info", space_info->flags, num_bytes, 1); - btrfs_space_info_update_bytes_may_use(cache->fs_info, - space_info, -ram_bytes); + btrfs_space_info_update_bytes_may_use(space_info, -ram_bytes); if (delalloc) cache->delalloc_bytes += num_bytes; diff --git a/fs/btrfs/block-rsv.c b/fs/btrfs/block-rsv.c index a07b9594dc70..3f3608299c0b 100644 --- a/fs/btrfs/block-rsv.c +++ b/fs/btrfs/block-rsv.c @@ -150,9 +150,7 @@ static u64 block_rsv_release_bytes(struct btrfs_fs_info *fs_info, spin_unlock(&dest->lock); } if (num_bytes) - btrfs_space_info_free_bytes_may_use(fs_info, - space_info, - num_bytes); + btrfs_space_info_free_bytes_may_use(space_info, num_bytes); } if (qgroup_to_release_ret) *qgroup_to_release_ret = qgroup_to_release; @@ -383,13 +381,11 @@ void btrfs_update_global_block_rsv(struct btrfs_fs_info *fs_info) if (block_rsv->reserved < block_rsv->size) { num_bytes = block_rsv->size - block_rsv->reserved; - btrfs_space_info_update_bytes_may_use(fs_info, sinfo, - num_bytes); + btrfs_space_info_update_bytes_may_use(sinfo, num_bytes); block_rsv->reserved = block_rsv->size; } else if (block_rsv->reserved > block_rsv->size) { num_bytes = block_rsv->reserved - block_rsv->size; - btrfs_space_info_update_bytes_may_use(fs_info, sinfo, - -num_bytes); + btrfs_space_info_update_bytes_may_use(sinfo, -num_bytes); block_rsv->reserved = block_rsv->size; btrfs_try_granting_tickets(fs_info, sinfo); } diff --git a/fs/btrfs/delalloc-space.c b/fs/btrfs/delalloc-space.c index 7aa8a395d838..88e900e5a43d 100644 --- a/fs/btrfs/delalloc-space.c +++ b/fs/btrfs/delalloc-space.c @@ -176,7 +176,7 @@ void btrfs_free_reserved_data_space_noquota(struct btrfs_fs_info *fs_info, ASSERT(IS_ALIGNED(len, fs_info->sectorsize)); data_sinfo = fs_info->data_sinfo; - btrfs_space_info_free_bytes_may_use(fs_info, data_sinfo, len); + btrfs_space_info_free_bytes_may_use(data_sinfo, len); } /* diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c index 4d2ad5b66928..3e2b1121c4ff 100644 --- a/fs/btrfs/delayed-ref.c +++ b/fs/btrfs/delayed-ref.c @@ -254,7 +254,7 @@ int btrfs_delayed_refs_rsv_refill(struct btrfs_fs_info *fs_info, spin_unlock(&block_rsv->lock); if (to_free > 0) - btrfs_space_info_free_bytes_may_use(fs_info, space_info, to_free); + btrfs_space_info_free_bytes_may_use(space_info, to_free); if (refilled_bytes > 0) trace_btrfs_space_reservation(fs_info, "delayed_refs_rsv", 0, @@ -1281,8 +1281,7 @@ void btrfs_destroy_delayed_refs(struct btrfs_transaction *trans) spin_lock(&bg->space_info->lock); spin_lock(&bg->lock); bg->pinned += head->num_bytes; - btrfs_space_info_update_bytes_pinned(fs_info, - bg->space_info, + btrfs_space_info_update_bytes_pinned(bg->space_info, head->num_bytes); bg->reserved -= head->num_bytes; bg->space_info->bytes_reserved -= head->num_bytes; diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index ce7c963dd0a6..9bbaa2b943e9 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -2571,13 +2571,10 @@ static int pin_down_extent(struct btrfs_trans_handle *trans, struct btrfs_block_group *cache, u64 bytenr, u64 num_bytes, int reserved) { - struct btrfs_fs_info *fs_info = cache->fs_info; - spin_lock(&cache->space_info->lock); spin_lock(&cache->lock); cache->pinned += num_bytes; - btrfs_space_info_update_bytes_pinned(fs_info, cache->space_info, - num_bytes); + btrfs_space_info_update_bytes_pinned(cache->space_info, num_bytes); if (reserved) { cache->reserved -= num_bytes; cache->space_info->bytes_reserved -= num_bytes; @@ -2778,15 +2775,14 @@ static int unpin_extent_range(struct btrfs_fs_info *fs_info, spin_lock(&space_info->lock); spin_lock(&cache->lock); cache->pinned -= len; - btrfs_space_info_update_bytes_pinned(fs_info, space_info, -len); + btrfs_space_info_update_bytes_pinned(space_info, -len); space_info->max_extent_size = 0; if (cache->ro) { space_info->bytes_readonly += len; readonly = true; } else if (btrfs_is_zoned(fs_info)) { /* Need reset before reusing in a zoned block group */ - btrfs_space_info_update_bytes_zone_unusable(fs_info, space_info, - len); + btrfs_space_info_update_bytes_zone_unusable(space_info, len); readonly = true; } spin_unlock(&cache->lock); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 22b8e2764619..38716ffbca45 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -1809,7 +1809,7 @@ static int fallback_to_cow(struct btrfs_inode *inode, bytes = range_bytes; spin_lock(&sinfo->lock); - btrfs_space_info_update_bytes_may_use(fs_info, sinfo, bytes); + btrfs_space_info_update_bytes_may_use(sinfo, bytes); spin_unlock(&sinfo->lock); if (count > 0) diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c index 93818339a9a9..7b3c514eb6f9 100644 --- a/fs/btrfs/space-info.c +++ b/fs/btrfs/space-info.c @@ -316,7 +316,7 @@ void btrfs_add_bg_to_space_info(struct btrfs_fs_info *info, found->bytes_used += block_group->used; found->disk_used += block_group->used * factor; found->bytes_readonly += block_group->bytes_super; - btrfs_space_info_update_bytes_zone_unusable(info, found, block_group->zone_unusable); + btrfs_space_info_update_bytes_zone_unusable(found, block_group->zone_unusable); if (block_group->length > 0) found->full = 0; btrfs_try_granting_tickets(info, found); @@ -489,9 +489,7 @@ void btrfs_try_granting_tickets(struct btrfs_fs_info *fs_info, if ((used + ticket->bytes <= space_info->total_bytes) || btrfs_can_overcommit(fs_info, space_info, ticket->bytes, flush)) { - btrfs_space_info_update_bytes_may_use(fs_info, - space_info, - ticket->bytes); + btrfs_space_info_update_bytes_may_use(space_info, ticket->bytes); remove_ticket(space_info, ticket); ticket->bytes = 0; space_info->tickets_id++; @@ -1690,8 +1688,7 @@ static int __reserve_bytes(struct btrfs_fs_info *fs_info, if (!pending_tickets && ((used + orig_bytes <= space_info->total_bytes) || btrfs_can_overcommit(fs_info, space_info, orig_bytes, flush))) { - btrfs_space_info_update_bytes_may_use(fs_info, space_info, - orig_bytes); + btrfs_space_info_update_bytes_may_use(space_info, orig_bytes); ret = 0; } @@ -1703,8 +1700,7 @@ static int __reserve_bytes(struct btrfs_fs_info *fs_info, if (ret && unlikely(flush == BTRFS_RESERVE_FLUSH_EMERGENCY)) { used = btrfs_space_info_used(space_info, false); if (used + orig_bytes <= space_info->total_bytes) { - btrfs_space_info_update_bytes_may_use(fs_info, space_info, - orig_bytes); + btrfs_space_info_update_bytes_may_use(space_info, orig_bytes); ret = 0; } } @@ -2099,7 +2095,7 @@ void btrfs_return_free_space(struct btrfs_space_info *space_info, u64 len) u64 to_add = min(len, global_rsv->size - global_rsv->reserved); global_rsv->reserved += to_add; - btrfs_space_info_update_bytes_may_use(fs_info, space_info, to_add); + btrfs_space_info_update_bytes_may_use(space_info, to_add); if (global_rsv->reserved >= global_rsv->size) global_rsv->full = 1; len -= to_add; diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h index 4c9e8aabee51..69071afc0d47 100644 --- a/fs/btrfs/space-info.h +++ b/fs/btrfs/space-info.h @@ -229,10 +229,10 @@ static inline bool btrfs_mixed_space_info(const struct btrfs_space_info *space_i */ #define DECLARE_SPACE_INFO_UPDATE(name, trace_name) \ static inline void \ -btrfs_space_info_update_##name(struct btrfs_fs_info *fs_info, \ - struct btrfs_space_info *sinfo, \ +btrfs_space_info_update_##name(struct btrfs_space_info *sinfo, \ s64 bytes) \ { \ + struct btrfs_fs_info *fs_info = sinfo->fs_info; \ const u64 abs_bytes = (bytes < 0) ? -bytes : bytes; \ lockdep_assert_held(&sinfo->lock); \ trace_update_##name(fs_info, sinfo, sinfo->name, bytes); \ @@ -275,13 +275,12 @@ int btrfs_can_overcommit(struct btrfs_fs_info *fs_info, enum btrfs_reserve_flush_enum flush); static inline void btrfs_space_info_free_bytes_may_use( - struct btrfs_fs_info *fs_info, struct btrfs_space_info *space_info, u64 num_bytes) { spin_lock(&space_info->lock); - btrfs_space_info_update_bytes_may_use(fs_info, space_info, -num_bytes); - btrfs_try_granting_tickets(fs_info, space_info); + btrfs_space_info_update_bytes_may_use(space_info, -num_bytes); + btrfs_try_granting_tickets(space_info->fs_info, space_info); spin_unlock(&space_info->lock); } int btrfs_reserve_data_bytes(struct btrfs_fs_info *fs_info, u64 bytes, diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index dc0b837efd5d..15312013f2a3 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -795,8 +795,7 @@ start_transaction(struct btrfs_root *root, unsigned int num_items, if (num_bytes) btrfs_block_rsv_release(fs_info, trans_rsv, num_bytes, NULL); if (delayed_refs_bytes) - btrfs_space_info_free_bytes_may_use(fs_info, trans_rsv->space_info, - delayed_refs_bytes); + btrfs_space_info_free_bytes_may_use(trans_rsv->space_info, delayed_refs_bytes); reserve_fail: btrfs_qgroup_free_meta_prealloc(root, qgroup_reserved); return ERR_PTR(ret); From patchwork Thu Nov 14 08:04:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 13874704 Received: from esa3.hgst.iphmx.com (esa3.hgst.iphmx.com [216.71.153.141]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4DA7D1F6686 for ; Thu, 14 Nov 2024 08:04:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=216.71.153.141 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731571488; cv=none; b=s4imYc6b6UrKTGYNddnGhzDhOPhg92f+MV0Yctxd3GHXXtJPiJYHu2mwjClUGQFlV3owCaxCdQ3adXQXSSim2GGYQUsiWmGIgABN/vVqhvHF5rfFXfphYK7tTuK1XM8ZVzze69y+8iwgmPcuDYMkRNY4TVTpCq91HoYn2MxsnNU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731571488; c=relaxed/simple; bh=/qKspSoNrijAjosztYEfSEVwQ6+wchTrIuHCfl+DERk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=AcCnUX26u/BjoMccHg7BNcx4gsjEKmq8bA9k811dtYnFCliJBTphpyss9452EW/btsqbki89SSoAVwTFrgafoXW3AdjLLxfm5ckC0KwMtxXx9NSNkqE6HqV3fMMnXmSdkSP/6c3L9xbbNCL74KblM0IBFtZu5D39rPc9e1uBOl0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=wdc.com; spf=pass smtp.mailfrom=wdc.com; dkim=pass (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b=XQuRrZoq; arc=none smtp.client-ip=216.71.153.141 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=wdc.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=wdc.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="XQuRrZoq" DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1731571487; x=1763107487; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/qKspSoNrijAjosztYEfSEVwQ6+wchTrIuHCfl+DERk=; b=XQuRrZoq7LsjUJHHjNepMO32bPKLKDWq2F67aVTsluSS7m8yFaptBFxU gObReF24youZ1t2nj/ysv6jcPo2kEpdr2lfxQlBO53Wz2kkVyWeunKDRi kDsBh3Ophi70n5epHs9+CiLfe/loSR0O5fzXJxcZZSdr8NvWrybMTGFU9 uJGbOq7wRzOTviWHZOt65m7mXi7ZCeo5CIijkEKjTbq2hfqYky9CFNd1b 0huruaJbSjEXldgl1pvne2Bm6K8eU7HgT4falsmjhDs76YZVKQ6OFdNks DQAicPZ6BcDrIO1zvU5/Dzgo8tL/HEiXWMj72nz/n+RtO8ThYGlxGBj2o w==; X-CSE-ConnectionGUID: FEDbaTLaTs+Qys0my561tg== X-CSE-MsgGUID: 6jJcHgGSQOWcBBaB5BjHXA== X-IronPort-AV: E=Sophos;i="6.12,153,1728921600"; d="scan'208";a="31543764" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 14 Nov 2024 16:04:45 +0800 IronPort-SDR: 6735a1db_HcLGGzyyHTrta7pcgAFcrRVNcuEBbX5wG75I2kZB5aR6CH6 3i2e8z5lkXcspMiIStA6MF/dr3Vx6W6dAp2K9KQ== Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 13 Nov 2024 23:08:12 -0800 WDCIronportException: Internal Received: from unknown (HELO naota-xeon.wdc.com) ([10.225.163.24]) by uls-op-cesaip01.wdc.com with ESMTP; 14 Nov 2024 00:04:43 -0800 From: Naohiro Aota To: linux-btrfs@vger.kernel.org Cc: Naohiro Aota Subject: [PATCH v2 3/3] btrfs: zoned: reclaim unused zone by zone resetting Date: Thu, 14 Nov 2024 17:04:29 +0900 Message-ID: X-Mailer: git-send-email 2.47.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 On the zoned mode, once used and freed region is still not reusable after the freeing. The underlying zone needs to be reset before reusing. Btrfs resets a zone when it removes a block group, and then new block group is allocated on the zones to reuse the zones. But, it is sometime too late to catch up with a write side. This commit introduces a new space-info reclaim method ZONE_RESET. That will pick a block group from the unused list and reset its zone to reuse the zone_unusable space. It is faster than removing the block group and re-creating a new block group on the same zones. For the first implementation, the ZONE_RESET is only applied to a block group whose region is fully zone_unusable. Reclaiming partial zone_unusable block group could be implemented later. Signed-off-by: Naohiro Aota --- fs/btrfs/space-info.c | 28 +++++++- fs/btrfs/space-info.h | 5 ++ fs/btrfs/zoned.c | 124 +++++++++++++++++++++++++++++++++++ fs/btrfs/zoned.h | 7 ++ include/trace/events/btrfs.h | 3 +- 5 files changed, 164 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c index 7b3c514eb6f9..981da9e1b009 100644 --- a/fs/btrfs/space-info.c +++ b/fs/btrfs/space-info.c @@ -14,6 +14,7 @@ #include "fs.h" #include "accessors.h" #include "extent-tree.h" +#include "zoned.h" /* * HOW DOES SPACE RESERVATION WORK @@ -127,6 +128,14 @@ * churn a lot and we can avoid making some extent tree modifications if we * are able to delay for as long as possible. * + * RESET_ZONES + * This state works only for the zoned mode. On the zoned mode, we cannot + * reuse once allocated then freed region until we reset the zone, due to + * the sequential write zone requirement. The RESET_ZONES state resets the + * zones of an unused block group and let btrfs reuse the space. The reusing + * is faster than removing the block group and allocating another block + * group on the zones. + * * ALLOC_CHUNK * We will skip this the first time through space reservation, because of * overcommit and we don't want to have a lot of useless metadata space when @@ -832,6 +841,9 @@ static void flush_space(struct btrfs_fs_info *fs_info, */ ret = btrfs_commit_current_transaction(root); break; + case RESET_ZONES: + ret = btrfs_reset_unused_block_groups(space_info, num_bytes); + break; default: ret = -ENOSPC; break; @@ -1084,9 +1096,14 @@ static void btrfs_async_reclaim_metadata_space(struct work_struct *work) enum btrfs_flush_state flush_state; int commit_cycles = 0; u64 last_tickets_id; + enum btrfs_flush_state final_state; fs_info = container_of(work, struct btrfs_fs_info, async_reclaim_work); space_info = btrfs_find_space_info(fs_info, BTRFS_BLOCK_GROUP_METADATA); + if (btrfs_is_zoned(fs_info)) + final_state = RESET_ZONES; + else + final_state = COMMIT_TRANS; spin_lock(&space_info->lock); to_reclaim = btrfs_calc_reclaim_metadata_size(fs_info, space_info); @@ -1139,7 +1156,7 @@ static void btrfs_async_reclaim_metadata_space(struct work_struct *work) if (flush_state == ALLOC_CHUNK_FORCE && !commit_cycles) flush_state++; - if (flush_state > COMMIT_TRANS) { + if (flush_state > final_state) { commit_cycles++; if (commit_cycles > 2) { if (maybe_fail_all_tickets(fs_info, space_info)) { @@ -1153,7 +1170,7 @@ static void btrfs_async_reclaim_metadata_space(struct work_struct *work) } } spin_unlock(&space_info->lock); - } while (flush_state <= COMMIT_TRANS); + } while (flush_state <= final_state); } /* @@ -1284,6 +1301,10 @@ static void btrfs_preempt_reclaim_metadata_space(struct work_struct *work) * This is where we reclaim all of the pinned space generated by running the * iputs * + * RESET_ZONES + * This state works only for the zoned mode. We scan the unused block + * group list and reset the zones and reuse the block group. + * * ALLOC_CHUNK_FORCE * For data we start with alloc chunk force, however we could have been full * before, and then the transaction commit could have freed new block groups, @@ -1293,6 +1314,7 @@ static const enum btrfs_flush_state data_flush_states[] = { FLUSH_DELALLOC_FULL, RUN_DELAYED_IPUTS, COMMIT_TRANS, + RESET_ZONES, ALLOC_CHUNK_FORCE, }; @@ -1384,6 +1406,7 @@ void btrfs_init_async_reclaim_work(struct btrfs_fs_info *fs_info) static const enum btrfs_flush_state priority_flush_states[] = { FLUSH_DELAYED_ITEMS_NR, FLUSH_DELAYED_ITEMS, + RESET_ZONES, ALLOC_CHUNK, }; @@ -1397,6 +1420,7 @@ static const enum btrfs_flush_state evict_flush_states[] = { FLUSH_DELALLOC_FULL, ALLOC_CHUNK, COMMIT_TRANS, + RESET_ZONES, }; static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info, diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h index 69071afc0d47..a96efdb5e681 100644 --- a/fs/btrfs/space-info.h +++ b/fs/btrfs/space-info.h @@ -79,6 +79,10 @@ enum btrfs_reserve_flush_enum { BTRFS_RESERVE_FLUSH_EMERGENCY, }; +/* + * Please be aware that the order of enum values will be the order of the reclaim + * process in btrfs_async_reclaim_metadata_space(). + */ enum btrfs_flush_state { FLUSH_DELAYED_ITEMS_NR = 1, FLUSH_DELAYED_ITEMS = 2, @@ -91,6 +95,7 @@ enum btrfs_flush_state { ALLOC_CHUNK_FORCE = 9, RUN_DELAYED_IPUTS = 10, COMMIT_TRANS = 11, + RESET_ZONES = 12, }; struct btrfs_space_info { diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index cb32966380f5..a294617dc25d 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -2648,3 +2648,127 @@ void btrfs_check_active_zone_reservation(struct btrfs_fs_info *fs_info) } spin_unlock(&fs_info->zone_active_bgs_lock); } + +/* + * Reset the zones of unused block groups from @space_info->bytes_zone_unusable. + * + * This one resets the zones of a block group, so we can reuse the region + * without removing the block group. On the other hand, btrfs_delete_unused_bgs() + * just removes a block group and frees up the underlying zones. So, we still + * need to allocate a new block group to reuse the zones. + * + * Resetting is faster than deleting/recreating a block group. It is similar + * to freeing the logical space on the regular mode. However, we cannot change + * the block group's profile with this operation. + * + * @space_info: the space to work on + * @num_bytes: targeting reclaim bytes + */ +int btrfs_reset_unused_block_groups(struct btrfs_space_info *space_info, u64 num_bytes) +{ + struct btrfs_fs_info *fs_info = space_info->fs_info; + const sector_t zone_size_sectors = fs_info->zone_size >> SECTOR_SHIFT; + + if (!btrfs_is_zoned(fs_info)) + return 0; + + while (num_bytes > 0) { + struct btrfs_chunk_map *map; + struct btrfs_block_group *bg = NULL; + bool found = false; + u64 reclaimed = 0; + + /* + * Here, we choose a fully zone_unusable block group. It's + * technically possible to reset a partly zone_unusable block + * group, which still has some free space left. However, + * handling that needs to cope with the allocation side, which + * makes the logic more complex. So, let's handle the easy case + * for now. + */ + spin_lock(&fs_info->unused_bgs_lock); + list_for_each_entry(bg, &fs_info->unused_bgs, bg_list) { + if ((bg->flags & BTRFS_BLOCK_GROUP_TYPE_MASK) != space_info->flags) + continue; + + /* + * Use trylock to avoid locking order violation. In + * btrfs_reclaim_bgs_work(), the lock order is + * &bg->lock -> &fs_info->unused_bgs_lock. We skips a + * block group, if we cannot take its lock. + */ + if (!spin_trylock(&bg->lock)) + continue; + if (btrfs_is_block_group_used(bg) || bg->zone_unusable < bg->length) { + spin_unlock(&bg->lock); + continue; + } + spin_unlock(&bg->lock); + found = true; + break; + } + if (!found) { + spin_unlock(&fs_info->unused_bgs_lock); + return 0; + } + + list_del_init(&bg->bg_list); + btrfs_put_block_group(bg); + spin_unlock(&fs_info->unused_bgs_lock); + + /* + * Since the block group is fully zone_unusable and we cannot + * allocate anymore from this block group, we don't need to set + * this block group read-only. + */ + + down_read(&fs_info->dev_replace.rwsem); + map = bg->physical_map; + for (int i = 0; i < map->num_stripes; i++) { + struct btrfs_io_stripe *stripe = &map->stripes[i]; + unsigned int nofs_flags; + int ret; + + nofs_flags = memalloc_nofs_save(); + ret = blkdev_zone_mgmt(stripe->dev->bdev, REQ_OP_ZONE_RESET, + stripe->physical >> SECTOR_SHIFT, + zone_size_sectors); + memalloc_nofs_restore(nofs_flags); + + if (ret) { + up_read(&fs_info->dev_replace.rwsem); + return ret; + } + } + up_read(&fs_info->dev_replace.rwsem); + + spin_lock(&space_info->lock); + spin_lock(&bg->lock); + ASSERT(!btrfs_is_block_group_used(bg)); + if (bg->ro) { + spin_unlock(&bg->lock); + spin_unlock(&space_info->lock); + continue; + } + + reclaimed = bg->alloc_offset; + bg->zone_unusable = bg->length - bg->zone_capacity; + bg->alloc_offset = 0; + /* + * This holds because we currently reset fully used then freed + * block group. + */ + ASSERT(reclaimed == bg->zone_capacity); + bg->free_space_ctl->free_space += reclaimed; + space_info->bytes_zone_unusable -= reclaimed; + spin_unlock(&bg->lock); + btrfs_return_free_space(space_info, reclaimed); + spin_unlock(&space_info->lock); + + if (num_bytes <= reclaimed) + break; + num_bytes -= reclaimed; + } + + return 0; +} diff --git a/fs/btrfs/zoned.h b/fs/btrfs/zoned.h index 7612e6572605..9672bf4c3335 100644 --- a/fs/btrfs/zoned.h +++ b/fs/btrfs/zoned.h @@ -96,6 +96,7 @@ int btrfs_zone_finish_one_bg(struct btrfs_fs_info *fs_info); int btrfs_zoned_activate_one_bg(struct btrfs_fs_info *fs_info, struct btrfs_space_info *space_info, bool do_finish); void btrfs_check_active_zone_reservation(struct btrfs_fs_info *fs_info); +int btrfs_reset_unused_block_groups(struct btrfs_space_info *space_info, u64 num_bytes); #else /* CONFIG_BLK_DEV_ZONED */ static inline int btrfs_get_dev_zone_info_all_devices(struct btrfs_fs_info *fs_info) @@ -265,6 +266,12 @@ static inline int btrfs_zoned_activate_one_bg(struct btrfs_fs_info *fs_info, static inline void btrfs_check_active_zone_reservation(struct btrfs_fs_info *fs_info) { } +static inline int btrfs_reset_unused_block_groups(struct btrfs_space_info *space_info, + u64 num_bytes) +{ + return 0; +} + #endif static inline bool btrfs_dev_is_sequential(struct btrfs_device *device, u64 pos) diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h index 4df93ca9b7a8..549ab3b41961 100644 --- a/include/trace/events/btrfs.h +++ b/include/trace/events/btrfs.h @@ -100,7 +100,8 @@ struct find_free_extent_ctl; EM( ALLOC_CHUNK, "ALLOC_CHUNK") \ EM( ALLOC_CHUNK_FORCE, "ALLOC_CHUNK_FORCE") \ EM( RUN_DELAYED_IPUTS, "RUN_DELAYED_IPUTS") \ - EMe(COMMIT_TRANS, "COMMIT_TRANS") + EM( COMMIT_TRANS, "COMMIT_TRANS") \ + EMe(RESET_ZONES, "RESET_ZONES") /* * First define the enums in the above macros to be exported to userspace via