From patchwork Tue Mar 29 08:56:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 12794570 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7A1CC433FE for ; Tue, 29 Mar 2022 08:56:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234307AbiC2I6A (ORCPT ); Tue, 29 Mar 2022 04:58:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33410 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232964AbiC2I57 (ORCPT ); Tue, 29 Mar 2022 04:57:59 -0400 Received: from esa1.hgst.iphmx.com (esa1.hgst.iphmx.com [68.232.141.245]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 93C8D190B54 for ; Tue, 29 Mar 2022 01:56:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1648544176; x=1680080176; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=o8OCy0g7xB1UK3ZwczxP0PJny07qDFNDokaC0Iq1iSs=; b=kDrGYzyG682dKElN9H4VAkhzfCO0NGzFlmQEbu1CIitBm1sJWrwA84yd duMsERfJwhaESR2rlqK5yentRhl33NsGgujo8Ev77DpUOczH6uLvzqUJs 6BSDuvisATMcZ6A6+goI3v8c0fwFsnYgPtVvov9TVgbMI90J/knAPgJ+F 18Do7TdFtU8Thr5dC63MMo9KJatNlhQql5s066eHs1G2TW+0z0kVIyulW SacUzcAB5NjFW4If+5nu5carHiZDIojB3UYbazIwJKVPzYhoNK1DGINDU krPXlgnWY8WdXyH+ikfjqIlsLXP0Wrn0feHBQ+hPObrDzNgLkSqOyPcD9 A==; X-IronPort-AV: E=Sophos;i="5.90,219,1643644800"; d="scan'208";a="308481648" Received: from uls-op-cesaip01.wdc.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 29 Mar 2022 16:56:16 +0800 IronPort-SDR: 2arWa3iP21Gt3FTFEYaSU84S65bUO3rekd1G/O1vICvWxPc1JHpRUBjOoh5uJpRAByhMWYfHVR vVpwWGaDeOfsOlHpy69Kv3DJNAnY9u0iCEnQht+mESfXdgj2ZhhleIWOnoy/q/2GrzSRvketGG pknvowQdE2m6F5sgaag6F8ecI4yh1kUO+hU0xFAYSkRb+KfYnffK6w2hG3GPilqaOEYCLepw8i YKu1Io6RjwpyYU363iJk/itLx4cF1t55HLFUHDmGhbeq7INTQGpIbB5uMb2P+guYz48SWND1Rj 3eAnHGzpEjvfO821yRCmaeAd Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 29 Mar 2022 01:28:00 -0700 IronPort-SDR: ZiH9Wna24tCwc2KbDLt+4JuBvhp2nMh2rzjhMFwXqK6ogwKPfHQegsnkCPBxkX0SUD1oZiDK67 NdFKavdw9EGqd0S0YGrc9KtnzOpk4lKwMDMxUIVhAN/PUnbrSiTfKYLd++CuID+bnkBmjY2nhD +hpdbVD37XriITZ1ov82yepFyudN/HafBtGy+KscQu8PsgTzZc+Ixo4aI3l+NKvYhc8UmgQl6s 4j//AftuWeFAKGsqErGTPShBBcpJg9RFPJ9jRH4eMzew4j2E+T+79myxjjN5XQhRS3q6bjwfy1 DHk= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip01.wdc.com with ESMTP; 29 Mar 2022 01:56:16 -0700 From: Johannes Thumshirn To: David Sterba Cc: Josef Bacik , Naohiro Aota , Pankaj Raghav , "linux-btrfs @ vger . kernel . org" , Johannes Thumshirn Subject: [PATCH v2 1/4] btrfs: make the bg_reclaim_threshold per-space info Date: Tue, 29 Mar 2022 01:56:06 -0700 Message-Id: <63d4d206dd2e652aa968ab0fa30dd9aab98a666b.1648543951.git.johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Josef Bacik For !zoned file systems it's useful to have the auto reclaim feature, however there are different use cases for !zoned, for example we may not want to reclaim metadata chunks ever, only data chunks. Move this sysfs flag to per-space_info. This won't affect current users because this tunable only ever did anything for zoned, and that is currently hidden behind BTRFS_CONFIG_DEBUG. Signed-off-by: Josef Bacik [ jth restore global bg_reclaim_threshold ] Signed-off-by: Johannes Thumshirn --- fs/btrfs/free-space-cache.c | 7 +++++-- fs/btrfs/space-info.c | 9 +++++++++ fs/btrfs/space-info.h | 6 ++++++ fs/btrfs/sysfs.c | 37 +++++++++++++++++++++++++++++++++++++ fs/btrfs/zoned.h | 6 +----- 5 files changed, 58 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index 01a408db5683..ef84bc5030cd 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -2630,16 +2630,19 @@ int __btrfs_add_free_space(struct btrfs_block_group *block_group, static int __btrfs_add_free_space_zoned(struct btrfs_block_group *block_group, u64 bytenr, u64 size, bool used) { - struct btrfs_fs_info *fs_info = block_group->fs_info; + struct btrfs_space_info *sinfo = block_group->space_info; struct btrfs_free_space_ctl *ctl = block_group->free_space_ctl; u64 offset = bytenr - block_group->start; u64 to_free, to_unusable; - const int bg_reclaim_threshold = READ_ONCE(fs_info->bg_reclaim_threshold); + int bg_reclaim_threshold = 0; bool initial = (size == block_group->length); u64 reclaimable_unusable; WARN_ON(!initial && offset + size > block_group->zone_capacity); + if (!initial) + bg_reclaim_threshold = READ_ONCE(sinfo->bg_reclaim_threshold); + spin_lock(&ctl->tree_lock); if (!used) to_free = size; diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c index b87931a458eb..60d0a58c4644 100644 --- a/fs/btrfs/space-info.c +++ b/fs/btrfs/space-info.c @@ -181,6 +181,12 @@ void btrfs_clear_space_info_full(struct btrfs_fs_info *info) found->full = 0; } +/* + * Block groups with more than this value (percents) of unusable space will be + * scheduled for background reclaim. + */ +#define BTRFS_DEFAULT_ZONED_RECLAIM_THRESH 75 + static int create_space_info(struct btrfs_fs_info *info, u64 flags) { @@ -203,6 +209,9 @@ static int create_space_info(struct btrfs_fs_info *info, u64 flags) INIT_LIST_HEAD(&space_info->priority_tickets); space_info->clamp = 1; + if (btrfs_is_zoned(info)) + space_info->bg_reclaim_threshold = BTRFS_DEFAULT_ZONED_RECLAIM_THRESH; + ret = btrfs_sysfs_add_space_info_type(info, space_info); if (ret) return ret; diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h index d841fed73492..0c45f539e3cf 100644 --- a/fs/btrfs/space-info.h +++ b/fs/btrfs/space-info.h @@ -24,6 +24,12 @@ struct btrfs_space_info { the space info if we had an ENOSPC in the allocator. */ + /* + * Once a block group drops below this threshold we'll schedule it for + * reclaim. + */ + int bg_reclaim_threshold; + int clamp; /* Used to scale our threshold for preemptive flushing. The value is >> clamp, so turns out to be a 2^clamp divisor. */ diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index 17389a42a3ab..90da1ea0cae0 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -722,6 +722,42 @@ SPACE_INFO_ATTR(bytes_zone_unusable); SPACE_INFO_ATTR(disk_used); SPACE_INFO_ATTR(disk_total); +static ssize_t btrfs_sinfo_bg_reclaim_threshold_show(struct kobject *kobj, + struct kobj_attribute *a, + char *buf) +{ + struct btrfs_space_info *space_info = to_space_info(kobj); + ssize_t ret; + + ret = sysfs_emit(buf, "%d\n", READ_ONCE(space_info->bg_reclaim_threshold)); + + return ret; +} + +static ssize_t btrfs_sinfo_bg_reclaim_threshold_store(struct kobject *kobj, + struct kobj_attribute *a, + const char *buf, size_t len) +{ + struct btrfs_space_info *space_info = to_space_info(kobj); + int thresh; + int ret; + + ret = kstrtoint(buf, 10, &thresh); + if (ret) + return ret; + + if (thresh != 0 && (thresh <= 50 || thresh > 100)) + return -EINVAL; + + WRITE_ONCE(space_info->bg_reclaim_threshold, thresh); + + return len; +} + +BTRFS_ATTR_RW(space_info, bg_reclaim_threshold, + btrfs_sinfo_bg_reclaim_threshold_show, + btrfs_sinfo_bg_reclaim_threshold_store); + /* * Allocation information about block group types. * @@ -738,6 +774,7 @@ static struct attribute *space_info_attrs[] = { BTRFS_ATTR_PTR(space_info, bytes_zone_unusable), BTRFS_ATTR_PTR(space_info, disk_used), BTRFS_ATTR_PTR(space_info, disk_total), + BTRFS_ATTR_PTR(space_info, bg_reclaim_threshold), NULL, }; ATTRIBUTE_GROUPS(space_info); diff --git a/fs/btrfs/zoned.h b/fs/btrfs/zoned.h index cbf016a7bb5d..c489c08d7fd5 100644 --- a/fs/btrfs/zoned.h +++ b/fs/btrfs/zoned.h @@ -10,11 +10,7 @@ #include "block-group.h" #include "btrfs_inode.h" -/* - * Block groups with more than this value (percents) of unusable space will be - * scheduled for background reclaim. - */ -#define BTRFS_DEFAULT_RECLAIM_THRESH 75 +#define BTRFS_DEFAULT_RECLAIM_THRESH 75 struct btrfs_zoned_device_info { /* From patchwork Tue Mar 29 08:56:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 12794572 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57B41C433F5 for ; Tue, 29 Mar 2022 08:56:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234312AbiC2I6A (ORCPT ); Tue, 29 Mar 2022 04:58:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33468 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234304AbiC2I57 (ORCPT ); Tue, 29 Mar 2022 04:57:59 -0400 Received: from esa1.hgst.iphmx.com (esa1.hgst.iphmx.com [68.232.141.245]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 960C5190B5E for ; Tue, 29 Mar 2022 01:56:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1648544177; x=1680080177; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=wJrqRCYSAbg1LnyyBqCL8XJOy/fAnNCtxaK/xgeCWfs=; b=aX0OsNBxKESwe5kTkXnf5SFRVbC9OZdHGRJV/tsRzuzAGs3KupTsdx/K HpLtxhnlG2C2rMC+lUXyM+7UajiYTnqHJGbwDvLR/KAgI4nWGW3uCEgpK U/ugGzzNPvo7rl5eBqbPfZALkmJ8rGPPHZcdV5UTZJWP9vRw9V0ww6wPC He3Iw0FnDhgl3tuOYJ16cRChjpZiuYrOnt0Og/bsMXIiz8toCFE2//W2p 46ZqoLsjyHQuwzB2Wvqg6EfN41falRVA7HKqCC+NKMRRFrwrPJ1rLXGJb duDqoDp1rFJJ5OPiL/9lkuDM+gQYNHu13bjvGMZIaa+hOs1dsQvozfRPD Q==; X-IronPort-AV: E=Sophos;i="5.90,219,1643644800"; d="scan'208";a="308481652" Received: from uls-op-cesaip01.wdc.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 29 Mar 2022 16:56:17 +0800 IronPort-SDR: BEA8UpAOItDVdWPpD9n0uvkyabn+h8l19Dyl4N2erlWhiXBYfYrFj7ffzGij8Kr4zogNmM9I/e a4s7jHg5j8+fes9TGf54RVXiZloUgCd0oyd7FEAJlKyzZI0xzfcLQ5MAzxK/eH0AA3WYeFEvrR Xk58d3HFbfzud14dYrlcjsYwt0h1wCCo1VRHBUjU4Qy5w81oF1bCGjTHc/PqcGfFg2/6BW0UYW sSvNnwvGdI543wARPUu68WPf3EfIUoeespuS69D3i4OqAKcO3G+fhgU18l7BGHaAMsVe1PIDtL qQ7QzzUmOylCmamaZC2b/1uD Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 29 Mar 2022 01:28:01 -0700 IronPort-SDR: +RPNHv0TDmyAcFyGkwsL7YtfQZHnCPoUXcvbaOzU3iLV5LaEJEv07HgZad551K8YBnFa5yMEhe mq+Ak2m7eKUx6nS15POTKSTptSps7uQkm3+HMwjaSNG4Q/GcER1stJdVB+ekWlhHvmdcH8iwAq rG+SpFb9jXu2NFIfkIf5ALTobNbAIH5Y1lBH6Qt4MCKGKFNC1vitVyyITQlkUfmXfWyIGfkrTx eSseAMaHDG7azb8LblzaxWrClsiHwttRCA17lR7uuGZxiO4X4Eo4dwb0suu/biW72F5WcJf9Dq aSU= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip01.wdc.com with ESMTP; 29 Mar 2022 01:56:17 -0700 From: Johannes Thumshirn To: David Sterba Cc: Josef Bacik , Naohiro Aota , Pankaj Raghav , "linux-btrfs @ vger . kernel . org" , Johannes Thumshirn Subject: [PATCH v2 2/4] btrfs: allow block group background reclaim for !zoned fs'es Date: Tue, 29 Mar 2022 01:56:07 -0700 Message-Id: <7243b0e2e57f2eb276f9ddbff3c772e6b3cbe956.1648543951.git.johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Josef Bacik We have found this feature invaluable at Facebook due to how our workload interacts with the allocator. We have been using this in production for months with only a single problem that has already been fixed. This will allow us to set a threshold for block groups to be automatically relocated even if we don't have zoned devices. Signed-off-by: Josef Bacik Signed-off-by: Johannes Thumshirn --- fs/btrfs/block-group.c | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c index 59f18a10fd5f..628741ecb97b 100644 --- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -3208,6 +3208,31 @@ int btrfs_write_dirty_block_groups(struct btrfs_trans_handle *trans) return ret; } +static inline bool should_reclaim_block_group(struct btrfs_block_group *block_group, + u64 bytes_freed) +{ + const struct btrfs_space_info *space_info = block_group->space_info; + const int reclaim_thresh = READ_ONCE(space_info->bg_reclaim_threshold); + const u64 new_val = block_group->used; + const u64 old_val = new_val + bytes_freed; + u64 thresh; + + if (reclaim_thresh == 0) + return false; + + thresh = div_factor_fine(block_group->length, reclaim_thresh); + + /* + * If we were below the threshold before don't reclaim, we are likely a + * brand new block group and we don't want to relocate new block groups. + */ + if (old_val < thresh) + return false; + if (new_val >= thresh) + return false; + return true; +} + int btrfs_update_block_group(struct btrfs_trans_handle *trans, u64 bytenr, u64 num_bytes, bool alloc) { @@ -3230,6 +3255,8 @@ int btrfs_update_block_group(struct btrfs_trans_handle *trans, spin_unlock(&info->delalloc_root_lock); while (total) { + bool reclaim; + cache = btrfs_lookup_block_group(info, bytenr); if (!cache) { ret = -ENOENT; @@ -3275,6 +3302,8 @@ int btrfs_update_block_group(struct btrfs_trans_handle *trans, cache->space_info, num_bytes); cache->space_info->bytes_used -= num_bytes; cache->space_info->disk_used -= num_bytes * factor; + + reclaim = should_reclaim_block_group(cache, num_bytes); spin_unlock(&cache->lock); spin_unlock(&cache->space_info->lock); @@ -3301,6 +3330,8 @@ int btrfs_update_block_group(struct btrfs_trans_handle *trans, if (!alloc && old_val == 0) { if (!btrfs_test_opt(info, DISCARD_ASYNC)) btrfs_mark_bg_unused(cache); + } else if (!alloc && reclaim) { + btrfs_mark_bg_to_reclaim(cache); } btrfs_put_block_group(cache); From patchwork Tue Mar 29 08:56:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 12794571 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F969C4332F for ; Tue, 29 Mar 2022 08:56:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234304AbiC2I6B (ORCPT ); Tue, 29 Mar 2022 04:58:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33590 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234309AbiC2I6A (ORCPT ); Tue, 29 Mar 2022 04:58:00 -0400 Received: from esa1.hgst.iphmx.com (esa1.hgst.iphmx.com [68.232.141.245]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BFCD2190B50 for ; Tue, 29 Mar 2022 01:56:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1648544178; x=1680080178; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=A948YAVd7f9yGG+Cfki6krVaeHX22/PKpDBYe2Jrg7A=; b=aNifqISOyRU9njyiIiv859nEDlWRv+ah4a+7Q1gPnB880zg55lvcJYOu Qe0ppRaKiXu0SGFl8Ow9+6Jbh4jfpbSsUQWLMgK8EkvPuss6KXK+fekiE YCa+U99wa6DpUKTt8t9jSk0zjdOBb3gUlpDKds9b18oq2A/R/2XEZMfGk AMI4r5wgQh/vqnU8Lv+FGsfgICfIErNCt6G1nPA0hoWww7X0f0nECRK6z cMoYORBPnnMB559LwZYGt7Z6TFAucSERbYrLuk+Ray6s/wD2hhZST8L3Q noHpwfkT5JV8XZnVsIx7nySjJqPxkvuNbwMxnJefggVAYsez0lNu+AMg+ w==; X-IronPort-AV: E=Sophos;i="5.90,219,1643644800"; d="scan'208";a="308481655" Received: from uls-op-cesaip01.wdc.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 29 Mar 2022 16:56:18 +0800 IronPort-SDR: CEiMAawZhjZcUYtEwHmK5bUl9GIKODKOOiAycCn0fwh/z8sN7UZTVCdHUrN8yNd536DRSliisy a/1n4/GKra6nflV9AybrZqfLGPt/o/J0zjeElqeYSmExezZyWQN4omLneVWvdgTcgyxcZDHcNW 93zdwlQREsuRZJ2N7ln8FUNEQOlZjPX/LjUAx7YjbzM1OcKxvASU3NEqJ75ohy4X6xDPti38HN 1sKPyndYISIP9rZ2s303XEqX0QuuBxkz4wF3fz4muRqT7rsdO2SUNb8luPGtXE9K0kNj2ivZHd rHmKd1HzCtfkHFTYPy2iCAjV Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 29 Mar 2022 01:28:03 -0700 IronPort-SDR: G6s2xXhObs5KRlYJT6kxDXcY4pV0CMw/z6VNTNZV7biIzkGYiJrmv2iFB9HRJzC8tSmXr5eV9y koOvC9iLeeXchJg3XTDZrp7ygTeq6jbjxBmwD9mpUSjB7Ie4GeIeo3xsGPQzELj4r9pifvIQea sVuJGL83NIzlz+epvNUBeSiFq9MJxE2iRddsJgan/R/tgVn4sKhS8YiNxOIrduxbscbEJsdsAc n0kZ27YTUTEM/9V7Wo5aiahhQV7JXV6Ej+AE7FFjIz3wlnoVuuFchRSa/xFUxmImPDXDAYtACU z8A= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip01.wdc.com with ESMTP; 29 Mar 2022 01:56:18 -0700 From: Johannes Thumshirn To: David Sterba Cc: Josef Bacik , Naohiro Aota , Pankaj Raghav , "linux-btrfs @ vger . kernel . org" , Johannes Thumshirn Subject: [PATCH v2 3/4] btrfs: change the bg_reclaim_threshold valid region from 0 to 100 Date: Tue, 29 Mar 2022 01:56:08 -0700 Message-Id: X-Mailer: git-send-email 2.35.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Josef Bacik For the !zoned case we may want to set the threshold for reclaim to something below 50%. Change the acceptable threshold from 50-100 to 0-100. Signed-off-by: Josef Bacik Signed-off-by: Johannes Thumshirn --- fs/btrfs/sysfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index 90da1ea0cae0..fdf9bf789528 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -746,7 +746,7 @@ static ssize_t btrfs_sinfo_bg_reclaim_threshold_store(struct kobject *kobj, if (ret) return ret; - if (thresh != 0 && (thresh <= 50 || thresh > 100)) + if (thresh < 0 || thresh > 100) return -EINVAL; WRITE_ONCE(space_info->bg_reclaim_threshold, thresh); From patchwork Tue Mar 29 08:56:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 12794573 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 084C9C433FE for ; Tue, 29 Mar 2022 08:56:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234315AbiC2I6C (ORCPT ); Tue, 29 Mar 2022 04:58:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33724 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234309AbiC2I6C (ORCPT ); Tue, 29 Mar 2022 04:58:02 -0400 Received: from esa1.hgst.iphmx.com (esa1.hgst.iphmx.com [68.232.141.245]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DCECF190B50 for ; Tue, 29 Mar 2022 01:56:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1648544179; x=1680080179; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=abTpYunA56gGmpm+Mj2VxzblStbA760aAWX50CRhuL4=; b=oW/TRo3JCF50tGyNQ6spfbpBV3BSh1VgWNH50l/9Dwb4PSmv2bO6Ppat uH/zNb27MrzvF/yuMOremadSo5Ts8jVeDrp3xz2uYRpGU6yY1y/0VQrl+ S2g7sWdACp/gjt63MM9qLgVO47LAUQ7dB47JgaOy12nZft4ZVvIDHGMk9 YC5xyvODaz45+qB4iDbQgCbqYFAs+CCvk6iGOJk89IxobFGexg4xWn6XN k86M+8cG/v0fhyzb4eW7X6VmeemAVyZ1piL4Xc+wOk8rgT8Xi+iBiu1Ov NipY2LJFq247c/9/+PDG8U1J2MaL/wH4jFDnM9h7eyXlY/VzlnRaVvIz0 Q==; X-IronPort-AV: E=Sophos;i="5.90,219,1643644800"; d="scan'208";a="308481658" Received: from uls-op-cesaip01.wdc.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 29 Mar 2022 16:56:19 +0800 IronPort-SDR: rpuh3R47uj1SP6ryZOQxp5soYEAjF3ozZM5uozyfj34MUyNpahb2MWPh69UfWTRr95i8sA6dLs 9+NefCRojS8e+LMrulzCA4ky0ihI4FfCPo7dI09n0DAoUlor6Sf23hFhYvwTXSNXgRE3ceo+t8 wz9DEYlEIXWyfW9ei187H7+2UqqCa2TqwkDh/Rm3u0GYL6hFDQ5f/L11cP+VYh7YzB0wpyhvLe 70DP04HzjzE4OWtr8cclN6pFBGFdyG0Zbze0Ciwyk2966zU+F4No8rM/n3BmxaAO/Fu+VH7q4v J5MoFksg4wwmFtOM6mb5Cpfm Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 29 Mar 2022 01:28:04 -0700 IronPort-SDR: s/Sn3efMXtzadimZoMMVEyku0x5wvQc3QISRgyzvTgCZmSbb5VhujJtZopwUWD/I+hqRL7dazL NAYGUoa+qRLaSm3SKQScsaL5+IaDlDLu06ak7mzhayqPZ0gh1sO6BixU0FJs8CcF+auiNpfIaA aXGP6ap/Ey87YkFuqZdcs1C4msulwd+zB2jh4szwaYd2wd97O3H+8YGb82sRPC6VMXbm3fvYbP u7jxy3FQH/7N8u3rsbY/uwTIydLAM59fOzyGHm+wqHdgrzQAFoErSrwcVfIyIfDOTkcWeOa3no 224= WDCIronportException: Internal Received: from unknown (HELO redsun91.ssa.fujisawa.hgst.com) ([10.149.66.72]) by uls-op-cesaip01.wdc.com with ESMTP; 29 Mar 2022 01:56:19 -0700 From: Johannes Thumshirn To: David Sterba Cc: Johannes Thumshirn , Naohiro Aota , Josef Bacik , Pankaj Raghav , "linux-btrfs @ vger . kernel . org" Subject: [PATCH v2 4/4] btrfs: zoned: make auto-reclaim less aggressive Date: Tue, 29 Mar 2022 01:56:09 -0700 Message-Id: X-Mailer: git-send-email 2.35.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org The current auto-reclaim algorithm starts reclaiming all block-group's with a zone_unusable value above a configured threshold. This is causing a lot of reclaim IO even if there would be enough free zones on the device. Instead of only accounting a block-group's zone_unusable value, also take the ratio of free and not usable (written as well as zone_unusable) bytes a device has into account. Signed-off-by: Johannes Thumshirn Tested-by: Pankaj Raghav --- fs/btrfs/block-group.c | 10 ++++++++++ fs/btrfs/zoned.c | 28 ++++++++++++++++++++++++++++ fs/btrfs/zoned.h | 6 ++++++ 3 files changed, 44 insertions(+) diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c index 628741ecb97b..12454304bb85 100644 --- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -1512,6 +1512,13 @@ static int reclaim_bgs_cmp(void *unused, const struct list_head *a, return bg1->used > bg2->used; } +static inline bool btrfs_should_reclaim(struct btrfs_fs_info *fs_info) +{ + if (btrfs_is_zoned(fs_info)) + return btrfs_zoned_should_reclaim(fs_info); + return true; +} + void btrfs_reclaim_bgs_work(struct work_struct *work) { struct btrfs_fs_info *fs_info = @@ -1522,6 +1529,9 @@ void btrfs_reclaim_bgs_work(struct work_struct *work) if (!test_bit(BTRFS_FS_OPEN, &fs_info->flags)) return; + if (!btrfs_should_reclaim(fs_info)) + return; + sb_start_write(fs_info->sb); if (!btrfs_exclop_start(fs_info, BTRFS_EXCLOP_BALANCE)) { diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index 1b1b310c3c51..c0c460749b74 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -2079,3 +2079,31 @@ void btrfs_free_zone_cache(struct btrfs_fs_info *fs_info) } mutex_unlock(&fs_devices->device_list_mutex); } + +bool btrfs_zoned_should_reclaim(struct btrfs_fs_info *fs_info) +{ + struct btrfs_fs_devices *fs_devices = fs_info->fs_devices; + struct btrfs_device *device; + u64 used = 0; + u64 total = 0; + u64 factor; + + ASSERT(btrfs_is_zoned(fs_info)); + + if (!fs_info->bg_reclaim_threshold) + return false; + + mutex_lock(&fs_devices->device_list_mutex); + list_for_each_entry(device, &fs_devices->devices, dev_list) { + if (!device->bdev) + continue; + + total += device->disk_total_bytes; + used += device->bytes_used; + + } + mutex_unlock(&fs_devices->device_list_mutex); + + factor = div64_u64(used * 100, total); + return factor >= fs_info->bg_reclaim_threshold; +} diff --git a/fs/btrfs/zoned.h b/fs/btrfs/zoned.h index c489c08d7fd5..f2d16395087f 100644 --- a/fs/btrfs/zoned.h +++ b/fs/btrfs/zoned.h @@ -74,6 +74,7 @@ void btrfs_zone_finish_endio(struct btrfs_fs_info *fs_info, u64 logical, u64 length); void btrfs_clear_data_reloc_bg(struct btrfs_block_group *bg); void btrfs_free_zone_cache(struct btrfs_fs_info *fs_info); +bool btrfs_zoned_should_reclaim(struct btrfs_fs_info *fs_info); #else /* CONFIG_BLK_DEV_ZONED */ static inline int btrfs_get_dev_zone(struct btrfs_device *device, u64 pos, struct blk_zone *zone) @@ -232,6 +233,11 @@ static inline void btrfs_zone_finish_endio(struct btrfs_fs_info *fs_info, static inline void btrfs_clear_data_reloc_bg(struct btrfs_block_group *bg) { } static inline void btrfs_free_zone_cache(struct btrfs_fs_info *fs_info) { } + +static inline bool btrfs_zoned_should_reclaim(struct btrfs_fs_info *fs_info) +{ + return false; +} #endif static inline bool btrfs_dev_is_sequential(struct btrfs_device *device, u64 pos)