From patchwork Wed Jan 25 20:50:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Burkov X-Patchwork-Id: 13116164 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB924C54EED for ; Wed, 25 Jan 2023 20:50:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236347AbjAYUum (ORCPT ); Wed, 25 Jan 2023 15:50:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53392 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235121AbjAYUuj (ORCPT ); Wed, 25 Jan 2023 15:50:39 -0500 Received: from out2-smtp.messagingengine.com (out2-smtp.messagingengine.com [66.111.4.26]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 08D2540C1 for ; Wed, 25 Jan 2023 12:50:37 -0800 (PST) Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id 92B3C5C02ED; Wed, 25 Jan 2023 15:50:36 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute3.internal (MEProxy); Wed, 25 Jan 2023 15:50:36 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bur.io; h=cc :content-transfer-encoding:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm3; t=1674679836; x=1674766236; bh=s+ 15ED4V5/W6abgBcCr5twOb4ZpnpnjrcFaoGQJDzvM=; b=JhckGjpjyW6BCxQ/Ia u6feA8/aWOt1LeSCaAOk41BQitoY6kNN0nEYhZYhDYyFP8t17Hx79JrBKGLaocda xtKzFluKx+5EXC2ZPqOwj/Qttkgk/QcpRk0hHm/jhJI4nHtRbQu6Q1dS4UGzq1d1 EOtpBHpsjmD+6tghAcPDUnjOCGrulzxAQK33VN06Okhf3CzDtFf29sCHfkxoacux 8GVGlfQFFrUTC8qAT9HtmDFbY8eoxAcjYNtp04cK1bj1suoswUZm0ikJkQqU5Tnu FCUn+xh5IUAZ88MX57if8uEZas8GTO8WX5Q3+wuNhK04Edjeh/IHjAHBfGU8Fyn0 xkSg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm3; t=1674679836; x=1674766236; bh=s+15ED4V5/W6a bgBcCr5twOb4ZpnpnjrcFaoGQJDzvM=; b=m76qz6axDjXKlAdHhOIpoC9hMnI5d toYzlyH9RJadobKDcxlFXa0lNsLj3kh/Rc8z2tPdBDFTjC6CFud4kHfOUhZljGXp PHhDuhxwLyHiRsl2FlvThLFH+SzwPWOyl/7lNNJLsc3oAug166XkOEyAMWShD7mj 69CuaX0qVpbpByWQaugpE4U2Em84LKma01GEL0Hrtos1zVP2Jo/T/9dBnQv299kg /EIO2a1X1WTUi2YVME+ie4HJcvBjjEYZH0vSgBYRWVCSib2u2NWrZAe5uP7S4p1v gdIyDFqXaSLx8ldWX1FE1gEFtfyzBpH2zs9svKW8hfI0gQt9RgWUkWqwA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedruddvvddgudeggecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtke ertdertddtnecuhfhrohhmpeeuohhrihhsuceuuhhrkhhovhcuoegsohhrihhssegsuhhr rdhioheqnecuggftrfgrthhtvghrnhepieeuffeuvdeiueejhfehiefgkeevudejjeejff evvdehtddufeeihfekgeeuheelnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghm pehmrghilhhfrhhomhepsghorhhishessghurhdrihho X-ME-Proxy: Feedback-ID: i083147f8:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 25 Jan 2023 15:50:36 -0500 (EST) From: Boris Burkov To: linux-btrfs@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 1/2] btrfs: fix size class loading logic Date: Wed, 25 Jan 2023 12:50:32 -0800 Message-Id: X-Mailer: git-send-email 2.38.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org The original implementation was completely incorrect. It used btrfs_search_slot to make an inexact match, which simply returned >0 to indicate not finding the key. Change it to using search_forward with no transid to actually walk the leaves looking for extent items. Some small tweaks to the key space condition checking in the iteration were also necessary. Finally, since the sampling lookups are of fixed complexity, move them into the main, blocking part of caching a block group, not as a best-effort thing after. This has no effect on total block group caching throughput as there is only one thread anyway, but makes it simpler and reduces weird races where we change the size class simultaneously from an allocation and loading. Signed-off-by: Boris Burkov --- fs/btrfs/block-group.c | 56 ++++++++++++++++++++++++++++-------------- 1 file changed, 37 insertions(+), 19 deletions(-) diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c index 73e1270b3904..45ccb25c5b1f 100644 --- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -555,7 +555,8 @@ u64 add_new_free_space(struct btrfs_block_group *block_group, u64 start, u64 end * Returns: 0 on success, 1 if the search didn't yield a useful item, negative * error code on error. */ -static int sample_block_group_extent_item(struct btrfs_block_group *block_group, +static int sample_block_group_extent_item(struct btrfs_caching_control *caching_ctl, + struct btrfs_block_group *block_group, int index, int max_index, struct btrfs_key *key) { @@ -563,17 +564,19 @@ static int sample_block_group_extent_item(struct btrfs_block_group *block_group, struct btrfs_root *extent_root; int ret = 0; u64 search_offset; + u64 search_end = block_group->start + block_group->length; struct btrfs_path *path; ASSERT(index >= 0); ASSERT(index <= max_index); ASSERT(max_index > 0); + lockdep_assert_held(&caching_ctl->mutex); + lockdep_assert_held_read(&fs_info->commit_root_sem); path = btrfs_alloc_path(); if (!path) return -ENOMEM; - down_read(&fs_info->commit_root_sem); extent_root = btrfs_extent_root(fs_info, max_t(u64, block_group->start, BTRFS_SUPER_INFO_OFFSET)); @@ -586,21 +589,36 @@ static int sample_block_group_extent_item(struct btrfs_block_group *block_group, key->type = BTRFS_EXTENT_ITEM_KEY; key->offset = 0; - ret = btrfs_search_slot(NULL, extent_root, key, path, 0, 0); - if (ret != 0) - goto out; - if (key->objectid < block_group->start || - key->objectid > block_group->start + block_group->length) { - ret = 1; - goto out; - } - if (key->type != BTRFS_EXTENT_ITEM_KEY) { - ret = 1; - goto out; + while (1) { + ret = btrfs_search_forward(extent_root, key, path, 0); + if (ret != 0) + goto out; + /* Success; sampled an extent item in the block group */ + if (key->type == BTRFS_EXTENT_ITEM_KEY && + key->objectid >= block_group->start && + key->objectid + key->offset <= search_end) + goto out; + + /* We can't possibly find a valid extent item anymore */ + if (key->objectid >= search_end) { + ret = 1; + break; + } + if (key->type < BTRFS_EXTENT_ITEM_KEY) + key->type = BTRFS_EXTENT_ITEM_KEY; + else + key->objectid++; + btrfs_release_path(path); + up_read(&fs_info->commit_root_sem); + mutex_unlock(&caching_ctl->mutex); + cond_resched(); + mutex_lock(&caching_ctl->mutex); + down_read(&fs_info->commit_root_sem); } out: + lockdep_assert_held(&caching_ctl->mutex); + lockdep_assert_held_read(&fs_info->commit_root_sem); btrfs_free_path(path); - up_read(&fs_info->commit_root_sem); return ret; } @@ -638,7 +656,8 @@ static int sample_block_group_extent_item(struct btrfs_block_group *block_group, * * Returns: 0 on success, negative error code on error. */ -static int load_block_group_size_class(struct btrfs_block_group *block_group) +static int load_block_group_size_class(struct btrfs_caching_control *caching_ctl, + struct btrfs_block_group *block_group) { struct btrfs_key key; int i; @@ -646,11 +665,11 @@ static int load_block_group_size_class(struct btrfs_block_group *block_group) enum btrfs_block_group_size_class size_class = BTRFS_BG_SZ_NONE; int ret; - if (btrfs_block_group_should_use_size_class(block_group)) + if (!btrfs_block_group_should_use_size_class(block_group)) return 0; for (i = 0; i < 5; ++i) { - ret = sample_block_group_extent_item(block_group, i, 5, &key); + ret = sample_block_group_extent_item(caching_ctl, block_group, i, 5, &key); if (ret < 0) goto out; if (ret > 0) @@ -812,6 +831,7 @@ static noinline void caching_thread(struct btrfs_work *work) mutex_lock(&caching_ctl->mutex); down_read(&fs_info->commit_root_sem); + load_block_group_size_class(caching_ctl, block_group); if (btrfs_test_opt(fs_info, SPACE_CACHE)) { ret = load_free_space_cache(block_group); if (ret == 1) { @@ -867,8 +887,6 @@ static noinline void caching_thread(struct btrfs_work *work) wake_up(&caching_ctl->wait); - load_block_group_size_class(block_group); - btrfs_put_caching_control(caching_ctl); btrfs_put_block_group(block_group); } From patchwork Wed Jan 25 20:50:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Burkov X-Patchwork-Id: 13116163 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 955D9C27C76 for ; Wed, 25 Jan 2023 20:50:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236176AbjAYUul (ORCPT ); Wed, 25 Jan 2023 15:50:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53394 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235174AbjAYUuk (ORCPT ); Wed, 25 Jan 2023 15:50:40 -0500 Received: from out2-smtp.messagingengine.com (out2-smtp.messagingengine.com [66.111.4.26]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0183744BE for ; Wed, 25 Jan 2023 12:50:38 -0800 (PST) Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id 65F215C02F0; Wed, 25 Jan 2023 15:50:38 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute2.internal (MEProxy); Wed, 25 Jan 2023 15:50:38 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bur.io; h=cc :content-transfer-encoding:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm3; t=1674679838; x=1674766238; bh=TZ ODq5ipBPHXR1fUlqHdF3BP+C8x4KpdWUsdeq1DwJA=; b=gdeL9KCdz5q3guWbXc HVhaMEDbogq4BFxBIg8BwjuFW2FBTCFRj4o/EjpwKWe2zOFQ/IuwTz/OECc0QGcw OHQGfKq5YHpD7BVOndoNkOk+TPGIHvwe91ruMBgCteuHW+331ThjO5n0Y7VwY8AF XP10l+V+Mms+nDj/AaDpLpHH1QPLiGV72dzcOrd5y4JBxTe9c/1dF5RD4QYdoMej OSCprQb1En0rJgyz7ePizfV3dCVq64u8SZhf6HfOqS7zGXZc0UTSQXlvXEct+QCQ xl2SNB/weacTkMRF3cBgbopnXfNeLuESxsZIeEw5snV3sWjg0A/wFuaMxaCpRfyb /GVg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm3; t=1674679838; x=1674766238; bh=TZODq5ipBPHXR 1fUlqHdF3BP+C8x4KpdWUsdeq1DwJA=; b=QOHcr1bqnfx0qzmS5vnRgl2IB3+oh rBuOIHni9yWeRNBk6JK4Jt/MnQoeU8prUdyCPFGirViDvt7x3zeXHHDdeRJaYvFM XYh/xDv4NxPrqItyNF1VK7xN2ikDwwoOg6wXGXBEQRYutwmQiIhSe23mljbsAi0v AdZxkANBVYSO272oGhpNQlf58DOiPpMvtGOUjoPEyuI3igaHt1cWQb5WxVTABBOb pfyrhUhGZAbkzJ6Q7Dx7+AiudLHmfPT4oG0JPyvob36QYXuQf6d6ah6MLODzNZog uJy5N6HiiairWTOmAncGLAsXiEFto1krkLh4IaRHe+RRVlXGZkZfg49dg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedruddvvddgudeggecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtke ertdertddtnecuhfhrohhmpeeuohhrihhsuceuuhhrkhhovhcuoegsohhrihhssegsuhhr rdhioheqnecuggftrfgrthhtvghrnhepieeuffeuvdeiueejhfehiefgkeevudejjeejff evvdehtddufeeihfekgeeuheelnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghm pehmrghilhhfrhhomhepsghorhhishessghurhdrihho X-ME-Proxy: Feedback-ID: i083147f8:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 25 Jan 2023 15:50:37 -0500 (EST) From: Boris Burkov To: linux-btrfs@vger.kernel.org, kernel-team@fb.com Subject: [PATCH 2/2] btrfs: add size class stats to sysfs Date: Wed, 25 Jan 2023 12:50:33 -0800 Message-Id: <3e95d7d8a42fa8969f415fc03ad999de3d29a196.1674679476.git.boris@bur.io> X-Mailer: git-send-email 2.38.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Make it possible to see the distribution of size classes for block groups. Helpful for testing and debugging the allocator w.r.t. to size classes. Signed-off-by: Boris Burkov --- fs/btrfs/sysfs.c | 39 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index 108aa3876186..e1ae4d2323d6 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -9,6 +9,7 @@ #include #include #include +#include #include #include "messages.h" #include "ctree.h" @@ -778,6 +779,42 @@ static ssize_t btrfs_chunk_size_store(struct kobject *kobj, return len; } +static ssize_t btrfs_size_classes_show(struct kobject *kobj, + struct kobj_attribute *a, char *buf) +{ + struct btrfs_space_info *sinfo = to_space_info(kobj); + struct btrfs_block_group *bg; + int none = 0; + int small = 0; + int medium = 0; + int large = 0; + int i; + + down_read(&sinfo->groups_sem); + for (i = 0; i < BTRFS_NR_RAID_TYPES; ++i) { + list_for_each_entry(bg, &sinfo->block_groups[i], list) { + if (!btrfs_block_group_should_use_size_class(bg)) + continue; + switch (bg->size_class) { + case BTRFS_BG_SZ_NONE: + none++; + break; + case BTRFS_BG_SZ_SMALL: + small++; + break; + case BTRFS_BG_SZ_MEDIUM: + medium++; + break; + case BTRFS_BG_SZ_LARGE: + large++; + break; + } + } + } + up_read(&sinfo->groups_sem); + return sysfs_emit(buf, "%d %d %d %d\n", none, small, medium, large); +} + #ifdef CONFIG_BTRFS_DEBUG /* * Request chunk allocation with current chunk size. @@ -835,6 +872,7 @@ SPACE_INFO_ATTR(bytes_zone_unusable); SPACE_INFO_ATTR(disk_used); SPACE_INFO_ATTR(disk_total); BTRFS_ATTR_RW(space_info, chunk_size, btrfs_chunk_size_show, btrfs_chunk_size_store); +BTRFS_ATTR(space_info, size_classes, btrfs_size_classes_show); static ssize_t btrfs_sinfo_bg_reclaim_threshold_show(struct kobject *kobj, struct kobj_attribute *a, @@ -887,6 +925,7 @@ static struct attribute *space_info_attrs[] = { BTRFS_ATTR_PTR(space_info, disk_total), BTRFS_ATTR_PTR(space_info, bg_reclaim_threshold), BTRFS_ATTR_PTR(space_info, chunk_size), + BTRFS_ATTR_PTR(space_info, size_classes), #ifdef CONFIG_BTRFS_DEBUG BTRFS_ATTR_PTR(space_info, force_chunk_alloc), #endif