From patchwork Wed Jan 26 20:32:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goffredo Baroncelli X-Patchwork-Id: 12725727 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 14475C63684 for ; Wed, 26 Jan 2022 20:32:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230422AbiAZUcW (ORCPT ); Wed, 26 Jan 2022 15:32:22 -0500 Received: from michael.mail.tiscali.it ([213.205.33.246]:59406 "EHLO smtp.tiscali.it" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S229907AbiAZUcV (ORCPT ); Wed, 26 Jan 2022 15:32:21 -0500 Received: from venice.bhome ([78.14.151.50]) by michael.mail.tiscali.it with id nYYG2600e15VSme01YYJKH; Wed, 26 Jan 2022 20:32:19 +0000 x-auth-user: kreijack@tiscali.it From: Goffredo Baroncelli To: linux-btrfs@vger.kernel.org Cc: Zygo Blaxell , Josef Bacik , David Sterba , Sinnamohideen Shafeeq , Paul Jones , Boris Burkov , Goffredo Baroncelli Subject: [PATCH 1/7] btrfs: add flags to give an hint to the chunk allocator Date: Wed, 26 Jan 2022 21:32:08 +0100 Message-Id: <12942e1615af081413ec256da04bfa4fd0a6c155.1643228177.git.kreijack@inwind.it> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Reply-To: Goffredo Baroncelli MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tiscali.it; s=smtp; t=1643229139; bh=VQY6O7Tid/B7P16mtGfsFnkVQ4gQfAUmnQyQiwZnzic=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To; b=xx6TPRxDJzJZalA3KKuZDbIx3KVAxRbbmtPqQ4dY3C70e2zvPj65NpU/gBS6azHgX v5+baR4YSdVmIbhlyMl/7xwZ5/BQ5rfDDC4OBP+xT/RPLMAK8wnJ59tlgLSu/5ikPC uSwJO2o1arD0n1a4gRxuMxHhf5ZEv69bJLtZJvRo= Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goffredo Baroncelli Add the following flags to give an hint about which chunk should be allocated in which disk: - BTRFS_DEV_ALLOCATION_HINT_DATA_PREFERRED preferred for data chunk, but metadata chunk allowed - BTRFS_DEV_ALLOCATION_HINT_METADATA_PREFERRED preferred for metadata chunk, but data chunk allowed - BTRFS_DEV_ALLOCATION_HINT_METADATA_ONLY only metadata chunk allowed - BTRFS_DEV_ALLOCATION_HINT_DATA_ONLY only data chunk allowed Signed-off-by: Goffredo Baroncelli --- include/uapi/linux/btrfs_tree.h | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/include/uapi/linux/btrfs_tree.h b/include/uapi/linux/btrfs_tree.h index 5416f1f1a77a..02955d5fcd21 100644 --- a/include/uapi/linux/btrfs_tree.h +++ b/include/uapi/linux/btrfs_tree.h @@ -386,6 +386,22 @@ struct btrfs_key { __u64 offset; } __attribute__ ((__packed__)); +/* dev_item.type */ + +/* btrfs chunk allocation hint */ +#define BTRFS_DEV_ALLOCATION_HINT_BIT_COUNT 2 +/* btrfs chunk allocation hint mask */ +#define BTRFS_DEV_ALLOCATION_HINT_MASK \ + ((1 << BTRFS_DEV_ALLOCATION_HINT_BIT_COUNT) - 1) +/* preferred data chunk, but metadata chunk allowed */ +#define BTRFS_DEV_ALLOCATION_HINT_DATA_PREFERRED (0ULL) +/* preferred metadata chunk, but data chunk allowed */ +#define BTRFS_DEV_ALLOCATION_HINT_METADATA_PREFERRED (1ULL) +/* only metadata chunk are allowed */ +#define BTRFS_DEV_ALLOCATION_HINT_METADATA_ONLY (2ULL) +/* only data chunk allowed */ +#define BTRFS_DEV_ALLOCATION_HINT_DATA_ONLY (3ULL) + struct btrfs_dev_item { /* the internal btrfs device id */ __le64 devid; From patchwork Wed Jan 26 20:32:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goffredo Baroncelli X-Patchwork-Id: 12725729 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E2B40C2BA4C for ; Wed, 26 Jan 2022 20:32:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230431AbiAZUcW (ORCPT ); Wed, 26 Jan 2022 15:32:22 -0500 Received: from michael.mail.tiscali.it ([213.205.33.246]:59422 "EHLO smtp.tiscali.it" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S230389AbiAZUcV (ORCPT ); Wed, 26 Jan 2022 15:32:21 -0500 Received: from venice.bhome ([78.14.151.50]) by michael.mail.tiscali.it with id nYYG2600e15VSme01YYKKg; Wed, 26 Jan 2022 20:32:19 +0000 x-auth-user: kreijack@tiscali.it From: Goffredo Baroncelli To: linux-btrfs@vger.kernel.org Cc: Zygo Blaxell , Josef Bacik , David Sterba , Sinnamohideen Shafeeq , Paul Jones , Boris Burkov , Goffredo Baroncelli Subject: [PATCH 2/7] btrfs: export the device allocation_hint property in sysfs Date: Wed, 26 Jan 2022 21:32:09 +0100 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Reply-To: Goffredo Baroncelli MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tiscali.it; s=smtp; t=1643229139; bh=zzmMd9npqQTxc6FJE7Zbse8rCI3pkdJWRdqejjB3neE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To; b=N0uUWTHSwT9YnOTNfDoHiunshA+Fl7UqZX4iYf6Rp0YapvvKRbrXFT1+9QvU3PXwj waLlKL4jd2jCP/JyP6iOXnig30ORx4t+A/fTJvWk1GojvD0Kkb5YUgS32/Lwxb9BP9 HD0q59u/sJRzVqWignxvrTdlnRaR2PxcIMXfYqnY= Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goffredo Baroncelli Export the device allocation_hint property via /sys/fs/btrfs//devinfo//allocation_hint Signed-off-by: Goffredo Baroncelli --- fs/btrfs/sysfs.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index beb7f72d50b8..c1c903187e19 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -1575,6 +1575,17 @@ static ssize_t btrfs_devinfo_error_stats_show(struct kobject *kobj, } BTRFS_ATTR(devid, error_stats, btrfs_devinfo_error_stats_show); +static ssize_t btrfs_devinfo_allocation_hint_show(struct kobject *kobj, + struct kobj_attribute *a, char *buf) +{ + struct btrfs_device *device = container_of(kobj, struct btrfs_device, + devid_kobj); + + return scnprintf(buf, PAGE_SIZE, "0x%08llx\n", + device->type & BTRFS_DEV_ALLOCATION_HINT_MASK); +} +BTRFS_ATTR(devid, allocation_hint, btrfs_devinfo_allocation_hint_show); + /* * Information about one device. * @@ -1588,6 +1599,7 @@ static struct attribute *devid_attrs[] = { BTRFS_ATTR_PTR(devid, replace_target), BTRFS_ATTR_PTR(devid, scrub_speed_max), BTRFS_ATTR_PTR(devid, writeable), + BTRFS_ATTR_PTR(devid, allocation_hint), NULL }; ATTRIBUTE_GROUPS(devid); From patchwork Wed Jan 26 20:32:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goffredo Baroncelli X-Patchwork-Id: 12725731 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E2CD4C63686 for ; Wed, 26 Jan 2022 20:32:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230428AbiAZUcX (ORCPT ); Wed, 26 Jan 2022 15:32:23 -0500 Received: from michael.mail.tiscali.it ([213.205.33.246]:59436 "EHLO smtp.tiscali.it" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S230391AbiAZUcV (ORCPT ); Wed, 26 Jan 2022 15:32:21 -0500 Received: from venice.bhome ([78.14.151.50]) by michael.mail.tiscali.it with id nYYG2600e15VSme01YYKLA; Wed, 26 Jan 2022 20:32:20 +0000 x-auth-user: kreijack@tiscali.it From: Goffredo Baroncelli To: linux-btrfs@vger.kernel.org Cc: Zygo Blaxell , Josef Bacik , David Sterba , Sinnamohideen Shafeeq , Paul Jones , Boris Burkov , Goffredo Baroncelli Subject: [PATCH 3/7] btrfs: change the device allocation_hint property via sysfs Date: Wed, 26 Jan 2022 21:32:10 +0100 Message-Id: <13dc3d2d0a220fcc85533199b289b4a0dfcaf204.1643228177.git.kreijack@inwind.it> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Reply-To: Goffredo Baroncelli MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tiscali.it; s=smtp; t=1643229140; bh=guhxZ1eAMzUJH7Br7Fc9G1Bkw/gHyGzbbaWcwUQTbBE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To; b=M7uyDSvtVPYpAx5R9k3MLlo+yJeqTq4mP2zmlMZWmyKbGH8AH1GE62v0ZSITMBKaI jpL+v1PYusP20jQVuE1okBHWaStRAOigeyzjflkjhop/h68g066h325/s7JRHXidim oQMODJTyfedxQryU2xbtKbGtYUZyLwrer/wloTfs= Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goffredo Baroncelli This patch allow to change the allocation_hint property writing a numerical value in the file. /sysfs/fs/btrfs//devinfo//allocation_hint To update this field it is added the property "allocation_hint" in btrfs-prog too. Signed-off-by: Goffredo Baroncelli --- fs/btrfs/sysfs.c | 62 +++++++++++++++++++++++++++++++++++++++++++++- fs/btrfs/volumes.c | 2 +- fs/btrfs/volumes.h | 2 ++ 3 files changed, 64 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index c1c903187e19..9070d0370343 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -1584,7 +1584,67 @@ static ssize_t btrfs_devinfo_allocation_hint_show(struct kobject *kobj, return scnprintf(buf, PAGE_SIZE, "0x%08llx\n", device->type & BTRFS_DEV_ALLOCATION_HINT_MASK); } -BTRFS_ATTR(devid, allocation_hint, btrfs_devinfo_allocation_hint_show); + +static ssize_t btrfs_devinfo_allocation_hint_store(struct kobject *kobj, + struct kobj_attribute *a, + const char *buf, size_t len) +{ + struct btrfs_fs_info *fs_info; + struct btrfs_root *root; + struct btrfs_device *device; + int ret; + struct btrfs_trans_handle *trans; + + u64 type, prev_type; + + device = container_of(kobj, struct btrfs_device, devid_kobj); + fs_info = device->fs_info; + if (!fs_info) + return -EPERM; + + root = fs_info->chunk_root; + if (sb_rdonly(fs_info->sb)) + return -EROFS; + + ret = kstrtou64(buf, 0, &type); + if (ret < 0) + return -EINVAL; + + /* for now, allow to touch only the 'allocation hint' bits */ + if (type & ~BTRFS_DEV_ALLOCATION_HINT_MASK) + return -EINVAL; + + /* check if a change is really needed */ + if ((device->type & BTRFS_DEV_ALLOCATION_HINT_MASK) == type) + return len; + + trans = btrfs_start_transaction(root, 1); + if (IS_ERR(trans)) + return PTR_ERR(trans); + + prev_type = device->type; + device->type = (device->type & ~BTRFS_DEV_ALLOCATION_HINT_MASK) | type; + + ret = btrfs_update_device(trans, device); + + if (ret < 0) { + btrfs_abort_transaction(trans, ret); + btrfs_end_transaction(trans); + goto abort; + } + + ret = btrfs_commit_transaction(trans); + if (ret < 0) + goto abort; + + return len; +abort: + device->type = prev_type; + return ret; +} +BTRFS_ATTR_RW(devid, allocation_hint, btrfs_devinfo_allocation_hint_show, + btrfs_devinfo_allocation_hint_store); + /* * Information about one device. diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index a91e51b0ca81..c43a8a36ff5b 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2841,7 +2841,7 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path return ret; } -static noinline int btrfs_update_device(struct btrfs_trans_handle *trans, +noinline int btrfs_update_device(struct btrfs_trans_handle *trans, struct btrfs_device *device) { int ret; diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index bd297f23d19e..93ac27d8097c 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -636,5 +636,7 @@ int btrfs_bg_type_to_factor(u64 flags); const char *btrfs_bg_type_to_raid_name(u64 flags); int btrfs_verify_dev_extents(struct btrfs_fs_info *fs_info); bool btrfs_repair_one_zone(struct btrfs_fs_info *fs_info, u64 logical); +int btrfs_update_device(struct btrfs_trans_handle *trans, + struct btrfs_device *device); #endif From patchwork Wed Jan 26 20:32:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goffredo Baroncelli X-Patchwork-Id: 12725730 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C9E2C63697 for ; Wed, 26 Jan 2022 20:32:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230436AbiAZUcY (ORCPT ); Wed, 26 Jan 2022 15:32:24 -0500 Received: from michael.mail.tiscali.it ([213.205.33.246]:59448 "EHLO smtp.tiscali.it" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S230402AbiAZUcW (ORCPT ); Wed, 26 Jan 2022 15:32:22 -0500 Received: from venice.bhome ([78.14.151.50]) by michael.mail.tiscali.it with id nYYG2600e15VSme01YYLLh; Wed, 26 Jan 2022 20:32:20 +0000 x-auth-user: kreijack@tiscali.it From: Goffredo Baroncelli To: linux-btrfs@vger.kernel.org Cc: Zygo Blaxell , Josef Bacik , David Sterba , Sinnamohideen Shafeeq , Paul Jones , Boris Burkov , Goffredo Baroncelli Subject: [PATCH 4/7] btrfs: add allocation_hint mode Date: Wed, 26 Jan 2022 21:32:11 +0100 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Reply-To: Goffredo Baroncelli MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tiscali.it; s=smtp; t=1643229140; bh=TgM9KKcCDftLXIXl9o3AKjUXPRT3pI4Dh/a7BG4nGwM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To; b=ZH5zKLQj7gbcPYWgO82bGEGYa01XjnF4ZQuPOJqMHJNa21h8m+CgwBKdaXpF56l/I A329CRuglXc/fsE31riEw+wmW2IQloys5I7cxleo5LSv+Iz5KUyezD/29EygKIZg+k rfQEV0rRZhm96HIacncpYGPRwTa7IGjGPshZCM3k= Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goffredo Baroncelli The chunk allocation policy is modified as follow. Each disk may have one of the following tags: - BTRFS_DEV_ALLOCATION_METADATA_PREFERRED - BTRFS_DEV_ALLOCATION_METADATA_ONLY - BTRFS_DEV_ALLOCATION_DATA_ONLY - BTRFS_DEV_ALLOCATION_DATA_PREFERRED (default) During a *mixed data/metadata* chunk allocation, BTRFS works as usual. During a *data* chunk allocation, the space are searched first in BTRFS_DEV_ALLOCATION_DATA_ONLY. If the space found is not enough (eg. in raid5, only two disks are available), then the disks tagged BTRFS_DEV_ALLOCATION_DATA_PREFERRED are considered. If the space is not enough again, the disks tagged BTRFS_DEV_ALLOCATION_METADATA_PREFERRED are also considered. If even in this case the space is not sufficient, -ENOSPC is raised. A disk tagged with BTRFS_DEV_ALLOCATION_METADATA_ONLY is never considered for a data BG allocation. During a *metadata* chunk allocation, the same algorithm applies swapping _DATA_ and _METADATA_. By default the disks are tagged as BTRFS_DEV_ALLOCATION_DATA_PREFERRED, so BTRFS behaves as usual. If the user prefers to store the metadata in the faster disks (e.g. SSD), he can tag these with BTRFS_DEV_ALLOCATION_METADATA_PREFERRED: in this case the metadata BG go in the BTRFS_DEV_ALLOCATION_METADATA_PREFERRED disks and the data BG in the others ones. When a disks set is filled, the other is considered. Signed-off-by: Goffredo Baroncelli --- fs/btrfs/volumes.c | 113 +++++++++++++++++++++++++++++++++++++++++++-- fs/btrfs/volumes.h | 1 + 2 files changed, 111 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index c43a8a36ff5b..666b67f4e07b 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -184,6 +184,27 @@ enum btrfs_raid_types __attribute_const__ btrfs_bg_flags_to_raid_index(u64 flags return BTRFS_RAID_SINGLE; /* BTRFS_BLOCK_GROUP_SINGLE */ } +#define BTRFS_DEV_ALLOCATION_HINT_COUNT (1ULL << \ + BTRFS_DEV_ALLOCATION_HINT_BIT_COUNT) + +/* + * The order of BTRFS_DEV_ALLOCATION_HINT_* values are not + * good, because BTRFS_DEV_ALLOCATION_HINT_DATA_PREFERRED is 0 + * (for backward compatibility reason), and the other + * values are greater (because the field is unsigned). So we + * need a map that rearranges the order giving to _DATA_PREFERRED + * an intermediate priority. + * These values give to METADATA_ONLY the highest priority, and are + * valid for metadata BG allocation. When a data + * BG is allocated we negate these values to reverse the priority. + */ +static const char alloc_hint_map[BTRFS_DEV_ALLOCATION_HINT_COUNT] = { + [BTRFS_DEV_ALLOCATION_HINT_DATA_ONLY] = -1, + [BTRFS_DEV_ALLOCATION_HINT_DATA_PREFERRED] = 0, + [BTRFS_DEV_ALLOCATION_HINT_METADATA_PREFERRED] = 1, + [BTRFS_DEV_ALLOCATION_HINT_METADATA_ONLY] = 2, +}; + const char *btrfs_bg_type_to_raid_name(u64 flags) { const int index = btrfs_bg_flags_to_raid_index(flags); @@ -5019,13 +5040,18 @@ static int btrfs_add_system_chunk(struct btrfs_fs_info *fs_info, } /* - * sort the devices in descending order by max_avail, total_avail + * sort the devices in descending order by alloc_hint, + * max_avail, total_avail */ static int btrfs_cmp_device_info(const void *a, const void *b) { const struct btrfs_device_info *di_a = a; const struct btrfs_device_info *di_b = b; + if (di_a->alloc_hint > di_b->alloc_hint) + return -1; + if (di_a->alloc_hint < di_b->alloc_hint) + return 1; if (di_a->max_avail > di_b->max_avail) return -1; if (di_a->max_avail < di_b->max_avail) @@ -5188,6 +5214,7 @@ static int gather_device_info(struct btrfs_fs_devices *fs_devices, int ndevs = 0; u64 max_avail; u64 dev_offset; + int hint; /* * in the first pass through the devices list, we gather information @@ -5240,17 +5267,95 @@ static int gather_device_info(struct btrfs_fs_devices *fs_devices, devices_info[ndevs].max_avail = max_avail; devices_info[ndevs].total_avail = total_avail; devices_info[ndevs].dev = device; + + if ((ctl->type & BTRFS_BLOCK_GROUP_DATA) && + (ctl->type & BTRFS_BLOCK_GROUP_METADATA)) { + /* + * if mixed bg set all the alloc_hint + * fields to the same value, so the sorting + * is not affected + */ + devices_info[ndevs].alloc_hint = 0; + } else if (ctl->type & BTRFS_BLOCK_GROUP_DATA) { + hint = device->type & BTRFS_DEV_ALLOCATION_HINT_MASK; + + /* + * skip BTRFS_DEV_METADATA_ONLY disks + */ + if (hint == BTRFS_DEV_ALLOCATION_HINT_METADATA_ONLY) + continue; + /* + * if a data chunk must be allocated, + * sort also by hint (data disk + * higher priority) + */ + devices_info[ndevs].alloc_hint = -alloc_hint_map[hint]; + } else { /* BTRFS_BLOCK_GROUP_METADATA */ + hint = device->type & BTRFS_DEV_ALLOCATION_HINT_MASK; + + /* + * skip BTRFS_DEV_DATA_ONLY disks + */ + if (hint == BTRFS_DEV_ALLOCATION_HINT_DATA_ONLY) + continue; + /* + * if a metadata chunk must be allocated, + * sort also by hint (metadata hint + * higher priority) + */ + devices_info[ndevs].alloc_hint = alloc_hint_map[hint]; + } + ++ndevs; } ctl->ndevs = ndevs; + return 0; +} + +static void sort_and_reduce_device_info(struct alloc_chunk_ctl *ctl, + struct btrfs_device_info *devices_info) +{ + int ndevs, hint, i; + + ndevs = ctl->ndevs; /* - * now sort the devices by hole size / available space + * now sort the devices by hint / hole size / available space */ sort(devices_info, ndevs, sizeof(struct btrfs_device_info), btrfs_cmp_device_info, NULL); - return 0; + /* + * select the minimum set of disks grouped by hint that + * can host the chunk + */ + ndevs = 0; + while (ndevs < ctl->ndevs) { + hint = devices_info[ndevs++].alloc_hint; + while (ndevs < ctl->ndevs) { + if (devices_info[ndevs].alloc_hint != hint) + break; + ndevs++; + } + if (ndevs >= ctl->devs_min) + break; + } + + ctl->ndevs = ndevs; + + /* + * the next layers require the devices_info ordered by + * max_avail. If we are returning two (or more) different + * group of alloc_hint, this is not always true. So sort + * these again. + */ + + for (i = 0 ; i < ndevs ; i++) + devices_info[i].alloc_hint = 0; + + sort(devices_info, ndevs, sizeof(struct btrfs_device_info), + btrfs_cmp_device_info, NULL); + } static int decide_stripe_size_regular(struct alloc_chunk_ctl *ctl, @@ -5502,6 +5607,8 @@ struct btrfs_block_group *btrfs_create_chunk(struct btrfs_trans_handle *trans, goto out; } + sort_and_reduce_device_info(&ctl, devices_info); + ret = decide_stripe_size(fs_devices, &ctl, devices_info); if (ret < 0) { block_group = ERR_PTR(ret); diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index 93ac27d8097c..b066f9af216a 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -404,6 +404,7 @@ struct btrfs_device_info { u64 dev_offset; u64 max_avail; u64 total_avail; + int alloc_hint; }; struct btrfs_raid_attr { From patchwork Wed Jan 26 20:32:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goffredo Baroncelli X-Patchwork-Id: 12725732 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EA1FFC63684 for ; Wed, 26 Jan 2022 20:32:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230391AbiAZUcZ (ORCPT ); Wed, 26 Jan 2022 15:32:25 -0500 Received: from michael.mail.tiscali.it ([213.205.33.246]:59462 "EHLO smtp.tiscali.it" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S230406AbiAZUcW (ORCPT ); Wed, 26 Jan 2022 15:32:22 -0500 Received: from venice.bhome ([78.14.151.50]) by michael.mail.tiscali.it with id nYYG2600e15VSme01YYLM5; Wed, 26 Jan 2022 20:32:20 +0000 x-auth-user: kreijack@tiscali.it From: Goffredo Baroncelli To: linux-btrfs@vger.kernel.org Cc: Zygo Blaxell , Josef Bacik , David Sterba , Sinnamohideen Shafeeq , Paul Jones , Boris Burkov , Goffredo Baroncelli Subject: [PATCH 5/7] btrfs: rename dev_item->type to dev_item->flags Date: Wed, 26 Jan 2022 21:32:12 +0100 Message-Id: <9d6c4b75a0dd1df45a22ae71e0a80e40bc88266e.1643228177.git.kreijack@inwind.it> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Reply-To: Goffredo Baroncelli MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tiscali.it; s=smtp; t=1643229140; bh=iuLGWxJzCL4KwEd2kcm2RGwOIFK0+BQhcRh4i03dRfg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To; b=HSs17xTqU2Yv1TPlIdy1jSS5++2jG+H9105zPS09GqRoa/8GB4GTJQNdB4jJgAjSZ lAiPOIFd6AcdbUmR5FSFv3W5bsVRp8crP6A66+JGmdD4tCic/ak1vNrc08ULQ+SZtl ZRAXkORzfPWDrROAGbvapFwHwF+S0cKc7yK2dasw= Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goffredo Baroncelli Rename the field type of dev_item from 'type' to 'flags' changing the struct btrfs_device and btrfs_dev_item. Signed-off-by: Goffredo Baroncelli --- fs/btrfs/ctree.h | 4 ++-- fs/btrfs/disk-io.c | 2 +- fs/btrfs/sysfs.c | 17 +++++++++-------- fs/btrfs/volumes.c | 10 +++++----- fs/btrfs/volumes.h | 4 ++-- include/uapi/linux/btrfs_tree.h | 4 ++-- 6 files changed, 21 insertions(+), 20 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 8992e0096163..41a7f4be0c13 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1664,7 +1664,7 @@ static inline void btrfs_set_device_total_bytes(const struct extent_buffer *eb, } -BTRFS_SETGET_FUNCS(device_type, struct btrfs_dev_item, type, 64); +BTRFS_SETGET_FUNCS(device_flags, struct btrfs_dev_item, flags, 64); BTRFS_SETGET_FUNCS(device_bytes_used, struct btrfs_dev_item, bytes_used, 64); BTRFS_SETGET_FUNCS(device_io_align, struct btrfs_dev_item, io_align, 32); BTRFS_SETGET_FUNCS(device_io_width, struct btrfs_dev_item, io_width, 32); @@ -1677,7 +1677,7 @@ BTRFS_SETGET_FUNCS(device_seek_speed, struct btrfs_dev_item, seek_speed, 8); BTRFS_SETGET_FUNCS(device_bandwidth, struct btrfs_dev_item, bandwidth, 8); BTRFS_SETGET_FUNCS(device_generation, struct btrfs_dev_item, generation, 64); -BTRFS_SETGET_STACK_FUNCS(stack_device_type, struct btrfs_dev_item, type, 64); +BTRFS_SETGET_STACK_FUNCS(stack_device_flags, struct btrfs_dev_item, flags, 64); BTRFS_SETGET_STACK_FUNCS(stack_device_total_bytes, struct btrfs_dev_item, total_bytes, 64); BTRFS_SETGET_STACK_FUNCS(stack_device_bytes_used, struct btrfs_dev_item, diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 87a5addbedf6..5b1f66eeed46 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -4326,7 +4326,7 @@ int write_all_supers(struct btrfs_fs_info *fs_info, int max_mirrors) continue; btrfs_set_stack_device_generation(dev_item, 0); - btrfs_set_stack_device_type(dev_item, dev->type); + btrfs_set_stack_device_flags(dev_item, dev->flags); btrfs_set_stack_device_id(dev_item, dev->devid); btrfs_set_stack_device_total_bytes(dev_item, dev->commit_total_bytes); diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index 9070d0370343..42921432c9dc 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -1582,7 +1582,7 @@ static ssize_t btrfs_devinfo_allocation_hint_show(struct kobject *kobj, devid_kobj); return scnprintf(buf, PAGE_SIZE, "0x%08llx\n", - device->type & BTRFS_DEV_ALLOCATION_HINT_MASK); + device->flags & BTRFS_DEV_ALLOCATION_HINT_MASK); } static ssize_t btrfs_devinfo_allocation_hint_store(struct kobject *kobj, @@ -1595,7 +1595,7 @@ static ssize_t btrfs_devinfo_allocation_hint_store(struct kobject *kobj, int ret; struct btrfs_trans_handle *trans; - u64 type, prev_type; + u64 flags, prev_flags; device = container_of(kobj, struct btrfs_device, devid_kobj); fs_info = device->fs_info; @@ -1606,24 +1606,25 @@ static ssize_t btrfs_devinfo_allocation_hint_store(struct kobject *kobj, if (sb_rdonly(fs_info->sb)) return -EROFS; - ret = kstrtou64(buf, 0, &type); + ret = kstrtou64(buf, 0, &flags); if (ret < 0) return -EINVAL; /* for now, allow to touch only the 'allocation hint' bits */ - if (type & ~BTRFS_DEV_ALLOCATION_HINT_MASK) + if (flags & ~BTRFS_DEV_ALLOCATION_HINT_MASK) return -EINVAL; /* check if a change is really needed */ - if ((device->type & BTRFS_DEV_ALLOCATION_HINT_MASK) == type) + if ((device->flags & BTRFS_DEV_ALLOCATION_HINT_MASK) == flags) return len; trans = btrfs_start_transaction(root, 1); if (IS_ERR(trans)) return PTR_ERR(trans); - prev_type = device->type; - device->type = (device->type & ~BTRFS_DEV_ALLOCATION_HINT_MASK) | type; + prev_flags = device->flags; + device->flags = (device->flags & ~BTRFS_DEV_ALLOCATION_HINT_MASK) | + flags; ret = btrfs_update_device(trans, device); @@ -1639,7 +1640,7 @@ static ssize_t btrfs_devinfo_allocation_hint_store(struct kobject *kobj, return len; abort: - device->type = prev_type; + device->flags = prev_flags; return ret; } BTRFS_ATTR_RW(devid, allocation_hint, btrfs_devinfo_allocation_hint_show, diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 666b67f4e07b..dd2996b0318b 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1871,7 +1871,7 @@ static int btrfs_add_dev_item(struct btrfs_trans_handle *trans, btrfs_set_device_id(leaf, dev_item, device->devid); btrfs_set_device_generation(leaf, dev_item, 0); - btrfs_set_device_type(leaf, dev_item, device->type); + btrfs_set_device_flags(leaf, dev_item, device->flags); btrfs_set_device_io_align(leaf, dev_item, device->io_align); btrfs_set_device_io_width(leaf, dev_item, device->io_width); btrfs_set_device_sector_size(leaf, dev_item, device->sector_size); @@ -2893,7 +2893,7 @@ noinline int btrfs_update_device(struct btrfs_trans_handle *trans, dev_item = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_dev_item); btrfs_set_device_id(leaf, dev_item, device->devid); - btrfs_set_device_type(leaf, dev_item, device->type); + btrfs_set_device_flags(leaf, dev_item, device->flags); btrfs_set_device_io_align(leaf, dev_item, device->io_align); btrfs_set_device_io_width(leaf, dev_item, device->io_width); btrfs_set_device_sector_size(leaf, dev_item, device->sector_size); @@ -5277,7 +5277,7 @@ static int gather_device_info(struct btrfs_fs_devices *fs_devices, */ devices_info[ndevs].alloc_hint = 0; } else if (ctl->type & BTRFS_BLOCK_GROUP_DATA) { - hint = device->type & BTRFS_DEV_ALLOCATION_HINT_MASK; + hint = device->flags & BTRFS_DEV_ALLOCATION_HINT_MASK; /* * skip BTRFS_DEV_METADATA_ONLY disks @@ -5291,7 +5291,7 @@ static int gather_device_info(struct btrfs_fs_devices *fs_devices, */ devices_info[ndevs].alloc_hint = -alloc_hint_map[hint]; } else { /* BTRFS_BLOCK_GROUP_METADATA */ - hint = device->type & BTRFS_DEV_ALLOCATION_HINT_MASK; + hint = device->flags & BTRFS_DEV_ALLOCATION_HINT_MASK; /* * skip BTRFS_DEV_DATA_ONLY disks @@ -7297,7 +7297,7 @@ static void fill_device_from_item(struct extent_buffer *leaf, device->commit_total_bytes = device->disk_total_bytes; device->bytes_used = btrfs_device_bytes_used(leaf, dev_item); device->commit_bytes_used = device->bytes_used; - device->type = btrfs_device_type(leaf, dev_item); + device->flags = btrfs_device_flags(leaf, dev_item); device->io_align = btrfs_device_io_align(leaf, dev_item); device->io_width = btrfs_device_io_width(leaf, dev_item); device->sector_size = btrfs_device_sector_size(leaf, dev_item); diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index b066f9af216a..6230d911e7af 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -101,8 +101,8 @@ struct btrfs_device { /* optimal io width for this device */ u32 io_width; - /* type and info about this device */ - u64 type; + /* device flags (e.g. allocation hint) */ + u64 flags; /* minimal io size for this device */ u32 sector_size; diff --git a/include/uapi/linux/btrfs_tree.h b/include/uapi/linux/btrfs_tree.h index 02955d5fcd21..232c66e7bc43 100644 --- a/include/uapi/linux/btrfs_tree.h +++ b/include/uapi/linux/btrfs_tree.h @@ -421,8 +421,8 @@ struct btrfs_dev_item { /* minimal io size for this device */ __le32 sector_size; - /* type and info about this device */ - __le64 type; + /* device flags (e.g. allocation hint) */ + __le64 flags; /* expected generation for this device */ __le64 generation; From patchwork Wed Jan 26 20:32:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goffredo Baroncelli X-Patchwork-Id: 12725733 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E3FFC5DF62 for ; Wed, 26 Jan 2022 20:32:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230441AbiAZUc0 (ORCPT ); Wed, 26 Jan 2022 15:32:26 -0500 Received: from michael.mail.tiscali.it ([213.205.33.246]:59470 "EHLO smtp.tiscali.it" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S230423AbiAZUcW (ORCPT ); Wed, 26 Jan 2022 15:32:22 -0500 Received: from venice.bhome ([78.14.151.50]) by michael.mail.tiscali.it with id nYYG2600e15VSme01YYLMa; Wed, 26 Jan 2022 20:32:21 +0000 x-auth-user: kreijack@tiscali.it From: Goffredo Baroncelli To: linux-btrfs@vger.kernel.org Cc: Zygo Blaxell , Josef Bacik , David Sterba , Sinnamohideen Shafeeq , Paul Jones , Boris Burkov , Goffredo Baroncelli Subject: [PATCH 6/7] btrfs: add major and minor to sysfs Date: Wed, 26 Jan 2022 21:32:13 +0100 Message-Id: <7a8017203cb85da33302c9b9eb85c921251a31f8.1643228177.git.kreijack@inwind.it> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Reply-To: Goffredo Baroncelli MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tiscali.it; s=smtp; t=1643229141; bh=UXy9HFCKUTueYPHP7sdmquri+adSS+j82PaE/CGO1+Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To; b=rcrO3IEVx8oyzek9dkjeosVzOOmfG8uAfqO20T324Fhi2OF/7H4T7URRflPIqI2un Vr45j4+6wbA+X828Cs61kRtZ+8N1MZ0ffdNnR6EwxoVk5xmJTdmDcdmk2QDwiTWFnm /4q2bGHJEf8KoBddJMvG3bj3IjGKOKWD1nTdJGq8= Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goffredo Baroncelli Add the following property to btrfs sysfs /sysfs/fs/btrfs//devinfo//major_minor This would help to figure out which block device is involved in which filesystem. Signed-off-by: Goffredo Baroncelli --- fs/btrfs/sysfs.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index 42921432c9dc..dee23669a00f 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -1647,6 +1647,22 @@ BTRFS_ATTR_RW(devid, allocation_hint, btrfs_devinfo_allocation_hint_show, btrfs_devinfo_allocation_hint_store); +static ssize_t btrfs_devinfo_major_minor_show(struct kobject *kobj, + struct kobj_attribute *a, char *buf) +{ + struct btrfs_device *device = container_of(kobj, struct btrfs_device, + devid_kobj); + + if (device->bdev) + return scnprintf(buf, PAGE_SIZE, "%d:%d\n", + MAJOR(device->bdev->bd_dev), + MINOR(device->bdev->bd_dev)); + else + return scnprintf(buf, PAGE_SIZE, "N/A\n"); +} + +BTRFS_ATTR(devid, major_minor, btrfs_devinfo_major_minor_show); + /* * Information about one device. * @@ -1661,6 +1677,7 @@ static struct attribute *devid_attrs[] = { BTRFS_ATTR_PTR(devid, scrub_speed_max), BTRFS_ATTR_PTR(devid, writeable), BTRFS_ATTR_PTR(devid, allocation_hint), + BTRFS_ATTR_PTR(devid, major_minor), NULL }; ATTRIBUTE_GROUPS(devid); From patchwork Wed Jan 26 20:32:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goffredo Baroncelli X-Patchwork-Id: 12725734 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43839C6369B for ; Wed, 26 Jan 2022 20:32:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230438AbiAZUcZ (ORCPT ); Wed, 26 Jan 2022 15:32:25 -0500 Received: from michael.mail.tiscali.it ([213.205.33.246]:59406 "EHLO smtp.tiscali.it" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S230413AbiAZUcW (ORCPT ); Wed, 26 Jan 2022 15:32:22 -0500 Received: from venice.bhome ([78.14.151.50]) by michael.mail.tiscali.it with id nYYG2600e15VSme01YYMMr; Wed, 26 Jan 2022 20:32:21 +0000 x-auth-user: kreijack@tiscali.it From: Goffredo Baroncelli To: linux-btrfs@vger.kernel.org Cc: Zygo Blaxell , Josef Bacik , David Sterba , Sinnamohideen Shafeeq , Paul Jones , Boris Burkov , Goffredo Baroncelli Subject: [PATCH 7/7] Add /sys/fs/btrfs/features/allocation_hint Date: Wed, 26 Jan 2022 21:32:14 +0100 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Reply-To: Goffredo Baroncelli MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tiscali.it; s=smtp; t=1643229141; bh=Bg8phRthP2z1QRlZ+UGcO8P2mq2f87aWiAEmus/xuXg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To; b=pDFYOtcL+beWcoTntrZT+0vPa/jGzrqO/arAN3QBfV9IoW9tDDEZqSsDKJd5YxsS2 3Nr46b7RkBM37zZ6SZ2KdsDZ/T2cJhZiE/c1iAo5ms6tmd77seXkn27kEdBNhb0CV6 NizS9FhK6q3bGb6KFf31b7esJCvMlg+KlfHhADfc= Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goffredo Baroncelli Add a new feature sysfs file to simplify the allocation_hit feature availability check. Signed-off-by: Goffredo Baroncelli --- fs/btrfs/sysfs.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index dee23669a00f..0f4a7ab79fe5 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -403,6 +403,20 @@ static ssize_t supported_sectorsizes_show(struct kobject *kobj, BTRFS_ATTR(static_feature, supported_sectorsizes, supported_sectorsizes_show); +static ssize_t allocation_hint_show(struct kobject *kobj, + struct kobj_attribute *a, + char *buf) +{ + ssize_t ret; + + /* Only sectorsize == PAGE_SIZE is now supported */ + ret = sysfs_emit(buf, "1\n"); + + return ret; +} +BTRFS_ATTR(static_feature, allocation_hint, + allocation_hint_show); + /* * Features which only depend on kernel version. * @@ -415,6 +429,7 @@ static struct attribute *btrfs_supported_static_feature_attrs[] = { BTRFS_ATTR_PTR(static_feature, send_stream_version), BTRFS_ATTR_PTR(static_feature, supported_rescue_options), BTRFS_ATTR_PTR(static_feature, supported_sectorsizes), + BTRFS_ATTR_PTR(static_feature, allocation_hint), NULL };