From patchwork Thu Jan 6 17:49:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goffredo Baroncelli X-Patchwork-Id: 12705612 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F327C433EF for ; Thu, 6 Jan 2022 17:49:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242417AbiAFRte (ORCPT ); Thu, 6 Jan 2022 12:49:34 -0500 Received: from santino.mail.tiscali.it ([213.205.33.245]:56242 "EHLO smtp.tiscali.it" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S242286AbiAFRtd (ORCPT ); Thu, 6 Jan 2022 12:49:33 -0500 Received: from venice.bhome ([84.220.25.125]) by santino.mail.tiscali.it with id fVpV2600Z2hwt0401VpYUQ; Thu, 06 Jan 2022 17:49:32 +0000 x-auth-user: kreijack@tiscali.it From: Goffredo Baroncelli To: linux-btrfs@vger.kernel.org Cc: Zygo Blaxell , Josef Bacik , David Sterba , Sinnamohideen Shafeeq , Paul Jones , Boris Burkov , Goffredo Baroncelli Subject: [PATCH 1/6] btrfs: add flags to give an hint to the chunk allocator Date: Thu, 6 Jan 2022 18:49:18 +0100 Message-Id: <90f0d53b2ee4b1b9fd4cfa2d36994ad6d858753f.1641486794.git.kreijack@inwind.it> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Reply-To: Goffredo Baroncelli MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tiscali.it; s=smtp; t=1641491372; bh=VQY6O7Tid/B7P16mtGfsFnkVQ4gQfAUmnQyQiwZnzic=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To; b=1lcHOCTReFWwimnASsPR6N+V7Q/b2D1+oOL9IR2duAUmH+PFgpY3thncoiuCQXSrj weuCJJJOxz/IDRov5qIQECRU/TfKBTGqca1U7ihGXbioVPs+G5bDS47PpIB2goqMRR TXBgfyUiUsV3fvORowGVeImcpQFquWtLN8fLiJPY= Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goffredo Baroncelli Add the following flags to give an hint about which chunk should be allocated in which disk: - BTRFS_DEV_ALLOCATION_HINT_DATA_PREFERRED preferred for data chunk, but metadata chunk allowed - BTRFS_DEV_ALLOCATION_HINT_METADATA_PREFERRED preferred for metadata chunk, but data chunk allowed - BTRFS_DEV_ALLOCATION_HINT_METADATA_ONLY only metadata chunk allowed - BTRFS_DEV_ALLOCATION_HINT_DATA_ONLY only data chunk allowed Signed-off-by: Goffredo Baroncelli --- include/uapi/linux/btrfs_tree.h | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/include/uapi/linux/btrfs_tree.h b/include/uapi/linux/btrfs_tree.h index 5416f1f1a77a..02955d5fcd21 100644 --- a/include/uapi/linux/btrfs_tree.h +++ b/include/uapi/linux/btrfs_tree.h @@ -386,6 +386,22 @@ struct btrfs_key { __u64 offset; } __attribute__ ((__packed__)); +/* dev_item.type */ + +/* btrfs chunk allocation hint */ +#define BTRFS_DEV_ALLOCATION_HINT_BIT_COUNT 2 +/* btrfs chunk allocation hint mask */ +#define BTRFS_DEV_ALLOCATION_HINT_MASK \ + ((1 << BTRFS_DEV_ALLOCATION_HINT_BIT_COUNT) - 1) +/* preferred data chunk, but metadata chunk allowed */ +#define BTRFS_DEV_ALLOCATION_HINT_DATA_PREFERRED (0ULL) +/* preferred metadata chunk, but data chunk allowed */ +#define BTRFS_DEV_ALLOCATION_HINT_METADATA_PREFERRED (1ULL) +/* only metadata chunk are allowed */ +#define BTRFS_DEV_ALLOCATION_HINT_METADATA_ONLY (2ULL) +/* only data chunk allowed */ +#define BTRFS_DEV_ALLOCATION_HINT_DATA_ONLY (3ULL) + struct btrfs_dev_item { /* the internal btrfs device id */ __le64 devid; From patchwork Thu Jan 6 17:49:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goffredo Baroncelli X-Patchwork-Id: 12705613 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 22FF8C433EF for ; Thu, 6 Jan 2022 17:49:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242466AbiAFRtq (ORCPT ); Thu, 6 Jan 2022 12:49:46 -0500 Received: from santino.mail.tiscali.it ([213.205.33.245]:56282 "EHLO smtp.tiscali.it" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S242406AbiAFRtd (ORCPT ); Thu, 6 Jan 2022 12:49:33 -0500 Received: from venice.bhome ([84.220.25.125]) by santino.mail.tiscali.it with id fVpV2600Z2hwt0401VpYUn; Thu, 06 Jan 2022 17:49:33 +0000 x-auth-user: kreijack@tiscali.it From: Goffredo Baroncelli To: linux-btrfs@vger.kernel.org Cc: Zygo Blaxell , Josef Bacik , David Sterba , Sinnamohideen Shafeeq , Paul Jones , Boris Burkov , Goffredo Baroncelli Subject: [PATCH 2/6] btrfs: export the device allocation_hint property in sysfs Date: Thu, 6 Jan 2022 18:49:19 +0100 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Reply-To: Goffredo Baroncelli MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tiscali.it; s=smtp; t=1641491373; bh=w00/BdbSia/2z7YnoS/DYr9Oxygds2jIgIPTDlJo5yQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To; b=mQJVS1T8r8ALdikG2T0xJrJpUSyoda8PFNTIV1GKYWrNDcnUQCSxUa7eWe28kDJfE wkmQwCAvd+/dqx5KpEGQggJ3rdwRr0J2Oqc9VUJOHyDNFpHVW5tLPV2lmOZsfthS4K jWi/Vmzz17TbQBijINn9SIqsipf8migZ5pTx5Yyk= Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goffredo Baroncelli Eport the device allocation_hint property via /sys/fs/btrfs//devinfo//allocation_hint Signed-off-by: Goffredo Baroncelli --- fs/btrfs/sysfs.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index beb7f72d50b8..c1c903187e19 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -1575,6 +1575,17 @@ static ssize_t btrfs_devinfo_error_stats_show(struct kobject *kobj, } BTRFS_ATTR(devid, error_stats, btrfs_devinfo_error_stats_show); +static ssize_t btrfs_devinfo_allocation_hint_show(struct kobject *kobj, + struct kobj_attribute *a, char *buf) +{ + struct btrfs_device *device = container_of(kobj, struct btrfs_device, + devid_kobj); + + return scnprintf(buf, PAGE_SIZE, "0x%08llx\n", + device->type & BTRFS_DEV_ALLOCATION_HINT_MASK); +} +BTRFS_ATTR(devid, allocation_hint, btrfs_devinfo_allocation_hint_show); + /* * Information about one device. * @@ -1588,6 +1599,7 @@ static struct attribute *devid_attrs[] = { BTRFS_ATTR_PTR(devid, replace_target), BTRFS_ATTR_PTR(devid, scrub_speed_max), BTRFS_ATTR_PTR(devid, writeable), + BTRFS_ATTR_PTR(devid, allocation_hint), NULL }; ATTRIBUTE_GROUPS(devid); From patchwork Thu Jan 6 17:49:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goffredo Baroncelli X-Patchwork-Id: 12705614 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62436C433F5 for ; Thu, 6 Jan 2022 17:49:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242465AbiAFRtp (ORCPT ); Thu, 6 Jan 2022 12:49:45 -0500 Received: from santino.mail.tiscali.it ([213.205.33.245]:56256 "EHLO smtp.tiscali.it" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S242413AbiAFRte (ORCPT ); Thu, 6 Jan 2022 12:49:34 -0500 Received: from venice.bhome ([84.220.25.125]) by santino.mail.tiscali.it with id fVpV2600Z2hwt0401VpZV9; Thu, 06 Jan 2022 17:49:33 +0000 x-auth-user: kreijack@tiscali.it From: Goffredo Baroncelli To: linux-btrfs@vger.kernel.org Cc: Zygo Blaxell , Josef Bacik , David Sterba , Sinnamohideen Shafeeq , Paul Jones , Boris Burkov , Goffredo Baroncelli Subject: [PATCH 3/6] btrfs: change the device allocation_hint property via sysfs Date: Thu, 6 Jan 2022 18:49:20 +0100 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Reply-To: Goffredo Baroncelli MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tiscali.it; s=smtp; t=1641491373; bh=Nuep3DTCEyx94voMz97HYZ0r/jYkfcJtrKSYF0LaLLM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To; b=TlUmE+bvc29UWyjmrVtoK6qFHgTHsrnFFDbfyrUi8aFofEOO4DUr6+jnsiOrZeM/j Uj0z300Rt8aCjfNlttDaxylGEU1dfrUNsDa1wJFTH4m59ivkKqGh4dceBcoyNZHiLL RuwvHo0iHxcHIwH7ecT4OYbksIlXG4ojz7vmX924= Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goffredo Baroncelli This patch allow to change the allocation_hint property writing a numerical value in the file. /sysfs/fs/btrfs//devinfo//allocation_hint To update this field it is added the property "allocation_hint" in btrfs-prog too. Signed-off-by: Goffredo Baroncelli --- fs/btrfs/sysfs.c | 62 +++++++++++++++++++++++++++++++++++++++++++++- fs/btrfs/volumes.c | 2 +- fs/btrfs/volumes.h | 2 ++ 3 files changed, 64 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index c1c903187e19..9070d0370343 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -1584,7 +1584,67 @@ static ssize_t btrfs_devinfo_allocation_hint_show(struct kobject *kobj, return scnprintf(buf, PAGE_SIZE, "0x%08llx\n", device->type & BTRFS_DEV_ALLOCATION_HINT_MASK); } -BTRFS_ATTR(devid, allocation_hint, btrfs_devinfo_allocation_hint_show); + +static ssize_t btrfs_devinfo_allocation_hint_store(struct kobject *kobj, + struct kobj_attribute *a, + const char *buf, size_t len) +{ + struct btrfs_fs_info *fs_info; + struct btrfs_root *root; + struct btrfs_device *device; + int ret; + struct btrfs_trans_handle *trans; + + u64 type, prev_type; + + device = container_of(kobj, struct btrfs_device, devid_kobj); + fs_info = device->fs_info; + if (!fs_info) + return -EPERM; + + root = fs_info->chunk_root; + if (sb_rdonly(fs_info->sb)) + return -EROFS; + + ret = kstrtou64(buf, 0, &type); + if (ret < 0) + return -EINVAL; + + /* for now, allow to touch only the 'allocation hint' bits */ + if (type & ~BTRFS_DEV_ALLOCATION_HINT_MASK) + return -EINVAL; + + /* check if a change is really needed */ + if ((device->type & BTRFS_DEV_ALLOCATION_HINT_MASK) == type) + return len; + + trans = btrfs_start_transaction(root, 1); + if (IS_ERR(trans)) + return PTR_ERR(trans); + + prev_type = device->type; + device->type = (device->type & ~BTRFS_DEV_ALLOCATION_HINT_MASK) | type; + + ret = btrfs_update_device(trans, device); + + if (ret < 0) { + btrfs_abort_transaction(trans, ret); + btrfs_end_transaction(trans); + goto abort; + } + + ret = btrfs_commit_transaction(trans); + if (ret < 0) + goto abort; + + return len; +abort: + device->type = prev_type; + return ret; +} +BTRFS_ATTR_RW(devid, allocation_hint, btrfs_devinfo_allocation_hint_show, + btrfs_devinfo_allocation_hint_store); + /* * Information about one device. diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index b07d382d53a8..643ba7cac22c 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2857,7 +2857,7 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path return ret; } -static noinline int btrfs_update_device(struct btrfs_trans_handle *trans, +noinline int btrfs_update_device(struct btrfs_trans_handle *trans, struct btrfs_device *device) { int ret; diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index 005c9e2a491a..4ac3114f5eae 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -631,5 +631,7 @@ int btrfs_bg_type_to_factor(u64 flags); const char *btrfs_bg_type_to_raid_name(u64 flags); int btrfs_verify_dev_extents(struct btrfs_fs_info *fs_info); bool btrfs_repair_one_zone(struct btrfs_fs_info *fs_info, u64 logical); +int btrfs_update_device(struct btrfs_trans_handle *trans, + struct btrfs_device *device); #endif From patchwork Thu Jan 6 17:49:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goffredo Baroncelli X-Patchwork-Id: 12705617 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63AA8C4332F for ; Thu, 6 Jan 2022 17:49:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242424AbiAFRtr (ORCPT ); Thu, 6 Jan 2022 12:49:47 -0500 Received: from santino.mail.tiscali.it ([213.205.33.245]:56242 "EHLO smtp.tiscali.it" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S242418AbiAFRte (ORCPT ); Thu, 6 Jan 2022 12:49:34 -0500 Received: from venice.bhome ([84.220.25.125]) by santino.mail.tiscali.it with id fVpV2600Z2hwt0401VpZVc; Thu, 06 Jan 2022 17:49:34 +0000 x-auth-user: kreijack@tiscali.it From: Goffredo Baroncelli To: linux-btrfs@vger.kernel.org Cc: Zygo Blaxell , Josef Bacik , David Sterba , Sinnamohideen Shafeeq , Paul Jones , Boris Burkov , Goffredo Baroncelli Subject: [PATCH 4/6] btrfs: add allocation_hint mode Date: Thu, 6 Jan 2022 18:49:21 +0100 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Reply-To: Goffredo Baroncelli MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tiscali.it; s=smtp; t=1641491374; bh=c/1Wz0vc9xGcbbCqDXkke2j2W52xkncr/4fCbWlMW3I=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To; b=FER2FBpBAeyMXLdU97hk/78FK0X36o+4VHI9xdNXUY3MlQM9RfQqRo+PBB3/BzpO7 OTcUQ2yK3hH2qoWdFYFFDB5Ihj20757hmTMiUp49cKxzkWIFPXTF+2eWiw+ilypdqk elLjv3SFeQ/8q/IA7XYGICYx5w9yVBQyVGmA3yNc= Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goffredo Baroncelli The chunk allocation policy is modified as follow. Each disk may have one of the following tags: - BTRFS_DEV_ALLOCATION_METADATA_PREFERRED - BTRFS_DEV_ALLOCATION_METADATA_ONLY - BTRFS_DEV_ALLOCATION_DATA_ONLY - BTRFS_DEV_ALLOCATION_DATA_PREFERRED (default) During a *mixed data/metadata* chunk allocation, BTRFS works as usual. During a *data* chunk allocation, the space are searched first in BTRFS_DEV_ALLOCATION_DATA_ONLY. If the space found is not enough (eg. in raid5, only two disks are available), then the disks tagged BTRFS_DEV_ALLOCATION_DATA_PREFERRED are considered. If the space is not enough again, the disks tagged BTRFS_DEV_ALLOCATION_METADATA_PREFERRED are also considered. If even in this case the space is not sufficient, -ENOSPC is raised. A disk tagged with BTRFS_DEV_ALLOCATION_METADATA_ONLY is never considered for a data BG allocation. During a *metadata* chunk allocation, the same algorithm applies swapping _DATA_ and _METADATA_. By default the disks are tagged as BTRFS_DEV_ALLOCATION_DATA_PREFERRED, so BTRFS behaves as usual. If the user prefers to store the metadata in the faster disks (e.g. SSD), he can tag these with BTRFS_DEV_ALLOCATION_METADATA_PREFERRED: in this case the metadata BG go in the BTRFS_DEV_ALLOCATION_METADATA_PREFERRED disks and the data BG in the others ones. When a disks set is filled, the other is considered. Signed-off-by: Goffredo Baroncelli --- fs/btrfs/volumes.c | 113 +++++++++++++++++++++++++++++++++++++++++++-- fs/btrfs/volumes.h | 1 + 2 files changed, 111 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 643ba7cac22c..a3b5c9653101 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -184,6 +184,27 @@ enum btrfs_raid_types __attribute_const__ btrfs_bg_flags_to_raid_index(u64 flags return BTRFS_RAID_SINGLE; /* BTRFS_BLOCK_GROUP_SINGLE */ } +#define BTRFS_DEV_ALLOCATION_HINT_COUNT (1ULL << \ + BTRFS_DEV_ALLOCATION_HINT_BIT_COUNT) + +/* + * The order of BTRFS_DEV_ALLOCATION_HINT_* values are not + * good, because BTRFS_DEV_ALLOCATION_HINT_DATA_PREFERRED is 0 + * (for backward compatibility reason), and the other + * values are greater (because the field is unsigned). So we + * need a map that rearranges the order giving to _DATA_PREFERRED + * an intermediate priority. + * These values give to METADATA_ONLY the highest priority, and are + * valid for metadata BG allocation. When a data + * BG is allocated we negate these values to reverse the priority. + */ +static const char alloc_hint_map[BTRFS_DEV_ALLOCATION_HINT_COUNT] = { + [BTRFS_DEV_ALLOCATION_HINT_DATA_ONLY] = -1, + [BTRFS_DEV_ALLOCATION_HINT_DATA_PREFERRED] = 0, + [BTRFS_DEV_ALLOCATION_HINT_METADATA_PREFERRED] = 1, + [BTRFS_DEV_ALLOCATION_HINT_METADATA_ONLY] = 2, +}; + const char *btrfs_bg_type_to_raid_name(u64 flags) { const int index = btrfs_bg_flags_to_raid_index(flags); @@ -5035,13 +5056,18 @@ static int btrfs_add_system_chunk(struct btrfs_fs_info *fs_info, } /* - * sort the devices in descending order by max_avail, total_avail + * sort the devices in descending order by alloc_hint, + * max_avail, total_avail */ static int btrfs_cmp_device_info(const void *a, const void *b) { const struct btrfs_device_info *di_a = a; const struct btrfs_device_info *di_b = b; + if (di_a->alloc_hint > di_b->alloc_hint) + return -1; + if (di_a->alloc_hint < di_b->alloc_hint) + return 1; if (di_a->max_avail > di_b->max_avail) return -1; if (di_a->max_avail < di_b->max_avail) @@ -5204,6 +5230,7 @@ static int gather_device_info(struct btrfs_fs_devices *fs_devices, int ndevs = 0; u64 max_avail; u64 dev_offset; + int hint; /* * in the first pass through the devices list, we gather information @@ -5256,17 +5283,95 @@ static int gather_device_info(struct btrfs_fs_devices *fs_devices, devices_info[ndevs].max_avail = max_avail; devices_info[ndevs].total_avail = total_avail; devices_info[ndevs].dev = device; + + if ((ctl->type & BTRFS_BLOCK_GROUP_DATA) && + (ctl->type & BTRFS_BLOCK_GROUP_METADATA)) { + /* + * if mixed bg set all the alloc_hint + * fields to the same value, so the sorting + * is not affected + */ + devices_info[ndevs].alloc_hint = 0; + } else if (ctl->type & BTRFS_BLOCK_GROUP_DATA) { + hint = device->type & BTRFS_DEV_ALLOCATION_HINT_MASK; + + /* + * skip BTRFS_DEV_METADATA_ONLY disks + */ + if (hint == BTRFS_DEV_ALLOCATION_HINT_METADATA_ONLY) + continue; + /* + * if a data chunk must be allocated, + * sort also by hint (data disk + * higher priority) + */ + devices_info[ndevs].alloc_hint = -alloc_hint_map[hint]; + } else { /* BTRFS_BLOCK_GROUP_METADATA */ + hint = device->type & BTRFS_DEV_ALLOCATION_HINT_MASK; + + /* + * skip BTRFS_DEV_DATA_ONLY disks + */ + if (hint == BTRFS_DEV_ALLOCATION_HINT_DATA_ONLY) + continue; + /* + * if a metadata chunk must be allocated, + * sort also by hint (metadata hint + * higher priority) + */ + devices_info[ndevs].alloc_hint = alloc_hint_map[hint]; + } + ++ndevs; } ctl->ndevs = ndevs; + return 0; +} + +static void sort_and_reduce_device_info(struct alloc_chunk_ctl *ctl, + struct btrfs_device_info *devices_info) +{ + int ndevs, hint, i; + + ndevs = ctl->ndevs; /* - * now sort the devices by hole size / available space + * now sort the devices by hint / hole size / available space */ sort(devices_info, ndevs, sizeof(struct btrfs_device_info), btrfs_cmp_device_info, NULL); - return 0; + /* + * select the minimum set of disks grouped by hint that + * can host the chunk + */ + ndevs = 0; + while (ndevs < ctl->ndevs) { + hint = devices_info[ndevs++].alloc_hint; + while (ndevs < ctl->ndevs) { + if (devices_info[ndevs].alloc_hint != hint) + break; + ndevs++; + } + if (ndevs >= ctl->devs_min) + break; + } + + ctl->ndevs = ndevs; + + /* + * the next layers require the devices_info ordered by + * max_avail. If we are returning two (or more) different + * group of alloc_hint, this is not always true. So sort + * these again. + */ + + for (i = 0 ; i < ndevs ; i++) + devices_info[i].alloc_hint = 0; + + sort(devices_info, ndevs, sizeof(struct btrfs_device_info), + btrfs_cmp_device_info, NULL); + } static int decide_stripe_size_regular(struct alloc_chunk_ctl *ctl, @@ -5518,6 +5623,8 @@ struct btrfs_block_group *btrfs_create_chunk(struct btrfs_trans_handle *trans, goto out; } + sort_and_reduce_device_info(&ctl, devices_info); + ret = decide_stripe_size(fs_devices, &ctl, devices_info); if (ret < 0) { block_group = ERR_PTR(ret); diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index 4ac3114f5eae..b1f92a2d13cf 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -399,6 +399,7 @@ struct btrfs_device_info { u64 dev_offset; u64 max_avail; u64 total_avail; + int alloc_hint; }; struct btrfs_raid_attr { From patchwork Thu Jan 6 17:49:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goffredo Baroncelli X-Patchwork-Id: 12705616 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D173CC433FE for ; Thu, 6 Jan 2022 17:49:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242406AbiAFRtr (ORCPT ); Thu, 6 Jan 2022 12:49:47 -0500 Received: from santino.mail.tiscali.it ([213.205.33.245]:56282 "EHLO smtp.tiscali.it" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S242422AbiAFRtf (ORCPT ); Thu, 6 Jan 2022 12:49:35 -0500 Received: from venice.bhome ([84.220.25.125]) by santino.mail.tiscali.it with id fVpV2600Z2hwt0401VpaW3; Thu, 06 Jan 2022 17:49:34 +0000 x-auth-user: kreijack@tiscali.it From: Goffredo Baroncelli To: linux-btrfs@vger.kernel.org Cc: Zygo Blaxell , Josef Bacik , David Sterba , Sinnamohideen Shafeeq , Paul Jones , Boris Burkov , Goffredo Baroncelli Subject: [PATCH 5/6] btrfs: rename dev_item->type to dev_item->flags Date: Thu, 6 Jan 2022 18:49:22 +0100 Message-Id: <4243d3e8a52684445ed66566819be33ae0cecc5a.1641486794.git.kreijack@inwind.it> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Reply-To: Goffredo Baroncelli MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tiscali.it; s=smtp; t=1641491374; bh=7wPojwQrpcADXRunCSqA7kFUjo63K/fnp89nJtwUQqg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To; b=dJ2vNM75fcqLF+GAXyHP4xw6juVptaE+TF4uEFZtz7nocy2dAVqYDwSVnZWiHtePA ttGtUfbXRw1YxoUzccGbEbeqIRu24tBiAA0IlN8C5i44FIYUAAVlNO5DdNU/WHRa4N 0P+H4pwRq3wliVZq3NOrLdFi0b/XrK4Ha7v/3XBY= Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goffredo Baroncelli Rename the field type of dev_item from 'type' to 'flags' changing the struct btrfs_device and btrfs_dev_item. Signed-off-by: Goffredo Baroncelli --- fs/btrfs/ctree.h | 4 ++-- fs/btrfs/disk-io.c | 2 +- fs/btrfs/sysfs.c | 17 +++++++++-------- fs/btrfs/volumes.c | 10 +++++----- fs/btrfs/volumes.h | 4 ++-- include/uapi/linux/btrfs_tree.h | 4 ++-- 6 files changed, 21 insertions(+), 20 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index b4a9b1c58d22..86065282a1c7 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1661,7 +1661,7 @@ static inline void btrfs_set_device_total_bytes(const struct extent_buffer *eb, } -BTRFS_SETGET_FUNCS(device_type, struct btrfs_dev_item, type, 64); +BTRFS_SETGET_FUNCS(device_flags, struct btrfs_dev_item, flags, 64); BTRFS_SETGET_FUNCS(device_bytes_used, struct btrfs_dev_item, bytes_used, 64); BTRFS_SETGET_FUNCS(device_io_align, struct btrfs_dev_item, io_align, 32); BTRFS_SETGET_FUNCS(device_io_width, struct btrfs_dev_item, io_width, 32); @@ -1674,7 +1674,7 @@ BTRFS_SETGET_FUNCS(device_seek_speed, struct btrfs_dev_item, seek_speed, 8); BTRFS_SETGET_FUNCS(device_bandwidth, struct btrfs_dev_item, bandwidth, 8); BTRFS_SETGET_FUNCS(device_generation, struct btrfs_dev_item, generation, 64); -BTRFS_SETGET_STACK_FUNCS(stack_device_type, struct btrfs_dev_item, type, 64); +BTRFS_SETGET_STACK_FUNCS(stack_device_flags, struct btrfs_dev_item, flags, 64); BTRFS_SETGET_STACK_FUNCS(stack_device_total_bytes, struct btrfs_dev_item, total_bytes, 64); BTRFS_SETGET_STACK_FUNCS(stack_device_bytes_used, struct btrfs_dev_item, diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 87a5addbedf6..5b1f66eeed46 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -4326,7 +4326,7 @@ int write_all_supers(struct btrfs_fs_info *fs_info, int max_mirrors) continue; btrfs_set_stack_device_generation(dev_item, 0); - btrfs_set_stack_device_type(dev_item, dev->type); + btrfs_set_stack_device_flags(dev_item, dev->flags); btrfs_set_stack_device_id(dev_item, dev->devid); btrfs_set_stack_device_total_bytes(dev_item, dev->commit_total_bytes); diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index 9070d0370343..42921432c9dc 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -1582,7 +1582,7 @@ static ssize_t btrfs_devinfo_allocation_hint_show(struct kobject *kobj, devid_kobj); return scnprintf(buf, PAGE_SIZE, "0x%08llx\n", - device->type & BTRFS_DEV_ALLOCATION_HINT_MASK); + device->flags & BTRFS_DEV_ALLOCATION_HINT_MASK); } static ssize_t btrfs_devinfo_allocation_hint_store(struct kobject *kobj, @@ -1595,7 +1595,7 @@ static ssize_t btrfs_devinfo_allocation_hint_store(struct kobject *kobj, int ret; struct btrfs_trans_handle *trans; - u64 type, prev_type; + u64 flags, prev_flags; device = container_of(kobj, struct btrfs_device, devid_kobj); fs_info = device->fs_info; @@ -1606,24 +1606,25 @@ static ssize_t btrfs_devinfo_allocation_hint_store(struct kobject *kobj, if (sb_rdonly(fs_info->sb)) return -EROFS; - ret = kstrtou64(buf, 0, &type); + ret = kstrtou64(buf, 0, &flags); if (ret < 0) return -EINVAL; /* for now, allow to touch only the 'allocation hint' bits */ - if (type & ~BTRFS_DEV_ALLOCATION_HINT_MASK) + if (flags & ~BTRFS_DEV_ALLOCATION_HINT_MASK) return -EINVAL; /* check if a change is really needed */ - if ((device->type & BTRFS_DEV_ALLOCATION_HINT_MASK) == type) + if ((device->flags & BTRFS_DEV_ALLOCATION_HINT_MASK) == flags) return len; trans = btrfs_start_transaction(root, 1); if (IS_ERR(trans)) return PTR_ERR(trans); - prev_type = device->type; - device->type = (device->type & ~BTRFS_DEV_ALLOCATION_HINT_MASK) | type; + prev_flags = device->flags; + device->flags = (device->flags & ~BTRFS_DEV_ALLOCATION_HINT_MASK) | + flags; ret = btrfs_update_device(trans, device); @@ -1639,7 +1640,7 @@ static ssize_t btrfs_devinfo_allocation_hint_store(struct kobject *kobj, return len; abort: - device->type = prev_type; + device->flags = prev_flags; return ret; } BTRFS_ATTR_RW(devid, allocation_hint, btrfs_devinfo_allocation_hint_show, diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index a3b5c9653101..496bb215b110 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1888,7 +1888,7 @@ static int btrfs_add_dev_item(struct btrfs_trans_handle *trans, btrfs_set_device_id(leaf, dev_item, device->devid); btrfs_set_device_generation(leaf, dev_item, 0); - btrfs_set_device_type(leaf, dev_item, device->type); + btrfs_set_device_flags(leaf, dev_item, device->flags); btrfs_set_device_io_align(leaf, dev_item, device->io_align); btrfs_set_device_io_width(leaf, dev_item, device->io_width); btrfs_set_device_sector_size(leaf, dev_item, device->sector_size); @@ -2909,7 +2909,7 @@ noinline int btrfs_update_device(struct btrfs_trans_handle *trans, dev_item = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_dev_item); btrfs_set_device_id(leaf, dev_item, device->devid); - btrfs_set_device_type(leaf, dev_item, device->type); + btrfs_set_device_flags(leaf, dev_item, device->flags); btrfs_set_device_io_align(leaf, dev_item, device->io_align); btrfs_set_device_io_width(leaf, dev_item, device->io_width); btrfs_set_device_sector_size(leaf, dev_item, device->sector_size); @@ -5293,7 +5293,7 @@ static int gather_device_info(struct btrfs_fs_devices *fs_devices, */ devices_info[ndevs].alloc_hint = 0; } else if (ctl->type & BTRFS_BLOCK_GROUP_DATA) { - hint = device->type & BTRFS_DEV_ALLOCATION_HINT_MASK; + hint = device->flags & BTRFS_DEV_ALLOCATION_HINT_MASK; /* * skip BTRFS_DEV_METADATA_ONLY disks @@ -5307,7 +5307,7 @@ static int gather_device_info(struct btrfs_fs_devices *fs_devices, */ devices_info[ndevs].alloc_hint = -alloc_hint_map[hint]; } else { /* BTRFS_BLOCK_GROUP_METADATA */ - hint = device->type & BTRFS_DEV_ALLOCATION_HINT_MASK; + hint = device->flags & BTRFS_DEV_ALLOCATION_HINT_MASK; /* * skip BTRFS_DEV_DATA_ONLY disks @@ -7303,7 +7303,7 @@ static void fill_device_from_item(struct extent_buffer *leaf, device->commit_total_bytes = device->disk_total_bytes; device->bytes_used = btrfs_device_bytes_used(leaf, dev_item); device->commit_bytes_used = device->bytes_used; - device->type = btrfs_device_type(leaf, dev_item); + device->flags = btrfs_device_flags(leaf, dev_item); device->io_align = btrfs_device_io_align(leaf, dev_item); device->io_width = btrfs_device_io_width(leaf, dev_item); device->sector_size = btrfs_device_sector_size(leaf, dev_item); diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index b1f92a2d13cf..1e559a1735f9 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -96,8 +96,8 @@ struct btrfs_device { /* optimal io width for this device */ u32 io_width; - /* type and info about this device */ - u64 type; + /* device flags (e.g. allocation hint) */ + u64 flags; /* minimal io size for this device */ u32 sector_size; diff --git a/include/uapi/linux/btrfs_tree.h b/include/uapi/linux/btrfs_tree.h index 02955d5fcd21..232c66e7bc43 100644 --- a/include/uapi/linux/btrfs_tree.h +++ b/include/uapi/linux/btrfs_tree.h @@ -421,8 +421,8 @@ struct btrfs_dev_item { /* minimal io size for this device */ __le32 sector_size; - /* type and info about this device */ - __le64 type; + /* device flags (e.g. allocation hint) */ + __le64 flags; /* expected generation for this device */ __le64 generation; From patchwork Thu Jan 6 17:49:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Goffredo Baroncelli X-Patchwork-Id: 12705615 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7EC2C43217 for ; Thu, 6 Jan 2022 17:49:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242428AbiAFRtq (ORCPT ); Thu, 6 Jan 2022 12:49:46 -0500 Received: from santino.mail.tiscali.it ([213.205.33.245]:56256 "EHLO smtp.tiscali.it" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S242424AbiAFRtf (ORCPT ); Thu, 6 Jan 2022 12:49:35 -0500 Received: from venice.bhome ([84.220.25.125]) by santino.mail.tiscali.it with id fVpV2600Z2hwt0401VpaWQ; Thu, 06 Jan 2022 17:49:34 +0000 x-auth-user: kreijack@tiscali.it From: Goffredo Baroncelli To: linux-btrfs@vger.kernel.org Cc: Zygo Blaxell , Josef Bacik , David Sterba , Sinnamohideen Shafeeq , Paul Jones , Boris Burkov , Goffredo Baroncelli Subject: [PATCH 6/6] btrfs: add major and minor to sysfs Date: Thu, 6 Jan 2022 18:49:23 +0100 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Reply-To: Goffredo Baroncelli MIME-Version: 1.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tiscali.it; s=smtp; t=1641491374; bh=UXy9HFCKUTueYPHP7sdmquri+adSS+j82PaE/CGO1+Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References:Reply-To; b=ssC/Ml08fuv8gBRhgjObvLCXqCO29aRtsJWjrxrIjD6fR6doj2z/X+32zl1yB0Alr auo882ELLTDleyyq2OZxhTvbJ0XkAnHDPHqIYTCyt2NY9AgK/9b3637XThBNCGKGuW OdnyEk9EJXhykSFbtdFttVgp27DDVugt2ub4hs0I= Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goffredo Baroncelli Add the following property to btrfs sysfs /sysfs/fs/btrfs//devinfo//major_minor This would help to figure out which block device is involved in which filesystem. Signed-off-by: Goffredo Baroncelli --- fs/btrfs/sysfs.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index 42921432c9dc..dee23669a00f 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -1647,6 +1647,22 @@ BTRFS_ATTR_RW(devid, allocation_hint, btrfs_devinfo_allocation_hint_show, btrfs_devinfo_allocation_hint_store); +static ssize_t btrfs_devinfo_major_minor_show(struct kobject *kobj, + struct kobj_attribute *a, char *buf) +{ + struct btrfs_device *device = container_of(kobj, struct btrfs_device, + devid_kobj); + + if (device->bdev) + return scnprintf(buf, PAGE_SIZE, "%d:%d\n", + MAJOR(device->bdev->bd_dev), + MINOR(device->bdev->bd_dev)); + else + return scnprintf(buf, PAGE_SIZE, "N/A\n"); +} + +BTRFS_ATTR(devid, major_minor, btrfs_devinfo_major_minor_show); + /* * Information about one device. * @@ -1661,6 +1677,7 @@ static struct attribute *devid_attrs[] = { BTRFS_ATTR_PTR(devid, scrub_speed_max), BTRFS_ATTR_PTR(devid, writeable), BTRFS_ATTR_PTR(devid, allocation_hint), + BTRFS_ATTR_PTR(devid, major_minor), NULL }; ATTRIBUTE_GROUPS(devid);