From patchwork Tue Aug 20 04:52:44 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 11102829 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 17C5E1890 for ; Tue, 20 Aug 2019 04:53:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id EBAEF20C01 for ; Tue, 20 Aug 2019 04:53:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="RErLMkGc" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729072AbfHTExG (ORCPT ); Tue, 20 Aug 2019 00:53:06 -0400 Received: from esa5.hgst.iphmx.com ([216.71.153.144]:11098 "EHLO esa5.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728878AbfHTExF (ORCPT ); Tue, 20 Aug 2019 00:53:05 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1566276786; x=1597812786; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=gCG/mFoW80K9QLMTeER2ZZ4y5gugUj9OImIyGYzEti4=; b=RErLMkGc1sxdiN3RMIKoowbA9gcaMh4/bQObaqXVh2hkpRRMeOLFoujI Yy3Uc9neZrES0WZHZZv1UszI04zvYI4SPr278SwyPcDLXv7qvphFA3dqa Zu6TKMDrCEfSobdWWd34qfABWhFcKygX9dF6PI+DKlXpT8Jd/HxR5X4W/ 5ZwAGDjMbxGBCC/2DpDJ1TxIDo+PcD5FpZi+E34MvWMwbmKMEOfaSn8u2 TftuHzXWG6ZGXbJw2fsVBeS3NA494MaAfiVIJDq0fjm3KuLN+6t74C5M5 UbhvEI745h5N8HHf09EgAIycXXHRhRPQeK8RzTuvSRQzjvNiwRRREOecp A==; IronPort-SDR: 6rS9c7OxbQE/b/O7LK55mdlqoDkLfXokRcNFG6swCM1yJtH9YHGs3J74rgjPoSoKTc1GN6EWZe d+bbXW1vn18I72NR+GXP+HXtKa5NcEskrtRaq4mDpUmjozJ1MV6+DQIUYM/aePQV+SOMLlNmIS uDnnzMHJC8ids9eV6ZZHBHV6se1dhj5JTGlANYPe7GRFidliMA+yHctq3fkGiQyH+CuLyTAPn6 uqcoTtNzTmEk3VFfQThh7Bm8v/zVnqjhFp+QAugK4pZ4BBzGVgqDgvs3wB9PCQRmb6mPo5nIrd zBc= X-IronPort-AV: E=Sophos;i="5.64,407,1559491200"; d="scan'208";a="117136282" Received: from uls-op-cesaip01.wdc.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 20 Aug 2019 12:53:05 +0800 IronPort-SDR: qowQe72KJULUleqBNANvk0UU2CiZeC02kmycOwhWkF7xxDoUhKGVS5KI7KL2QJE4xyIUVIXKQ7 xOfr4aAyJTKUQgHREaL9DQUJ4nd8Y3626JrHyZxbNsSMJaU5H10XDN8laDnuU3UjR6db+jMYeg GYV+UqUCHO/2EpAm3LgHeYc+IxahytchA/T1bnPELK6KWrDo1r1LFDPL6ENuJFLODzaWS+5kwt bGlwpKdatj33XiCczFiYFw+KeRPNAgjFpOqqO9MFApVbHE7PtsdBw2G3WrrfrZ69RfuoTHukA7 Jj8JyQOiHY8WrVSDhpjePoQR Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Aug 2019 21:50:29 -0700 IronPort-SDR: YxAS8DCbO82RXa5UE3nd3/S00Ap2m4r3aqVq69IS6RV1QrGRo8oesyz+QaCZsAkAamNHL8y5w5 AN+2YWHOnYxnY6NXzXYGPwmndJ37Gu3MdIQ1IRqsyJWPkdKSnqoAEJIlsDgdogDRfS2hg5Fo7G 0HBVNXnN34fFpobcLfRJysb08pPp1gnVoBBZi5+0yOHBtJZ0fzruuLIOzbrOKEwsE8c9wk0//C 5UZlZUUCImTTiPzS90l2kZvj5by66c0l/HeCJWFUFpl41/LoPmw1/pKKwjmxmOumSVqUkAFJ0O drQ= Received: from naota.dhcp.fujisawa.hgst.com (HELO naota.fujisawa.hgst.com) ([10.149.53.115]) by uls-op-cesaip02.wdc.com with ESMTP; 19 Aug 2019 21:53:02 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org, David Sterba Cc: Chris Mason , Josef Bacik , Nikolay Borisov , Damien Le Moal , Matias Bjorling , Johannes Thumshirn , Hannes Reinecke , linux-fsdevel@vger.kernel.org, Naohiro Aota Subject: [PATCH v3 01/15] btrfs-progs: utils: Introduce queue_param helper function Date: Tue, 20 Aug 2019 13:52:44 +0900 Message-Id: <20190820045258.1571640-2-naohiro.aota@wdc.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20190820045258.1571640-1-naohiro.aota@wdc.com> References: <20190820045258.1571640-1-naohiro.aota@wdc.com> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Introduce the queue_param helper function to get a device request queue parameter. This helper will be used later to query information of a zoned device. Furthermore, rewrite is_ssd() using the helper function. Signed-off-by: Damien Le Moal [Naohiro] fixed error return value Signed-off-by: Naohiro Aota --- common/device-utils.c | 46 +++++++++++++++++++++++++++++++++++++++++++ common/device-utils.h | 1 + mkfs/main.c | 40 ++----------------------------------- 3 files changed, 49 insertions(+), 38 deletions(-) diff --git a/common/device-utils.c b/common/device-utils.c index b03d62faaf21..7fa9386f4677 100644 --- a/common/device-utils.c +++ b/common/device-utils.c @@ -252,3 +252,49 @@ u64 get_partition_size(const char *dev) return result; } +/* + * Get a device request queue parameter. + */ +int queue_param(const char *file, const char *param, char *buf, size_t len) +{ + blkid_probe probe; + char wholedisk[PATH_MAX]; + char sysfs_path[PATH_MAX]; + dev_t devno; + int fd; + int ret; + + probe = blkid_new_probe_from_filename(file); + if (!probe) + return 0; + + /* Device number of this disk (possibly a partition) */ + devno = blkid_probe_get_devno(probe); + if (!devno) { + blkid_free_probe(probe); + return 0; + } + + /* Get whole disk name (not full path) for this devno */ + ret = blkid_devno_to_wholedisk(devno, + wholedisk, sizeof(wholedisk), NULL); + if (ret) { + blkid_free_probe(probe); + return 0; + } + + snprintf(sysfs_path, PATH_MAX, "/sys/block/%s/queue/%s", + wholedisk, param); + + blkid_free_probe(probe); + + fd = open(sysfs_path, O_RDONLY); + if (fd < 0) + return 0; + + len = read(fd, buf, len); + close(fd); + + return len; +} + diff --git a/common/device-utils.h b/common/device-utils.h index 70d19cae3e50..d1799323d002 100644 --- a/common/device-utils.h +++ b/common/device-utils.h @@ -29,5 +29,6 @@ u64 disk_size(const char *path); u64 btrfs_device_size(int fd, struct stat *st); int btrfs_prepare_device(int fd, const char *file, u64 *block_count_ret, u64 max_block_count, unsigned opflags); +int queue_param(const char *file, const char *param, char *buf, size_t len); #endif diff --git a/mkfs/main.c b/mkfs/main.c index 971cb39534fd..948c84be5f39 100644 --- a/mkfs/main.c +++ b/mkfs/main.c @@ -426,49 +426,13 @@ static int zero_output_file(int out_fd, u64 size) static int is_ssd(const char *file) { - blkid_probe probe; - char wholedisk[PATH_MAX]; - char sysfs_path[PATH_MAX]; - dev_t devno; - int fd; char rotational; int ret; - probe = blkid_new_probe_from_filename(file); - if (!probe) + ret = queue_param(file, "rotational", &rotational, 1); + if (ret < 1) return 0; - /* Device number of this disk (possibly a partition) */ - devno = blkid_probe_get_devno(probe); - if (!devno) { - blkid_free_probe(probe); - return 0; - } - - /* Get whole disk name (not full path) for this devno */ - ret = blkid_devno_to_wholedisk(devno, - wholedisk, sizeof(wholedisk), NULL); - if (ret) { - blkid_free_probe(probe); - return 0; - } - - snprintf(sysfs_path, PATH_MAX, "/sys/block/%s/queue/rotational", - wholedisk); - - blkid_free_probe(probe); - - fd = open(sysfs_path, O_RDONLY); - if (fd < 0) { - return 0; - } - - if (read(fd, &rotational, 1) < 1) { - close(fd); - return 0; - } - close(fd); - return rotational == '0'; } From patchwork Tue Aug 20 04:52:45 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 11102831 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4EBA31864 for ; Tue, 20 Aug 2019 04:53:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2277920C01 for ; Tue, 20 Aug 2019 04:53:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="UPBNiqUW" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729124AbfHTExI (ORCPT ); Tue, 20 Aug 2019 00:53:08 -0400 Received: from esa5.hgst.iphmx.com ([216.71.153.144]:11098 "EHLO esa5.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728878AbfHTExH (ORCPT ); Tue, 20 Aug 2019 00:53:07 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1566276787; x=1597812787; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5X6DLB2CfRgFLonMj3pABXYl/oqPE52q61YLSYp3yXM=; b=UPBNiqUWRPQPIO+/T49WbhdeOSmnFFaMo4r6dqTwipkUGuLL3yau98Ei MR/uSNkY/1mjyhXsjPpjsJrWUn3vw4PvtQYXTz1VgheQFewo1WMBTimZt r2Ek1Jfk5dompSou4Z1WgMHWahF6gy0yIKnmpvP61OdUEEjgRg4llFq+P w/XR+xM7tnLY6TWhTRg1WyfQ5+UeLde5Kq8bRJKC3s2dnN9Y7d2z5Nrwj Je9G27j+jQlZ3gPmMfBfqWIYyECT3MsIxVVPciNLZnixGAp+/knsGgJWQ aqbir2nELHgflAyxkyPQ8THRH1FOhAwoH2JWo9GaZbW7OCUiRkcx3zviU g==; IronPort-SDR: ATUrjxdxbA6rxUlfsliqvLmPbmObRjcYxLuuqX3UHWddoHc1G6XlpKd8iUorjpXvW/s0CxPv+J dwvghyMkVuzikpBGEAH86LBOkxlqdLID+mcu8JOMZpqle5kadEk6Eu835Lz0QmB07OeYKTP9iq k+Zgr5Q0l1fPjuTvDnHmfTQJ0FuYg7nJ9ZrTeT801XNIZhVrNiIFe41nXmgKMVNk0d2NDFueFH 6Y4YNLRStjr/LE7awgJf5dfXtrOpRYxOGQokiqOAXMiihyRd2husIu93ZmlGHcYWiouh/pB3Hf H5o= X-IronPort-AV: E=Sophos;i="5.64,407,1559491200"; d="scan'208";a="117136285" Received: from uls-op-cesaip01.wdc.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 20 Aug 2019 12:53:07 +0800 IronPort-SDR: TdQ8YFsytzClU5ZmLsVFdVrZ3GVAR87wCxQ2ldwRgxBHqBMtNqRwsPIoP4iKw4tAc/h73xpqLA 9Hmg9TSq2m6P1gtGAgQgAAWDG7m31tPm6l+pMLJhW67ObYRzAxCuUxbmWzG8hkjpVtLy4JXYT7 PLr/RyuG07zjQvp7+TwzIeh7Vj95+kM2ypwEvsg38ZVkNbHb9nBgLzC7kby8juLAXFVQOoVIf5 JUtjEisZYFvy+iQHJADws2kfsJAffjzbMVKb1CS3GQqnyI/ikTTc1bZbpkkPEbZlht4SVkEBMh rIK5xGx5g0Ea+QFoPgmXYq+C Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Aug 2019 21:50:31 -0700 IronPort-SDR: VkOr8AkTfNQl2ui+rC9B6PKsGaNYrreLz7JWW3bOQ6ITGUZzF7dcYA08v89+AYfiUG22Etbrby w3QKZfQ67DGIoRYpdUav7Qqk++LMrjZqHbOhSZ1cNPgdjKWkjNCFBqOvBOzvxXbV9vGhpr5zK0 4ENmisSgqWvNX4W+mq494PU25TLvzUOy43/9Zr+Lfnp3y0jzdg+JzOKtwE55/8ZC/eY+Kk2qiy Ov1xrVMI03+IZj1saitEhifrKOxLker0a93+JvCwTin6bq+Qf8yK/B6lgw0d5U+iBg6ITUB1Z8 eVY= Received: from naota.dhcp.fujisawa.hgst.com (HELO naota.fujisawa.hgst.com) ([10.149.53.115]) by uls-op-cesaip02.wdc.com with ESMTP; 19 Aug 2019 21:53:04 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org, David Sterba Cc: Chris Mason , Josef Bacik , Nikolay Borisov , Damien Le Moal , Matias Bjorling , Johannes Thumshirn , Hannes Reinecke , linux-fsdevel@vger.kernel.org, Naohiro Aota Subject: [PATCH v3 02/15] btrfs-progs: introduce raid parameters variables Date: Tue, 20 Aug 2019 13:52:45 +0900 Message-Id: <20190820045258.1571640-3-naohiro.aota@wdc.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20190820045258.1571640-1-naohiro.aota@wdc.com> References: <20190820045258.1571640-1-naohiro.aota@wdc.com> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Userland btrfs_alloc_chunk() and its kernel side counterpart __btrfs_alloc_chunk() is so diverged that it's difficult to use the kernel code as is. This commit introduces some RAID parameter variables and read them from btrfs_raid_array as the same as in kernel land. Signed-off-by: Naohiro Aota --- volumes.c | 27 +++++++++++++++++++++++++-- 1 file changed, 25 insertions(+), 2 deletions(-) diff --git a/volumes.c b/volumes.c index 0e6fb1dbce15..f99fddc7cf6f 100644 --- a/volumes.c +++ b/volumes.c @@ -993,7 +993,19 @@ int btrfs_alloc_chunk(struct btrfs_trans_handle *trans, int num_stripes = 1; int max_stripes = 0; int min_stripes = 1; - int sub_stripes = 0; + int sub_stripes; /* sub_stripes info for map */ + int dev_stripes __attribute__((unused)); + /* stripes per dev */ + int devs_max; /* max devs to use */ + int devs_min __attribute__((unused)); + /* min devs needed */ + int devs_increment __attribute__((unused)); + /* ndevs has to be a multiple of this */ + int ncopies __attribute__((unused)); + /* how many copies to data has */ + int nparity __attribute__((unused)); + /* number of stripes worth of bytes to + store parity information */ int looped = 0; int ret; int index; @@ -1005,6 +1017,18 @@ int btrfs_alloc_chunk(struct btrfs_trans_handle *trans, return -ENOSPC; } + index = btrfs_bg_flags_to_raid_index(type); + + sub_stripes = btrfs_raid_array[index].sub_stripes; + dev_stripes = btrfs_raid_array[index].dev_stripes; + devs_max = btrfs_raid_array[index].devs_max; + if (!devs_max) + devs_max = BTRFS_MAX_DEVS(info); + devs_min = btrfs_raid_array[index].devs_min; + devs_increment = btrfs_raid_array[index].devs_increment; + ncopies = btrfs_raid_array[index].ncopies; + nparity = btrfs_raid_array[index].nparity; + if (type & BTRFS_BLOCK_GROUP_PROFILE_MASK) { if (type & BTRFS_BLOCK_GROUP_SYSTEM) { calc_size = SZ_8M; @@ -1051,7 +1075,6 @@ int btrfs_alloc_chunk(struct btrfs_trans_handle *trans, if (num_stripes < 4) return -ENOSPC; num_stripes &= ~(u32)1; - sub_stripes = 2; min_stripes = 4; } if (type & (BTRFS_BLOCK_GROUP_RAID5)) { From patchwork Tue Aug 20 04:52:46 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 11102837 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 17CF014DB for ; Tue, 20 Aug 2019 04:53:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id EA15520C01 for ; Tue, 20 Aug 2019 04:53:13 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="OEoa4Ssa" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729153AbfHTExL (ORCPT ); Tue, 20 Aug 2019 00:53:11 -0400 Received: from esa5.hgst.iphmx.com ([216.71.153.144]:11098 "EHLO esa5.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728206AbfHTExK (ORCPT ); Tue, 20 Aug 2019 00:53:10 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1566276790; x=1597812790; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=6o1gldu8mLQ/sEw6Hc3j3Y+Xm+tfLgqyimdGRfehMrY=; b=OEoa4SsawK/Svjs3Hhz/w0cJP7zFcHzY35teOO5w4k2uqryhl5zxcTz3 0jMRI+FFOqVRADS+f/eKw8Xtw4+Vdr2a/JSvaOmyR6aA8MP8Vu3CvSTJg zKKLr025TBotGMjerO1QgB+gwlapbuOHaiMkVUNqC4849q69EMJLx5wk6 I51KwWkdqnMsAROJcDnj9QU4JO/2NZSs2hHZ9XOU3s0kJ7ndsxYufxecj RmkMImDOh50vKGzJmxysRZ7xoitINXWUcya/9AjWYEikjLdKPmVvKyWla doj+K3+hn3Y2Qoq1JHu+2E8cyNo+2nz1D9q8rUiIIiECrm/zFsfmHJD+M g==; IronPort-SDR: 1Eq83y2IqhZU07NKss5kgPZzmYYtlSUhFCqYoTy5AuCR0ZW52Kfi3ZkHBxMzeqRv4u19bnGpp0 rTeSkyACRCkclwk22AApF78PFxxP19C+vDscc4LGvF4yxaH4V7s7TiTUB+TVxDWTFZ9yuasNLw ieIwjk4NcFKNDPfqVxkz9iL+tnGaxbYhxRouXM36rHCRw/JPZN9z/KTmmFO8ne2EXcNeuj72TH j0OOTpHyoy4VMS4D20QzityApgLuBNUYCWBy39LPxGegd/xaV8qrQGP5M82bzvCRAFPt9UTvDL gUE= X-IronPort-AV: E=Sophos;i="5.64,407,1559491200"; d="scan'208";a="117136287" Received: from uls-op-cesaip01.wdc.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 20 Aug 2019 12:53:09 +0800 IronPort-SDR: XmAOGXeFtaUBDTxU1nHe86mZ+Q/GMwtTXlpSLRqHdcVVyGEIO/+4i+ajNE1wbWniqbhKYyH8z/ yR+f9FOZ/0ItCPj0kIO/J3BLDLLSgp+lSaBezlANVLk0VHHRGFIjtgPfwcMjD1FFa9d/BFsQ31 Gh2iad/S+DXIBvdWc1HL3DcdNhf3ls5erHTvQEIsoc1dls5Pqy3A+k7XY3FjViQTseKKyXHUZa 23HZG1zSA2TdIAZndeQTrdv4RUoyVG9Yrw7heWHjCm9EwIAcwk8CpIvnGADslRox4vii3/9rQX cEDo97gHEJg3FfylHFZNGVRo Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Aug 2019 21:50:33 -0700 IronPort-SDR: 6y+7ppOvnT2w3lX3jCqoq1rBA0qLVG0ODTiNzR8yb7OMqa+pI5ADEo69u5FtnldLtCQiMaQmYz Pg4IRwdpxe1HkgAWq8EBZEU+jZiB8MTh5jqpKBt+z0sOd/XDF/VkT+iQNNIxMBA5pTCvbCUkhB 9X39yr4kBFHC/YR/7kziYoznTwnK+YA0Dkc3pprgf8dsfFirDG5QiGzLl/odpnGYGNiGO515L8 DDTIuF00Cp5oVmR4hWTwCMxBWdCpIJLrt93ZMeYqZGBTDs28Lnfc1o+BHchAQy3dMiT0Ja1kcP Yc8= Received: from naota.dhcp.fujisawa.hgst.com (HELO naota.fujisawa.hgst.com) ([10.149.53.115]) by uls-op-cesaip02.wdc.com with ESMTP; 19 Aug 2019 21:53:06 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org, David Sterba Cc: Chris Mason , Josef Bacik , Nikolay Borisov , Damien Le Moal , Matias Bjorling , Johannes Thumshirn , Hannes Reinecke , linux-fsdevel@vger.kernel.org, Naohiro Aota Subject: [PATCH v3 03/15] btrfs-progs: build: Check zoned block device support Date: Tue, 20 Aug 2019 13:52:46 +0900 Message-Id: <20190820045258.1571640-4-naohiro.aota@wdc.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20190820045258.1571640-1-naohiro.aota@wdc.com> References: <20190820045258.1571640-1-naohiro.aota@wdc.com> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org If the kernel supports zoned block devices, the file /usr/include/linux/blkzoned.h will be present. Check this and define BTRFS_ZONED if the file is present. If it present, enables HMZONED feature, if not disable it. Signed-off-by: Damien Le Moal Signed-off-by: Naohiro Aota --- configure.ac | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/configure.ac b/configure.ac index cf792eb5488b..c637f72a8fe6 100644 --- a/configure.ac +++ b/configure.ac @@ -206,6 +206,18 @@ else AC_DEFINE([HAVE_OWN_FIEMAP_EXTENT_SHARED_DEFINE], [0], [We did not define FIEMAP_EXTENT_SHARED]) fi +AC_CHECK_HEADER(linux/blkzoned.h, [blkzoned_found=yes], [blkzoned_found=no]) +AC_ARG_ENABLE([zoned], + AS_HELP_STRING([--disable-zoned], [disable zoned block device support]), + [], [enable_zoned=$blkzoned_found] +) + +AS_IF([test "x$enable_zoned" = xyes], [ + AC_CHECK_HEADER(linux/blkzoned.h, [], + [AC_MSG_ERROR([Couldn't find linux/blkzoned.h])]) + AC_DEFINE([BTRFS_ZONED], [1], [enable zoned block device support]) +]) + dnl Define _LIBS= and _CFLAGS= by pkg-config dnl dnl The default PKG_CHECK_MODULES() action-if-not-found is end the @@ -307,6 +319,7 @@ AC_MSG_RESULT([ btrfs-restore zstd: ${enable_zstd} Python bindings: ${enable_python} Python interpreter: ${PYTHON} + zoned device: ${enable_zoned} Type 'make' to compile. ]) From patchwork Tue Aug 20 04:52:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 11102841 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EC45F14DB for ; Tue, 20 Aug 2019 04:53:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CB49220C01 for ; Tue, 20 Aug 2019 04:53:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="DIuMUSZx" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729160AbfHTExN (ORCPT ); Tue, 20 Aug 2019 00:53:13 -0400 Received: from esa5.hgst.iphmx.com ([216.71.153.144]:11098 "EHLO esa5.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728206AbfHTExL (ORCPT ); Tue, 20 Aug 2019 00:53:11 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1566276791; x=1597812791; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hPZsva0Rs7IA/IUggNKiwBxmrFQNJUWM1nqdShu19v8=; b=DIuMUSZxqsD6x6FGa4d6qx9RycZbVAe9BKc1RsM9v5FX21E/8QBbBEqv aVzivQe9Dg+hXkheWcEzA+BI1ZE3xNCVIpnAZxsf3+Gi+s4KIZA21Fli+ ablppSHMu4H8L4Hjd7+gi4gCfTeFgu3njqWPa3VIPBFrVhhNjHzulPK20 XhLKsUuPS76lt/Knry+Rwtudx+os0j7mBHENvSWoVzEGNi/foePw5WURK p80rWpY/BXtTr8FOroawr1BOMOXJUtUcMd3OCXP1Ll0ScjK5N/GzqUkMM J6cAF5WkTRHgyn2YjTN/tlbrejg7LWM000IAHst3RTkUsuTgeswOHhGnh g==; IronPort-SDR: 7xoLp7TsIS9mAtWewz4ts7oZ+zjgtCiibg4ny44J/vs+QY1LGC3zxeXnQa3p+kUcmPsdrt8paq 1TtmoEtzgxZ/9azMlhyalUf3SE+/gh26/rYcEdfGNfjXtm7XszRjn3dTuSr53bGORQOkJAfjoT V9PiuqdLP5Fs1YDlHNjU5l4DDtGLNJAzGFewUmFxcaK5ZJ6WGJiYC16tucNTNHDvIZS4vNDs0g nCq99hzngyhxSF3bpO4KQIMcuvAJoRuEpjZAiBIqyUQ8DmyX61tdVmoEgD3mp8yU25q96XcGGT B8s= X-IronPort-AV: E=Sophos;i="5.64,407,1559491200"; d="scan'208";a="117136290" Received: from uls-op-cesaip01.wdc.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 20 Aug 2019 12:53:11 +0800 IronPort-SDR: dIAFyaThvY2RcpbJN3FLyWAGLlE0NhI1mnryZuc2hYvE0QMCsR8BVCbLK4CwTOGLWtC05ks0BN LK3glZxaJo1lgGb5Emq6Oj6uOqspveVBEe25aObiDeepNb8PQNfA3SacrEY3O8GvWincrl3tTW qaBCCSzKqBT2jyv53tHrmsRhtNIoKOrA8CatqShI/nAZ0VEpfavHAL6xtHwSTmSeB91sPNM8jh htoeiRb5dseMP28TZMJPR0dlxEyBQDtakNCB5IQn3REn4V0GDSTw87MWKOlJSDdWKpGRWajY2h gCX4mi6kAD1vnVOzvM7+TGzG Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Aug 2019 21:50:35 -0700 IronPort-SDR: dysHhv9kqpT/qEDZP4Xxyq9BXkMm5rQZ7AUPjgXrfXdhnIY8JeulYaRSjVlG98GmVMepAre4op Ul8CqVCZ4Q4QTXflkPtDY96qrygaBUzC1zWpIIMeAIfTkAyE4vFebXPmJIcbyeMnvHCLg9U8bo MyCJr3MVmtWXncV6INBcbIR3VP6QJXq0PRD3G0OsI5utQGr1kUkgRo7TeF8nkjgoQ+tpi83fCw pp2uDxGaySH5VcDCvNOAGO2PAnc4GDaKN8HFMSXqO3m/dJkBqvOrJlU/OCmouqrTy3QRMTI3K5 ozU= Received: from naota.dhcp.fujisawa.hgst.com (HELO naota.fujisawa.hgst.com) ([10.149.53.115]) by uls-op-cesaip02.wdc.com with ESMTP; 19 Aug 2019 21:53:08 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org, David Sterba Cc: Chris Mason , Josef Bacik , Nikolay Borisov , Damien Le Moal , Matias Bjorling , Johannes Thumshirn , Hannes Reinecke , linux-fsdevel@vger.kernel.org, Naohiro Aota Subject: [PATCH v3 04/15] btrfs-progs: add new HMZONED feature flag Date: Tue, 20 Aug 2019 13:52:47 +0900 Message-Id: <20190820045258.1571640-5-naohiro.aota@wdc.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20190820045258.1571640-1-naohiro.aota@wdc.com> References: <20190820045258.1571640-1-naohiro.aota@wdc.com> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org With this feature enabled, a zoned block device aware btrfs allocates block groups aligned to the device zones and always write in sequential zones at the zone write pointer position. Enabling this feature also force disable conversion from ext4 volumes. Note: this flag can be moved to COMPAT_RO, so that older kernel can read but not write zoned block devices formatted with btrfs. Signed-off-by: Naohiro Aota --- cmds/inspect-dump-super.c | 3 ++- common/fsfeatures.c | 8 ++++++++ common/fsfeatures.h | 2 +- ctree.h | 4 +++- libbtrfsutil/btrfs.h | 2 ++ 5 files changed, 16 insertions(+), 3 deletions(-) diff --git a/cmds/inspect-dump-super.c b/cmds/inspect-dump-super.c index 65fb3506eac6..e942d37f8c9b 100644 --- a/cmds/inspect-dump-super.c +++ b/cmds/inspect-dump-super.c @@ -229,7 +229,8 @@ static struct readable_flag_entry incompat_flags_array[] = { DEF_INCOMPAT_FLAG_ENTRY(RAID56), DEF_INCOMPAT_FLAG_ENTRY(SKINNY_METADATA), DEF_INCOMPAT_FLAG_ENTRY(NO_HOLES), - DEF_INCOMPAT_FLAG_ENTRY(METADATA_UUID) + DEF_INCOMPAT_FLAG_ENTRY(METADATA_UUID), + DEF_INCOMPAT_FLAG_ENTRY(HMZONED) }; static const int incompat_flags_num = sizeof(incompat_flags_array) / sizeof(struct readable_flag_entry); diff --git a/common/fsfeatures.c b/common/fsfeatures.c index 50934bd161b0..b5bbecd8cf62 100644 --- a/common/fsfeatures.c +++ b/common/fsfeatures.c @@ -86,6 +86,14 @@ static const struct btrfs_fs_feature { VERSION_TO_STRING2(4,0), NULL, 0, "no explicit hole extents for files" }, +#ifdef BTRFS_ZONED + { "hmzoned", BTRFS_FEATURE_INCOMPAT_HMZONED, + "hmzoned", + NULL, 0, + NULL, 0, + NULL, 0, + "support Host-Managed Zoned devices" }, +#endif /* Keep this one last */ { "list-all", BTRFS_FEATURE_LIST_ALL, NULL } }; diff --git a/common/fsfeatures.h b/common/fsfeatures.h index 3cc9452a3327..0918ee1aa113 100644 --- a/common/fsfeatures.h +++ b/common/fsfeatures.h @@ -25,7 +25,7 @@ | BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA) /* - * Avoid multi-device features (RAID56) and mixed block groups + * Avoid multi-device features (RAID56), mixed block groups, and hmzoned device */ #define BTRFS_CONVERT_ALLOWED_FEATURES \ (BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF \ diff --git a/ctree.h b/ctree.h index 0d12563b7261..a56e18119069 100644 --- a/ctree.h +++ b/ctree.h @@ -490,6 +490,7 @@ struct btrfs_super_block { #define BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA (1ULL << 8) #define BTRFS_FEATURE_INCOMPAT_NO_HOLES (1ULL << 9) #define BTRFS_FEATURE_INCOMPAT_METADATA_UUID (1ULL << 10) +#define BTRFS_FEATURE_INCOMPAT_HMZONED (1ULL << 11) #define BTRFS_FEATURE_COMPAT_SUPP 0ULL @@ -513,7 +514,8 @@ struct btrfs_super_block { BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS | \ BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA | \ BTRFS_FEATURE_INCOMPAT_NO_HOLES | \ - BTRFS_FEATURE_INCOMPAT_METADATA_UUID) + BTRFS_FEATURE_INCOMPAT_METADATA_UUID | \ + BTRFS_FEATURE_INCOMPAT_HMZONED) /* * A leaf is full of items. offset and size tell us where to find diff --git a/libbtrfsutil/btrfs.h b/libbtrfsutil/btrfs.h index 944d50132456..5c415240f74c 100644 --- a/libbtrfsutil/btrfs.h +++ b/libbtrfsutil/btrfs.h @@ -268,6 +268,8 @@ struct btrfs_ioctl_fs_info_args { #define BTRFS_FEATURE_INCOMPAT_RAID56 (1ULL << 7) #define BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA (1ULL << 8) #define BTRFS_FEATURE_INCOMPAT_NO_HOLES (1ULL << 9) +/* Missing */ +#define BTRFS_FEATURE_INCOMPAT_HMZONED (1ULL << 11) struct btrfs_ioctl_feature_flags { __u64 compat_flags; From patchwork Tue Aug 20 04:52:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 11102845 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AD6481395 for ; Tue, 20 Aug 2019 04:53:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 78B7222CF5 for ; Tue, 20 Aug 2019 04:53:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="cdh6KlSs" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729172AbfHTExP (ORCPT ); Tue, 20 Aug 2019 00:53:15 -0400 Received: from esa5.hgst.iphmx.com ([216.71.153.144]:11098 "EHLO esa5.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729159AbfHTExO (ORCPT ); Tue, 20 Aug 2019 00:53:14 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1566276793; x=1597812793; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Ye/0Ll1y3LvM9rwKSs/3RvcW7W0urkL6yR9I1InYY8k=; b=cdh6KlSsAn9y0PFrFy3CK6YGNB0fDkqDAD/FsIpUBAlqhnk0+LDZ6/zx dvizhLF55NX4WUFREPMdriyVPT+Q17ZRo7GZe1HqTr88tPPxSt30+gTYI HA73rF27KlxkKVQzKDa1xkfeExMtoc3Q9FoTqVb9k3vNpbCm+Di+F1ue+ bkmGFiS2PX3CZx1aLBTQhH9KMGi4pBAnPPMPnN1dt8hsAgFOsXnxbX5nS 1zno0IhldeZvdRR6iudX+Ng4N9lq4BO1xjvSOJ1eWu+jAeRP/OPWN5ffD pyl6pVtV0bxze1W+V7TF6zkp0jKG5WNidkQ8WmUEw9d1qtaz6qwH20P5q w==; IronPort-SDR: 0VetXvcrUV41McEMiWQMcf7xVxMhP2Ra7q2/gLUIBHpVLCMyFO0REfzA/PJ7q7ANr8i9hub9Rx O2uMLBNbUv2kUW9uU0TD+M5DhaDcGqWI2thvM6a/LYPeazrFWirq6kiWTWlkhM7STOAvoaAfEb oZtLZ0IaQ5A+oVS1h6HolHliUdsvIEFrP4pkI2TRbsu9SKki3vqdMdFrSxcT+xUoFsX+/ibluZ HEvil8SYzFytA/MZ21hLGGOejXviN//1EOY2NmX43AoVLOF5+NBP22Uu5hGWNHYoMKytGANwzH x0E= X-IronPort-AV: E=Sophos;i="5.64,407,1559491200"; d="scan'208";a="117136292" Received: from uls-op-cesaip01.wdc.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 20 Aug 2019 12:53:13 +0800 IronPort-SDR: 5YbtxIAPwLfyqVC1vxYoN0sRQUyIlnUtV8r5y3MsXv+mElcePhw13XXZ7uuFB3ox2aCeuRcogi 6zcSI6J89rgg0Gu9yID4ZhsDF14W41IWWMSTeEw1ajHc0027PkoJeF6ou6m+TXT6AOhdVKsN9f K+Ty4yUWnX4OsSqvWmiJuOSGLZZRZuxtGNIgWEZedH5GzPoI/TCjCWtv0eIc+J/UAyqn7hIBjM 5hbFJhEgKMfGjTRb+hq3jpGA8exTqQcIPhmoeJR4tJU3e8JBCsf+565sHc5BPJp3C+wVzGNnQT X11qos447oXTE2+4Na3x8+fY Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Aug 2019 21:50:37 -0700 IronPort-SDR: GJC69ZjcVhr/cWtvXF/DWaeCUovbRHyrC65vKt1UxSg+9WiuUwN0gRGbzxvnaGh5mTgc807SHO XOJ2JQGusuA3Sa6VcLOVQt2c6Pt2dijgjJypGkuslR+ZFDmX8iCRicGURUAc0I+mV249q4V46e i3sw7+nBL6jmrYyqQTF/CEGWw69NZ312f8819EkFCzcCepv+1LLlg5jFpaQUPvXybnQh5mDwWE WKD/M7BO++zpaX5d8HCiJCJ0A7ImeFtqH2CNZ9cUfKo7RRCSNuGblta+z+98yhgTiKu6XE7KbB YS8= Received: from naota.dhcp.fujisawa.hgst.com (HELO naota.fujisawa.hgst.com) ([10.149.53.115]) by uls-op-cesaip02.wdc.com with ESMTP; 19 Aug 2019 21:53:10 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org, David Sterba Cc: Chris Mason , Josef Bacik , Nikolay Borisov , Damien Le Moal , Matias Bjorling , Johannes Thumshirn , Hannes Reinecke , linux-fsdevel@vger.kernel.org, Naohiro Aota Subject: [PATCH v3 05/15] btrfs-progs: Introduce zone block device helper functions Date: Tue, 20 Aug 2019 13:52:48 +0900 Message-Id: <20190820045258.1571640-6-naohiro.aota@wdc.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20190820045258.1571640-1-naohiro.aota@wdc.com> References: <20190820045258.1571640-1-naohiro.aota@wdc.com> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org This patch introduce several zone related functions: btrfs_get_zone_info() to get zone information from the specified device and put the information in zinfo, and zone_is_sequential() to check if a zone is a sequential required zone. btrfs_get_zone_info() is intentionaly works with "struct btrfs_zone_info" instead of "struct btrfs_device". We need to load zone information at btrfs_prepare_device(), but there are no "struct btrfs_device" at that time. Signed-off-by: Naohiro Aota --- Makefile | 2 +- common/hmzoned.c | 218 +++++++++++++++++++++++++++++++++++++++++++++++ common/hmzoned.h | 67 +++++++++++++++ 3 files changed, 286 insertions(+), 1 deletion(-) create mode 100644 common/hmzoned.c create mode 100644 common/hmzoned.h diff --git a/Makefile b/Makefile index 82417d19a9f8..60a9e8992864 100644 --- a/Makefile +++ b/Makefile @@ -140,7 +140,7 @@ objects = ctree.o disk-io.o kernel-lib/radix-tree.o extent-tree.o print-tree.o \ inode.o file.o find-root.o free-space-tree.o common/help.o send-dump.o \ common/fsfeatures.o kernel-lib/tables.o kernel-lib/raid56.o transaction.o \ delayed-ref.o common/format-output.o common/path-utils.o \ - common/device-utils.o common/device-scan.o + common/device-utils.o common/device-scan.o common/hmzoned.o cmds_objects = cmds/subvolume.o cmds/filesystem.o cmds/device.o cmds/scrub.o \ cmds/inspect.o cmds/balance.o cmds/send.o cmds/receive.o \ cmds/quota.o cmds/qgroup.o cmds/replace.o check/main.o \ diff --git a/common/hmzoned.c b/common/hmzoned.c new file mode 100644 index 000000000000..7114943458ef --- /dev/null +++ b/common/hmzoned.c @@ -0,0 +1,218 @@ +/* + * Copyright (C) 2019 Western Digital Corporation or its affiliates. + * Authors: + * Naohiro Aota + * Damien Le Moal + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public + * License v2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public + * License along with this program; if not, write to the + * Free Software Foundation, Inc., 59 Temple Place - Suite 330, + * Boston, MA 021110-1307, USA. + */ + +#include + +#include "common/utils.h" +#include "common/device-utils.h" +#include "common/messages.h" +#include "mkfs/common.h" +#include "common/hmzoned.h" + +#define BTRFS_REPORT_NR_ZONES 8192 + +enum btrfs_zoned_model zoned_model(const char *file) +{ + char model[32]; + int ret; + + ret = queue_param(file, "zoned", model, sizeof(model)); + if (ret <= 0) + return ZONED_NONE; + + if (strncmp(model, "host-aware", 10) == 0) + return ZONED_HOST_AWARE; + if (strncmp(model, "host-managed", 12) == 0) + return ZONED_HOST_MANAGED; + + return ZONED_NONE; +} + +size_t zone_size(const char *file) +{ + char chunk[32]; + int ret; + + ret = queue_param(file, "chunk_sectors", chunk, sizeof(chunk)); + if (ret <= 0) + return 0; + + return strtoul((const char *)chunk, NULL, 10) << 9; +} + +#ifdef BTRFS_ZONED +bool zone_is_sequential(struct btrfs_zone_info *zinfo, u64 bytenr) +{ + unsigned int zno; + + if (zinfo->model == ZONED_NONE) + return false; + + zno = bytenr / zinfo->zone_size; + + /* + * Only sequential write required zones on host-managed + * devices cannot be written randomly. + */ + return zinfo->zones[zno].type == BLK_ZONE_TYPE_SEQWRITE_REQ; +} + +static int report_zones(int fd, const char *file, u64 block_count, + struct btrfs_zone_info *zinfo) +{ + size_t zone_bytes = zone_size(file); + size_t rep_size; + u64 sector = 0; + struct blk_zone_report *rep; + struct blk_zone *zone; + unsigned int i, n = 0; + int ret; + + /* + * Zones are guaranteed (by the kernel) to be a power of 2 number of + * sectors. Check this here and make sure that zones are not too + * small. + */ + if (!zone_bytes || !is_power_of_2(zone_bytes)) { + error("Illegal zone size %zu (not a power of 2)", zone_bytes); + exit(1); + } + if (zone_bytes < BTRFS_MKFS_SYSTEM_GROUP_SIZE) { + error("Illegal zone size %zu (smaller than %d)", zone_bytes, + BTRFS_MKFS_SYSTEM_GROUP_SIZE); + exit(1); + } + + /* Allocate the zone information array */ + zinfo->zone_size = zone_bytes; + zinfo->nr_zones = block_count / zone_bytes; + if (block_count & (zone_bytes - 1)) + zinfo->nr_zones++; + zinfo->zones = calloc(zinfo->nr_zones, sizeof(struct blk_zone)); + if (!zinfo->zones) { + error("No memory for zone information"); + exit(1); + } + + /* Allocate a zone report */ + rep_size = sizeof(struct blk_zone_report) + + sizeof(struct blk_zone) * BTRFS_REPORT_NR_ZONES; + rep = malloc(rep_size); + if (!rep) { + error("No memory for zones report"); + exit(1); + } + + /* Get zone information */ + zone = (struct blk_zone *)(rep + 1); + while (n < zinfo->nr_zones) { + memset(rep, 0, rep_size); + rep->sector = sector; + rep->nr_zones = BTRFS_REPORT_NR_ZONES; + + ret = ioctl(fd, BLKREPORTZONE, rep); + if (ret != 0) { + error("ioctl BLKREPORTZONE failed (%s)", + strerror(errno)); + exit(1); + } + + if (!rep->nr_zones) + break; + + for (i = 0; i < rep->nr_zones; i++) { + if (n >= zinfo->nr_zones) + break; + memcpy(&zinfo->zones[n], &zone[i], + sizeof(struct blk_zone)); + n++; + } + + sector = zone[rep->nr_zones - 1].start + + zone[rep->nr_zones - 1].len; + } + + /* + * We need at least one random write zone (a conventional zone or + * a sequential write preferred zone on a host-aware device). + */ + if (zone_is_sequential(zinfo, 0)) { + error("No conventional zone at block 0"); + exit(1); + } + + free(rep); + + return 0; +} + +#endif + +int btrfs_get_zone_info(int fd, const char *file, bool hmzoned, + struct btrfs_zone_info *zinfo) +{ + struct stat st; + int ret; + + memset(zinfo, 0, sizeof(struct btrfs_zone_info)); + + ret = fstat(fd, &st); + if (ret < 0) { + error("unable to stat %s", file); + return 1; + } + + if (!S_ISBLK(st.st_mode)) + return 0; + + /* Check zone model */ + zinfo->model = zoned_model(file); + if (zinfo->model == ZONED_NONE) + return 0; + + if (zinfo->model == ZONED_HOST_MANAGED && !hmzoned) { + error( +"%s: host-managed zoned block device (enable zone block device support with -O hmzoned)", + file); + return 1; + } + + if (!hmzoned) { + /* Treat host-aware devices as regular devices */ + zinfo->model = ZONED_NONE; + return 0; + } + +#ifdef BTRFS_ZONED + /* Get zone information */ + ret = report_zones(fd, file, btrfs_device_size(fd, &st), zinfo); + if (ret != 0) + return ret; +#else + error("%s: Unsupported host-%s zoned block device", file, + zinfo->model == ZONED_HOST_MANAGED ? "managed" : "aware"); + if (zinfo->model == ZONED_HOST_MANAGED) + return 1; + + error("%s: handling host-aware block device as a regular disk", file); +#endif + return 0; +} diff --git a/common/hmzoned.h b/common/hmzoned.h new file mode 100644 index 000000000000..fbcaaf2da20e --- /dev/null +++ b/common/hmzoned.h @@ -0,0 +1,67 @@ +/* + * Copyright (C) 2019 Western Digital Corporation or its affiliates. + * Authors: + * Naohiro Aota + * Damien Le Moal + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public + * License v2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public + * License along with this program; if not, write to the + * Free Software Foundation, Inc., 59 Temple Place - Suite 330, + * Boston, MA 021110-1307, USA. + */ + +#ifndef __BTRFS_HMZONED_H__ +#define __BTRFS_HMZONED_H__ + +#ifdef BTRFS_ZONED +#include +#else +struct blk_zone { + int dummy; +}; +#endif /* BTRFS_ZONED */ + +/* + * Zoned block device models. + */ +enum btrfs_zoned_model { + ZONED_NONE = 0, + ZONED_HOST_AWARE, + ZONED_HOST_MANAGED, +}; + +/* + * Zone information for a zoned block device. + */ +struct btrfs_zone_info { + enum btrfs_zoned_model model; + size_t zone_size; + struct blk_zone *zones; + unsigned int nr_zones; +}; + +enum btrfs_zoned_model zoned_model(const char *file); +size_t zone_size(const char *file); +int btrfs_get_zone_info(int fd, const char *file, bool hmzoned, + struct btrfs_zone_info *zinfo); + +#ifdef BTRFS_ZONED +bool zone_is_sequential(struct btrfs_zone_info *zinfo, u64 bytenr); +#else +static inline bool zone_is_sequential(struct btrfs_zone_info *zinfo, + u64 bytenr) +{ + return true; +} +#endif /* BTRFS_ZONED */ + +#endif /* __BTRFS_HMZONED_H__ */ From patchwork Tue Aug 20 04:52:49 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 11102847 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5ACAE1395 for ; Tue, 20 Aug 2019 04:53:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3AA9D22CF4 for ; Tue, 20 Aug 2019 04:53:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="mqt6ZBrA" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729171AbfHTExR (ORCPT ); Tue, 20 Aug 2019 00:53:17 -0400 Received: from esa5.hgst.iphmx.com ([216.71.153.144]:11098 "EHLO esa5.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728206AbfHTExQ (ORCPT ); Tue, 20 Aug 2019 00:53:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1566276795; x=1597812795; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=64YBSe+3BGLkturLsVCskGQwbaQ+oBED/ZGLxkC0Ldo=; b=mqt6ZBrAJVro5W0yGF5GXINOj5cNW7NpJQ71nHOkTWmWGUd8N7GmXfas Z3UTAiCJ6+Ip09Wv1sJkYIhFz3b8l1fQAAe92cLo/eF22NMfPuIT57CQk WGGIJRnzCaQsvna4/VRVoQAfRFUq3w7HahlWC0UX2SF8yrTkKHseo7xKk CqvysjRqFJSNPZjPOZ973zIYm48g50yuWbFuev3F2wcwUx1VbQ44z+HGS /5roHbVZ+y539ppo5JcRd+UtIozJ6BoztILaoiDzHrFnIxNe4BNOq9b3P uxXhClgva3SS6qOMEdh0nFdyjScpWF6M9Ye741qVw9eKM0HNOv7dmjJzs Q==; IronPort-SDR: 8QpPO+3DLIbScJllHuCcuweQ+BPUFRdnLQM9Z68LTZ7SwEJugPKYyY2dzMcdkLumYsXGcaxDyi zzkVmn5VYxuROwu9sPEUKErY1CL6KTDr207UdtlLmeQ0VhXVCZ4d28QHWYg3204iMuQOrtUtxW M5aFJH0ycOTnJINTD0RFlMBCWnrQHANFRvaPdkS0V9IoVkWMuq8J5rjmEJR2Rqm5Sy4fvuD74+ 8WQclivfIFrKgvuxqgKsr9Ka9MjITrQaSZ8uv/XqZqAp3Gx7AMbFicab/RwQmVX+9WMHKeEXAA uxA= X-IronPort-AV: E=Sophos;i="5.64,407,1559491200"; d="scan'208";a="117136296" Received: from uls-op-cesaip01.wdc.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 20 Aug 2019 12:53:15 +0800 IronPort-SDR: VszuqIMEHJjThlmylJOZUo9+Xa57auHR0+RJxQ0oRjWcQlY/s9Qx12OkjqNEQ+bkJPE8bpVMg2 a3p3v5T19HuWpkiNspeFSTVgXM61sHD51EK7wKH+LM/oR+Q/4tVCNIGQrSS9j0umU00dxhG1h6 zWYrcAp05oVhGju/JbazRdTvq4hE/yfu41HBSO5RB17A0N42RvXE32fs6usfnTlYJNDYHeW2Z1 av1bzyRgM/W/HSxmzPoD3ZHVLidJSvwLRlu8SylBwH6Gr67gW/fVtshkyU3lLcar3pjtuSO9gj 3Ufkg/t9nzXRU4SBH2JKiEF6 Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Aug 2019 21:50:39 -0700 IronPort-SDR: J4M3QSho2ScOdD5lZ2G6LcRlaWzt53vL3f8ITj1x+5FXOSDbXjiO+Fxx3nUzAQpUo0fKzsl11n 8BTPCdq5fTlVRmDnIbZkGwEQtEKOptDsujZBHjy1hvMFk1p1Q2pcA7JONW1DZyRzS725A1B5B1 A4zSVc+11GFcAxeTJj+p6ww8Rg8Zc3Tbs868xsgCfWsdOkhTHpobknQ3ZD++LeR6au1svXyRfb XT/RbA7T5T3S4tC42LocOi6OkINsjxPSLY9SXDFUMcaGA8CuUdnyWrHJ8VAvQxjN4uYduutvJo XDc= Received: from naota.dhcp.fujisawa.hgst.com (HELO naota.fujisawa.hgst.com) ([10.149.53.115]) by uls-op-cesaip02.wdc.com with ESMTP; 19 Aug 2019 21:53:12 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org, David Sterba Cc: Chris Mason , Josef Bacik , Nikolay Borisov , Damien Le Moal , Matias Bjorling , Johannes Thumshirn , Hannes Reinecke , linux-fsdevel@vger.kernel.org, Naohiro Aota Subject: [PATCH v3 06/15] btrfs-progs: load and check zone information Date: Tue, 20 Aug 2019 13:52:49 +0900 Message-Id: <20190820045258.1571640-7-naohiro.aota@wdc.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20190820045258.1571640-1-naohiro.aota@wdc.com> References: <20190820045258.1571640-1-naohiro.aota@wdc.com> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org This patch checks if a device added to btrfs is a zoned block device. If it is, load zones information and the zone size for the device. For a btrfs volume composed of multiple zoned block devices, all devices must have the same zone size. Signed-off-by: Naohiro Aota --- common/device-scan.c | 10 ++++++++++ volumes.c | 18 ++++++++++++++++++ volumes.h | 6 ++++++ 3 files changed, 34 insertions(+) diff --git a/common/device-scan.c b/common/device-scan.c index 2c5ae225f710..5df77b9b68d7 100644 --- a/common/device-scan.c +++ b/common/device-scan.c @@ -135,6 +135,16 @@ int btrfs_add_to_fsid(struct btrfs_trans_handle *trans, goto out; } + ret = btrfs_get_zone_info(fd, path, fs_info->fs_devices->hmzoned, + &device->zone_info); + if (ret) + goto out; + if (device->zone_info.zone_size != fs_info->fs_devices->zone_size) { + error("Device zone size differ"); + ret = -EINVAL; + goto out; + } + disk_super = (struct btrfs_super_block *)buf; dev_item = &disk_super->dev_item; diff --git a/volumes.c b/volumes.c index f99fddc7cf6f..a0ebed547faa 100644 --- a/volumes.c +++ b/volumes.c @@ -196,6 +196,8 @@ static int device_list_add(const char *path, u64 found_transid = btrfs_super_generation(disk_super); bool metadata_uuid = (btrfs_super_incompat_flags(disk_super) & BTRFS_FEATURE_INCOMPAT_METADATA_UUID); + bool hmzoned = btrfs_super_incompat_flags(disk_super) & + BTRFS_FEATURE_INCOMPAT_HMZONED; if (metadata_uuid) fs_devices = find_fsid(disk_super->fsid, @@ -285,6 +287,7 @@ static int device_list_add(const char *path, if (fs_devices->lowest_devid > devid) { fs_devices->lowest_devid = devid; } + fs_devices->hmzoned = hmzoned; *fs_devices_ret = fs_devices; return 0; } @@ -355,6 +358,8 @@ int btrfs_open_devices(struct btrfs_fs_devices *fs_devices, int flags) struct btrfs_device *device; int ret; + fs_devices->zone_size = 0; + list_for_each_entry(device, &fs_devices->devices, dev_list) { if (!device->name) { printk("no name for device %llu, skip it now\n", device->devid); @@ -378,6 +383,19 @@ int btrfs_open_devices(struct btrfs_fs_devices *fs_devices, int flags) device->fd = fd; if (flags & O_RDWR) device->writeable = 1; + + ret = btrfs_get_zone_info(fd, device->name, fs_devices->hmzoned, + &device->zone_info); + if (ret != 0) + goto fail; + if (!fs_devices->zone_size) { + fs_devices->zone_size = device->zone_info.zone_size; + } else if (device->zone_info.zone_size != + fs_devices->zone_size) { + error("Device zone size differ"); + ret = -EINVAL; + goto fail; + } } return 0; fail: diff --git a/volumes.h b/volumes.h index 586588c871ab..edbb0f36aa75 100644 --- a/volumes.h +++ b/volumes.h @@ -22,12 +22,15 @@ #include "kerncompat.h" #include "ctree.h" +#include "common/hmzoned.h" + #define BTRFS_STRIPE_LEN SZ_64K struct btrfs_device { struct list_head dev_list; struct btrfs_root *dev_root; struct btrfs_fs_devices *fs_devices; + struct btrfs_zone_info zone_info; u64 total_ios; @@ -87,6 +90,9 @@ struct btrfs_fs_devices { int seeding; struct btrfs_fs_devices *seed; + + u64 zone_size; + bool hmzoned; }; struct btrfs_bio_stripe { From patchwork Tue Aug 20 04:52:50 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 11102851 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6F63F1399 for ; Tue, 20 Aug 2019 04:53:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4FD2522CF7 for ; Tue, 20 Aug 2019 04:53:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="EfSXcnX3" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729189AbfHTExS (ORCPT ); Tue, 20 Aug 2019 00:53:18 -0400 Received: from esa5.hgst.iphmx.com ([216.71.153.144]:11098 "EHLO esa5.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729178AbfHTExR (ORCPT ); Tue, 20 Aug 2019 00:53:17 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1566276797; x=1597812797; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=UWf4fXO5O18k5bQ8f/qM73iKo11oH8s2PQ6abn0uCnM=; b=EfSXcnX3tfA+HCYN7fMVC9WCOebDv+t8SRMDCO+cCUoiKuq3n8tYeWsm WrzPbzJDfLiMoVzcAttKAdZZ9JdVmD2t8SDsaWrDPsgSklHIp/CsQEgW+ kKfi29x3nwXbZjyMDea2MW2SQhtZApp5jjAPXqYGz3jGvA+uIzJ8i0CUi WDJdIMWGcZekqF98z/bjob7+Iofq1EoQAA0gvSo1NXVE7oq9ZWPN6k0v2 3N1LEKfZoiYR1k4JwqT8HwxE+k6O1QgNAzwK2Or1UcS0hP5RuPCSkuXJu vctqwkiSCIGXAwbBWgvoUDwAp11Aa/mJx5TrGHw0A6bTT8FYFboeJOHp0 g==; IronPort-SDR: GXzbshbfMnvJEAyAELo0lpSHQnTNLkA/LOrvwIh+zzW+HCU3jlEYSTT4LTmMhI4JESjmLEdxLP sS3Jz6NorY8rpwzllRdO8xHxzFY1BGHen+WNsTJE+Qp+f3U/Ja8rtHxf7in1bcSDtBF7MzN/FY +Im0LPeD0/sqe05PgjjX9eiHbal9l4nHCbjinTcgfQsS/1mFvi5NEicKQ4HNWWPLGvPJv4R5jb FLntUeVBGupgfgd3+yzF76fvv+xus5F1weOsfuI2hh65tvlUiB1TL0p2wtJ0OFXWWX7WbSkqwQ np0= X-IronPort-AV: E=Sophos;i="5.64,407,1559491200"; d="scan'208";a="117136301" Received: from uls-op-cesaip01.wdc.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 20 Aug 2019 12:53:17 +0800 IronPort-SDR: k0bJbJorqWMWxS/mOBivz2SFZ0ZLkpuOqFGK8RXXr3LkFjVQRHgMSqec2+52cZ4aub7grJCVaz ncdEXUhv1dosDfOr/EyzeT8CLKFJT5UHMpMjBlp3dBIsrAbFD88HrSL3w75kPctQ1mGH0frROT LWchrBPgrXFI2e3lnmj3pvCaZydwJtiarE0GYEgR5FILFv3GYL0R0Tt2pwB1y1kyITXeebVrDx HNlRL4yynqlZiq+dz2K6Yz9BqkQLro2ISY6QWFaOIs5GUuUgH5KYOXBDfmrYoUc79V6j86v5MU KOuTZVqc15sdnXzDAL5ePQtU Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Aug 2019 21:50:41 -0700 IronPort-SDR: kMJkeTDCg4Tn+tsZQsrigbvurLdZWCki6PPg/JyyavYU0+G+xj16SG4yNoFVZ1umQIE6kub03s a/Oz95ZOxbqNRMawAWwVqSiblJGnPUZjuOLdBdFV5PJol1fTwZa/l1ZaQ9JCqizni3bDYQ+sQQ w0AWe+HpU0Rkiu5d7nprOnE/Atj+cHTYZCKgN10RVN6vWrtxLUiYgAH48XvNKDKkq+fs6cIAik y3Co32Z85N70R9q9vGWwv+8MGPnw+iDA+ALxYvU4rVKyFY1Ap4J4/fMG69bJo52LZ8wsjq3d/9 LNA= Received: from naota.dhcp.fujisawa.hgst.com (HELO naota.fujisawa.hgst.com) ([10.149.53.115]) by uls-op-cesaip02.wdc.com with ESMTP; 19 Aug 2019 21:53:13 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org, David Sterba Cc: Chris Mason , Josef Bacik , Nikolay Borisov , Damien Le Moal , Matias Bjorling , Johannes Thumshirn , Hannes Reinecke , linux-fsdevel@vger.kernel.org, Naohiro Aota Subject: [PATCH v3 07/15] btrfs-progs: avoid writing super block to sequential zones Date: Tue, 20 Aug 2019 13:52:50 +0900 Message-Id: <20190820045258.1571640-8-naohiro.aota@wdc.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20190820045258.1571640-1-naohiro.aota@wdc.com> References: <20190820045258.1571640-1-naohiro.aota@wdc.com> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org It is not possible to write a super block copy in sequential write required zones as this prevents in-place updates required for super blocks. This patch limits super block possible locations to zones accepting random writes. In particular, the zone containing the first block of the device or partition being formatted must accept random writes. Signed-off-by: Naohiro Aota --- disk-io.c | 10 ++++++++++ volumes.h | 4 ++++ 2 files changed, 14 insertions(+) diff --git a/disk-io.c b/disk-io.c index be44eead5cef..5dd55723b9b7 100644 --- a/disk-io.c +++ b/disk-io.c @@ -1632,6 +1632,14 @@ static int write_dev_supers(struct btrfs_fs_info *fs_info, BTRFS_SUPER_INFO_SIZE - BTRFS_CSUM_SIZE); btrfs_csum_final(crc, &sb->csum[0]); + if (btrfs_dev_is_sequential(device, fs_info->super_bytenr)) { + errno = EIO; + error( +"failed to write super block for devid %llu: require random write zone: %m", + device->devid); + return -EIO; + } + /* * super_copy is BTRFS_SUPER_INFO_SIZE bytes and is * zero filled, we can use it directly @@ -1660,6 +1668,8 @@ static int write_dev_supers(struct btrfs_fs_info *fs_info, bytenr = btrfs_sb_offset(i); if (bytenr + BTRFS_SUPER_INFO_SIZE > device->total_bytes) break; + if (btrfs_dev_is_sequential(device, bytenr)) + continue; btrfs_set_super_bytenr(sb, bytenr); diff --git a/volumes.h b/volumes.h index edbb0f36aa75..b5e7a07df5a8 100644 --- a/volumes.h +++ b/volumes.h @@ -319,4 +319,8 @@ int btrfs_fix_device_size(struct btrfs_fs_info *fs_info, struct btrfs_device *device); int btrfs_fix_super_size(struct btrfs_fs_info *fs_info); int btrfs_fix_device_and_super_size(struct btrfs_fs_info *fs_info); +static inline bool btrfs_dev_is_sequential(struct btrfs_device *device, u64 pos) +{ + return zone_is_sequential(&device->zone_info, pos); +} #endif From patchwork Tue Aug 20 04:52:51 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 11102855 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 78A421395 for ; Tue, 20 Aug 2019 04:53:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4EF9F22CF4 for ; Tue, 20 Aug 2019 04:53:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="lk5pXW40" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729202AbfHTExV (ORCPT ); Tue, 20 Aug 2019 00:53:21 -0400 Received: from esa5.hgst.iphmx.com ([216.71.153.144]:11098 "EHLO esa5.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729178AbfHTExT (ORCPT ); Tue, 20 Aug 2019 00:53:19 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1566276799; x=1597812799; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=s3avTnFnXBTbDCO3GzkRgYC1I46d7Y9pb8l8EceqUII=; b=lk5pXW40gCHFq9hTX/FEpPL2ETcoMKaJ/YT/j7eJZUi43hzCGppBzRQ+ 6e+nqSvnWiZXiPvLsAdSJKI6+XcCu2EXPL/y+IYpNdLlsSNbgH3JO+p0T yBjH5hHOA/YNMv6RgZnd2wxXGLa5KHsTzJ7sr2nkO08dAblELdkfG+Tz1 ZNaZs8yPCwcb+g/Tdgd93uW26Wrqk0fK1u1v3yOAEnZavsrtVaPx8V4Zt ExSOZx9ufmHM2vT151wGMu1u3vLxsgV1ApU2GpZd0PgzmjD3eFHwkt5Wb QZSgG8/L5tCQQ4BLAwxROqcykZoRR4UUgqmpPaztwE0uehFT/CSY6jcGF A==; IronPort-SDR: nyGt6vUBofBKgxyXLZ5OHlbOUMJ36386R7RIXEh2V9J7yCWxDo4fxKTsMK1UPm98jDLOyyS6me 1ZvEfmnFmp/5slKCBTNdbk11KGDWPXQbhpuzNUlcFyuzUSQEUJdt0SQjJDOFBaQJJgSXSXsNZ9 WDt3qJrqB7p0LF8NTTkdg2x7N/0xIIpJohPwh6gwNgPSGgdV1wX7WP22GruKHrJhX8KkQQpwAJ EAbS+g0++4Ej3IIOW2cZz3453A75+LpU/2B56SIMwmYAH+xcjcNYJI6LSSFMI1JrHGh2890WkL NUg= X-IronPort-AV: E=Sophos;i="5.64,407,1559491200"; d="scan'208";a="117136304" Received: from uls-op-cesaip01.wdc.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 20 Aug 2019 12:53:19 +0800 IronPort-SDR: o453GWZAnj6t2w3zcyBiPlBLe+fpWlsbnG8AbpYmxD8wRPzq6h6snn3QDWOvh8U2K0SPNAdeou g64wRJeLeAYsOuMqmoLHiOaQ+Xw86gCV17xYAxYaIR3fiYqs4q3tkHzwQTCk/BoWfp4SNYZO4H MaAtz8T8dQ62qp7k8MNdqA9BEWw4cDD2HNV2Yz3V3xrnuPN+tWu2wskcmb6igDk5lcnwIpqy5w lVZctDGa9lXdPSYEOC4wULD9mewF5/gxMkSS9kQfkqPtejS6jUCQ5LasZPG95TgwKtH0nVVjYw zNWgZsdrQSW29M0fEjCHPnni Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Aug 2019 21:50:43 -0700 IronPort-SDR: SJFkfG209BS0q3KqKVtzNsynTn+a9a7SfkLJy7Cr9ufA/1Qnhfbfzg4SzX9HVlemagTprrb9dp ZS+RcXMVBkbTmw8s6TlsK1i2svtaiZtXdRRPCnGTNGqbmfJ4ouBFw9LaTKCEfi42l2Sanr/3KG EvsDulrtS7CfIVWlqQR8acSQGoXcOXKCnaFwLoZQkyvnIghQgGeSxBp6/GTA4hulqehtg0dJkr uhi3k5IVDDn1WK50u1DFAQVrSC0bc8Jc7WvVYFAdxRPqmml0jSLtYWGoIT2uL0kpFxazsVwk/b n38= Received: from naota.dhcp.fujisawa.hgst.com (HELO naota.fujisawa.hgst.com) ([10.149.53.115]) by uls-op-cesaip02.wdc.com with ESMTP; 19 Aug 2019 21:53:15 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org, David Sterba Cc: Chris Mason , Josef Bacik , Nikolay Borisov , Damien Le Moal , Matias Bjorling , Johannes Thumshirn , Hannes Reinecke , linux-fsdevel@vger.kernel.org, Naohiro Aota Subject: [PATCH v3 08/15] btrfs-progs: support discarding zoned device Date: Tue, 20 Aug 2019 13:52:51 +0900 Message-Id: <20190820045258.1571640-9-naohiro.aota@wdc.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20190820045258.1571640-1-naohiro.aota@wdc.com> References: <20190820045258.1571640-1-naohiro.aota@wdc.com> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org All zones of zoned block devices should be reset before writing. Support this by introducing PREP_DEVICE_HMZONED. This commit export discard_blocks() and use it from btrfs_discard_all_zones(). Signed-off-by: Naohiro Aota --- common/device-utils.c | 23 +++++++++++++++++++++-- common/device-utils.h | 2 ++ common/hmzoned.c | 27 +++++++++++++++++++++++++++ common/hmzoned.h | 5 +++++ 4 files changed, 55 insertions(+), 2 deletions(-) diff --git a/common/device-utils.c b/common/device-utils.c index 7fa9386f4677..c7046c22a9fb 100644 --- a/common/device-utils.c +++ b/common/device-utils.c @@ -29,6 +29,7 @@ #include "common/internal.h" #include "common/messages.h" #include "common/utils.h" +#include "common/hmzoned.h" #ifndef BLKDISCARD #define BLKDISCARD _IO(0x12,119) @@ -49,7 +50,7 @@ static int discard_range(int fd, u64 start, u64 len) /* * Discard blocks in the given range in 1G chunks, the process is interruptible */ -static int discard_blocks(int fd, u64 start, u64 len) +int discard_blocks(int fd, u64 start, u64 len) { while (len > 0) { /* 1G granularity */ @@ -155,6 +156,7 @@ out: int btrfs_prepare_device(int fd, const char *file, u64 *block_count_ret, u64 max_block_count, unsigned opflags) { + struct btrfs_zone_info zinfo; u64 block_count; struct stat st; int i, ret; @@ -173,7 +175,24 @@ int btrfs_prepare_device(int fd, const char *file, u64 *block_count_ret, if (max_block_count) block_count = min(block_count, max_block_count); - if (opflags & PREP_DEVICE_DISCARD) { + ret = btrfs_get_zone_info(fd, file, opflags & PREP_DEVICE_HMZONED, + &zinfo); + if (ret < 0) + return 1; + + if (opflags & PREP_DEVICE_HMZONED) { + printf("Resetting device zones %s (%u zones) ...\n", + file, zinfo.nr_zones); + /* + * We cannot ignore zone discard (reset) errors for a zoned + * block device as this could result in the inability to + * write to non-empty sequential zones of the device. + */ + if (btrfs_discard_all_zones(fd, &zinfo)) { + error("failed to reset device '%s' zones", file); + return 1; + } + } else if (opflags & PREP_DEVICE_DISCARD) { /* * We intentionally ignore errors from the discard ioctl. It * is not necessary for the mkfs functionality but just an diff --git a/common/device-utils.h b/common/device-utils.h index d1799323d002..885a46937e0d 100644 --- a/common/device-utils.h +++ b/common/device-utils.h @@ -23,7 +23,9 @@ #define PREP_DEVICE_ZERO_END (1U << 0) #define PREP_DEVICE_DISCARD (1U << 1) #define PREP_DEVICE_VERBOSE (1U << 2) +#define PREP_DEVICE_HMZONED (1U << 3) +int discard_blocks(int fd, u64 start, u64 len); u64 get_partition_size(const char *dev); u64 disk_size(const char *path); u64 btrfs_device_size(int fd, struct stat *st); diff --git a/common/hmzoned.c b/common/hmzoned.c index 7114943458ef..70de111f22da 100644 --- a/common/hmzoned.c +++ b/common/hmzoned.c @@ -216,3 +216,30 @@ int btrfs_get_zone_info(int fd, const char *file, bool hmzoned, #endif return 0; } + +/* + * Discard blocks in the zones of a zoned block device. + * Process this with zone size granularity so that blocks in + * conventional zones are discarded using discard_range and + * blocks in sequential zones are discarded though a zone reset. + */ +int btrfs_discard_all_zones(int fd, struct btrfs_zone_info *zinfo) +{ + unsigned int i; + + /* Zone size granularity */ + for (i = 0; i < zinfo->nr_zones; i++) { + if (zinfo->zones[i].type == BLK_ZONE_TYPE_CONVENTIONAL) { + discard_blocks(fd, zinfo->zones[i].start << 9, + zinfo->zone_size); + } else if (zinfo->zones[i].cond != BLK_ZONE_COND_EMPTY) { + struct blk_zone_range range = { + zinfo->zones[i].start, + zinfo->zone_size >> 9 }; + if (ioctl(fd, BLKRESETZONE, &range) < 0) + return errno; + } + } + + return 0; +} diff --git a/common/hmzoned.h b/common/hmzoned.h index fbcaaf2da20e..c4e20ae71d21 100644 --- a/common/hmzoned.h +++ b/common/hmzoned.h @@ -56,12 +56,17 @@ int btrfs_get_zone_info(int fd, const char *file, bool hmzoned, #ifdef BTRFS_ZONED bool zone_is_sequential(struct btrfs_zone_info *zinfo, u64 bytenr); +int btrfs_discard_all_zones(int fd, struct btrfs_zone_info *zinfo); #else static inline bool zone_is_sequential(struct btrfs_zone_info *zinfo, u64 bytenr) { return true; } +static inline int btrfs_discard_all_zones(int fd, struct btrfs_zone_info *zinfo) +{ + return -EOPNOTSUPP; +} #endif /* BTRFS_ZONED */ #endif /* __BTRFS_HMZONED_H__ */ From patchwork Tue Aug 20 04:52:52 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 11102859 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F1A4A1395 for ; Tue, 20 Aug 2019 04:53:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C80C922CF4 for ; Tue, 20 Aug 2019 04:53:23 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="SVQ3/aTZ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729192AbfHTExX (ORCPT ); Tue, 20 Aug 2019 00:53:23 -0400 Received: from esa5.hgst.iphmx.com ([216.71.153.144]:11098 "EHLO esa5.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729194AbfHTExV (ORCPT ); Tue, 20 Aug 2019 00:53:21 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1566276801; x=1597812801; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7TVvSLmTvJ7ZjC/YqB2D4J4WF4j++F7FE/ZWWk3V+Mk=; b=SVQ3/aTZhcrOnPMF5qWnUKeYBY3ol/NoSqzliZIxQuECtzEuJR8ub3B7 tjGEWUm+VrGjRtZe4J3IJ9p7GYs+fdlCxYpRTxJ4F6NcGb6XOYnlCPKm3 cChkY2WMVQK44J6wznW/tdZ6EejlGqXQgTtAVeSN3Ggp/B1zhU2muw7TB HwjX5z6JNmeuN/RwExRYEpSnqJ6XNGNzimkA9ef4yoCg5I+vt9nefofwF iZ+T40nDqxqoqG43Smhv9C0FhJiRFCdh5BwoGXR2CgsK0RGDB9eak8HBp FLgZbyzIgQtBykPlmmKdRQHchdhqeAKu0bCYAEftjk+giTukevgeMteEz A==; IronPort-SDR: KmoEXUt99knkJsZaaklAse2zZ6rtDGvM7UmbhwcnQxNh2sjVI6s13z7VLfESO8KSwfkV3EuaJT 7477MAo5XHDcZCYaI33XP2ANiv8uw45gCMjGoYOw9C/LL+EknQp1n3tyF1Ii509JJdnhTH3p+r npUfxFqcreiEvWRRo/e/11wsypAoi3blRscyNbj/zP4Aa+yv9rWag6V/tb4bL+2Yk4N6Yabooe 2+RRNedKHxAbNjMQEvgMFWHPyF6TEjygOXtQbDjlmF76aNs0bTYWgGWaJUsD1bvNB0fWPwDXjy Jv4= X-IronPort-AV: E=Sophos;i="5.64,407,1559491200"; d="scan'208";a="117136306" Received: from uls-op-cesaip01.wdc.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 20 Aug 2019 12:53:20 +0800 IronPort-SDR: hiCcHc6Xiwt3zzOV3oy0RWj5eQT29hzYgLkuTkjlIkerhFOwu4TSSxreUI4HJUJ0lOQFRtcDr6 1GDWGIgDU8bqhSSoUxdI9r4xQ31aCqB+RvIroijTm5RENCJwGbwISyHiyhAbxmWIuFHIcd+NwZ a8pb0W5+TPsG+ytCIZlRE8XHCV/vaMbLQAtb7kQS2KDlY/eiLFpj/iHOF82DG9pIbCqp2GT4rB DEyvnBNzBm5cj4tu84roOpiXuy87s61E8qEmioZn6GLXb2FyASbL2LvJH+3w3wqHClDfBcVFIc 1o0m2RWija9NWwzpZTwotIdM Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Aug 2019 21:50:45 -0700 IronPort-SDR: gL0md5HAY+HJKOKP+IyQgyxGOsU/DcBat+CNuHMWGG9UuJOMsLOcBJBwxrbVfWnLyPsKWplJTe g0KJQAVhuZSSfYnbjMPdkwXtn6cxvirr121QfT5XaUnxZ9/f7xqD3J4XSQdndqm3qLlcY6GeMF qOQM6Wp3gvntkrzOHxAgxLgHrvfmPbGJycMzzeTKBJan0ESHucI5w68BB1nJydwUwmdTJsHxaY bH0Jn4S0tnvJDcR4Sw0p8O3tO2Bdr8BonOzD1lETw8lMOG24fF03LvY/TMz/lnnVLu8fUnByAr iGk= Received: from naota.dhcp.fujisawa.hgst.com (HELO naota.fujisawa.hgst.com) ([10.149.53.115]) by uls-op-cesaip02.wdc.com with ESMTP; 19 Aug 2019 21:53:17 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org, David Sterba Cc: Chris Mason , Josef Bacik , Nikolay Borisov , Damien Le Moal , Matias Bjorling , Johannes Thumshirn , Hannes Reinecke , linux-fsdevel@vger.kernel.org, Naohiro Aota Subject: [PATCH v3 09/15] btrfs-progs: support zero out on zoned block device Date: Tue, 20 Aug 2019 13:52:52 +0900 Message-Id: <20190820045258.1571640-10-naohiro.aota@wdc.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20190820045258.1571640-1-naohiro.aota@wdc.com> References: <20190820045258.1571640-1-naohiro.aota@wdc.com> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org If we zero out a region in a sequential write required zone, we cannot write to the region until we reset the zone. Thus, we must prohibit zeroing out to a sequential write required zone. zero_dev_clamped() is modified to take the zone information and it calls zero_zone_blocks() if the device is host managed to avoid writing to sequential write required zones. Signed-off-by: Naohiro Aota --- common/device-utils.c | 14 +++++++++----- common/device-utils.h | 1 + common/hmzoned.c | 29 +++++++++++++++++++++++++++++ common/hmzoned.h | 7 +++++++ 4 files changed, 46 insertions(+), 5 deletions(-) diff --git a/common/device-utils.c b/common/device-utils.c index c7046c22a9fb..840399c860c3 100644 --- a/common/device-utils.c +++ b/common/device-utils.c @@ -67,7 +67,7 @@ int discard_blocks(int fd, u64 start, u64 len) return 0; } -static int zero_blocks(int fd, off_t start, size_t len) +int zero_blocks(int fd, off_t start, size_t len) { char *buf = malloc(len); int ret = 0; @@ -86,7 +86,8 @@ static int zero_blocks(int fd, off_t start, size_t len) #define ZERO_DEV_BYTES SZ_2M /* don't write outside the device by clamping the region to the device size */ -static int zero_dev_clamped(int fd, off_t start, ssize_t len, u64 dev_size) +static int zero_dev_clamped(int fd, struct btrfs_zone_info *zinfo, off_t start, + ssize_t len, u64 dev_size) { off_t end = max(start, start + len); @@ -99,6 +100,9 @@ static int zero_dev_clamped(int fd, off_t start, ssize_t len, u64 dev_size) start = min_t(u64, start, dev_size); end = min_t(u64, end, dev_size); + if (zinfo->model == ZONED_HOST_MANAGED) + return zero_zone_blocks(fd, zinfo, start, end - start); + return zero_blocks(fd, start, end - start); } @@ -206,12 +210,12 @@ int btrfs_prepare_device(int fd, const char *file, u64 *block_count_ret, } } - ret = zero_dev_clamped(fd, 0, ZERO_DEV_BYTES, block_count); + ret = zero_dev_clamped(fd, &zinfo, 0, ZERO_DEV_BYTES, block_count); for (i = 0 ; !ret && i < BTRFS_SUPER_MIRROR_MAX; i++) - ret = zero_dev_clamped(fd, btrfs_sb_offset(i), + ret = zero_dev_clamped(fd, &zinfo, btrfs_sb_offset(i), BTRFS_SUPER_INFO_SIZE, block_count); if (!ret && (opflags & PREP_DEVICE_ZERO_END)) - ret = zero_dev_clamped(fd, block_count - ZERO_DEV_BYTES, + ret = zero_dev_clamped(fd, &zinfo, block_count - ZERO_DEV_BYTES, ZERO_DEV_BYTES, block_count); if (ret < 0) { diff --git a/common/device-utils.h b/common/device-utils.h index 885a46937e0d..7d5b622b8957 100644 --- a/common/device-utils.h +++ b/common/device-utils.h @@ -26,6 +26,7 @@ #define PREP_DEVICE_HMZONED (1U << 3) int discard_blocks(int fd, u64 start, u64 len); +int zero_blocks(int fd, off_t start, size_t len); u64 get_partition_size(const char *dev); u64 disk_size(const char *path); u64 btrfs_device_size(int fd, struct stat *st); diff --git a/common/hmzoned.c b/common/hmzoned.c index 70de111f22da..12eb8f551853 100644 --- a/common/hmzoned.c +++ b/common/hmzoned.c @@ -243,3 +243,32 @@ int btrfs_discard_all_zones(int fd, struct btrfs_zone_info *zinfo) return 0; } + +int zero_zone_blocks(int fd, struct btrfs_zone_info *zinfo, off_t start, + size_t len) +{ + size_t zone_len = zinfo->zone_size; + off_t ofst = start; + size_t count; + int ret; + + /* Make sure that zero_blocks does not write sequential zones */ + while (len > 0) { + + /* Limit zero_blocks to a single zone */ + count = min_t(size_t, len, zone_len); + if (count > zone_len - (ofst & (zone_len - 1))) + count = zone_len - (ofst & (zone_len - 1)); + + if (!zone_is_sequential(zinfo, ofst)) { + ret = zero_blocks(fd, ofst, count); + if (ret != 0) + return ret; + } + + len -= count; + ofst += count; + } + + return 0; +} diff --git a/common/hmzoned.h b/common/hmzoned.h index c4e20ae71d21..75812716ffd9 100644 --- a/common/hmzoned.h +++ b/common/hmzoned.h @@ -57,6 +57,8 @@ int btrfs_get_zone_info(int fd, const char *file, bool hmzoned, #ifdef BTRFS_ZONED bool zone_is_sequential(struct btrfs_zone_info *zinfo, u64 bytenr); int btrfs_discard_all_zones(int fd, struct btrfs_zone_info *zinfo); +int zero_zone_blocks(int fd, struct btrfs_zone_info *zinfo, off_t start, + size_t len); #else static inline bool zone_is_sequential(struct btrfs_zone_info *zinfo, u64 bytenr) @@ -67,6 +69,11 @@ static inline int btrfs_discard_all_zones(int fd, struct btrfs_zone_info *zinfo) { return -EOPNOTSUPP; } +static int zero_zone_blocks(int fd, struct btrfs_zone_info *zinfo, off_t start, + size_t len) +{ + return -EOPNOTSUPP; +} #endif /* BTRFS_ZONED */ #endif /* __BTRFS_HMZONED_H__ */ From patchwork Tue Aug 20 04:52:53 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 11102863 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E6C431395 for ; Tue, 20 Aug 2019 04:53:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C627A22CF5 for ; Tue, 20 Aug 2019 04:53:24 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="X1I7QPOD" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729210AbfHTExX (ORCPT ); Tue, 20 Aug 2019 00:53:23 -0400 Received: from esa5.hgst.iphmx.com ([216.71.153.144]:11098 "EHLO esa5.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729190AbfHTExX (ORCPT ); Tue, 20 Aug 2019 00:53:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1566276803; x=1597812803; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ai677rvWpTtGocwHpjzEcXgOD/ERjn3uyQqzNuN/eTU=; b=X1I7QPODNrmP5n6qfVBx9Jvz5J7EvrIL6XYK+X9FgWvrTnm2xaiXjZ0H ALfFPtW7XB074h7qIG9IhDxoUTEaUn0U4QHd/RtjnGKx5QvNRrIBrXYXN 7hC8++H444bfKj0aOQhB4RcmZdcJ7DoLOCGkUjZGA7tUG2S0CUohd9mLz vndDsKne70noXYBV0Im2kVW2EwOaoPVM6QWYUkhulCGjyX2oIUc9ARUgE YZtYxLu3zcZrKaIKKO9s8JDkHhrJfO9CJWSJYH1nyhrfuHaSk+I4thAGm dp0k5LsJQa3kS4yrw6zW1jdRrHRMPDOhuFR5JW0IYNCKi1NKNWrUWZfpE w==; IronPort-SDR: TRTg6VnBOEZ6l4Q2nT4YsuoCaSb/cm7fp8kX8vEwMIABvSudHq9Fi4NJaoZ6F4iaaA7kkmsqlw byYj6vCkY+YPLrV/Cc7qN/mNi3MvSf5AvHFdoFvGheuIhKOBPc8j6DhuYm+x/335TqGo9yK7B6 HpeHYPpgjk/drkLZLI7B4N6bQ/otos8UW183kXl0m789JHE5Myo7apwTQWP1HtFGv3vO4qhsEh djqUyszdqhEKkhmr+Ti18907RUpvkML6SygTVsfSk2WkfP8172lWbgNSbWJ7/WNH3HTouprI3B fSk= X-IronPort-AV: E=Sophos;i="5.64,407,1559491200"; d="scan'208";a="117136311" Received: from uls-op-cesaip01.wdc.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 20 Aug 2019 12:53:22 +0800 IronPort-SDR: RZ1QMgk56b6PgUsx0kcPjnWs6iOSsTM2xUIi3GTnegtWvS7KFr0PKYPFDwcfF39RYf5SQIuY7d 4m2wMDyB5B7xwC6iLf1WjdzRbmU9j0yrwPui9gT2llyEaCEYZslbZJZicyOC0zer8VA/mZk+Lu gGHmRkls7SUomrjCmwcKtoGhtFXPC3yYYBfWmvqE491I/xyO3xBhvZexQvYtfF666B1yfIYZUo HE8BfSVjyChVAD8zsRU0JJ8AGA/mO9pnZeFmdgKibRMy868+lTN1SY6MUmOd1grKejfyseUE8b N5G7ljFPwhLB00DvPLJf3Um0 Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Aug 2019 21:50:46 -0700 IronPort-SDR: my0DR+dwR3izy2tnqKK06TihZuC7EGipzY1zZVH2EeeO/ltdnvGFSBNMOSUjnPWPhhjRFhYQBq w6gUE2wod+GitNZhcZ0rbx/uDR4bef+bmzeJgPq4RV6kOVpcx/ugW3z4aAwqYbkDIttHhzmOBR dPXYBdJbzEW/+KdLxQ/L4mAh0TskHsS57qj1mKN9iujgPAgPoUQLJBe+rQF1SiztF0mZlJWc6B WIfwiilrc6IunTfGEBeXtsJL8RdJFKEtnFL2NbHmCRGmp7ctWkZ033tHRFR/RS54znIb1V4Exx /EM= Received: from naota.dhcp.fujisawa.hgst.com (HELO naota.fujisawa.hgst.com) ([10.149.53.115]) by uls-op-cesaip02.wdc.com with ESMTP; 19 Aug 2019 21:53:19 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org, David Sterba Cc: Chris Mason , Josef Bacik , Nikolay Borisov , Damien Le Moal , Matias Bjorling , Johannes Thumshirn , Hannes Reinecke , linux-fsdevel@vger.kernel.org, Naohiro Aota Subject: [PATCH v3 10/15] btrfs-progs: align device extent allocation to zone boundary Date: Tue, 20 Aug 2019 13:52:53 +0900 Message-Id: <20190820045258.1571640-11-naohiro.aota@wdc.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20190820045258.1571640-1-naohiro.aota@wdc.com> References: <20190820045258.1571640-1-naohiro.aota@wdc.com> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org In HMZONED mode, align the device extents to zone boundaries so that a zone reset affects only the device extent and does not change the state of blocks in the neighbor device extents. Also, check that a region allocation is always over empty same-type zones and it is not over any locations of super block copies. Signed-off-by: Naohiro Aota --- common/hmzoned.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++++ common/hmzoned.h | 2 ++ kerncompat.h | 2 ++ volumes.c | 72 ++++++++++++++++++++++++++++++++++++++++------ volumes.h | 7 +++++ 5 files changed, 149 insertions(+), 8 deletions(-) diff --git a/common/hmzoned.c b/common/hmzoned.c index 12eb8f551853..b1d9f5574d35 100644 --- a/common/hmzoned.c +++ b/common/hmzoned.c @@ -26,6 +26,8 @@ #include "common/messages.h" #include "mkfs/common.h" #include "common/hmzoned.h" +#include "volumes.h" +#include "disk-io.h" #define BTRFS_REPORT_NR_ZONES 8192 @@ -272,3 +274,75 @@ int zero_zone_blocks(int fd, struct btrfs_zone_info *zinfo, off_t start, return 0; } + +static inline bool btrfs_dev_is_empty_zone(struct btrfs_device *device, u64 pos) +{ + struct btrfs_zone_info *zinfo = &device->zone_info; + unsigned int zno; + + if (!zone_is_sequential(zinfo, pos)) + return true; + + zno = pos / zinfo->zone_size; + return zinfo->zones[zno].cond == BLK_ZONE_COND_EMPTY; +} + + +/* + * btrfs_check_allocatable_zones - check if spcecifeid region is + * suitable for allocation + * @device: the device to allocate a region + * @pos: the position of the region + * @num_bytes: the size of the region + * + * In non-ZONED device, anywhere is suitable for allocation. In ZONED + * device, check if + * 1) the region is not on non-empty zones, + * 2) all zones in the region have the same zone type, + * 3) it does not contain super block location, if the zones are + * sequential. + */ +bool btrfs_check_allocatable_zones(struct btrfs_device *device, u64 pos, + u64 num_bytes) +{ + struct btrfs_zone_info *zinfo = &device->zone_info; + u64 nzones, begin, end; + u64 sb_pos; + bool is_sequential; + int i; + + if (zinfo->model == ZONED_NONE) + return true; + + nzones = num_bytes / zinfo->zone_size; + begin = pos / zinfo->zone_size; + end = begin + nzones; + + ASSERT(IS_ALIGNED(pos, zinfo->zone_size)); + ASSERT(IS_ALIGNED(num_bytes, zinfo->zone_size)); + + if (end > zinfo->nr_zones) + return false; + + is_sequential = btrfs_dev_is_sequential(device, pos); + if (is_sequential) { + for (i = 0; i < BTRFS_SUPER_MIRROR_MAX; i++) { + sb_pos = btrfs_sb_offset(i); + if (!(sb_pos + BTRFS_SUPER_INFO_SIZE <= pos || + pos + end <= sb_pos)) + return false; + } + } + + while (num_bytes) { + if (!btrfs_dev_is_empty_zone(device, pos)) + return false; + if (is_sequential != btrfs_dev_is_sequential(device, pos)) + return false; + + pos += zinfo->zone_size; + num_bytes -= zinfo->zone_size; + } + + return true; +} diff --git a/common/hmzoned.h b/common/hmzoned.h index 75812716ffd9..93759291871f 100644 --- a/common/hmzoned.h +++ b/common/hmzoned.h @@ -53,6 +53,8 @@ enum btrfs_zoned_model zoned_model(const char *file); size_t zone_size(const char *file); int btrfs_get_zone_info(int fd, const char *file, bool hmzoned, struct btrfs_zone_info *zinfo); +bool btrfs_check_allocatable_zones(struct btrfs_device *device, u64 pos, + u64 num_bytes); #ifdef BTRFS_ZONED bool zone_is_sequential(struct btrfs_zone_info *zinfo, u64 bytenr); diff --git a/kerncompat.h b/kerncompat.h index 9fdc58e25d43..b5105df20a3e 100644 --- a/kerncompat.h +++ b/kerncompat.h @@ -28,6 +28,7 @@ #include #include #include +#include #include #include @@ -345,6 +346,7 @@ static inline void assert_trace(const char *assertion, const char *filename, /* Alignment check */ #define IS_ALIGNED(x, a) (((x) & ((typeof(x))(a) - 1)) == 0) +#define ALIGN(x, a) __ALIGN_KERNEL((x), (a)) static inline int is_power_of_2(unsigned long n) { diff --git a/volumes.c b/volumes.c index a0ebed547faa..14ca3df0efdb 100644 --- a/volumes.c +++ b/volumes.c @@ -465,6 +465,7 @@ static int find_free_dev_extent_start(struct btrfs_device *device, int slot; struct extent_buffer *l; u64 min_search_start; + u64 zone_size; /* * We don't want to overwrite the superblock on the drive nor any area @@ -473,6 +474,13 @@ static int find_free_dev_extent_start(struct btrfs_device *device, */ min_search_start = max(root->fs_info->alloc_start, (u64)SZ_1M); search_start = max(search_start, min_search_start); + /* + * For a zoned block device, skip the first zone of the device + * entirely. + */ + zone_size = device->zone_info.zone_size; + search_start = max_t(u64, search_start, zone_size); + search_start = btrfs_zone_align(device, search_start); path = btrfs_alloc_path(); if (!path) @@ -481,6 +489,7 @@ static int find_free_dev_extent_start(struct btrfs_device *device, max_hole_start = search_start; max_hole_size = 0; +again: if (search_start >= search_end) { ret = -ENOSPC; goto out; @@ -525,6 +534,13 @@ static int find_free_dev_extent_start(struct btrfs_device *device, goto next; if (key.offset > search_start) { + if (!btrfs_check_allocatable_zones(device, search_start, + num_bytes)) { + search_start += zone_size; + btrfs_release_path(path); + goto again; + } + hole_size = key.offset - search_start; /* @@ -567,6 +583,13 @@ next: * search_end may be smaller than search_start. */ if (search_end > search_start) { + if (!btrfs_check_allocatable_zones(device, search_start, + num_bytes)) { + search_start += zone_size; + btrfs_release_path(path); + goto again; + } + hole_size = search_end - search_start; if (hole_size > max_hole_size) { @@ -610,6 +633,10 @@ int btrfs_insert_dev_extent(struct btrfs_trans_handle *trans, struct extent_buffer *leaf; struct btrfs_key key; + /* Check alignment to zone for a zoned block device */ + ASSERT(device->zone_info.model != ZONED_HOST_MANAGED || + IS_ALIGNED(start, device->zone_info.zone_size)); + path = btrfs_alloc_path(); if (!path) return -ENOMEM; @@ -1012,17 +1039,13 @@ int btrfs_alloc_chunk(struct btrfs_trans_handle *trans, int max_stripes = 0; int min_stripes = 1; int sub_stripes; /* sub_stripes info for map */ - int dev_stripes __attribute__((unused)); - /* stripes per dev */ + int dev_stripes; /* stripes per dev */ int devs_max; /* max devs to use */ - int devs_min __attribute__((unused)); - /* min devs needed */ + int devs_min; /* min devs needed */ int devs_increment __attribute__((unused)); /* ndevs has to be a multiple of this */ - int ncopies __attribute__((unused)); - /* how many copies to data has */ - int nparity __attribute__((unused)); - /* number of stripes worth of bytes to + int ncopies; /* how many copies to data has */ + int nparity; /* number of stripes worth of bytes to store parity information */ int looped = 0; int ret; @@ -1030,6 +1053,8 @@ int btrfs_alloc_chunk(struct btrfs_trans_handle *trans, int stripe_len = BTRFS_STRIPE_LEN; struct btrfs_key key; u64 offset; + bool hmzoned = info->fs_devices->hmzoned; + u64 zone_size = info->fs_devices->zone_size; if (list_empty(dev_list)) { return -ENOSPC; @@ -1116,13 +1141,39 @@ int btrfs_alloc_chunk(struct btrfs_trans_handle *trans, btrfs_super_stripesize(info->super_copy)); } + if (hmzoned) { + calc_size = zone_size; + max_chunk_size = round_down(max_chunk_size, zone_size); + } + /* we don't want a chunk larger than 10% of the FS */ percent_max = div_factor(btrfs_super_total_bytes(info->super_copy), 1); max_chunk_size = min(percent_max, max_chunk_size); + if (hmzoned) { + int min_num_stripes = devs_min * dev_stripes; + int min_data_stripes = (min_num_stripes - nparity) / ncopies; + u64 min_chunk_size = min_data_stripes * zone_size; + + max_chunk_size = max(round_down(max_chunk_size, + zone_size), + min_chunk_size); + } + again: if (chunk_bytes_by_type(type, calc_size, num_stripes, sub_stripes) > max_chunk_size) { + if (hmzoned) { + /* + * calc_size is fixed in HMZONED. Reduce + * num_stripes instead. + */ + num_stripes = max_chunk_size / calc_size; + if (num_stripes < min_stripes) + return -ENOSPC; + goto again; + } + calc_size = max_chunk_size; calc_size /= num_stripes; calc_size /= stripe_len; @@ -1133,6 +1184,9 @@ again: calc_size /= stripe_len; calc_size *= stripe_len; + + ASSERT(!hmzoned || calc_size == zone_size); + INIT_LIST_HEAD(&private_devs); cur = dev_list->next; index = 0; @@ -1214,6 +1268,8 @@ again: if (ret < 0) goto out_chunk_map; + ASSERT(!zone_size || IS_ALIGNED(dev_offset, zone_size)); + device->bytes_used += calc_size; ret = btrfs_update_device(trans, device); if (ret < 0) diff --git a/volumes.h b/volumes.h index b5e7a07df5a8..d1326d068ca3 100644 --- a/volumes.h +++ b/volumes.h @@ -323,4 +323,11 @@ static inline bool btrfs_dev_is_sequential(struct btrfs_device *device, u64 pos) { return zone_is_sequential(&device->zone_info, pos); } +static inline u64 btrfs_zone_align(struct btrfs_device *device, u64 pos) +{ + if (device->zone_info.model == ZONED_NONE) + return pos; + + return ALIGN(pos, device->zone_info.zone_size); +} #endif From patchwork Tue Aug 20 04:52:54 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 11102869 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3F96214DB for ; Tue, 20 Aug 2019 04:53:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 14C6F22CF4 for ; Tue, 20 Aug 2019 04:53:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="M1XbDQ5U" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729222AbfHTExZ (ORCPT ); Tue, 20 Aug 2019 00:53:25 -0400 Received: from esa5.hgst.iphmx.com ([216.71.153.144]:11098 "EHLO esa5.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729190AbfHTExZ (ORCPT ); Tue, 20 Aug 2019 00:53:25 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1566276804; x=1597812804; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=6wqBtM5vCnBfxci3OwPFbWOSVcPwW9ETDEy611DquL4=; b=M1XbDQ5UJ2SR7MJ/MQq5zP3c/mHThRpm6mRs6PyJTeMoTB4gjuzIXKDE CGnrv3rnd8SdFlBNZ/KQqPJDTwk5liWOjKml4kynsvt/AUpJme6N+TTST i2OhVj77lGhETCZPYsHAveVN9knnCn00pL0Q+KtEWHylfGX/HYGZiO5S7 QV3W9Fns9stlefj9npfAy8jgWwLenQ/rP4Y4HWXFkNsge1YIBwTbIJBt6 1Hs6ZiBiVc29aFoTubXB/N/Gna/yE50Ha5T2jlZ4DLQW6+AsiV4Skx90L GJqjyGr6Ehe5gpSv5jJTJcbCUffNSPqvKTTKlxn/EzJPM3fW/xN0kegbs w==; IronPort-SDR: hYXu+b47PBKytIqWwi1Ho9OYDKpbEGJdMl5/aw11D5X8LNYGggcS7aeyJbOW1D6ov24Kc6H+AJ /kC1E7T4bhlZpOtg7LVQaAY5COLxyERECqxbNuF2b+1TWBwsgKNdV/aFDN65+2Yzwz1n0CCdZw 9AbmvLetHty4JXS2KGYSaJYmk/a3eSHY8/uym+KKEswRE5hjNzdyIxDL1W+UfzWB7O/tUWC3LS zFu6tDs8B+9wDUa+LkW7I9oJTnUYTRQ+/f6d+V9+Yc+KJhkOZQMx1iv+GrivK1SO22o4Wex8mx 2QM= X-IronPort-AV: E=Sophos;i="5.64,407,1559491200"; d="scan'208";a="117136314" Received: from uls-op-cesaip01.wdc.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 20 Aug 2019 12:53:24 +0800 IronPort-SDR: RXvp/CUKsTFaSXSRSy3mASo8npG1uB3zR92AohctyapfQWx0I56eqrB5pTAGxRMshcMGXLRh+L N7yfy9vWFl6eAd+1AKI055omsBiwixU82qvth8jxiYmQixHkcKhS7K6BpYkVU+7gBu8UJHNRzY Stlyg1zofD6G9n3LjT+1vCQrTnYSiWEeBPO5zQK4tm2RKa940BXbfrdbWCOuhlwIBC1exfmdGg hc4xPmHVIkDTQwh2TswzalgxMLjqCSnWKHvjtsSV2cY+0MAo0kPs0k5DXeqtu6A69+TBT3j4y7 lngcAWUUS/EGN4uEQKZ2aeop Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Aug 2019 21:50:48 -0700 IronPort-SDR: I+fkYedZC88d6yxFs0d42SQhrxqFWEOgd2XYptylaI+iur4QTqgT5fSPps2gOh98eNnQD9iy1N ys1FAA0gVBW7CM3Our20Eq7s9xZ4amyi0vWiB1FX4kOdHDfSV7GG/ygsBMtgmE0dw7HVoaWanX lEm1SrW06ewgHeVSlS+Lodd8mnAKcpSWptsPyfSya8/VCVdKF4PaPjUupSvFwzBSFJDYCgHAbY 2plRyAtyfhrrH1hvs9aw5FX2xNWp/SJMajvLj+xR6DkukKqceYkZIfNqFR/LwYRKKI/Pc8W62K 7r4= Received: from naota.dhcp.fujisawa.hgst.com (HELO naota.fujisawa.hgst.com) ([10.149.53.115]) by uls-op-cesaip02.wdc.com with ESMTP; 19 Aug 2019 21:53:21 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org, David Sterba Cc: Chris Mason , Josef Bacik , Nikolay Borisov , Damien Le Moal , Matias Bjorling , Johannes Thumshirn , Hannes Reinecke , linux-fsdevel@vger.kernel.org, Naohiro Aota Subject: [PATCH v3 11/15] btrfs-progs: do sequential allocation in HMZONED mode Date: Tue, 20 Aug 2019 13:52:54 +0900 Message-Id: <20190820045258.1571640-12-naohiro.aota@wdc.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20190820045258.1571640-1-naohiro.aota@wdc.com> References: <20190820045258.1571640-1-naohiro.aota@wdc.com> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org On HMZONED drives, writes must always be sequential and directed at a block group zone write pointer position. Thus, block allocation in a block group must also be done sequentially using an allocation pointer equal to the block group zone write pointer plus the number of blocks allocated but not yet written. Signed-off-by: Naohiro Aota --- common/hmzoned.c | 212 +++++++++++++++++++++++++++++++++++++++++++++++ common/hmzoned.h | 7 ++ ctree.h | 16 ++++ extent-tree.c | 15 ++++ 4 files changed, 250 insertions(+) diff --git a/common/hmzoned.c b/common/hmzoned.c index b1d9f5574d35..0e54144259b7 100644 --- a/common/hmzoned.c +++ b/common/hmzoned.c @@ -31,6 +31,9 @@ #define BTRFS_REPORT_NR_ZONES 8192 +/* Invalid allocation pointer value for missing devices */ +#define WP_MISSING_DEV ((u64)-1) + enum btrfs_zoned_model zoned_model(const char *file) { char model[32]; @@ -346,3 +349,212 @@ bool btrfs_check_allocatable_zones(struct btrfs_device *device, u64 pos, return true; } + +int btrfs_load_block_group_zone_info(struct btrfs_fs_info *fs_info, + struct btrfs_block_group_cache *cache) +{ + struct btrfs_device *device; + struct btrfs_mapping_tree *map_tree = &fs_info->mapping_tree; + struct cache_extent *ce; + struct map_lookup *map; + u64 logical = cache->key.objectid; + u64 length = cache->key.offset; + u64 physical = 0; + int ret = 0; + int alloc_type; + int i, j; + u64 zone_size = fs_info->fs_devices->zone_size; + u64 *alloc_offsets = NULL; + + if (!btrfs_fs_incompat(fs_info, HMZONED)) + return 0; + + /* Sanity check */ + if (logical == BTRFS_BLOCK_RESERVED_1M_FOR_SUPER) { + if (length + SZ_1M != zone_size) { + error("unaligned initial system block group"); + return -EIO; + } + } else if (!IS_ALIGNED(length, zone_size)) { + error("unaligned block group at %llu + %llu", logical, length); + return -EIO; + } + + /* Get the chunk mapping */ + ce = search_cache_extent(&map_tree->cache_tree, logical); + if (!ce) { + error("failed to find block group at %llu", logical); + return -ENOENT; + } + map = container_of(ce, struct map_lookup, ce); + + /* + * Get the zone type: if the group is mapped to a non-sequential zone, + * there is no need for the allocation offset (fit allocation is OK). + */ + alloc_type = -1; + alloc_offsets = calloc(map->num_stripes, sizeof(*alloc_offsets)); + if (!alloc_offsets) { + error("failed to allocate alloc_offsets"); + return -ENOMEM; + } + + for (i = 0; i < map->num_stripes; i++) { + bool is_sequential; + struct blk_zone zone; + + device = map->stripes[i].dev; + physical = map->stripes[i].physical; + + is_sequential = btrfs_dev_is_sequential(device, physical); + if (alloc_type == -1) + alloc_type = is_sequential ? + BTRFS_ALLOC_SEQ : BTRFS_ALLOC_FIT; + + if ((is_sequential && alloc_type != BTRFS_ALLOC_SEQ) || + (!is_sequential && alloc_type == BTRFS_ALLOC_SEQ)) { + error("found block group of mixed zone types"); + ret = -EIO; + goto out; + } + + if (!is_sequential) + continue; + + /* + * The group is mapped to a sequential zone. Get the zone write + * pointer to determine the allocation offset within the zone. + */ + WARN_ON(!IS_ALIGNED(physical, zone_size)); + zone = device->zone_info.zones[physical / zone_size]; + + switch (zone.cond) { + case BLK_ZONE_COND_OFFLINE: + case BLK_ZONE_COND_READONLY: + error("Offline/readonly zone %llu", + physical / fs_info->fs_devices->zone_size); + ret = -EIO; + goto out; + case BLK_ZONE_COND_EMPTY: + alloc_offsets[i] = 0; + break; + case BLK_ZONE_COND_FULL: + alloc_offsets[i] = zone_size; + break; + default: + /* Partially used zone */ + alloc_offsets[i] = ((zone.wp - zone.start) << 9); + break; + } + } + + if (alloc_type == BTRFS_ALLOC_FIT) + goto out; + + switch (map->type & BTRFS_BLOCK_GROUP_PROFILE_MASK) { + case 0: /* single */ + case BTRFS_BLOCK_GROUP_DUP: + case BTRFS_BLOCK_GROUP_RAID1: + cache->alloc_offset = WP_MISSING_DEV; + for (i = 0; i < map->num_stripes; i++) { + if (alloc_offsets[i] == WP_MISSING_DEV) + continue; + if (cache->alloc_offset == WP_MISSING_DEV) + cache->alloc_offset = alloc_offsets[i]; + if (alloc_offsets[i] == cache->alloc_offset) + continue; + + error("write pointer mismatch: block group %llu", + logical); + ret = -EIO; + goto out; + } + break; + case BTRFS_BLOCK_GROUP_RAID0: + cache->alloc_offset = 0; + for (i = 0; i < map->num_stripes; i++) { + if (alloc_offsets[i] == WP_MISSING_DEV) { + error("cannot recover write pointer: block group %llu", + logical); + ret = -EIO; + goto out; + } + + if (alloc_offsets[0] < alloc_offsets[i]) { + error( + "write pointer mismatch: block group %llu", + logical); + ret = -EIO; + goto out; + + } + + cache->alloc_offset += alloc_offsets[i]; + } + break; + case BTRFS_BLOCK_GROUP_RAID10: + /* + * Pass1: check write pointer of RAID1 level: each pointer + * should be equal. + */ + for (i = 0; i < map->num_stripes / map->sub_stripes; i++) { + int base = i * map->sub_stripes; + u64 offset = WP_MISSING_DEV; + + for (j = 0; j < map->sub_stripes; j++) { + if (alloc_offsets[base + j] == WP_MISSING_DEV) + continue; + if (offset == WP_MISSING_DEV) + offset = alloc_offsets[base+j]; + if (alloc_offsets[base + j] == offset) + continue; + + error( + "write pointer mismatch: block group %llu", + logical); + ret = -EIO; + goto out; + } + for (j = 0; j < map->sub_stripes; j++) + alloc_offsets[base + j] = offset; + } + + /* Pass2: check write pointer of RAID1 level */ + cache->alloc_offset = 0; + for (i = 0; i < map->num_stripes / map->sub_stripes; i++) { + int base = i * map->sub_stripes; + + if (alloc_offsets[base] == WP_MISSING_DEV) { + error( + "cannot recover write pointer: block group %llu", + logical); + ret = -EIO; + goto out; + } + + if (alloc_offsets[0] < alloc_offsets[base]) { + error( + "write pointer mismatch: block group %llu", + logical); + ret = -EIO; + goto out; + } + + cache->alloc_offset += alloc_offsets[base]; + } + break; + case BTRFS_BLOCK_GROUP_RAID5: + case BTRFS_BLOCK_GROUP_RAID6: + /* RAID5/6 is not supported yet */ + default: + error("Unsupported profile %llu", + map->type & BTRFS_BLOCK_GROUP_PROFILE_MASK); + ret = -EINVAL; + goto out; + } + +out: + cache->alloc_type = alloc_type; + free(alloc_offsets); + return ret; +} diff --git a/common/hmzoned.h b/common/hmzoned.h index 93759291871f..dca7588f840b 100644 --- a/common/hmzoned.h +++ b/common/hmzoned.h @@ -61,6 +61,8 @@ bool zone_is_sequential(struct btrfs_zone_info *zinfo, u64 bytenr); int btrfs_discard_all_zones(int fd, struct btrfs_zone_info *zinfo); int zero_zone_blocks(int fd, struct btrfs_zone_info *zinfo, off_t start, size_t len); +int btrfs_load_block_group_zone_info(struct btrfs_fs_info *fs_info, + struct btrfs_block_group_cache *cache); #else static inline bool zone_is_sequential(struct btrfs_zone_info *zinfo, u64 bytenr) @@ -76,6 +78,11 @@ static int zero_zone_blocks(int fd, struct btrfs_zone_info *zinfo, off_t start, { return -EOPNOTSUPP; } +static inline int btrfs_load_block_group_zone_info( + struct btrfs_fs_info *fs_info, struct btrfs_block_group_cache *cache) +{ + return 0; +} #endif /* BTRFS_ZONED */ #endif /* __BTRFS_HMZONED_H__ */ diff --git a/ctree.h b/ctree.h index a56e18119069..d38708b8a6c5 100644 --- a/ctree.h +++ b/ctree.h @@ -1087,6 +1087,20 @@ struct btrfs_space_info { struct list_head list; }; +/* Block group allocation types */ +enum btrfs_alloc_type { + + /* Regular first fit allocation */ + BTRFS_ALLOC_FIT = 0, + + /* + * Sequential allocation: this is for HMZONED mode and + * will result in ignoring free space before a block + * group allocation offset. + */ + BTRFS_ALLOC_SEQ = 1, +}; + struct btrfs_block_group_cache { struct cache_extent cache; struct btrfs_key key; @@ -1109,6 +1123,8 @@ struct btrfs_block_group_cache { */ u32 bitmap_low_thresh; + enum btrfs_alloc_type alloc_type; + u64 alloc_offset; }; struct btrfs_device; diff --git a/extent-tree.c b/extent-tree.c index 932af2c644bd..35fddfbd9acc 100644 --- a/extent-tree.c +++ b/extent-tree.c @@ -251,6 +251,14 @@ again: if (cache->ro || !block_group_bits(cache, data)) goto new_group; + if (cache->alloc_type == BTRFS_ALLOC_SEQ) { + if (cache->key.offset - cache->alloc_offset < num) + goto new_group; + *start_ret = cache->key.objectid + cache->alloc_offset; + cache->alloc_offset += num; + return 0; + } + while(1) { ret = find_first_extent_bit(&root->fs_info->free_space_cache, last, &start, &end, EXTENT_DIRTY); @@ -2724,6 +2732,10 @@ int btrfs_read_block_groups(struct btrfs_root *root) BUG_ON(ret); cache->space_info = space_info; + ret = btrfs_load_block_group_zone_info(info, cache); + if (ret) + goto error; + /* use EXTENT_LOCKED to prevent merging */ set_extent_bits(block_group_cache, found_key.objectid, found_key.objectid + found_key.offset - 1, @@ -2753,6 +2765,9 @@ btrfs_add_block_group(struct btrfs_fs_info *fs_info, u64 bytes_used, u64 type, cache->key.objectid = chunk_offset; cache->key.offset = size; + ret = btrfs_load_block_group_zone_info(fs_info, cache); + BUG_ON(ret); + cache->key.type = BTRFS_BLOCK_GROUP_ITEM_KEY; btrfs_set_block_group_used(&cache->item, bytes_used); btrfs_set_block_group_chunk_objectid(&cache->item, From patchwork Tue Aug 20 04:52:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 11102871 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F11F71395 for ; Tue, 20 Aug 2019 04:53:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D086C20C01 for ; Tue, 20 Aug 2019 04:53:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="dLd9uv4V" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729228AbfHTEx1 (ORCPT ); Tue, 20 Aug 2019 00:53:27 -0400 Received: from esa5.hgst.iphmx.com ([216.71.153.144]:11098 "EHLO esa5.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729190AbfHTEx0 (ORCPT ); Tue, 20 Aug 2019 00:53:26 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1566276806; x=1597812806; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SAnP0DvB0b+BvO+0g0H5T4EmJJtUjqsPCUh699GxsZU=; b=dLd9uv4V+30LsUv8mNgYZ0u0xbcF2Wk5ZN5w+lADdHMs1Mmo6fXgXYeF ltOhN+tLNZzdSfChoBd/y8ZgY1PTpm9efgZqMdnDGmkuY+1ssNV5bZxML cYYkrpUfm9/TdtSak/iBnYqVOPe3plgnLAFuy4JfaMu0o+X0TdzzkhJlo iXc4DZyCPs1OdN52WsxHmTO4H1gbjkj2QIyVe6zuGappEtsv6BdUGLt9K raiDWqwoPJ8n1WiHKVDa+jgUODEdt6+rTs6peDdgN2jnWDkx9DOITkEyY YZ9wUhVspjDQp9Re/O/+zCMsAU6t8NfW3EwOtayo47Xs1m/9zB2G/3ZyH w==; IronPort-SDR: RgyG8UmN0lFL36rAm9aKE8Cm4ZIyPBNlLId1B7Oy29qvKgp2RAbr3QVA/Uu3yy8nUmvMLDQZTL nSLEBNpb7QL0vkYKAnfr79I2eqCyG/y0qnOfx8dPd2s1u/usYLLP6t3Mkt1mEyd7m6AF3KvJnl hy0PeYUGOLTe1l5CRVOtnU/5XcTeR7MFRFH64SOlmOSJDm3GM9grAGos+OgUVibMhdyLJS6DHa TD6HgEJbdyveD36Fvu6PxIwiFJx61yO5xpAM7F9Te5RL1W8uo57hU3+ro3kCMm0WQgEYHFfvgp AFo= X-IronPort-AV: E=Sophos;i="5.64,407,1559491200"; d="scan'208";a="117136317" Received: from uls-op-cesaip01.wdc.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 20 Aug 2019 12:53:26 +0800 IronPort-SDR: gs+1qRxJm2mvn2AN21korKrJErDYyjkHewc+lx67AMSo7WpVCwaYjeP7g3RboZp122gcOjSNQb Noc7n/7kQbHcPj4qtbkzxZQHdfKJPbise0nhH6+nO7PyHUP9fn9Foeqf+IyyG/+ZH7ZAsoutHz h2XfCGnAV0jJs+pUbHk0xl8ipG5JAJGyUeatStOx4p7ZiXbTo0YLkgFqBI6fOMcthy57hvhhkV HxMzF5Q6amg3Mv5nuoCI0V1mFzsfxyvxctsGXfi9+yuHlZ4IVFWEWXrT721fH1B3aeYyvUxLwu +7WZPMmUCg3HZm0Hhq9HWO8z Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Aug 2019 21:50:50 -0700 IronPort-SDR: VIchjVqil0QIEtBBpawPK1VhwNh3EOikcg1xX0rBJ7YU5Y5W0sEzKr+D45MWvyTHgMps3osHIt NeBQXFjfGYhX/lwjzJYv1Qa4qWggkRlQ3dJjvld62r/HAxXM4e78JiGN8PrFqfxVNF+6YxOtha 21n8OzTQalQoQ2jJwQGQC+EXiT1MW0v0ly8x2KCUZPO94VqL2SeF6pl4EPpEoXpfY//IUae+L0 gqEUvaBPPVB/Aie7clDT4HjekEuFoPxKxjO4ZKOufUOG07AhW5JpvLasPB7XnBrCG68Bedulr2 qHc= Received: from naota.dhcp.fujisawa.hgst.com (HELO naota.fujisawa.hgst.com) ([10.149.53.115]) by uls-op-cesaip02.wdc.com with ESMTP; 19 Aug 2019 21:53:23 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org, David Sterba Cc: Chris Mason , Josef Bacik , Nikolay Borisov , Damien Le Moal , Matias Bjorling , Johannes Thumshirn , Hannes Reinecke , linux-fsdevel@vger.kernel.org, Naohiro Aota Subject: [PATCH v3 12/15] btrfs-progs: redirty clean extent buffers in seq Date: Tue, 20 Aug 2019 13:52:55 +0900 Message-Id: <20190820045258.1571640-13-naohiro.aota@wdc.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20190820045258.1571640-1-naohiro.aota@wdc.com> References: <20190820045258.1571640-1-naohiro.aota@wdc.com> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Tree manipulating operations like merging nodes often release once-allocated tree nodes. Btrfs cleans such nodes so that pages in the node are not uselessly written out. On HMZONED drives, however, such optimization blocks the following IOs as the cancellation of the write out of the freed blocks breaks the sequential write sequence expected by the device. This patch check if next dirty extent buffer is continuous to a previously written one. If not, it redirty extent buffers between the previous one and the next one, so that all dirty buffers are written sequentially. Signed-off-by: Naohiro Aota --- common/hmzoned.c | 28 ++++++++++++++++++++++++++++ common/hmzoned.h | 2 ++ ctree.h | 1 + transaction.c | 7 +++++++ 4 files changed, 38 insertions(+) diff --git a/common/hmzoned.c b/common/hmzoned.c index 0e54144259b7..1b3830b429ab 100644 --- a/common/hmzoned.c +++ b/common/hmzoned.c @@ -555,6 +555,34 @@ int btrfs_load_block_group_zone_info(struct btrfs_fs_info *fs_info, out: cache->alloc_type = alloc_type; + cache->write_offset = cache->alloc_offset; free(alloc_offsets); return ret; } + +bool btrfs_redirty_extent_buffer_for_hmzoned(struct btrfs_fs_info *fs_info, + u64 start, u64 end) +{ + u64 next; + struct btrfs_block_group_cache *cache; + struct extent_buffer *eb; + + cache = btrfs_lookup_first_block_group(fs_info, start); + BUG_ON(!cache); + + if (cache->alloc_type != BTRFS_ALLOC_SEQ) + return false; + + if (cache->key.objectid + cache->write_offset < start) { + next = cache->key.objectid + cache->write_offset; + BUG_ON(next + fs_info->nodesize > start); + eb = btrfs_find_create_tree_block(fs_info, next); + btrfs_mark_buffer_dirty(eb); + free_extent_buffer(eb); + return true; + } + + cache->write_offset += (end + 1 - start); + + return false; +} diff --git a/common/hmzoned.h b/common/hmzoned.h index dca7588f840b..bcbf6ea15c0b 100644 --- a/common/hmzoned.h +++ b/common/hmzoned.h @@ -55,6 +55,8 @@ int btrfs_get_zone_info(int fd, const char *file, bool hmzoned, struct btrfs_zone_info *zinfo); bool btrfs_check_allocatable_zones(struct btrfs_device *device, u64 pos, u64 num_bytes); +bool btrfs_redirty_extent_buffer_for_hmzoned(struct btrfs_fs_info *fs_info, + u64 start, u64 end); #ifdef BTRFS_ZONED bool zone_is_sequential(struct btrfs_zone_info *zinfo, u64 bytenr); diff --git a/ctree.h b/ctree.h index d38708b8a6c5..cd315814614a 100644 --- a/ctree.h +++ b/ctree.h @@ -1125,6 +1125,7 @@ struct btrfs_block_group_cache { enum btrfs_alloc_type alloc_type; u64 alloc_offset; + u64 write_offset; }; struct btrfs_device; diff --git a/transaction.c b/transaction.c index 45bb9e1f9de6..7b37f12f118f 100644 --- a/transaction.c +++ b/transaction.c @@ -18,6 +18,7 @@ #include "disk-io.h" #include "transaction.h" #include "delayed-ref.h" +#include "common/hmzoned.h" #include "common/messages.h" @@ -136,10 +137,16 @@ int __commit_transaction(struct btrfs_trans_handle *trans, int ret; while(1) { +again: ret = find_first_extent_bit(tree, 0, &start, &end, EXTENT_DIRTY); if (ret) break; + + if (btrfs_redirty_extent_buffer_for_hmzoned(fs_info, start, + end)) + goto again; + while(start <= end) { eb = find_first_extent_buffer(tree, start); BUG_ON(!eb || eb->start != start); From patchwork Tue Aug 20 04:52:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 11102875 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6DE6E1399 for ; Tue, 20 Aug 2019 04:53:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4307122CF5 for ; Tue, 20 Aug 2019 04:53:31 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="DUq7rG7Z" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729239AbfHTExa (ORCPT ); Tue, 20 Aug 2019 00:53:30 -0400 Received: from esa5.hgst.iphmx.com ([216.71.153.144]:11098 "EHLO esa5.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729190AbfHTEx3 (ORCPT ); Tue, 20 Aug 2019 00:53:29 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1566276808; x=1597812808; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VeiuaukNyeyG+n12dnCxZySC9iDVGiqt8CDL1UFy4l4=; b=DUq7rG7ZrSmbD2eWd7daYZfyepvBXP+Nstq0TKqtuTiv1FSvauN275p2 BoYq5/lzesrH67ThXuBMlld1s03GQMA+n1L5FBg8jTfJ2NJjEiNQI9Lx3 vFVUpkmBFVbaE2tNtPzHyeDcshPJ6TOhOY7FBwp3yZKmy5o/6aGMbPeG4 Igls5mBoWOABC+u5zxmLrT5x3q6G28awBFfL0fzT4z77ML6v016SKynpD XTDiUvHa4QIaufb1wrIXUdLTRtLREjb8QmYia2M0yw9kaXxJ2CWsfs+cA 7LD9LetUIKeilYoURmXzvsDXumjoWPbtLpfQbKMbiF1yDBBQvUnsS8Wue w==; IronPort-SDR: M8/dcpvze5YOYGECYjFR5qLvQE3v0o5N7atuPhNVM9V6cHf6ghn4nX7zhSXE/xcIWpzpdnAM8X P/0IosDLqC1+OB2e8KEYv4c3ZhC+S/aUfAI4QEFHnu3WyRnrODor89VgthbYthw7w6lTeXzqOi fxP5GOId6JIx7ul1LqRqXgBv1rWcF7o8du98QldYvOG5JKsW80xyhXKwDZHL6vP+a86oSaOLwP NUNpIVPZqE5UoKa7wq2Vii5znEGEitxRcLaNIklGxn+gUUDSNCkyHqus3mdyOtXz/RVxm0D53P 3fA= X-IronPort-AV: E=Sophos;i="5.64,407,1559491200"; d="scan'208";a="117136319" Received: from uls-op-cesaip01.wdc.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 20 Aug 2019 12:53:28 +0800 IronPort-SDR: sxV0h5eRJL54z5iTVLpnpGPxP+VWM06EZuiHN/HpNk5PG3okA5I7F1k9PydBuI+tTS0Xk+0frb 5G+ZTwukrQ+bJKRUpmq5am5rSg4EGJIoEcfy/6Jz4MDpt4YewIuvEjI4H7PwMkzLyYOE3PGsxo PmFVPwotBoXS9pdIAVPQiGlvp8+vd0OTIK5gjpiBt2nlEgx0LDmueT1DgfKLpjYcXOn6aEoHMi jaD5Wu+AS6oNS9sjCpQRvq5Myk0HXStDnUg8y0IAcuR8gqlZeWWCRJuragp3u2/KaT0QaMkUw6 /5gwSavsUa9UP5gXZ2QRqTE6 Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Aug 2019 21:50:52 -0700 IronPort-SDR: TW5ErjFy1AvR3O1QoM0dPp+EinlHtzbvqHvSDRipBP3TOjxM2GkiV1pItxZ/lW2gYuN+RHvky9 macoJdcsy2enyrj1Xcr8BO2IYNDE0mlyyPKQF1vJ0HgmOqmcE2zKVv194Gbx7uevK1QDw97+Ej x7qLWT9aBCojFiR5dzbkNLP0QATOoLU19+Me4LNrdU/hXGE3RXdJoSM8X8crgFclLW2glVpqV4 /XvCj/TnsSbI7bgqlfjZ+z0oLTB+t5gOXpDDcCUGuQKFQstM9iTO7IGeQm5cogpTUCFDo/anok xf4= Received: from naota.dhcp.fujisawa.hgst.com (HELO naota.fujisawa.hgst.com) ([10.149.53.115]) by uls-op-cesaip02.wdc.com with ESMTP; 19 Aug 2019 21:53:24 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org, David Sterba Cc: Chris Mason , Josef Bacik , Nikolay Borisov , Damien Le Moal , Matias Bjorling , Johannes Thumshirn , Hannes Reinecke , linux-fsdevel@vger.kernel.org, Naohiro Aota Subject: [PATCH v3 13/15] btrfs-progs: mkfs: Zoned block device support Date: Tue, 20 Aug 2019 13:52:56 +0900 Message-Id: <20190820045258.1571640-14-naohiro.aota@wdc.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20190820045258.1571640-1-naohiro.aota@wdc.com> References: <20190820045258.1571640-1-naohiro.aota@wdc.com> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org This patch makes the size of the temporary system group chunk equal to the device zone size. It also enables PREP_DEVICE_HMZONED if the user enables the HMZONED feature. Enabling HMZONED feature is done using option "-O hmzoned". This feature is incompatible for now with source directory setup. Signed-off-by: Naohiro Aota --- mkfs/common.c | 20 ++++++++++----- mkfs/common.h | 1 + mkfs/main.c | 67 +++++++++++++++++++++++++++++++++++++++++++-------- 3 files changed, 72 insertions(+), 16 deletions(-) diff --git a/mkfs/common.c b/mkfs/common.c index caca5e707233..6b5c5500da67 100644 --- a/mkfs/common.c +++ b/mkfs/common.c @@ -154,6 +154,7 @@ int make_btrfs(int fd, struct btrfs_mkfs_config *cfg) int skinny_metadata = !!(cfg->features & BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA); u64 num_bytes; + u64 system_group_size; buf = malloc(sizeof(*buf) + max(cfg->sectorsize, cfg->nodesize)); if (!buf) @@ -203,7 +204,10 @@ int make_btrfs(int fd, struct btrfs_mkfs_config *cfg) btrfs_set_super_stripesize(&super, cfg->stripesize); btrfs_set_super_csum_type(&super, BTRFS_CSUM_TYPE_CRC32); btrfs_set_super_chunk_root_generation(&super, 1); - btrfs_set_super_cache_generation(&super, -1); + if (cfg->features & BTRFS_FEATURE_INCOMPAT_HMZONED) + btrfs_set_super_cache_generation(&super, 0); + else + btrfs_set_super_cache_generation(&super, -1); btrfs_set_super_incompat_flags(&super, cfg->features); if (cfg->label) __strncpy_null(super.label, cfg->label, BTRFS_LABEL_SIZE - 1); @@ -314,12 +318,17 @@ int make_btrfs(int fd, struct btrfs_mkfs_config *cfg) btrfs_set_item_offset(buf, btrfs_item_nr(nritems), itemoff); btrfs_set_item_size(buf, btrfs_item_nr(nritems), item_size); + if (cfg->features & BTRFS_FEATURE_INCOMPAT_HMZONED) + system_group_size = cfg->zone_size - + BTRFS_BLOCK_RESERVED_1M_FOR_SUPER; + else + system_group_size = BTRFS_MKFS_SYSTEM_GROUP_SIZE; + dev_item = btrfs_item_ptr(buf, nritems, struct btrfs_dev_item); btrfs_set_device_id(buf, dev_item, 1); btrfs_set_device_generation(buf, dev_item, 0); btrfs_set_device_total_bytes(buf, dev_item, num_bytes); - btrfs_set_device_bytes_used(buf, dev_item, - BTRFS_MKFS_SYSTEM_GROUP_SIZE); + btrfs_set_device_bytes_used(buf, dev_item, system_group_size); btrfs_set_device_io_align(buf, dev_item, cfg->sectorsize); btrfs_set_device_io_width(buf, dev_item, cfg->sectorsize); btrfs_set_device_sector_size(buf, dev_item, cfg->sectorsize); @@ -347,7 +356,7 @@ int make_btrfs(int fd, struct btrfs_mkfs_config *cfg) btrfs_set_item_size(buf, btrfs_item_nr(nritems), item_size); chunk = btrfs_item_ptr(buf, nritems, struct btrfs_chunk); - btrfs_set_chunk_length(buf, chunk, BTRFS_MKFS_SYSTEM_GROUP_SIZE); + btrfs_set_chunk_length(buf, chunk, system_group_size); btrfs_set_chunk_owner(buf, chunk, BTRFS_EXTENT_TREE_OBJECTID); btrfs_set_chunk_stripe_len(buf, chunk, BTRFS_STRIPE_LEN); btrfs_set_chunk_type(buf, chunk, BTRFS_BLOCK_GROUP_SYSTEM); @@ -413,8 +422,7 @@ int make_btrfs(int fd, struct btrfs_mkfs_config *cfg) (unsigned long)btrfs_dev_extent_chunk_tree_uuid(dev_extent), BTRFS_UUID_SIZE); - btrfs_set_dev_extent_length(buf, dev_extent, - BTRFS_MKFS_SYSTEM_GROUP_SIZE); + btrfs_set_dev_extent_length(buf, dev_extent, system_group_size); nritems++; btrfs_set_header_bytenr(buf, cfg->blocks[MKFS_DEV_TREE]); diff --git a/mkfs/common.h b/mkfs/common.h index 28912906d0a9..d0e4c7b2c906 100644 --- a/mkfs/common.h +++ b/mkfs/common.h @@ -53,6 +53,7 @@ struct btrfs_mkfs_config { u64 features; /* Size of the filesystem in bytes */ u64 num_bytes; + u64 zone_size; /* Output fields, set during creation */ diff --git a/mkfs/main.c b/mkfs/main.c index 948c84be5f39..83463f8d819a 100644 --- a/mkfs/main.c +++ b/mkfs/main.c @@ -68,8 +68,13 @@ static int create_metadata_block_groups(struct btrfs_root *root, int mixed, u64 bytes_used; u64 chunk_start = 0; u64 chunk_size = 0; + u64 system_group_size = BTRFS_MKFS_SYSTEM_GROUP_SIZE; int ret; + if (fs_info->fs_devices->hmzoned) + system_group_size = fs_info->fs_devices->zone_size - + BTRFS_BLOCK_RESERVED_1M_FOR_SUPER; + if (mixed) flags |= BTRFS_BLOCK_GROUP_DATA; @@ -90,8 +95,8 @@ static int create_metadata_block_groups(struct btrfs_root *root, int mixed, ret = btrfs_make_block_group(trans, fs_info, bytes_used, BTRFS_BLOCK_GROUP_SYSTEM, BTRFS_BLOCK_RESERVED_1M_FOR_SUPER, - BTRFS_MKFS_SYSTEM_GROUP_SIZE); - allocation->system += BTRFS_MKFS_SYSTEM_GROUP_SIZE; + system_group_size); + allocation->system += system_group_size; if (ret) return ret; @@ -297,11 +302,19 @@ static int create_one_raid_group(struct btrfs_trans_handle *trans, static int create_raid_groups(struct btrfs_trans_handle *trans, struct btrfs_root *root, u64 data_profile, - u64 metadata_profile, int mixed, + u64 metadata_profile, int mixed, int hmzoned, struct mkfs_allocation *allocation) { int ret; + if (!metadata_profile && hmzoned) { + ret = create_one_raid_group(trans, root, + BTRFS_BLOCK_GROUP_SYSTEM, + allocation); + if (ret) + return ret; + } + if (metadata_profile) { u64 meta_flags = BTRFS_BLOCK_GROUP_METADATA; @@ -548,6 +561,7 @@ out: /* This function will cleanup */ static int cleanup_temp_chunks(struct btrfs_fs_info *fs_info, struct mkfs_allocation *alloc, + int hmzoned, u64 data_profile, u64 meta_profile, u64 sys_profile) { @@ -599,7 +613,11 @@ static int cleanup_temp_chunks(struct btrfs_fs_info *fs_info, struct btrfs_block_group_item); if (is_temp_block_group(path.nodes[0], bgi, data_profile, meta_profile, - sys_profile)) { + sys_profile) || + /* need to remove the first sys chunk */ + (hmzoned && found_key.objectid == + BTRFS_BLOCK_RESERVED_1M_FOR_SUPER)) { + u64 flags = btrfs_disk_block_group_flags(path.nodes[0], bgi); @@ -783,6 +801,7 @@ int BOX_MAIN(mkfs)(int argc, char **argv) int metadata_profile_opt = 0; int discard = 1; int ssd = 0; + int hmzoned = 0; int force_overwrite = 0; char *source_dir = NULL; bool source_dir_set = false; @@ -796,6 +815,7 @@ int BOX_MAIN(mkfs)(int argc, char **argv) u64 features = BTRFS_MKFS_DEFAULT_FEATURES; struct mkfs_allocation allocation = { 0 }; struct btrfs_mkfs_config mkfs_cfg; + u64 system_group_size; crc32c_optimization_init(); @@ -920,6 +940,8 @@ int BOX_MAIN(mkfs)(int argc, char **argv) if (dev_cnt == 0) print_usage(1); + hmzoned = features & BTRFS_FEATURE_INCOMPAT_HMZONED; + if (source_dir_set && dev_cnt > 1) { error("the option -r is limited to a single device"); goto error; @@ -929,6 +951,11 @@ int BOX_MAIN(mkfs)(int argc, char **argv) goto error; } + if (source_dir_set && hmzoned) { + error("The -r and hmzoned feature are incompatible"); + exit(1); + } + if (*fs_uuid) { uuid_t dummy_uuid; @@ -960,6 +987,16 @@ int BOX_MAIN(mkfs)(int argc, char **argv) file = argv[optind++]; ssd = is_ssd(file); + if (hmzoned) { + if (zoned_model(file) == ZONED_NONE) { + error("%s: not a zoned block device", file); + exit(1); + } + if (!zone_size(file)) { + error("%s: zone size undefined", file); + exit(1); + } + } /* * Set default profiles according to number of added devices. @@ -1111,7 +1148,8 @@ int BOX_MAIN(mkfs)(int argc, char **argv) ret = btrfs_prepare_device(fd, file, &dev_block_count, block_count, (zero_end ? PREP_DEVICE_ZERO_END : 0) | (discard ? PREP_DEVICE_DISCARD : 0) | - (verbose ? PREP_DEVICE_VERBOSE : 0)); + (verbose ? PREP_DEVICE_VERBOSE : 0) | + (hmzoned ? PREP_DEVICE_HMZONED : 0)); if (ret) goto error; if (block_count && block_count > dev_block_count) { @@ -1122,9 +1160,11 @@ int BOX_MAIN(mkfs)(int argc, char **argv) } /* To create the first block group and chunk 0 in make_btrfs */ - if (dev_block_count < BTRFS_MKFS_SYSTEM_GROUP_SIZE) { + system_group_size = hmzoned ? + zone_size(file) : BTRFS_MKFS_SYSTEM_GROUP_SIZE; + if (dev_block_count < system_group_size) { error("device is too small to make filesystem, must be at least %llu", - (unsigned long long)BTRFS_MKFS_SYSTEM_GROUP_SIZE); + (unsigned long long)system_group_size); goto error; } @@ -1140,6 +1180,7 @@ int BOX_MAIN(mkfs)(int argc, char **argv) mkfs_cfg.sectorsize = sectorsize; mkfs_cfg.stripesize = stripesize; mkfs_cfg.features = features; + mkfs_cfg.zone_size = zone_size(file); ret = make_btrfs(fd, &mkfs_cfg); if (ret) { @@ -1150,6 +1191,7 @@ int BOX_MAIN(mkfs)(int argc, char **argv) fs_info = open_ctree_fs_info(file, 0, 0, 0, OPEN_CTREE_WRITES | OPEN_CTREE_TEMPORARY_SUPER); + if (!fs_info) { error("open ctree failed"); goto error; @@ -1223,7 +1265,8 @@ int BOX_MAIN(mkfs)(int argc, char **argv) block_count, (verbose ? PREP_DEVICE_VERBOSE : 0) | (zero_end ? PREP_DEVICE_ZERO_END : 0) | - (discard ? PREP_DEVICE_DISCARD : 0)); + (discard ? PREP_DEVICE_DISCARD : 0) | + (hmzoned ? PREP_DEVICE_HMZONED : 0)); if (ret) { goto error; } @@ -1246,7 +1289,7 @@ int BOX_MAIN(mkfs)(int argc, char **argv) raid_groups: ret = create_raid_groups(trans, root, data_profile, - metadata_profile, mixed, &allocation); + metadata_profile, mixed, hmzoned, &allocation); if (ret) { error("unable to create raid groups: %d", ret); goto out; @@ -1269,7 +1312,7 @@ raid_groups: goto out; } - ret = cleanup_temp_chunks(fs_info, &allocation, data_profile, + ret = cleanup_temp_chunks(fs_info, &allocation, hmzoned, data_profile, metadata_profile, metadata_profile); if (ret < 0) { error("failed to cleanup temporary chunks: %d", ret); @@ -1320,6 +1363,10 @@ raid_groups: btrfs_group_profile_str(metadata_profile), pretty_size(allocation.system)); printf("SSD detected: %s\n", ssd ? "yes" : "no"); + printf("Zoned device: %s\n", hmzoned ? "yes" : "no"); + if (hmzoned) + printf("Zone size: %s\n", + pretty_size(fs_info->fs_devices->zone_size)); btrfs_parse_features_to_string(features_buf, features); printf("Incompat features: %s", features_buf); printf("\n"); From patchwork Tue Aug 20 04:52:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 11102881 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 58D2414DB for ; Tue, 20 Aug 2019 04:53:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3926522CF4 for ; Tue, 20 Aug 2019 04:53:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="g281ck/4" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729255AbfHTExc (ORCPT ); Tue, 20 Aug 2019 00:53:32 -0400 Received: from esa5.hgst.iphmx.com ([216.71.153.144]:11098 "EHLO esa5.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729190AbfHTExa (ORCPT ); Tue, 20 Aug 2019 00:53:30 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1566276810; x=1597812810; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=B9zte2SolvEF+uy9agb4M60jeVVydpnM5oPQWx6WUZE=; b=g281ck/4lNUlkLsLPVXtAIku1pJWzHl9gF+ENtNHYeJphlPjjJfrAaee mkBV24LLAkLVKexmb19tVAor3nV6yzabCtrmmV7pM1IKGWpm4kSFQfhNe adFTt1WoMrQ8g/2DdaibUOLUrxEDFpRuQfmsuveqkY5i6f/l6ujBTJJWq guLnto4QJjyKs5WEm+f/4WAK2sdcRdwNy4uWS8GG/tgvvYTJxvCG986Rw BhdVyMKzWEV6iBkIeUJfh/84i7MNgrx52+/29WfchV5HU05yaIkZmima4 sIpOqjO40KV6oU0mskLYS9RRmHiJzcrisHQyILXkd4K3GJhgN9wG4Bxqp A==; IronPort-SDR: +xaMGDsWBRxU/7CxZfnVpC90iaHAftvo78VftjP9SOxzEcoi3rat0J/7y40Ch2gJftU/gu6aaP oaXQJkEf3cLYtV94cbU0jer2BDsH9Lbxcbn816vSPsiQmNVk3pNRysB+wjPy7+UDBiLHE1DizL h+zhbVX3LEev4M+BpA5WKknpoG5E/RprLuGtg+i+gnSPsIsf/sSL7WK3vlGky+fRvotVQxCQ7g 80RXvOy+s8zJ3n9SUn2JQtEmdcpoasUVAgB5OrIW7lDVNFTJ8iwtqRYjiBhs34JA05rzAaiYZV 9WI= X-IronPort-AV: E=Sophos;i="5.64,407,1559491200"; d="scan'208";a="117136321" Received: from uls-op-cesaip01.wdc.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 20 Aug 2019 12:53:30 +0800 IronPort-SDR: EJldSskBrn/pICaiSWp5CAyynC8+INYx1nLY3z9W+vQ9L4/k3JOE8KHb99zulq66ZzarwAUtXY AI2/1kYI/WhLv2MTpOL0KV4+cNTEs/FBqgfJ370pBuQo8fIkxq2GK2pHfDA5ehUBFyQx0jt9n7 4QxT+lHsVvjZbhqA2LEWZb3t0M4VfARnqrA3zvb6rGL2CMRGRMaAAR12OKNkskBPD525zZYsQi WUtOetAf+1FsfOHDK2bc1zGy6fTePB914DGRXfz9fKhoJXrXo4E8toIHf/OjZKQY8TieYWBxqO +sXc/kLqUhTFhhd0s177cZ/b Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Aug 2019 21:50:54 -0700 IronPort-SDR: FcN/AhKeWFszuQHYkBAwjyRRNBouxDHxo4+aLGgaoiOexfTctM1UF9wBM6SjlwqZzhLxPwgJcA BE1d+yvtqR12x/1CsQn038n5oFXPuEN45x4e/GY8yxRygmkm48DOXEfZNuOOtuY/bRgo8awOA+ Uf6hfw7QV4Fb1nmLJ3YuxQpWqWYEPnFOf9JA78tMaemPJTNwu6X+nSgvoYb9v/BlX+20J7sxyf De+YSgcgWorITk+KEv1zO17BXx252yJtoO8ulplXJJ1Z6fr9H8z4Tvej9xAs74jnU8xv9p6lVl ORk= Received: from naota.dhcp.fujisawa.hgst.com (HELO naota.fujisawa.hgst.com) ([10.149.53.115]) by uls-op-cesaip02.wdc.com with ESMTP; 19 Aug 2019 21:53:26 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org, David Sterba Cc: Chris Mason , Josef Bacik , Nikolay Borisov , Damien Le Moal , Matias Bjorling , Johannes Thumshirn , Hannes Reinecke , linux-fsdevel@vger.kernel.org, Naohiro Aota Subject: [PATCH v3 14/15] btrfs-progs: device-add: support HMZONED device Date: Tue, 20 Aug 2019 13:52:57 +0900 Message-Id: <20190820045258.1571640-15-naohiro.aota@wdc.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20190820045258.1571640-1-naohiro.aota@wdc.com> References: <20190820045258.1571640-1-naohiro.aota@wdc.com> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org This patch check if the target file system is flagged as HMZONED. If it is, the device to be added is flagged PREP_DEVICE_HMZONED. Also add checks to prevent mixing non-zoned devices and zoned devices. Signed-off-by: Naohiro Aota --- cmds/device.c | 31 +++++++++++++++++++++++++++++-- 1 file changed, 29 insertions(+), 2 deletions(-) diff --git a/cmds/device.c b/cmds/device.c index 24158308a41b..9caa77efd049 100644 --- a/cmds/device.c +++ b/cmds/device.c @@ -61,6 +61,9 @@ static int cmd_device_add(const struct cmd_struct *cmd, int discard = 1; int force = 0; int last_dev; + int res; + int hmzoned; + struct btrfs_ioctl_feature_flags feature_flags; optind = 0; while (1) { @@ -96,12 +99,35 @@ static int cmd_device_add(const struct cmd_struct *cmd, if (fdmnt < 0) return 1; + res = ioctl(fdmnt, BTRFS_IOC_GET_FEATURES, &feature_flags); + if (res) { + error("error getting feature flags '%s': %m", mntpnt); + return 1; + } + hmzoned = feature_flags.incompat_flags & BTRFS_FEATURE_INCOMPAT_HMZONED; + for (i = optind; i < last_dev; i++){ struct btrfs_ioctl_vol_args ioctl_args; - int devfd, res; + int devfd; u64 dev_block_count = 0; char *path; + if (hmzoned && zoned_model(argv[i]) == ZONED_NONE) { + error( + "cannot add non-zoned device to HMZONED file system '%s'", + argv[i]); + ret++; + continue; + } + + if (!hmzoned && zoned_model(argv[i]) == ZONED_HOST_MANAGED) { + error( + "cannot add host managed zoned device to non-HMZONED file system '%s'", + argv[i]); + ret++; + continue; + } + res = test_dev_for_mkfs(argv[i], force); if (res) { ret++; @@ -117,7 +143,8 @@ static int cmd_device_add(const struct cmd_struct *cmd, res = btrfs_prepare_device(devfd, argv[i], &dev_block_count, 0, PREP_DEVICE_ZERO_END | PREP_DEVICE_VERBOSE | - (discard ? PREP_DEVICE_DISCARD : 0)); + (discard ? PREP_DEVICE_DISCARD : 0) | + (hmzoned ? PREP_DEVICE_HMZONED : 0)); close(devfd); if (res) { ret++; From patchwork Tue Aug 20 04:52:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 11102885 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6E50014DB for ; Tue, 20 Aug 2019 04:53:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4DF9222CF4 for ; Tue, 20 Aug 2019 04:53:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="k+4OK7dp" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729261AbfHTExd (ORCPT ); Tue, 20 Aug 2019 00:53:33 -0400 Received: from esa5.hgst.iphmx.com ([216.71.153.144]:11098 "EHLO esa5.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729254AbfHTExc (ORCPT ); Tue, 20 Aug 2019 00:53:32 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1566276812; x=1597812812; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=nBu49akIyAGHh9SzymnOJF+LRH5iJJ7A21f9W33xnCM=; b=k+4OK7dptRKwYSCKSm+bPv9ph2KffdeZ6MfYQ8rQ4zjXi0PmWn3HN5DR exZIHgSnqsewSuBkYTzXgsezosp4Yz6Xb8AbY1EadMv345n129hbaTq84 mdx97V/lVKVtyUf3aHR2XIigJGOfHrwwgcR4eGGyg87XGVxhSvuxJOMko dpifrmnSqy5BCU6p6mKSSU3UtN61kS0T00Wn+iEccKds+IAhcWJT1Vz4e 0/p9usp0IAiq3ThtJC5aZmf+jpol+NTa2CNyNIc+PIHcPcxnw8R7qo/rF OJVxxZEgayDYZUZitCMhQVCUl//y11SJlw014CQe3UU8TkYAdAFY/bpa8 A==; IronPort-SDR: 0kLPT1oFo6IybCzxEYd9iEDvtsooJanuZdCkf/ucljFCogmhWf3atRvYO0Am3U03ErqxW5/VsK v346Lr5yBPTFba2CjMT0gTlZYldP3cAbf2iL3eFsHkaU/b01qkpOfy+VezcAdeU1AfRzRCCzNf 7xR6nAXaNE8KCWiR7LA0uljNiA64grF2aRpk+0pP7daoHOxA5rMq7d5H97R+t/xGYQGZ8759f0 AdDwrEQPzmowNUBXdkSf1kI3kAOZhJyOMIj90S0PwaXeA25FUQ6JKzkq5ScQz+He5oZPMwpAyG ZI4= X-IronPort-AV: E=Sophos;i="5.64,407,1559491200"; d="scan'208";a="117136322" Received: from uls-op-cesaip01.wdc.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 20 Aug 2019 12:53:31 +0800 IronPort-SDR: A5sUlv4knqi2FYYSI07ulUPNU5sUQL6mEPz8Cp2tGZEgEoIpmywShaESxaq4d/ExeZsdpSyDly ivz178Q81gdcmT0JNhbxmoFo4WxciYiKH/XVvo+FFgL6MU/N4Fgmrp+iNnMq3xKUhfx4GLggDZ TcGuvsv87EcTLJcVwvXbEiLGB0T+uXmpRuL6bjfrJcX7tBwO9XOaOlSnC5RiJFuctHLfmRwUj2 TU4MAf9qlgyhQz7spDyVuYN/saoiQN6/ol4RkTNibtJv9I3cgayx8lQKC9Oftts3GbX+vU7Adc Xeww0i/FZTee+QlN8p8iCfz7 Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Aug 2019 21:50:56 -0700 IronPort-SDR: e45tWBCatkTPegpC8zgJzlFpe5247SQUcy9uu9jSi8S9fXoR9XXTkAUrwdj/uJBNhw20UQCKn5 AGGZknKw655lB6h+7ndhZ7fWMt3K91zkR50ylXpViD2kumnvFhZo/FTX89myw5G0meWvxoDvVs rM1nYDdO46v+Zy5+Sn1Jqc4UjxDzEcTLLx6EwzVA35s6aJ/LmQGGjtoT37PqFyylwjfYUT4k0+ s3fndlD8A6k8XIjjh8uPMs3J8q+Y6foquKF692vP19QwXCto55p2VIFewRG88/0BsGSb4fSpqw AJo= Received: from naota.dhcp.fujisawa.hgst.com (HELO naota.fujisawa.hgst.com) ([10.149.53.115]) by uls-op-cesaip02.wdc.com with ESMTP; 19 Aug 2019 21:53:28 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org, David Sterba Cc: Chris Mason , Josef Bacik , Nikolay Borisov , Damien Le Moal , Matias Bjorling , Johannes Thumshirn , Hannes Reinecke , linux-fsdevel@vger.kernel.org, Naohiro Aota Subject: [PATCH v3 15/15] btrfs-progs: introduce support for device replace HMZONED device Date: Tue, 20 Aug 2019 13:52:58 +0900 Message-Id: <20190820045258.1571640-16-naohiro.aota@wdc.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20190820045258.1571640-1-naohiro.aota@wdc.com> References: <20190820045258.1571640-1-naohiro.aota@wdc.com> MIME-Version: 1.0 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org This patch check if the target file system is flagged as HMZONED. If it is, the device to be added is flagged PREP_DEVICE_HMZONED. Also add checks to prevent mixing non-zoned devices and zoned devices. Signed-off-by: Naohiro Aota --- cmds/replace.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/cmds/replace.c b/cmds/replace.c index 2321aa156fe2..670df68a93f7 100644 --- a/cmds/replace.c +++ b/cmds/replace.c @@ -119,6 +119,7 @@ static const char *const cmd_replace_start_usage[] = { static int cmd_replace_start(const struct cmd_struct *cmd, int argc, char **argv) { + struct btrfs_ioctl_feature_flags feature_flags; struct btrfs_ioctl_dev_replace_args start_args = {0}; struct btrfs_ioctl_dev_replace_args status_args = {0}; int ret; @@ -126,6 +127,7 @@ static int cmd_replace_start(const struct cmd_struct *cmd, int c; int fdmnt = -1; int fddstdev = -1; + int hmzoned; char *path; char *srcdev; char *dstdev = NULL; @@ -166,6 +168,13 @@ static int cmd_replace_start(const struct cmd_struct *cmd, if (fdmnt < 0) goto leave_with_error; + ret = ioctl(fdmnt, BTRFS_IOC_GET_FEATURES, &feature_flags); + if (ret) { + error("ioctl(GET_FEATURES) on '%s' returns error: %m", path); + goto leave_with_error; + } + hmzoned = feature_flags.incompat_flags & BTRFS_FEATURE_INCOMPAT_HMZONED; + /* check for possible errors before backgrounding */ status_args.cmd = BTRFS_IOCTL_DEV_REPLACE_CMD_STATUS; status_args.result = BTRFS_IOCTL_DEV_REPLACE_RESULT_NO_RESULT; @@ -260,7 +269,8 @@ static int cmd_replace_start(const struct cmd_struct *cmd, strncpy((char *)start_args.start.tgtdev_name, dstdev, BTRFS_DEVICE_PATH_NAME_MAX); ret = btrfs_prepare_device(fddstdev, dstdev, &dstdev_block_count, 0, - PREP_DEVICE_ZERO_END | PREP_DEVICE_VERBOSE); + PREP_DEVICE_ZERO_END | PREP_DEVICE_VERBOSE | + (hmzoned ? PREP_DEVICE_HMZONED : 0)); if (ret) goto leave_with_error;