From patchwork Fri Sep 30 04:11:37 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Damien Le Moal X-Patchwork-Id: 9357639 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id C55D4600C8 for ; Fri, 30 Sep 2016 04:12:12 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C9B5529D9D for ; Fri, 30 Sep 2016 04:12:12 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BEC7329DBA; Fri, 30 Sep 2016 04:12:12 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8FD3829D9D for ; Fri, 30 Sep 2016 04:12:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932905AbcI3EMK (ORCPT ); Fri, 30 Sep 2016 00:12:10 -0400 Received: from esa2.hgst.iphmx.com ([68.232.143.124]:29232 "EHLO esa2.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752932AbcI3EME (ORCPT ); Fri, 30 Sep 2016 00:12:04 -0400 X-IronPort-AV: E=Sophos;i="5.31,418,1473091200"; d="scan'208";a="19511989" Received: from mail-co1nam03lp0015.outbound.protection.outlook.com (HELO NAM03-CO1-obe.outbound.protection.outlook.com) ([216.32.181.15]) by ob1.hgst.iphmx.com with ESMTP; 30 Sep 2016 12:12:03 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sharedspace.onmicrosoft.com; s=selector1-hgst-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=HVqoGpzS7ExncEYm8927+l5XLDwwWCNGtgIgTAN8Y6w=; b=hXt5n8Ta2G467/PySbo9eU+CUx5WV3v3F6WWPoN1Zp6Bc+evdK/E10cRdVQVtqKYm/pxLukoaxTu2+IRki7wGnHuj+vhLOnYiM9x6n+Ehm65dJVrtUWFENqM46fk1aNoZceVWeoJXa7F9+q8JaAb5QqznuvTOZ76KXPfVCPvne8= Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Damien.LeMoal@hgst.com; Received: from washi.fujisawa.hgst.com (199.255.47.8) by CY1PR04MB1979.namprd04.prod.outlook.com (10.166.191.15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.639.5; Fri, 30 Sep 2016 04:12:00 +0000 From: Damien Le Moal To: Jens Axboe CC: , , Christoph Hellwig , "Martin K . Petersen" , Hannes Reinecke , Shaun Tancheff , Damien Le Moal Subject: [PATCH v5 5/7] block: Implement support for zoned block devices Date: Fri, 30 Sep 2016 13:11:37 +0900 Message-ID: <1475208699-27310-6-git-send-email-damien.lemoal@hgst.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1475208699-27310-1-git-send-email-damien.lemoal@hgst.com> References: <1475208699-27310-1-git-send-email-damien.lemoal@hgst.com> MIME-Version: 1.0 X-Originating-IP: [199.255.47.8] X-ClientProxiedBy: BY2PR12CA0020.namprd12.prod.outlook.com (10.160.121.30) To CY1PR04MB1979.namprd04.prod.outlook.com (10.166.191.15) X-MS-Office365-Filtering-Correlation-Id: 44e395c2-9085-40f4-db8e-08d3e8e7ee16 X-Microsoft-Exchange-Diagnostics: 1; CY1PR04MB1979; 2:wzIl40ZLtlJuumqZBt29KdkwediCqZYGf5irXEzzUbWajZ+G8LgxSansXVIloHJaJIOi8DG7B3QhlMAToy6j7FuNUn8a+REyOYcZ8aGySz8Pwj/dslw4S0hGuLhGFKXIaBmD+0wnPhcxIBv+/ZqrJ8sI2KxxT5zyVpV0hvUQMlFk1CxMlvKsC1UPNcd2tBwl; 3:DWbnlM1dLl06e/xAK8uPSMG+LJqVSC08nhjEJ+E+GsfCc4bCQRwiQvdoG8fhOV9yTXSbcL4wosL590S1EAkW0LzqMyxjxrLmvckbH1oVix7mY4PL+5DmrAbSWZ7U45co X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:CY1PR04MB1979; X-Microsoft-Exchange-Diagnostics: 1; CY1PR04MB1979; 25:kHd4TUj0x5lv6sc9U5UDOyGsm1lpYU4BUYIOd6QD3sLa0dQcfsJFlzloWI2CnyCjrD8U+IaDd831+7MkG8Candve1gpZgubqRRm3EHsXFgEWMN3t/ketlHN1CB2Os69cCsNTx+t4yju3OANR0ZyANq1YCS8hwUYm6hbqCCgJxK7oYgKTPFtaizmDs0tQ84uqh1+5+WN5lLnHv30OC3gx/jEcMBXsp3fodhi51qKFPhY0pRwvewjz6+ocxvS6VhJ9w7AVfAQRaNl8hRv9ZdheqgUECfK12pixnjYyATdpVeW3knrgVmk1xPJIf0Zlk/jVpDXiu5aWTKTV9+ftG22F4Wi0MUgBC3fCIB4ZghHEKdJ1+Fy07rZkuWnCC/NkBYyZ6r05tjlrAX5emuv8EpC/2eQXBB+Qcd+xgiLZ6sgBofMhzw+IdWmh9Vz+57+pZbvpNbKRa8nEDi/zKbVvONLNQM71WJHXWAMskvNwU9QDOP7q5zAiDPWST1SPlG7N9+18jK8C56zKptgf30GOsHvFSQBUGD+F6WOi6eB6JkocMIleGO3VfuxwGJwauImBI77DuSaBNRs6jC4gOQRxpYXiv5lVLolwoujicyUvmV3rzX0nXa8rzqKY/B/AqKNFLY7Q7o10zYRFF+r98HpTCbq4hhm66QGyYNzXLlydTevZh0sj8tAcAa9ghRHwGgXJW7tBS9YPavRpD+SwSdDzeJ/TkLvDy2rIiiqmlAz2Zr9hIqE= X-Microsoft-Exchange-Diagnostics: 1; CY1PR04MB1979; 31:Q8RvCJNIkP+zCp+8ywZtfBe+SLA0mPLjPOTaj9x/VULtB2ognWps8O4iar7yb5Nt3DYhNE/SvmkewIXRbh+zDHrPxgWwmC4R4dfjGgswclYYc4u40EVkVadw2gJfVsLVc1G10CqzRkrVCP6BtgCM636cly9+UBOp7TyNcW9pdhp5kgyLRnJD6ZhDzxcVZoAloJfbVUKpzlAM5D3lbYszWQnd/9sR8WM1RONbqxiD7JhD7imU4uAGAA8SPyiVJvZG; 20:XTLPf+ezEodkFBqJb0KNZvbQ0vlhoYOLdJwzNr42seYU4zERInfm7dILT6X01jA9xqyIvONQqSR9Ii0qq1PjPx5LYwyPrOlow/3MQEIDn7xSra8x5RmQdhabZ3PIpBJV5XI15SkyRbzrlaP2mvMKZX0RMZOPCAZe0HC56A1HHcNMxTD6hZXTE/1p3XS9K7zl2H49sfqPA2JajoKDjWwtLx0cqsh9vYdKXJGDrcBMCiL2GBNhwMxCv+n1yImab3AUttwdwJQAcC7VkX1xdQMEgbfj5X2QzBlIMrc8YuKLUzV7EqpJ1Dq2XdTL5wv9Rk/07XSYmwS799enUZANLDZtrPA1RdxCC39erwy13er9imBeAGZvJjppFCcyfVGyWGaTzwWdxv9On+vaKCtzbefi2V1CauqrNgQBVaN5DYm0BCqfp0u88pMXyHFv7KHvIRFpBKSBepeGZ4Su1MkC7mNDpX8ARPAkv19dPDKSCZLuqy40lFGPUxFarp0tl7bQdoPg X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(79135771888625)(211171220733660)(146099531331640); X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(6040176)(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001)(6055026); SRVR:CY1PR04MB1979; BCL:0; PCL:0; RULEID:; SRVR:CY1PR04MB1979; X-Microsoft-Exchange-Diagnostics: 1; CY1PR04MB1979; 4:uGuITQ4iccK5x7MpW+Xv9gQ8LAj3Ci5Zf9o7ozgc/qYKuKwydOS+p8o/nS8vg3gQ03Z6C7g7nQwBb1Rey2mzWX7byCylguOiPScBxM18+AJTGrMYgxy3PVCP5Mq/kLxtJ7G9IaGJMAAwlOf2vRItNroqlcYkvHwPFvwLuYYjBGIUhajJeFyJ77rXfpnTNh2xvH2wOkZymiPM24m4NiIl+10ds0QU931kqfVa5IS3AVRIiAHA9IF/H1N5Jf5hWa9TMFHKZZ+cbVx0VVPGO9lec21jneGlwNZJFPR9T8yrvaWFVSerkB6TvtqwtwftgVuxKcvEd0QCCRH7EFT1IacooVoy3hx4aweCAJti2D/L1nCHuHZ94nPE0DMrZOXBU8HBM7A2pjlhdVAELxKZwsDb11iG8uPLZsiGEd7wQeZ45rLRcEyMAMviF2CriwxUZNyCJSLjXc4W5+zK4gd94ulvnJcr69m/MD/Rz2pdLhcQExGznM6RLHIzkk0HfG92FkCn6hoN9jlzTYdfHDgxEyXynQ== X-Forefront-PRVS: 008184426E X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10019020)(4630300001)(6009001)(7916002)(189002)(50944005)(199003)(561944003)(8676002)(42186005)(2950100002)(81156014)(6916009)(6666003)(81166006)(5003940100001)(47776003)(66066001)(229853001)(106356001)(7846002)(105586002)(305945005)(2906002)(36756003)(189998001)(7736002)(5660300001)(76176999)(50986999)(97736004)(586003)(4326007)(3846002)(107886002)(50226002)(33646002)(575784001)(50466002)(86362001)(4001430100002)(68736007)(77096005)(110136003)(101416001)(19580395003)(92566002)(19580405001)(48376002)(6116002)(7099028); DIR:OUT; SFP:1102; SCL:1; SRVR:CY1PR04MB1979; H:washi.fujisawa.hgst.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; Received-SPF: None (protection.outlook.com: hgst.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; CY1PR04MB1979; 23:EzCepGxQMYarp61zfLb0nkiYC740z+Br0ht2RWYuC?= =?us-ascii?Q?NEGHriuMVEztcLUqAVyWaY5zt8/sO94twsvhIwXV90NHZG/86XoWOL2sEGeE?= =?us-ascii?Q?iEKo731KrS4UugszH0AKb7ZCDTag4LC+KRMS/UlTqRekCuA5Ee1TY3nJtR8l?= =?us-ascii?Q?5xbj1rFLspCpD+KHLprapqr4hBE91kKEaJiET82GJxd8ICingCal6MO+Ss5f?= =?us-ascii?Q?ge34nHfkf6dp1xHLJkMlPMKBqjIJUyDVc+14IPcEwY92g92g0AQ0pLyvIfY4?= =?us-ascii?Q?e4bkkgRKlfPcxd0m97PZxH7RH5q6fg+Bn9K2SZiTT3cae9RYY6zvkx865WV4?= =?us-ascii?Q?8vo/eEXRoB7GMpK0TBTeBVXWtPhZpjuxULKBWCTMxswid/iGqMxQX+VkRiET?= =?us-ascii?Q?GSqlKpy3Gt9uj3G/xiRbQQF21bJ0y7ddyWLO7SMbhFfVZDJCmot18HbJUIVx?= =?us-ascii?Q?Ln6RL2FXwnF11MP9X9JFgPhrmKIvJk95f13d/X6i62T2ufLqhMkCNsnjLvdA?= =?us-ascii?Q?Zo4xecbKPqvdL5Ks3fYg5lzqbdbx7hWpBkRGB6G3ewMot4xNb5e/Pn8uxTyP?= =?us-ascii?Q?qI4ejO8v4PiEsYEX2jIGfuU0TIn1mzfoXAzfjSO73WhYLzOW1lxs7s/CpQPd?= =?us-ascii?Q?58FVq3Hh3jHlaVfxl9mobnVbkBJAJHgJPVchiHx9UWX7dUIUj0Y8Nwj1reUX?= =?us-ascii?Q?GCfQtptRbOeyefXnq3BxJMyG96PrtHOC5FUju6PkLIVY/QlOwv0GCCvyvCI5?= =?us-ascii?Q?69dcrgbRqNf+SaoSPnaVFx+PROK0tfxh023y+z4ZiOnqdo/3j8GQpxzLTtrz?= =?us-ascii?Q?+nZtnQP4eMA+gKqSLTtykKoWpd0DuOKI3lj16Qz7x0hOU0ref6i1ZFVJGmdD?= =?us-ascii?Q?rpKwOrYuoBR78/TOaXwWJPrys9CWAOPToVt/PRR9PUsaSZlvsNtk87fy/zdM?= =?us-ascii?Q?q0dIPel3YPzHYc/89vOgYmO15FvMqxlbCpQTF/3aVCdVL8mq/TyxtzYGS1ev?= =?us-ascii?Q?41hPM7D/SBh/dxhJgPVyorQaJaeUWZUQNyyBgiUpDYbxXjh6iw4ngW+srDCn?= =?us-ascii?Q?hgyR8OSibtbSV6q7V6pj5qeTtMbVY/Hdfe00FuhDU0r7MMeWBcqPG3Zsiesz?= =?us-ascii?Q?WXGq2hh+MuPAzZs8pd/l8EmJ5K+y7wP0Md7VDyk4kqqwOvZ1deFQwkkxrPcE?= =?us-ascii?Q?PZTdohZQY1eK39BuGAuefKLJqheQLGbu00pquGshbjeAEvt5OFZi4FN6w=3D?= =?us-ascii?Q?=3D?= X-Microsoft-Exchange-Diagnostics: 1; CY1PR04MB1979; 6:M8fA1WUAEj5rIWnsYx5zgouf6jgjm6c0LHRJLQ1wCR8Uh2L457Z3IBwugEcBQw6JdaT5KyuCbbT76m3xZTOTMfd8KEjjRgjdiXm8NeJfQChH87fjLTq+cWGVkpq2MEkI+RRcGOAQgMzy1GdZYKNOtzv2mCWosd6L07U0FGLTTsQWyI+Kf1FmbKuNpXhjHWUlmj6WcNS4sG7LXLSFSeGGUYa/eAjiFT3Xal+UCE9ZdJXM/ytoEB7XrZgvqF379k82vbwXf51suZQZ9wDMGhwGwAkFh8DfRrm6NcSIJyqPVjHhCcBD2X90AIT17weoloSP/XqpfgMwovhutTOGP1TVuA==; 5:FBzIOJiSGbt7ZLgySOumgZOpn6MGTSGT5V5pUKGc1DteJYPqfz5IpmsOumFI7+5MLURGMGpd+h2wgqtnjrZMgBbK9nuAcJuj55Q7tB/bP8CldvNaOm0jblkqGcOt/T1gDDJjLPP2sZexWpyw2TnEMg==; 24:QOz8CxkCgjrhXFoau8inXcDbzLdTWirXjHnMEwyAvonGZ4dlzi0GGE0yFfqmNgtCCFgTXanpvnmiZsFiZ8eh0E6LbTpePkQDWCrXj6GZJSk=; 7:3x0BAAut8g4fLrOGmhDh0KlHhfdNvA7hrl31WKJ6pK7ZKnLz6vj8YO/X7vJhkWQX7necuJHm2xr212/KNFbdWYnpRAfSAk+QnVbfbC3CCq+b+qLwtMVdhY7PBYM8w5wBkJi43LHX1zLiWzjxEe+pHLlQy2LT3XmQ7wmnZA10/etZ+Rbpaa506R62C9lBfp+csMhQRJrscS9OUtd06MAIly2bDdzL4sAKSwo7nJ5YUVBnk2Wer8YGX65GEWNOGhYrFCH5Q/F5kMFEMrcpNGbsp/0MLb90W0Ws3A5P6c7oA0Dw9GDPn2hpWK2B2LdW7uhx SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; CY1PR04MB1979; 20:8Ngr8JrxiN1hTlgvybw0lTLdQa7NRbNT/yJvfToTEaLukeSPaIhbo+DZWMGCGjas1vBtbO7DrziardloSI1JN8rpYLL9hZ9Y7d6BAsX01f96bKzxKGTw0Zjn1LFONU14UGB22FBZ+BSMYdO2ug+bsIg00myplGlEHpwZvZvVs0fs3c5PUlOxZsPECyVn87yLYuGEcHlbCBEZnvarS5XARDVaVGYTPxip5vW7Rr/IkOYtA5ZFdhpgQbozvH9Ldx7T X-OriginatorOrg: hgst.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 30 Sep 2016 04:12:00.3974 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY1PR04MB1979 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Hannes Reinecke Implement zoned block device zone information reporting and reset. Zone information are reported as struct blk_zone. This implementation does not differentiate between host-aware and host-managed device models and is valid for both. Two functions are provided: blkdev_report_zones for discovering the zone configuration of a zoned block device, and blkdev_reset_zones for resetting the write pointer of sequential zones. The helper function blk_queue_zone_size and bdev_zone_size are also provided for, as the name suggest, obtaining the zone size (in 512B sectors) of the zones of the device. Signed-off-by: Hannes Reinecke [Damien: * Removed the zone cache * Implement report zones operation based on earlier proposal by Shaun Tancheff ] Signed-off-by: Damien Le Moal Reviewed-by: Christoph Hellwig Reviewed-by: Martin K. Petersen Reviewed-by: Shaun Tancheff Tested-by: Shaun Tancheff --- block/Kconfig | 8 ++ block/Makefile | 1 + block/blk-zoned.c | 257 ++++++++++++++++++++++++++++++++++++++++++ include/linux/blkdev.h | 31 +++++ include/uapi/linux/Kbuild | 1 + include/uapi/linux/blkzoned.h | 103 +++++++++++++++++ 6 files changed, 401 insertions(+) create mode 100644 block/blk-zoned.c create mode 100644 include/uapi/linux/blkzoned.h diff --git a/block/Kconfig b/block/Kconfig index 1d4d624..6b0ad08 100644 --- a/block/Kconfig +++ b/block/Kconfig @@ -89,6 +89,14 @@ config BLK_DEV_INTEGRITY T10/SCSI Data Integrity Field or the T13/ATA External Path Protection. If in doubt, say N. +config BLK_DEV_ZONED + bool "Zoned block device support" + ---help--- + Block layer zoned block device support. This option enables + support for ZAC/ZBC host-managed and host-aware zoned block devices. + + Say yes here if you have a ZAC or ZBC storage device. + config BLK_DEV_THROTTLING bool "Block layer bio throttling support" depends on BLK_CGROUP=y diff --git a/block/Makefile b/block/Makefile index 36acdd7..9371bc7 100644 --- a/block/Makefile +++ b/block/Makefile @@ -22,4 +22,5 @@ obj-$(CONFIG_IOSCHED_CFQ) += cfq-iosched.o obj-$(CONFIG_BLOCK_COMPAT) += compat_ioctl.o obj-$(CONFIG_BLK_CMDLINE_PARSER) += cmdline-parser.o obj-$(CONFIG_BLK_DEV_INTEGRITY) += bio-integrity.o blk-integrity.o t10-pi.o +obj-$(CONFIG_BLK_DEV_ZONED) += blk-zoned.o obj-$(CONFIG_BLK_MQ_PCI) += blk-mq-pci.o diff --git a/block/blk-zoned.c b/block/blk-zoned.c new file mode 100644 index 0000000..1603573 --- /dev/null +++ b/block/blk-zoned.c @@ -0,0 +1,257 @@ +/* + * Zoned block device handling + * + * Copyright (c) 2015, Hannes Reinecke + * Copyright (c) 2015, SUSE Linux GmbH + * + * Copyright (c) 2016, Damien Le Moal + * Copyright (c) 2016, Western Digital + */ + +#include +#include +#include +#include + +static inline sector_t blk_zone_start(struct request_queue *q, + sector_t sector) +{ + sector_t zone_mask = blk_queue_zone_size(q) - 1; + + return sector & ~zone_mask; +} + +/* + * Check that a zone report belongs to the partition. + * If yes, fix its start sector and write pointer, copy it in the + * zone information array and return true. Return false otherwise. + */ +static bool blkdev_report_zone(struct block_device *bdev, + struct blk_zone *rep, + struct blk_zone *zone) +{ + sector_t offset = get_start_sect(bdev); + + if (rep->start < offset) + return false; + + rep->start -= offset; + if (rep->start + rep->len > bdev->bd_part->nr_sects) + return false; + + if (rep->type == BLK_ZONE_TYPE_CONVENTIONAL) + rep->wp = rep->start + rep->len; + else + rep->wp -= offset; + memcpy(zone, rep, sizeof(struct blk_zone)); + + return true; +} + +/** + * blkdev_report_zones - Get zones information + * @bdev: Target block device + * @sector: Sector from which to report zones + * @zones: Array of zone structures where to return the zones information + * @nr_zones: Number of zone structures in the zone array + * @gfp_mask: Memory allocation flags (for bio_alloc) + * + * Description: + * Get zone information starting from the zone containing @sector. + * The number of zone information reported may be less than the number + * requested by @nr_zones. The number of zones actually reported is + * returned in @nr_zones. + */ +int blkdev_report_zones(struct block_device *bdev, + sector_t sector, + struct blk_zone *zones, + unsigned int *nr_zones, + gfp_t gfp_mask) +{ + struct request_queue *q = bdev_get_queue(bdev); + struct blk_zone_report_hdr *hdr; + unsigned int nrz = *nr_zones; + struct page *page; + unsigned int nr_rep; + size_t rep_bytes; + unsigned int nr_pages; + struct bio *bio; + struct bio_vec *bv; + unsigned int i, n, nz; + unsigned int ofst; + void *addr; + int ret = 0; + + if (!q) + return -ENXIO; + + if (!blk_queue_is_zoned(q)) + return -EOPNOTSUPP; + + if (!nrz) + return 0; + + if (sector > bdev->bd_part->nr_sects) { + *nr_zones = 0; + return 0; + } + + /* + * The zone report has a header. So make room for it in the + * payload. Also make sure that the report fits in a single BIO + * that will not be split down the stack. + */ + rep_bytes = sizeof(struct blk_zone_report_hdr) + + sizeof(struct blk_zone) * nrz; + rep_bytes = (rep_bytes + PAGE_SIZE - 1) & PAGE_MASK; + if (rep_bytes > (queue_max_sectors(q) << 9)) + rep_bytes = queue_max_sectors(q) << 9; + + nr_pages = min_t(unsigned int, BIO_MAX_PAGES, + rep_bytes >> PAGE_SHIFT); + nr_pages = min_t(unsigned int, nr_pages, + queue_max_segments(q)); + + bio = bio_alloc(gfp_mask, nr_pages); + if (!bio) + return -ENOMEM; + + bio->bi_bdev = bdev; + bio->bi_iter.bi_sector = blk_zone_start(q, sector); + bio_set_op_attrs(bio, REQ_OP_ZONE_REPORT, 0); + + for (i = 0; i < nr_pages; i++) { + page = alloc_page(gfp_mask); + if (!page) { + ret = -ENOMEM; + goto out; + } + if (!bio_add_page(bio, page, PAGE_SIZE, 0)) { + __free_page(page); + break; + } + } + + if (i == 0) + ret = -ENOMEM; + else + ret = submit_bio_wait(bio); + if (ret) + goto out; + + /* + * Process the report result: skip the header and go through the + * reported zones to fixup and fixup the zone information for + * partitions. At the same time, return the zone information into + * the zone array. + */ + n = 0; + nz = 0; + nr_rep = 0; + bio_for_each_segment_all(bv, bio, i) { + + if (!bv->bv_page) + break; + + addr = kmap_atomic(bv->bv_page); + + /* Get header in the first page */ + ofst = 0; + if (!nr_rep) { + hdr = (struct blk_zone_report_hdr *) addr; + nr_rep = hdr->nr_zones; + ofst = sizeof(struct blk_zone_report_hdr); + } + + /* Fixup and report zones */ + while (ofst < bv->bv_len && + n < nr_rep && nz < nrz) { + if (blkdev_report_zone(bdev, addr + ofst, &zones[nz])) + nz++; + ofst += sizeof(struct blk_zone); + n++; + } + + kunmap_atomic(addr); + + if (n >= nr_rep || nz >= nrz) + break; + + } + +out: + bio_for_each_segment_all(bv, bio, i) + __free_page(bv->bv_page); + bio_put(bio); + + if (ret == 0) + *nr_zones = nz; + + return ret; +} +EXPORT_SYMBOL_GPL(blkdev_report_zones); + +/** + * blkdev_reset_zones - Reset zones write pointer + * @bdev: Target block device + * @sector: Start sector of the first zone to reset + * @nr_sectors: Number of sectors, at least the length of one zone + * @gfp_mask: Memory allocation flags (for bio_alloc) + * + * Description: + * Reset the write pointer of the zones contained in the range + * @sector..@sector+@nr_sectors. Specifying the entire disk sector range + * is valid, but the specified range should not contain conventional zones. + */ +int blkdev_reset_zones(struct block_device *bdev, + sector_t sector, sector_t nr_sectors, + gfp_t gfp_mask) +{ + struct request_queue *q = bdev_get_queue(bdev); + sector_t zone_sectors; + sector_t end_sector = sector + nr_sectors; + struct bio *bio; + int ret; + + if (!q) + return -ENXIO; + + if (!blk_queue_is_zoned(q)) + return -EOPNOTSUPP; + + if (end_sector > bdev->bd_part->nr_sects) + /* Out of range */ + return -EINVAL; + + /* Check alignment (handle eventual smaller last zone) */ + zone_sectors = blk_queue_zone_size(q); + if (sector & (zone_sectors - 1)) + return -EINVAL; + + if ((nr_sectors & (zone_sectors - 1)) && + end_sector != bdev->bd_part->nr_sects) + return -EINVAL; + + while (sector < end_sector) { + + bio = bio_alloc(gfp_mask, 0); + bio->bi_iter.bi_sector = sector; + bio->bi_bdev = bdev; + bio_set_op_attrs(bio, REQ_OP_ZONE_RESET, 0); + + ret = submit_bio_wait(bio); + bio_put(bio); + + if (ret) + return ret; + + sector += zone_sectors; + + /* This may take a while, so be nice to others */ + cond_resched(); + + } + + return 0; +} +EXPORT_SYMBOL_GPL(blkdev_reset_zones); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index f19e16b..252043f 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -24,6 +24,7 @@ #include #include #include +#include struct module; struct scsi_ioctl_command; @@ -302,6 +303,21 @@ struct queue_limits { enum blk_zoned_model zoned; }; +#ifdef CONFIG_BLK_DEV_ZONED + +struct blk_zone_report_hdr { + unsigned int nr_zones; + u8 padding[60]; +}; + +extern int blkdev_report_zones(struct block_device *bdev, + sector_t sector, struct blk_zone *zones, + unsigned int *nr_zones, gfp_t gfp_mask); +extern int blkdev_reset_zones(struct block_device *bdev, sector_t sectors, + sector_t nr_sectors, gfp_t gfp_mask); + +#endif /* CONFIG_BLK_DEV_ZONED */ + struct request_queue { /* * Together with queue_head for cacheline sharing @@ -654,6 +670,11 @@ static inline bool blk_queue_is_zoned(struct request_queue *q) } } +static inline unsigned int blk_queue_zone_size(struct request_queue *q) +{ + return blk_queue_is_zoned(q) ? q->limits.chunk_sectors : 0; +} + /* * We regard a request as sync, if either a read or a sync write */ @@ -1401,6 +1422,16 @@ static inline bool bdev_is_zoned(struct block_device *bdev) return false; } +static inline unsigned int bdev_zone_size(struct block_device *bdev) +{ + struct request_queue *q = bdev_get_queue(bdev); + + if (q) + return blk_queue_zone_size(q); + + return 0; +} + static inline int queue_dma_alignment(struct request_queue *q) { return q ? q->dma_alignment : 511; diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild index dd60439..92466a6 100644 --- a/include/uapi/linux/Kbuild +++ b/include/uapi/linux/Kbuild @@ -70,6 +70,7 @@ header-y += bfs_fs.h header-y += binfmts.h header-y += blkpg.h header-y += blktrace_api.h +header-y += blkzoned.h header-y += bpf_common.h header-y += bpf_perf_event.h header-y += bpf.h diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.h new file mode 100644 index 0000000..a381721 --- /dev/null +++ b/include/uapi/linux/blkzoned.h @@ -0,0 +1,103 @@ +/* + * Zoned block devices handling. + * + * Copyright (C) 2015 Seagate Technology PLC + * + * Written by: Shaun Tancheff + * + * Modified by: Damien Le Moal + * Copyright (C) 2016 Western Digital + * + * This file is licensed under the terms of the GNU General Public + * License version 2. This program is licensed "as is" without any + * warranty of any kind, whether express or implied. + */ +#ifndef _UAPI_BLKZONED_H +#define _UAPI_BLKZONED_H + +#include + +/** + * enum blk_zone_type - Types of zones allowed in a zoned device. + * + * @BLK_ZONE_TYPE_CONVENTIONAL: The zone has no write pointer and can be writen + * randomly. Zone reset has no effect on the zone. + * @BLK_ZONE_TYPE_SEQWRITE_REQ: The zone must be written sequentially + * @BLK_ZONE_TYPE_SEQWRITE_PREF: The zone can be written non-sequentially + * + * Any other value not defined is reserved and must be considered as invalid. + */ +enum blk_zone_type { + BLK_ZONE_TYPE_CONVENTIONAL = 0x1, + BLK_ZONE_TYPE_SEQWRITE_REQ = 0x2, + BLK_ZONE_TYPE_SEQWRITE_PREF = 0x3, +}; + +/** + * enum blk_zone_cond - Condition [state] of a zone in a zoned device. + * + * @BLK_ZONE_COND_NOT_WP: The zone has no write pointer, it is conventional. + * @BLK_ZONE_COND_EMPTY: The zone is empty. + * @BLK_ZONE_COND_IMP_OPEN: The zone is open, but not explicitly opened. + * @BLK_ZONE_COND_EXP_OPEN: The zones was explicitly opened by an + * OPEN ZONE command. + * @BLK_ZONE_COND_CLOSED: The zone was [explicitly] closed after writing. + * @BLK_ZONE_COND_FULL: The zone is marked as full, possibly by a zone + * FINISH ZONE command. + * @BLK_ZONE_COND_READONLY: The zone is read-only. + * @BLK_ZONE_COND_OFFLINE: The zone is offline (sectors cannot be read/written). + * + * The Zone Condition state machine in the ZBC/ZAC standards maps the above + * deinitions as: + * - ZC1: Empty | BLK_ZONE_EMPTY + * - ZC2: Implicit Open | BLK_ZONE_COND_IMP_OPEN + * - ZC3: Explicit Open | BLK_ZONE_COND_EXP_OPEN + * - ZC4: Closed | BLK_ZONE_CLOSED + * - ZC5: Full | BLK_ZONE_FULL + * - ZC6: Read Only | BLK_ZONE_READONLY + * - ZC7: Offline | BLK_ZONE_OFFLINE + * + * Conditions 0x5 to 0xC are reserved by the current ZBC/ZAC spec and should + * be considered invalid. + */ +enum blk_zone_cond { + BLK_ZONE_COND_NOT_WP = 0x0, + BLK_ZONE_COND_EMPTY = 0x1, + BLK_ZONE_COND_IMP_OPEN = 0x2, + BLK_ZONE_COND_EXP_OPEN = 0x3, + BLK_ZONE_COND_CLOSED = 0x4, + BLK_ZONE_COND_READONLY = 0xD, + BLK_ZONE_COND_FULL = 0xE, + BLK_ZONE_COND_OFFLINE = 0xF, +}; + +/** + * struct blk_zone - Zone descriptor for BLKREPORTZONE ioctl. + * + * @start: Zone start in 512 B sector units + * @len: Zone length in 512 B sector units + * @wp: Zone write pointer location in 512 B sector units + * @type: see enum blk_zone_type for possible values + * @cond: see enum blk_zone_cond for possible values + * @non_seq: Flag indicating that the zone is using non-sequential resources + * (for host-aware zoned block devices only). + * @reset: Flag indicating that a zone reset is recommended. + * @reserved: Padding to 64 B to match the ZBC/ZAC defined zone descriptor size. + * + * start, len and wp use the regular 512 B sector unit, regardless of the + * device logical block size. The overall structure size is 64 B to match the + * ZBC/ZAC defined zone descriptor and allow support for future additional + * zone information. + */ +struct blk_zone { + __u64 start; /* Zone start sector */ + __u64 len; /* Zone length in number of sectors */ + __u64 wp; /* Zone write pointer position */ + __u8 type; /* Zone type */ + __u8 cond; /* Zone condition */ + __u8 non_seq; /* Non-sequential write resources active */ + __u8 reset; /* Reset write pointer recommended */ + __u8 reserved[36]; +}; + +#endif /* _UAPI_BLKZONED_H */