From patchwork Thu Nov 26 23:45:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Klaus Jensen X-Patchwork-Id: 11934927 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F4E1C64E75 for ; Thu, 26 Nov 2020 23:50:09 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2E2D321D91 for ; Thu, 26 Nov 2020 23:50:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2E2D321D91 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=irrelevant.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:48770 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kiR1b-00014v-U6 for qemu-devel@archiver.kernel.org; Thu, 26 Nov 2020 18:50:08 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:55780) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kiQxr-0007Ot-54; Thu, 26 Nov 2020 18:46:15 -0500 Received: from new2-smtp.messagingengine.com ([66.111.4.224]:46427) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kiQxk-0003sA-92; Thu, 26 Nov 2020 18:46:14 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.nyi.internal (Postfix) with ESMTP id 64B3D580E24; Thu, 26 Nov 2020 18:46:06 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Thu, 26 Nov 2020 18:46:06 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=y3FLYMUf2JoVH rbhJ/eBz1mmUTnPek3z66qDy5fGSPQ=; b=VbwKaaU3erwR7MWlGAz70t4ePNvDL 5JYyPIa1TE+IaO0zMjt8vL3QHU5Cl4HB2T5QHwGiBdJzYu6fTdd2aVnrOBC/mxzO QAJe4GI0tzSbA7t+2GuJzS2rrYJSqrNJ/+h+Zjs7dTUkEZSIaVgYIgAEFoRCHy00 XRUNQVCoT6D8Jx8qsMbdNK1rufUakvdDR9DxDExVkXKURJMR2bJ1dTQOtKLgdAK5 a+nci2VbN/SqxXyIUpe5MuD/3NAY/pfFjeKBePw88pGv5QFbkBbRszDwuscQ0mu5 +w4xRTjf3QK+qWdeZZLor52OBL28cDPmjqVbaErHtt/+fJH+jhfkZxF9Q== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=y3FLYMUf2JoVHrbhJ/eBz1mmUTnPek3z66qDy5fGSPQ=; b=l3pJkspz LdBPauXP/0U5gB3t8auvFVlYbBzWIvA2JaWnCvyZU4wf//DSwOw2kZoO2+EvX3hm w+KcjEmLK5+cVqDL3yHACz17HFmcVVQ+zxy4Z6oHsZHacArfZ8yXgKSJGa4k9Upf HAMEwpWR8gviAQLG0tG+YMrCzLHCq0wBA6wK+VgDjhfMa4RREwzZIGBnw74WrJwO 8o1l3uoazeshnx0sIC/bTe3M7f5CX8hmkYHQ6c1tq9hAEDuWJx7sKeb1iylyJHUl pwFgGzBS9qZ/pg6HW7uDjXdpNZi+nC3FpLq/wUhbeCb5DQGz+NJgyttaDiPsyJDr XlxXvL/wq991IQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrudehfedguddvucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvufffkffojghfggfgsedtkeertdertddtnecuhfhrohhmpefmlhgruhhs ucflvghnshgvnhcuoehithhssehirhhrvghlvghvrghnthdrughkqeenucggtffrrghtth gvrhhnpeeuleetgeeiuefhgfekfefgveejiefgteekiedtgfdtieefhfdthfefueffvefg keenucfkphepkedtrdduieejrdelkedrudeltdenucevlhhushhtvghrufhiiigvpedtne curfgrrhgrmhepmhgrihhlfhhrohhmpehithhssehirhhrvghlvghvrghnthdrughk X-ME-Proxy: Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id A398B3064AB2; Thu, 26 Nov 2020 18:46:04 -0500 (EST) From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH v5 01/12] hw/block/nvme: Separate read and write handlers Date: Fri, 27 Nov 2020 00:45:50 +0100 Message-Id: <20201126234601.689714-2-its@irrelevant.dk> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201126234601.689714-1-its@irrelevant.dk> References: <20201126234601.689714-1-its@irrelevant.dk> MIME-Version: 1.0 Received-SPF: pass client-ip=66.111.4.224; envelope-from=its@irrelevant.dk; helo=new2-smtp.messagingengine.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Niklas Cassel , Dmitry Fomichev , Klaus Jensen , Max Reitz , Keith Busch , Stefan Hajnoczi , Klaus Jensen Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Dmitry Fomichev With ZNS support in place, the majority of code in nvme_rw() has become read- or write-specific. Move these parts to two separate handlers, nvme_read() and nvme_write() to make the code more readable and to remove multiple is_write checks that so far existed in the i/o path. This is a refactoring patch, no change in functionality. Signed-off-by: Dmitry Fomichev Reviewed-by: Niklas Cassel [kj: rebased] Signed-off-by: Klaus Jensen --- hw/block/nvme.c | 105 ++++++++++++++++++++++++++++-------------- hw/block/trace-events | 3 +- 2 files changed, 73 insertions(+), 35 deletions(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 7ab53cfcf67d..657d0b8b2922 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1394,6 +1394,61 @@ static uint16_t nvme_flush(NvmeCtrl *n, NvmeRequest *req) return NVME_NO_COMPLETE; } +static uint16_t nvme_read(NvmeCtrl *n, NvmeRequest *req) +{ + NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; + NvmeNamespace *ns = req->ns; + uint64_t slba = le64_to_cpu(rw->slba); + uint32_t nlb = (uint32_t)le16_to_cpu(rw->nlb) + 1; + uint64_t data_size = nvme_l2b(ns, nlb); + uint64_t data_offset; + BlockBackend *blk = ns->blkconf.blk; + uint16_t status; + + trace_pci_nvme_read(nvme_cid(req), nvme_nsid(ns), nlb, data_size, slba); + + status = nvme_check_mdts(n, data_size); + if (status) { + trace_pci_nvme_err_mdts(nvme_cid(req), data_size); + goto invalid; + } + + status = nvme_check_bounds(ns, slba, nlb); + if (status) { + trace_pci_nvme_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze); + goto invalid; + } + + if (NVME_ERR_REC_DULBE(ns->features.err_rec)) { + status = nvme_check_dulbe(ns, slba, nlb); + if (status) { + goto invalid; + } + } + + status = nvme_map_dptr(n, data_size, req); + if (status) { + goto invalid; + } + + data_offset = nvme_l2b(ns, slba); + + block_acct_start(blk_get_stats(blk), &req->acct, data_size, + BLOCK_ACCT_READ); + if (req->qsg.sg) { + req->aiocb = dma_blk_read(blk, &req->qsg, data_offset, + BDRV_SECTOR_SIZE, nvme_rw_cb, req); + } else { + req->aiocb = blk_aio_preadv(blk, data_offset, &req->iov, 0, + nvme_rw_cb, req); + } + return NVME_NO_COMPLETE; + +invalid: + block_acct_invalid(blk_get_stats(blk), BLOCK_ACCT_READ); + return status; +} + static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req) { NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; @@ -1419,22 +1474,19 @@ static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req) return NVME_NO_COMPLETE; } -static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_write(NvmeCtrl *n, NvmeRequest *req) { NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; NvmeNamespace *ns = req->ns; - uint32_t nlb = (uint32_t)le16_to_cpu(rw->nlb) + 1; uint64_t slba = le64_to_cpu(rw->slba); - + uint32_t nlb = (uint32_t)le16_to_cpu(rw->nlb) + 1; uint64_t data_size = nvme_l2b(ns, nlb); - uint64_t data_offset = nvme_l2b(ns, slba); - enum BlockAcctType acct = req->cmd.opcode == NVME_CMD_WRITE ? - BLOCK_ACCT_WRITE : BLOCK_ACCT_READ; + uint64_t data_offset; BlockBackend *blk = ns->blkconf.blk; uint16_t status; - trace_pci_nvme_rw(nvme_cid(req), nvme_io_opc_str(rw->opcode), - nvme_nsid(ns), nlb, data_size, slba); + trace_pci_nvme_write(nvme_cid(req), nvme_io_opc_str(rw->opcode), + nvme_nsid(ns), nlb, data_size, slba); status = nvme_check_mdts(n, data_size); if (status) { @@ -1448,42 +1500,26 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req) goto invalid; } - if (acct == BLOCK_ACCT_READ) { - if (NVME_ERR_REC_DULBE(ns->features.err_rec)) { - status = nvme_check_dulbe(ns, slba, nlb); - if (status) { - goto invalid; - } - } - } - status = nvme_map_dptr(n, data_size, req); if (status) { goto invalid; } - block_acct_start(blk_get_stats(blk), &req->acct, data_size, acct); + data_offset = nvme_l2b(ns, slba); + + block_acct_start(blk_get_stats(blk), &req->acct, data_size, + BLOCK_ACCT_WRITE); if (req->qsg.sg) { - if (acct == BLOCK_ACCT_WRITE) { - req->aiocb = dma_blk_write(blk, &req->qsg, data_offset, - BDRV_SECTOR_SIZE, nvme_rw_cb, req); - } else { - req->aiocb = dma_blk_read(blk, &req->qsg, data_offset, - BDRV_SECTOR_SIZE, nvme_rw_cb, req); - } + req->aiocb = dma_blk_write(blk, &req->qsg, data_offset, + BDRV_SECTOR_SIZE, nvme_rw_cb, req); } else { - if (acct == BLOCK_ACCT_WRITE) { - req->aiocb = blk_aio_pwritev(blk, data_offset, &req->iov, 0, - nvme_rw_cb, req); - } else { - req->aiocb = blk_aio_preadv(blk, data_offset, &req->iov, 0, - nvme_rw_cb, req); - } + req->aiocb = blk_aio_pwritev(blk, data_offset, &req->iov, 0, + nvme_rw_cb, req); } return NVME_NO_COMPLETE; invalid: - block_acct_invalid(blk_get_stats(ns->blkconf.blk), acct); + block_acct_invalid(blk_get_stats(blk), BLOCK_ACCT_WRITE); return status; } @@ -1513,8 +1549,9 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req) case NVME_CMD_WRITE_ZEROES: return nvme_write_zeroes(n, req); case NVME_CMD_WRITE: + return nvme_write(n, req); case NVME_CMD_READ: - return nvme_rw(n, req); + return nvme_read(n, req); case NVME_CMD_COMPARE: return nvme_compare(n, req); case NVME_CMD_DSM: diff --git a/hw/block/trace-events b/hw/block/trace-events index dd3a0b386ef9..cc269b51a1e0 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -40,7 +40,8 @@ pci_nvme_map_prp(uint64_t trans_len, uint32_t len, uint64_t prp1, uint64_t prp2, pci_nvme_map_sgl(uint16_t cid, uint8_t typ, uint64_t len) "cid %"PRIu16" type 0x%"PRIx8" len %"PRIu64"" pci_nvme_io_cmd(uint16_t cid, uint32_t nsid, uint16_t sqid, uint8_t opcode, const char *opname) "cid %"PRIu16" nsid %"PRIu32" sqid %"PRIu16" opc 0x%"PRIx8" opname '%s'" pci_nvme_admin_cmd(uint16_t cid, uint16_t sqid, uint8_t opcode, const char *opname) "cid %"PRIu16" sqid %"PRIu16" opc 0x%"PRIx8" opname '%s'" -pci_nvme_rw(uint16_t cid, const char *verb, uint32_t nsid, uint32_t nlb, uint64_t count, uint64_t lba) "cid %"PRIu16" opname '%s' nsid %"PRIu32" nlb %"PRIu32" count %"PRIu64" lba 0x%"PRIx64"" +pci_nvme_read(uint16_t cid, uint32_t nsid, uint32_t nlb, uint64_t count, uint64_t lba) "cid %"PRIu16" nsid %"PRIu32" nlb %"PRIu32" count %"PRIu64" lba 0x%"PRIx64"" +pci_nvme_write(uint16_t cid, const char *verb, uint32_t nsid, uint32_t nlb, uint64_t count, uint64_t lba) "cid %"PRIu16" opname '%s' nsid %"PRIu32" nlb %"PRIu32" count %"PRIu64" lba 0x%"PRIx64"" pci_nvme_rw_cb(uint16_t cid, const char *blkname) "cid %"PRIu16" blk '%s'" pci_nvme_copy(uint16_t cid, uint32_t nsid, uint16_t nr, uint8_t format) "cid %"PRIu16" nsid %"PRIu32" nr %"PRIu16" format 0x%"PRIx8"" pci_nvme_copy_source_range(uint64_t slba, uint32_t nlb) "slba 0x%"PRIx64" nlb %"PRIu32"" From patchwork Thu Nov 26 23:45:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Klaus Jensen X-Patchwork-Id: 11934921 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNWANTED_LANGUAGE_BODY, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF252C63777 for ; Thu, 26 Nov 2020 23:49:23 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7198A21D91 for ; Thu, 26 Nov 2020 23:49:23 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7198A21D91 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=irrelevant.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:47994 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kiR0r-0000kX-NG for qemu-devel@archiver.kernel.org; Thu, 26 Nov 2020 18:49:21 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:55826) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kiQxt-0007P8-Ks; Thu, 26 Nov 2020 18:46:17 -0500 Received: from new2-smtp.messagingengine.com ([66.111.4.224]:41727) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kiQxk-0003sL-Su; Thu, 26 Nov 2020 18:46:17 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.nyi.internal (Postfix) with ESMTP id D311A580E83; Thu, 26 Nov 2020 18:46:07 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Thu, 26 Nov 2020 18:46:07 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=sBPATm+u2MfPv qufWKbPI4pSHthzu9uObTwJH52y8Q0=; b=1ZpSg2B8YF1z9unSMZEEpBjOS2JBW Bj+XPbdIGw01Bu8IA1gQTyC/jUkMJSlypVQ++IgB+X14ADZVODxMkDkbXrqIgNEA VdRFgcvo8w6updEe8TBKtbTqq/Z+sNl47uhOSdpP0miguWnhO5lHA+3eHXYsygGZ +FjYcpnBKGE8uLS+TINqd8WxDDxWEfLM4UoK9STQtmXQilh1bONKUVulmm9dJZ9R P5sg/diJ/ZPQvP8NyjpMvrbed7lEjx/zsnTW21emDn8byMKqvZOHp34nGyE59TNF TwITE4YSyhnII+eUrCeCjNG4y+31+U9kvAiBwBE7yas0sFA6di6dUAWBw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=sBPATm+u2MfPvqufWKbPI4pSHthzu9uObTwJH52y8Q0=; b=cej9pILW mVEYUMmY9H3rj9rJuyTsPQGcDbbkkyadzvEVNzDS1CIAwU1eCkbWACmpcQnH27Bl 5+ljRlLZDaaiEOl8C6cXarqWxY4MhVRLLqmHa5VNQaUIbVChQ2DrSSpYnU7zIR5N /CaLVjZLpxa5kvp8FayABpRVlzq6XJyNpal36HciRAPozJiLGe3Bv7KWmMJCOI5j V0Lkc+xNkV0HVcUatmJl8ReRDuVnvaPZIWl+ZQycEJMciY+yBvAmMmlexuEUnvCL I5L3Y6SKLqWNoMEjwJzKBTNX0enp7rc12cCQSQjNrB0IVdjErPOzZ/fsh4oFcLco H0xzyjukd67gIQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrudehfedguddvucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvufffkffojghfggfgsedtkeertdertddtnecuhfhrohhmpefmlhgruhhs ucflvghnshgvnhcuoehithhssehirhhrvghlvghvrghnthdrughkqeenucggtffrrghtth gvrhhnpeeuleetgeeiuefhgfekfefgveejiefgteekiedtgfdtieefhfdthfefueffvefg keenucfkphepkedtrdduieejrdelkedrudeltdenucevlhhushhtvghrufhiiigvpedtne curfgrrhgrmhepmhgrihhlfhhrohhmpehithhssehirhhrvghlvghvrghnthdrughk X-ME-Proxy: Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id 43D733064AAA; Thu, 26 Nov 2020 18:46:06 -0500 (EST) From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH v5 02/12] hw/block/nvme: Merge nvme_write_zeroes() with nvme_write() Date: Fri, 27 Nov 2020 00:45:51 +0100 Message-Id: <20201126234601.689714-3-its@irrelevant.dk> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201126234601.689714-1-its@irrelevant.dk> References: <20201126234601.689714-1-its@irrelevant.dk> MIME-Version: 1.0 Received-SPF: pass client-ip=66.111.4.224; envelope-from=its@irrelevant.dk; helo=new2-smtp.messagingengine.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Niklas Cassel , Dmitry Fomichev , Klaus Jensen , Max Reitz , Keith Busch , Stefan Hajnoczi , Klaus Jensen Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Dmitry Fomichev nvme_write() now handles WRITE, WRITE ZEROES and ZONE_APPEND. Signed-off-by: Dmitry Fomichev Reviewed-by: Niklas Cassel [kj: rebased] Signed-off-by: Klaus Jensen --- hw/block/nvme.c | 69 +++++++++++++++++-------------------------- hw/block/trace-events | 1 - 2 files changed, 27 insertions(+), 43 deletions(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 657d0b8b2922..0050ef87cb92 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1449,31 +1449,6 @@ invalid: return status; } -static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req) -{ - NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; - NvmeNamespace *ns = req->ns; - uint64_t slba = le64_to_cpu(rw->slba); - uint32_t nlb = (uint32_t)le16_to_cpu(rw->nlb) + 1; - uint64_t offset = nvme_l2b(ns, slba); - uint32_t count = nvme_l2b(ns, nlb); - uint16_t status; - - trace_pci_nvme_write_zeroes(nvme_cid(req), nvme_nsid(ns), slba, nlb); - - status = nvme_check_bounds(ns, slba, nlb); - if (status) { - trace_pci_nvme_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze); - return status; - } - - block_acct_start(blk_get_stats(req->ns->blkconf.blk), &req->acct, 0, - BLOCK_ACCT_WRITE); - req->aiocb = blk_aio_pwrite_zeroes(req->ns->blkconf.blk, offset, count, - BDRV_REQ_MAY_UNMAP, nvme_rw_cb, req); - return NVME_NO_COMPLETE; -} - static uint16_t nvme_write(NvmeCtrl *n, NvmeRequest *req) { NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; @@ -1483,15 +1458,18 @@ static uint16_t nvme_write(NvmeCtrl *n, NvmeRequest *req) uint64_t data_size = nvme_l2b(ns, nlb); uint64_t data_offset; BlockBackend *blk = ns->blkconf.blk; + bool wrz = rw->opcode == NVME_CMD_WRITE_ZEROES; uint16_t status; trace_pci_nvme_write(nvme_cid(req), nvme_io_opc_str(rw->opcode), nvme_nsid(ns), nlb, data_size, slba); - status = nvme_check_mdts(n, data_size); - if (status) { - trace_pci_nvme_err_mdts(nvme_cid(req), data_size); - goto invalid; + if (!wrz) { + status = nvme_check_mdts(n, data_size); + if (status) { + trace_pci_nvme_err_mdts(nvme_cid(req), data_size); + goto invalid; + } } status = nvme_check_bounds(ns, slba, nlb); @@ -1500,22 +1478,30 @@ static uint16_t nvme_write(NvmeCtrl *n, NvmeRequest *req) goto invalid; } - status = nvme_map_dptr(n, data_size, req); - if (status) { - goto invalid; - } - data_offset = nvme_l2b(ns, slba); - block_acct_start(blk_get_stats(blk), &req->acct, data_size, - BLOCK_ACCT_WRITE); - if (req->qsg.sg) { - req->aiocb = dma_blk_write(blk, &req->qsg, data_offset, - BDRV_SECTOR_SIZE, nvme_rw_cb, req); + if (!wrz) { + status = nvme_map_dptr(n, data_size, req); + if (status) { + goto invalid; + } + + block_acct_start(blk_get_stats(blk), &req->acct, data_size, + BLOCK_ACCT_WRITE); + if (req->qsg.sg) { + req->aiocb = dma_blk_write(blk, &req->qsg, data_offset, + BDRV_SECTOR_SIZE, nvme_rw_cb, req); + } else { + req->aiocb = blk_aio_pwritev(blk, data_offset, &req->iov, 0, + nvme_rw_cb, req); + } } else { - req->aiocb = blk_aio_pwritev(blk, data_offset, &req->iov, 0, - nvme_rw_cb, req); + block_acct_start(blk_get_stats(blk), &req->acct, 0, BLOCK_ACCT_WRITE); + req->aiocb = blk_aio_pwrite_zeroes(blk, data_offset, data_size, + BDRV_REQ_MAY_UNMAP, nvme_rw_cb, + req); } + return NVME_NO_COMPLETE; invalid: @@ -1547,7 +1533,6 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req) case NVME_CMD_FLUSH: return nvme_flush(n, req); case NVME_CMD_WRITE_ZEROES: - return nvme_write_zeroes(n, req); case NVME_CMD_WRITE: return nvme_write(n, req); case NVME_CMD_READ: diff --git a/hw/block/trace-events b/hw/block/trace-events index cc269b51a1e0..35ea40c49169 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -47,7 +47,6 @@ pci_nvme_copy(uint16_t cid, uint32_t nsid, uint16_t nr, uint8_t format) "cid %"P pci_nvme_copy_source_range(uint64_t slba, uint32_t nlb) "slba 0x%"PRIx64" nlb %"PRIu32"" pci_nvme_copy_in_complete(uint16_t cid) "cid %"PRIu16"" pci_nvme_copy_cb(uint16_t cid) "cid %"PRIu16"" -pci_nvme_write_zeroes(uint16_t cid, uint32_t nsid, uint64_t slba, uint32_t nlb) "cid %"PRIu16" nsid %"PRIu32" slba %"PRIu64" nlb %"PRIu32"" pci_nvme_block_status(int64_t offset, int64_t bytes, int64_t pnum, int ret, bool zeroed) "offset %"PRId64" bytes %"PRId64" pnum %"PRId64" ret 0x%x zeroed %d" pci_nvme_dsm(uint16_t cid, uint32_t nsid, uint32_t nr, uint32_t attr) "cid %"PRIu16" nsid %"PRIu32" nr %"PRIu32" attr 0x%"PRIx32"" pci_nvme_dsm_deallocate(uint16_t cid, uint32_t nsid, uint64_t slba, uint32_t nlb) "cid %"PRIu16" nsid %"PRIu32" slba %"PRIu64" nlb %"PRIu32"" From patchwork Thu Nov 26 23:45:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Klaus Jensen X-Patchwork-Id: 11934925 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64401C63777 for ; Thu, 26 Nov 2020 23:50:08 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C961621D91 for ; Thu, 26 Nov 2020 23:50:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C961621D91 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=irrelevant.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:48724 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kiR1a-00013w-J7 for qemu-devel@archiver.kernel.org; Thu, 26 Nov 2020 18:50:06 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:55868) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kiQxx-0007Pz-5u; Thu, 26 Nov 2020 18:46:21 -0500 Received: from out3-smtp.messagingengine.com ([66.111.4.27]:54733) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kiQxm-0003tG-Dq; Thu, 26 Nov 2020 18:46:20 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 56CF35C0203; Thu, 26 Nov 2020 18:46:09 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Thu, 26 Nov 2020 18:46:09 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=j2k9HpTrV3TkL 6eRFBEngDJVE7jcAkCbhkcm19svwDg=; b=voFziL87mlMuHQraZFHPGo7ODxLJ4 5C0OcxbM6NdeROpUXnPwp6FOtBDI3i7PMHaW+ohZJpQEWnlTpAfCZv4EDtb8LuJZ RrXYHmb9W+IGkvat5xM1S+lW3+MUXEO9eBi5Xl1bNHpdplADkaONtD6FoccJyYCY AaiKDP6lrta+nTukKjp4OfEjWQj2ZRNY2tNvNsS3VD3L8jRpvV54zeDGGEQ1bQu2 wDbbuAvy2YTvesEfkleYRVNDqqWGfNX1L9GQLnduRTRWhLCFMwARMYfUW08ZghdF 7EltxKum73FvNY4Uxax1cFrAujoiW41kH6kvGWkAKSEZEbnQ/PrMIjXYg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=j2k9HpTrV3TkL6eRFBEngDJVE7jcAkCbhkcm19svwDg=; b=mn7gIkqK MeDyrqaSzySgg3YmVfigLsaOj4vbYuOaYbM2XhtjS6dsNaSPh0F4VEnboyhoiSdE DvOvBWdLE7Aq8nlkC4mb2l0p7ooSsws/MNXSLQlRBrlFvaoUsFtM9JmNaUyLfY0l LspN9C48mOjZMM60xOU9+FHr2wmDL1PJoXahrGiabs8c/MQygAsCQ5uEv28+jYg2 44a70VvGchleKFbBbUl9RbBxkWMljkH16aeGwhk9BZ2vDjyrxW+NEWB/lylsps4m 5tkgRK5wObL6ljbuq9hS8iS2+/6gd1MONxG62N4Ihe1IWbY++ivd5gnJGqXQZEyT O73gehfrHZIPnA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrudehfedguddvucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvufffkffojghfggfgsedtkeertdertddtnecuhfhrohhmpefmlhgruhhs ucflvghnshgvnhcuoehithhssehirhhrvghlvghvrghnthdrughkqeenucggtffrrghtth gvrhhnpeeuleetgeeiuefhgfekfefgveejiefgteekiedtgfdtieefhfdthfefueffvefg keenucfkphepkedtrdduieejrdelkedrudeltdenucevlhhushhtvghrufhiiigvpedtne curfgrrhgrmhepmhgrihhlfhhrohhmpehithhssehirhhrvghlvghvrghnthdrughk X-ME-Proxy: Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id D80B63064AB3; Thu, 26 Nov 2020 18:46:07 -0500 (EST) From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH v5 03/12] hw/block/nvme: add commands supported and effects log page Date: Fri, 27 Nov 2020 00:45:52 +0100 Message-Id: <20201126234601.689714-4-its@irrelevant.dk> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201126234601.689714-1-its@irrelevant.dk> References: <20201126234601.689714-1-its@irrelevant.dk> MIME-Version: 1.0 Received-SPF: pass client-ip=66.111.4.27; envelope-from=its@irrelevant.dk; helo=out3-smtp.messagingengine.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Klaus Jensen , Gollu Appalanaidu , Max Reitz , Keith Busch , Stefan Hajnoczi , Klaus Jensen Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Gollu Appalanaidu This is to support for the Commands Supported and Effects log page. See NVM Express Spec 1.3d, sec. 5.14.1.5 ("Commands Supported and Effects") Signed-off-by: Gollu Appalanaidu Signed-off-by: Klaus Jensen --- include/block/nvme.h | 25 ++++++++++++++++-- hw/block/nvme.c | 61 +++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 83 insertions(+), 3 deletions(-) diff --git a/include/block/nvme.h b/include/block/nvme.h index 6ea435fd34ab..ffc65dc25f90 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -773,6 +773,24 @@ typedef struct QEMU_PACKED NvmeSmartLog { uint8_t reserved2[320]; } NvmeSmartLog; +typedef struct NvmeEffectsLog { + uint32_t acs[256]; + uint32_t iocs[256]; + uint8_t rsvd2048[2048]; +} NvmeEffectsLog; + +enum { + NVME_EFFECTS_CSUPP = 1 << 0, + NVME_EFFECTS_LBCC = 1 << 1, + NVME_EFFECTS_NCC = 1 << 2, + NVME_EFFECTS_NIC = 1 << 3, + NVME_EFFECTS_CCC = 1 << 4, + NVME_EFFECTS_CSE_SINGLE = 1 << 16, + NVME_EFFECTS_CSE_MULTI = 1 << 17, + NVME_EFFECTS_CSE_MASK = 3 << 16, + NVME_EFFECTS_UUID_SEL = 1 << 19, +}; + enum NvmeSmartWarn { NVME_SMART_SPARE = 1 << 0, NVME_SMART_TEMPERATURE = 1 << 1, @@ -785,6 +803,7 @@ enum NvmeLogIdentifier { NVME_LOG_ERROR_INFO = 0x01, NVME_LOG_SMART_INFO = 0x02, NVME_LOG_FW_SLOT_INFO = 0x03, + NVME_LOG_EFFECTS = 0x05, }; typedef struct QEMU_PACKED NvmePSD { @@ -901,8 +920,9 @@ enum NvmeIdCtrlFrmw { }; enum NvmeIdCtrlLpa { - NVME_LPA_NS_SMART = 1 << 0, - NVME_LPA_EXTENDED = 1 << 2, + NVME_LPA_NS_SMART = 1 << 0, + NVME_LPA_EFFECTS_LOG = 1 << 1, + NVME_LPA_EXTENDED = 1 << 2, }; #define NVME_CTRL_SQES_MIN(sqes) ((sqes) & 0xf) @@ -1119,5 +1139,6 @@ static inline void _nvme_check_size(void) QEMU_BUILD_BUG_ON(sizeof(NvmeIdNs) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeSglDescriptor) != 16); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsDescr) != 4); + QEMU_BUILD_BUG_ON(sizeof(NvmeEffectsLog) != 4096); } #endif diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 0050ef87cb92..7a5ec843d567 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1782,6 +1782,63 @@ static uint16_t nvme_error_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len, DMA_DIRECTION_FROM_DEVICE, req); } +static void nvme_effects_nvm(NvmeEffectsLog *effects) +{ + effects->iocs[NVME_CMD_FLUSH] = NVME_EFFECTS_CSUPP | NVME_EFFECTS_LBCC; + effects->iocs[NVME_CMD_WRITE] = NVME_EFFECTS_CSUPP | NVME_EFFECTS_LBCC; + effects->iocs[NVME_CMD_READ] = NVME_EFFECTS_CSUPP; + effects->iocs[NVME_CMD_COMPARE] = NVME_EFFECTS_CSUPP; + effects->iocs[NVME_CMD_WRITE_ZEROES] = NVME_EFFECTS_CSUPP | + NVME_EFFECTS_LBCC; + effects->iocs[NVME_CMD_DSM] = NVME_EFFECTS_CSUPP | NVME_EFFECTS_LBCC; + effects->iocs[NVME_CMD_COPY] = NVME_EFFECTS_CSUPP | NVME_EFFECTS_LBCC; +} + +static uint16_t nvme_effects_log(NvmeCtrl *n, uint32_t buf_len, uint64_t off, + NvmeRequest *req) +{ + NvmeEffectsLog effects = (NvmeEffectsLog) { + .acs = { + [NVME_ADM_CMD_DELETE_SQ] = NVME_EFFECTS_CSUPP, + [NVME_ADM_CMD_CREATE_SQ] = NVME_EFFECTS_CSUPP, + [NVME_ADM_CMD_GET_LOG_PAGE] = NVME_EFFECTS_CSUPP, + [NVME_ADM_CMD_DELETE_CQ] = NVME_EFFECTS_CSUPP, + [NVME_ADM_CMD_CREATE_CQ] = NVME_EFFECTS_CSUPP, + [NVME_ADM_CMD_IDENTIFY] = NVME_EFFECTS_CSUPP, + [NVME_ADM_CMD_ABORT] = NVME_EFFECTS_CSUPP, + [NVME_ADM_CMD_SET_FEATURES] = NVME_EFFECTS_CSUPP | + NVME_EFFECTS_CCC | + NVME_EFFECTS_NIC | + NVME_EFFECTS_NCC, + [NVME_ADM_CMD_GET_FEATURES] = NVME_EFFECTS_CSUPP, + [NVME_ADM_CMD_ASYNC_EV_REQ] = NVME_EFFECTS_CSUPP + }, + }; + + uint32_t trans_len; + + if (off >= sizeof(NvmeEffectsLog)) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + switch (NVME_CC_CSS(n->bar.cc)) { + case NVME_CC_CSS_ADMIN_ONLY: + break; + + case NVME_CC_CSS_NVM: + nvme_effects_nvm(&effects); + break; + + default: + return NVME_INTERNAL_DEV_ERROR | NVME_DNR; + } + + trans_len = MIN(sizeof(effects) - off, buf_len); + + return nvme_dma(n, (uint8_t *)&effects + off, trans_len, + DMA_DIRECTION_FROM_DEVICE, req); +} + static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req) { NvmeCmd *cmd = &req->cmd; @@ -1825,6 +1882,8 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req) return nvme_smart_info(n, rae, len, off, req); case NVME_LOG_FW_SLOT_INFO: return nvme_fw_log_info(n, len, off, req); + case NVME_LOG_EFFECTS: + return nvme_effects_log(n, len, off, req); default: trace_pci_nvme_err_invalid_log_page(nvme_cid(req), lid); return NVME_INVALID_FIELD | NVME_DNR; @@ -3286,7 +3345,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev) id->acl = 3; id->aerl = n->params.aerl; id->frmw = (NVME_NUM_FW_SLOTS << 1) | NVME_FRMW_SLOT1_RO; - id->lpa = NVME_LPA_NS_SMART | NVME_LPA_EXTENDED; + id->lpa = NVME_LPA_NS_SMART | NVME_LPA_EXTENDED | NVME_LPA_EFFECTS_LOG; /* recommended default value (~70 C) */ id->wctemp = cpu_to_le16(NVME_TEMPERATURE_WARNING); From patchwork Thu Nov 26 23:45:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Klaus Jensen X-Patchwork-Id: 11934931 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B035C63777 for ; Thu, 26 Nov 2020 23:55:15 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8C0E92070E for ; Thu, 26 Nov 2020 23:55:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8C0E92070E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=irrelevant.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:58462 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kiR6X-0005C6-C7 for qemu-devel@archiver.kernel.org; Thu, 26 Nov 2020 18:55:13 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:55834) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kiQxu-0007Pj-7M; Thu, 26 Nov 2020 18:46:18 -0500 Received: from new2-smtp.messagingengine.com ([66.111.4.224]:33473) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kiQxo-0003tP-Ky; Thu, 26 Nov 2020 18:46:17 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.nyi.internal (Postfix) with ESMTP id 00F19581056; Thu, 26 Nov 2020 18:46:11 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Thu, 26 Nov 2020 18:46:11 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=+lRBJL7LBAg5M 4NcHDtnbUic7lHnkZSGRv4lKJaJfws=; b=0pNiH+sQIRhpqOQlUDWfzpwuHaJie hGIwieV/Qq/pA28CSgt+Gwyje6kScDKgVvoT91jr8To7MVl/rKl3/A48m/zCpItU jfsJu8oM21BrrhxKCVjeoUN37KCw1cDO8SKHWstYVm4b420Aq7tSlrfpXooe4SdQ 5qADdKiX0+GH9DYrvRvyCtPx6VDNMOofz8k8TMETytF+DTEJW1zkv0Ak4yElFiZz jf187Gr3udEQZz92PXTQ3Q3Ru6foZpi1CcPZcMtsAEm8tNpyu11Ox/ll2sNqkGma pb/FHhEf+cI5JqrP44TQBF2qm0xNhYJCjUXNy7eQf7T1104lfyCbErAjQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=+lRBJL7LBAg5M4NcHDtnbUic7lHnkZSGRv4lKJaJfws=; b=iDo6QNJJ /iwKtDxaMFNRZ7QfDW8YgfAZsqo9vIa1iaYvtNhYGWyXWCPfAiaTLVUYnFqroS11 n2QqzRql95MZ/YWcwRlL3F4M/TB35fr80mZTiTDmx98rJEh93t1jTG1DUnSY3HwL sXnU2IJfSm91MC6CapitPlHX4ZgfnnSLuSpVK14CapUi+aPVCQgRvM4czwZnSHNQ /DfzzArMHN5LX0hWBARPDHt+WU3Xb7/L/rYx9pBAcAEvRo4w5ZAvmvbqevM8OfWq gyvHU5CUDNvZ1ywky2B9BGCUcCXPweVu62DrWBbuQCtfHeV5CWeNAJi17NarA4j4 OM/u/QmF24+cqw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrudehfedguddvucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvufffkffojghfggfgsedtkeertdertddtnecuhfhrohhmpefmlhgruhhs ucflvghnshgvnhcuoehithhssehirhhrvghlvghvrghnthdrughkqeenucggtffrrghtth gvrhhnpeeiudehfeejteegueeffeekhfelgfekheetfffgteejgeekuefgtddujeeuudeg ieenucffohhmrghinhepuhhuihgurdgurghtrgenucfkphepkedtrdduieejrdelkedrud eltdenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehi thhssehirhhrvghlvghvrghnthdrughk X-ME-Proxy: Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id 5F4263064AAF; Thu, 26 Nov 2020 18:46:09 -0500 (EST) From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH v5 04/12] hw/block/nvme: Generate namespace UUIDs Date: Fri, 27 Nov 2020 00:45:53 +0100 Message-Id: <20201126234601.689714-5-its@irrelevant.dk> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201126234601.689714-1-its@irrelevant.dk> References: <20201126234601.689714-1-its@irrelevant.dk> MIME-Version: 1.0 Received-SPF: pass client-ip=66.111.4.224; envelope-from=its@irrelevant.dk; helo=new2-smtp.messagingengine.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Niklas Cassel , Dmitry Fomichev , Klaus Jensen , Max Reitz , Keith Busch , Stefan Hajnoczi , Klaus Jensen Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Dmitry Fomichev In NVMe 1.4, a namespace must report an ID descriptor of UUID type if it doesn't support EUI64 or NGUID. Add a new namespace property, "uuid", that provides the user the option to either specify the UUID explicitly or have a UUID generated automatically every time a namespace is initialized. Suggested-by: Klaus Jensen Signed-off-by: Dmitry Fomichev Reviewed-by: Klaus Jensen Reviewed-by: Keith Busch Reviewed-by: Niklas Cassel --- hw/block/nvme-ns.h | 1 + hw/block/nvme-ns.c | 1 + hw/block/nvme.c | 9 +++++---- 3 files changed, 7 insertions(+), 4 deletions(-) diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index 745d288b09cf..1f8c9c0a92ad 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -21,6 +21,7 @@ typedef struct NvmeNamespaceParams { uint32_t nsid; + QemuUUID uuid; uint16_t mssrl; uint32_t mcl; diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index eb28757c2f17..505f6fb0a654 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -155,6 +155,7 @@ static void nvme_ns_realize(DeviceState *dev, Error **errp) static Property nvme_ns_props[] = { DEFINE_BLOCK_PROPERTIES(NvmeNamespace, blkconf), DEFINE_PROP_UINT32("nsid", NvmeNamespace, params.nsid, 0), + DEFINE_PROP_UUID("uuid", NvmeNamespace, params.uuid), DEFINE_PROP_UINT16("mssrl", NvmeNamespace, params.mssrl, 128), DEFINE_PROP_UINT32("mcl", NvmeNamespace, params.mcl, 128), DEFINE_PROP_UINT8("msrc", NvmeNamespace, params.msrc, 255), diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 7a5ec843d567..4f732c13c7e9 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -2068,6 +2068,7 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req) static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req) { + NvmeNamespace *ns; NvmeIdentify *c = (NvmeIdentify *)&req->cmd; uint32_t nsid = le32_to_cpu(c->nsid); uint8_t list[NVME_IDENTIFY_DATA_SIZE]; @@ -2087,7 +2088,8 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req) return NVME_INVALID_NSID | NVME_DNR; } - if (unlikely(!nvme_ns(n, nsid))) { + ns = nvme_ns(n, nsid); + if (unlikely(!ns)) { return NVME_INVALID_FIELD | NVME_DNR; } @@ -2096,12 +2098,11 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req) /* * Because the NGUID and EUI64 fields are 0 in the Identify Namespace data * structure, a Namespace UUID (nidt = 0x3) must be reported in the - * Namespace Identification Descriptor. Add a very basic Namespace UUID - * here. + * Namespace Identification Descriptor. Add the namespace UUID here. */ ns_descrs->uuid.hdr.nidt = NVME_NIDT_UUID; ns_descrs->uuid.hdr.nidl = NVME_NIDT_UUID_LEN; - stl_be_p(&ns_descrs->uuid.v, nsid); + memcpy(&ns_descrs->uuid.v, ns->params.uuid.data, NVME_NIDT_UUID_LEN); return nvme_dma(n, list, NVME_IDENTIFY_DATA_SIZE, DMA_DIRECTION_FROM_DEVICE, req); From patchwork Thu Nov 26 23:45:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Klaus Jensen X-Patchwork-Id: 11934933 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DD85CC63798 for ; Thu, 26 Nov 2020 23:55:16 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6F9EC2070E for ; Thu, 26 Nov 2020 23:55:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6F9EC2070E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=irrelevant.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:58542 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kiR6Z-0005Dv-6Y for qemu-devel@archiver.kernel.org; Thu, 26 Nov 2020 18:55:15 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:55914) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kiQxz-0007Rf-0u; Thu, 26 Nov 2020 18:46:23 -0500 Received: from out3-smtp.messagingengine.com ([66.111.4.27]:44761) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kiQxq-0003tY-CU; Thu, 26 Nov 2020 18:46:22 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 81A985C0213; Thu, 26 Nov 2020 18:46:12 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Thu, 26 Nov 2020 18:46:12 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=ENwZJt1MfSZJS DKphDFhCxwUnw4t/EzBhqNWygkefss=; b=iAToZEZeqCzYMb5Y0fE6ZRGN3vV1u uT4vekpFJP2gsCr3+z3y+LIvIR/ggLLb0BE8wPVxVkSMRQXqucopea8KG8udAA2J jI6MgiR3HT1r2fAiQiGozue7Lz/4dwtyl38OSvfBmHokohEVSRvetmJSpmOZSiGD CWyMHgr3GdsQ8ZXHHTVKyHOkQJM6php4clY9dXBRruPaW43ykdX4VCcXiOXM2NMJ 9lbn9VHKEFmguWpD590O+1RDtwb0eqismG3IvjpaXmugTh6+URGqfPVjv3KeqbSV ypPDog/HcH2HQqGFzUuPx7o98cuAeAxN0ZHBYXbmbuvwZO8PeIukvv5Yg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=ENwZJt1MfSZJSDKphDFhCxwUnw4t/EzBhqNWygkefss=; b=I39OFV+Y l00b6/0qYE00ARJ4cMzalBfB1zrsMJ6XR5uoK5/c5TpNeWhbwsFUl62FLpENFSTv TSrgYY5jjJdHLwRnlT+gJLVc6D3cKGYD+A7Bj/9eVnbG8+egnCyHnSt2DZjIRkq+ VlPf58EZjClA7SIzdeYOIxWNO00nSMcR6pMy3mq4/8TObmu7MHUCsWUqov4Mos0O dJU/NuF8NhoxBO6h3kb8S2jLIf1u6PA6d+nMM4yEAiYshBfiwrHMIOGCLwlIDlBE PYBusUlDmgrusvLsgdBARhwWrOIK9gMLhmYwtNHQ3DsiT5u2EWIhKMZIVGuUSyb9 bk6Krob35pwn6w== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrudehfedgudefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvufffkffojghfggfgsedtkeertdertddtnecuhfhrohhmpefmlhgruhhs ucflvghnshgvnhcuoehithhssehirhhrvghlvghvrghnthdrughkqeenucggtffrrghtth gvrhhnpeeiudehfeejteegueeffeekhfelgfekheetfffgteejgeekuefgtddujeeuudeg ieenucffohhmrghinhepuhhuihgurdgurghtrgenucfkphepkedtrdduieejrdelkedrud eltdenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehi thhssehirhhrvghlvghvrghnthdrughk X-ME-Proxy: Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id 02CD03064AAE; Thu, 26 Nov 2020 18:46:10 -0500 (EST) From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH v5 05/12] hw/block/nvme: support namespace types Date: Fri, 27 Nov 2020 00:45:54 +0100 Message-Id: <20201126234601.689714-6-its@irrelevant.dk> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201126234601.689714-1-its@irrelevant.dk> References: <20201126234601.689714-1-its@irrelevant.dk> MIME-Version: 1.0 Received-SPF: pass client-ip=66.111.4.27; envelope-from=its@irrelevant.dk; helo=out3-smtp.messagingengine.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Klaus Jensen , Max Reitz , Keith Busch , Stefan Hajnoczi , Klaus Jensen Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Klaus Jensen Implement support for TP 4056 ("Namespace Types"). This adds the 'iocs' (I/O Command Set) device parameter to the nvme-ns device. Signed-off-by: Klaus Jensen --- docs/specs/nvme.txt | 3 + hw/block/nvme-ns.h | 12 ++- hw/block/nvme.h | 3 + include/block/nvme.h | 52 ++++++++--- block/nvme.c | 4 +- hw/block/nvme-ns.c | 20 +++- hw/block/nvme.c | 209 +++++++++++++++++++++++++++++++++++------- hw/block/trace-events | 6 +- 8 files changed, 256 insertions(+), 53 deletions(-) diff --git a/docs/specs/nvme.txt b/docs/specs/nvme.txt index 56d393884e7a..619bd9ce4378 100644 --- a/docs/specs/nvme.txt +++ b/docs/specs/nvme.txt @@ -3,6 +3,9 @@ NVM Express Controller The nvme device (-device nvme) emulates an NVM Express Controller. + `iocs`; The "I/O Command Set" associated with the namespace. E.g. 0x0 for the + NVM Command Set (the default), or 0x2 for the Zoned Namespace Command Set. + Reference Specifications ------------------------ diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index 1f8c9c0a92ad..3b095423cf52 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -21,6 +21,7 @@ typedef struct NvmeNamespaceParams { uint32_t nsid; + uint8_t iocs; QemuUUID uuid; uint16_t mssrl; @@ -33,7 +34,8 @@ typedef struct NvmeNamespace { BlockConf blkconf; int32_t bootindex; int64_t size; - NvmeIdNs id_ns; + uint8_t iocs; + void *id_ns[NVME_IOCS_MAX]; NvmeNamespaceParams params; @@ -53,7 +55,7 @@ static inline uint32_t nvme_nsid(NvmeNamespace *ns) static inline NvmeLBAF *nvme_ns_lbaf(NvmeNamespace *ns) { - NvmeIdNs *id_ns = &ns->id_ns; + NvmeIdNsNvm *id_ns = ns->id_ns[NVME_IOCS_NVM]; return &id_ns->lbaf[NVME_ID_NS_FLBAS_INDEX(id_ns->flbas)]; } @@ -68,6 +70,12 @@ static inline uint64_t nvme_ns_nlbas(NvmeNamespace *ns) return ns->size >> nvme_ns_lbads(ns); } +static inline uint64_t nvme_ns_nsze(NvmeNamespace *ns) +{ + NvmeIdNsNvm *id_ns = ns->id_ns[NVME_IOCS_NVM]; + return le64_to_cpu(id_ns->nsze); +} + /* convert an LBA to the equivalent in bytes */ static inline size_t nvme_l2b(NvmeNamespace *ns, uint64_t lba) { diff --git a/hw/block/nvme.h b/hw/block/nvme.h index 079807272ae7..b1616ba79733 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -115,6 +115,7 @@ typedef struct NvmeFeatureVal { }; uint32_t async_config; uint32_t vwc; + uint32_t iocsci; } NvmeFeatureVal; typedef struct NvmeCtrl { @@ -143,6 +144,7 @@ typedef struct NvmeCtrl { uint64_t timestamp_set_qemu_clock_ms; /* QEMU clock time */ uint64_t starttime_ms; uint16_t temperature; + uint64_t iocscs[512]; HostMemoryBackend *pmrdev; @@ -158,6 +160,7 @@ typedef struct NvmeCtrl { NvmeSQueue admin_sq; NvmeCQueue admin_cq; NvmeIdCtrl id_ctrl; + void *id_ctrl_iocss[NVME_IOCS_MAX]; NvmeFeatureVal features; } NvmeCtrl; diff --git a/include/block/nvme.h b/include/block/nvme.h index ffc65dc25f90..53c051d52c53 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -88,6 +88,7 @@ enum NvmeCapMask { enum NvmeCapCss { NVME_CAP_CSS_NVM = 1 << 0, + NVME_CAP_CSS_CSI = 1 << 6, NVME_CAP_CSS_ADMIN_ONLY = 1 << 7, }; @@ -121,6 +122,7 @@ enum NvmeCcMask { enum NvmeCcCss { NVME_CC_CSS_NVM = 0x0, + NVME_CC_CSS_ALL = 0x6, NVME_CC_CSS_ADMIN_ONLY = 0x7, }; @@ -392,6 +394,11 @@ enum NvmePmrmscMask { #define NVME_PMRMSC_SET_CBA(pmrmsc, val) \ (pmrmsc |= (uint64_t)(val & PMRMSC_CBA_MASK) << PMRMSC_CBA_SHIFT) +enum NvmeCommandSet { + NVME_IOCS_NVM = 0x0, + NVME_IOCS_MAX = 0x1, +}; + enum NvmeSglDescriptorType { NVME_SGL_DESCR_TYPE_DATA_BLOCK = 0x0, NVME_SGL_DESCR_TYPE_BIT_BUCKET = 0x1, @@ -539,8 +546,13 @@ typedef struct QEMU_PACKED NvmeIdentify { uint64_t rsvd2[2]; uint64_t prp1; uint64_t prp2; - uint32_t cns; - uint32_t rsvd11[5]; + uint8_t cns; + uint8_t rsvd3; + uint16_t cntid; + uint16_t nvmsetid; + uint8_t rsvd4; + uint8_t csi; + uint32_t rsvd11[4]; } NvmeIdentify; typedef struct QEMU_PACKED NvmeRwCmd { @@ -661,8 +673,15 @@ typedef struct QEMU_PACKED NvmeAerResult { } NvmeAerResult; typedef struct QEMU_PACKED NvmeCqe { - uint32_t result; - uint32_t rsvd; + union { + struct { + uint32_t dw0; + uint32_t dw1; + }; + + uint64_t qw0; + }; + uint16_t sq_head; uint16_t sq_id; uint16_t cid; @@ -711,6 +730,10 @@ enum NvmeStatusCodes { NVME_FEAT_NOT_CHANGEABLE = 0x010e, NVME_FEAT_NOT_NS_SPEC = 0x010f, NVME_FW_REQ_SUSYSTEM_RESET = 0x0110, + NVME_IOCS_NOT_SUPPORTED = 0x0129, + NVME_IOCS_NOT_ENABLED = 0x012a, + NVME_IOCS_COMB_REJECTED = 0x012b, + NVME_INVALID_IOCS = 0x012c, NVME_CONFLICTING_ATTRS = 0x0180, NVME_INVALID_PROT_INFO = 0x0181, NVME_WRITE_TO_RO = 0x0182, @@ -821,10 +844,14 @@ typedef struct QEMU_PACKED NvmePSD { #define NVME_IDENTIFY_DATA_SIZE 4096 enum { - NVME_ID_CNS_NS = 0x0, - NVME_ID_CNS_CTRL = 0x1, - NVME_ID_CNS_NS_ACTIVE_LIST = 0x2, - NVME_ID_CNS_NS_DESCR_LIST = 0x3, + NVME_ID_CNS_NS = 0x00, + NVME_ID_CNS_CTRL = 0x01, + NVME_ID_CNS_NS_ACTIVE_LIST = 0x02, + NVME_ID_CNS_NS_DESCR_LIST = 0x03, + NVME_ID_CNS_NS_IOCS = 0x05, + NVME_ID_CNS_CTRL_IOCS = 0x06, + NVME_ID_CNS_NS_ACTIVE_LIST_IOCS = 0x07, + NVME_ID_CNS_IOCS = 0x1c, }; typedef struct QEMU_PACKED NvmeIdCtrl { @@ -980,6 +1007,7 @@ enum NvmeFeatureIds { NVME_WRITE_ATOMICITY = 0xa, NVME_ASYNCHRONOUS_EVENT_CONF = 0xb, NVME_TIMESTAMP = 0xe, + NVME_COMMAND_SET_PROFILE = 0x19, NVME_SOFTWARE_PROGRESS_MARKER = 0x80, NVME_FID_MAX = 0x100, }; @@ -1028,7 +1056,7 @@ typedef struct QEMU_PACKED NvmeLBAF { #define NVME_NSID_BROADCAST 0xffffffff -typedef struct QEMU_PACKED NvmeIdNs { +typedef struct QEMU_PACKED NvmeIdNsNvm { uint64_t nsze; uint64_t ncap; uint64_t nuse; @@ -1064,7 +1092,7 @@ typedef struct QEMU_PACKED NvmeIdNs { NvmeLBAF lbaf[16]; uint8_t rsvd192[192]; uint8_t vs[3712]; -} NvmeIdNs; +} NvmeIdNsNvm; typedef struct QEMU_PACKED NvmeIdNsDescr { uint8_t nidt; @@ -1076,12 +1104,14 @@ enum { NVME_NIDT_EUI64_LEN = 8, NVME_NIDT_NGUID_LEN = 16, NVME_NIDT_UUID_LEN = 16, + NVME_NIDT_CSI_LEN = 1, }; enum NvmeNsIdentifierType { NVME_NIDT_EUI64 = 0x1, NVME_NIDT_NGUID = 0x2, NVME_NIDT_UUID = 0x3, + NVME_NIDT_CSI = 0x4, }; /*Deallocate Logical Block Features*/ @@ -1136,7 +1166,7 @@ static inline void _nvme_check_size(void) QEMU_BUILD_BUG_ON(sizeof(NvmeFwSlotInfoLog) != 512); QEMU_BUILD_BUG_ON(sizeof(NvmeSmartLog) != 512); QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrl) != 4096); - QEMU_BUILD_BUG_ON(sizeof(NvmeIdNs) != 4096); + QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsNvm) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeSglDescriptor) != 16); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsDescr) != 4); QEMU_BUILD_BUG_ON(sizeof(NvmeEffectsLog) != 4096); diff --git a/block/nvme.c b/block/nvme.c index 739a0a700cb8..4048ac10a0f4 100644 --- a/block/nvme.c +++ b/block/nvme.c @@ -337,7 +337,7 @@ static inline int nvme_translate_error(const NvmeCqe *c) { uint16_t status = (le16_to_cpu(c->status) >> 1) & 0xFF; if (status) { - trace_nvme_error(le32_to_cpu(c->result), + trace_nvme_error(le32_to_cpu(c->dw0), le16_to_cpu(c->sq_head), le16_to_cpu(c->sq_id), le16_to_cpu(c->cid), @@ -507,7 +507,7 @@ static void nvme_identify(BlockDriverState *bs, int namespace, Error **errp) BDRVNVMeState *s = bs->opaque; union { NvmeIdCtrl ctrl; - NvmeIdNs ns; + NvmeIdNsNvm ns; } *id; NvmeLBAF *lbaf; uint16_t oncs; diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index 505f6fb0a654..7d70095439b6 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -33,11 +33,16 @@ static int nvme_ns_init(NvmeNamespace *ns, Error **errp) { BlockDriverInfo bdi; - NvmeIdNs *id_ns = &ns->id_ns; - int lba_index = NVME_ID_NS_FLBAS_INDEX(ns->id_ns.flbas); + NvmeIdNsNvm *id_ns; + int lba_index; int npdg; - ns->id_ns.dlfeat = 0x9; + id_ns = ns->id_ns[NVME_IOCS_NVM] = g_new0(NvmeIdNsNvm, 1); + + ns->iocs = ns->params.iocs; + lba_index = NVME_ID_NS_FLBAS_INDEX(id_ns->flbas); + + id_ns->dlfeat = 0x9; id_ns->lbaf[lba_index].ds = 31 - clz32(ns->blkconf.logical_block_size); @@ -104,6 +109,14 @@ static int nvme_ns_check_constraints(NvmeNamespace *ns, Error **errp) return -1; } + switch (ns->params.iocs) { + case NVME_IOCS_NVM: + break; + default: + error_setg(errp, "unsupported iocs"); + return -1; + } + return 0; } @@ -155,6 +168,7 @@ static void nvme_ns_realize(DeviceState *dev, Error **errp) static Property nvme_ns_props[] = { DEFINE_BLOCK_PROPERTIES(NvmeNamespace, blkconf), DEFINE_PROP_UINT32("nsid", NvmeNamespace, params.nsid, 0), + DEFINE_PROP_UINT8("iocs", NvmeNamespace, params.iocs, NVME_IOCS_NVM), DEFINE_PROP_UUID("uuid", NvmeNamespace, params.uuid), DEFINE_PROP_UINT16("mssrl", NvmeNamespace, params.mssrl, 128), DEFINE_PROP_UINT32("mcl", NvmeNamespace, params.mcl, 128), diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 4f732c13c7e9..5df7c9598b13 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -100,6 +100,7 @@ static const bool nvme_feature_support[NVME_FID_MAX] = { [NVME_WRITE_ATOMICITY] = true, [NVME_ASYNCHRONOUS_EVENT_CONF] = true, [NVME_TIMESTAMP] = true, + [NVME_COMMAND_SET_PROFILE] = true, }; static const uint32_t nvme_feature_cap[NVME_FID_MAX] = { @@ -109,6 +110,7 @@ static const uint32_t nvme_feature_cap[NVME_FID_MAX] = { [NVME_NUMBER_OF_QUEUES] = NVME_FEAT_CAP_CHANGE, [NVME_ASYNCHRONOUS_EVENT_CONF] = NVME_FEAT_CAP_CHANGE, [NVME_TIMESTAMP] = NVME_FEAT_CAP_CHANGE, + [NVME_COMMAND_SET_PROFILE] = NVME_FEAT_CAP_CHANGE, }; static void nvme_process_sq(void *opaque); @@ -810,7 +812,7 @@ static void nvme_process_aers(void *opaque) req = n->aer_reqs[n->outstanding_aers]; - result = (NvmeAerResult *) &req->cqe.result; + result = (NvmeAerResult *) &req->cqe.dw0; result->event_type = event->result.event_type; result->event_info = event->result.event_info; result->log_page = event->result.log_page; @@ -870,7 +872,7 @@ static inline uint16_t nvme_check_mdts(NvmeCtrl *n, size_t len) static inline uint16_t nvme_check_bounds(NvmeNamespace *ns, uint64_t slba, uint32_t nlb) { - uint64_t nsze = le64_to_cpu(ns->id_ns.nsze); + uint64_t nsze = nvme_ns_nsze(ns); if (unlikely(UINT64_MAX - slba < nlb || slba + nlb > nsze)) { return NVME_LBA_RANGE | NVME_DNR; @@ -1044,7 +1046,8 @@ static void nvme_copy_in_complete(NvmeRequest *req) status = nvme_check_bounds(ns, sdlba, ctx->nlb); if (status) { - trace_pci_nvme_err_invalid_lba_range(sdlba, ctx->nlb, ns->id_ns.nsze); + trace_pci_nvme_err_invalid_lba_range(sdlba, ctx->nlb, + nvme_ns_nsze(ns)); req->status = status; g_free(ctx->bounce); @@ -1183,7 +1186,7 @@ static uint16_t nvme_dsm(NvmeCtrl *n, NvmeRequest *req) if (nvme_check_bounds(ns, slba, nlb)) { trace_pci_nvme_err_invalid_lba_range(slba, nlb, - ns->id_ns.nsze); + nvme_ns_nsze(ns)); continue; } @@ -1222,6 +1225,7 @@ static uint16_t nvme_dsm(NvmeCtrl *n, NvmeRequest *req) static uint16_t nvme_copy(NvmeCtrl *n, NvmeRequest *req) { NvmeNamespace *ns = req->ns; + NvmeIdNsNvm *id_ns = ns->id_ns[NVME_IOCS_NVM]; NvmeCopyCmd *copy = (NvmeCopyCmd *)&req->cmd; g_autofree NvmeCopySourceRange *range = NULL; @@ -1242,7 +1246,7 @@ static uint16_t nvme_copy(NvmeCtrl *n, NvmeRequest *req) return NVME_INVALID_FIELD | NVME_DNR; } - if (nr > ns->id_ns.msrc + 1) { + if (nr > id_ns->msrc + 1) { return NVME_CMD_SIZE_LIMIT | NVME_DNR; } @@ -1256,14 +1260,14 @@ static uint16_t nvme_copy(NvmeCtrl *n, NvmeRequest *req) for (i = 0; i < nr; i++) { uint32_t _nlb = le16_to_cpu(range[i].nlb) + 1; - if (_nlb > le16_to_cpu(ns->id_ns.mssrl)) { + if (_nlb > le16_to_cpu(id_ns->mssrl)) { return NVME_CMD_SIZE_LIMIT | NVME_DNR; } nlb += _nlb; } - if (nlb > le32_to_cpu(ns->id_ns.mcl)) { + if (nlb > le32_to_cpu(id_ns->mcl)) { return NVME_CMD_SIZE_LIMIT | NVME_DNR; } @@ -1275,7 +1279,7 @@ static uint16_t nvme_copy(NvmeCtrl *n, NvmeRequest *req) status = nvme_check_bounds(ns, slba, nlb); if (status) { - trace_pci_nvme_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze); + trace_pci_nvme_err_invalid_lba_range(slba, nlb, id_ns->nsze); goto free_bounce; } @@ -1339,6 +1343,7 @@ static uint16_t nvme_compare(NvmeCtrl *n, NvmeRequest *req) { NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; NvmeNamespace *ns = req->ns; + NvmeIdNsNvm *id_ns = ns->id_ns[NVME_IOCS_NVM]; BlockBackend *blk = ns->blkconf.blk; uint64_t slba = le64_to_cpu(rw->slba); uint32_t nlb = le16_to_cpu(rw->nlb) + 1; @@ -1358,7 +1363,7 @@ static uint16_t nvme_compare(NvmeCtrl *n, NvmeRequest *req) status = nvme_check_bounds(ns, slba, nlb); if (status) { - trace_pci_nvme_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze); + trace_pci_nvme_err_invalid_lba_range(slba, nlb, id_ns->nsze); return status; } @@ -1415,7 +1420,7 @@ static uint16_t nvme_read(NvmeCtrl *n, NvmeRequest *req) status = nvme_check_bounds(ns, slba, nlb); if (status) { - trace_pci_nvme_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze); + trace_pci_nvme_err_invalid_lba_range(slba, nlb, nvme_ns_nsze(ns)); goto invalid; } @@ -1474,7 +1479,7 @@ static uint16_t nvme_write(NvmeCtrl *n, NvmeRequest *req) status = nvme_check_bounds(ns, slba, nlb); if (status) { - trace_pci_nvme_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze); + trace_pci_nvme_err_invalid_lba_range(slba, nlb, nvme_ns_nsze(ns)); goto invalid; } @@ -1816,6 +1821,7 @@ static uint16_t nvme_effects_log(NvmeCtrl *n, uint32_t buf_len, uint64_t off, }; uint32_t trans_len; + uint8_t csi = le32_to_cpu(req->cmd.cdw14) >> 24; if (off >= sizeof(NvmeEffectsLog)) { return NVME_INVALID_FIELD | NVME_DNR; @@ -1829,6 +1835,21 @@ static uint16_t nvme_effects_log(NvmeCtrl *n, uint32_t buf_len, uint64_t off, nvme_effects_nvm(&effects); break; + case NVME_CC_CSS_ALL: + if (!(n->iocscs[n->features.iocsci] & (1 << csi))) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + switch (csi) { + case NVME_IOCS_NVM: + nvme_effects_nvm(&effects); + break; + default: + return NVME_INVALID_FIELD | NVME_DNR; + } + + break; + default: return NVME_INTERNAL_DEV_ERROR | NVME_DNR; } @@ -1997,39 +2018,94 @@ static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeRequest *req) return NVME_SUCCESS; } -static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_identify_ctrl(NvmeCtrl *n, uint8_t cns, uint8_t csi, + NvmeRequest *req) { + NvmeIdCtrl empty = { 0 }; + NvmeIdCtrl *id_ctrl = ∅ + trace_pci_nvme_identify_ctrl(); - return nvme_dma(n, (uint8_t *)&n->id_ctrl, sizeof(n->id_ctrl), + switch (cns) { + case NVME_ID_CNS_CTRL: + id_ctrl = &n->id_ctrl; + + break; + + case NVME_ID_CNS_CTRL_IOCS: + if (!(n->iocscs[n->features.iocsci] & (1 << csi))) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + assert(csi < NVME_IOCS_MAX); + + if (n->id_ctrl_iocss[csi]) { + id_ctrl = n->id_ctrl_iocss[csi]; + } + + break; + + default: + return NVME_INVALID_FIELD | NVME_DNR; + } + + return nvme_dma(n, (uint8_t *)id_ctrl, sizeof(*id_ctrl), DMA_DIRECTION_FROM_DEVICE, req); } -static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_identify_ns(NvmeCtrl *n, uint8_t cns, uint8_t csi, + NvmeRequest *req) { + NvmeIdNsNvm empty = { 0 }; + void *id_ns = ∅ NvmeNamespace *ns; NvmeIdentify *c = (NvmeIdentify *)&req->cmd; - NvmeIdNs *id_ns, inactive = { 0 }; uint32_t nsid = le32_to_cpu(c->nsid); - trace_pci_nvme_identify_ns(nsid); + trace_pci_nvme_identify_ns(nsid, csi); if (!nvme_nsid_valid(n, nsid) || nsid == NVME_NSID_BROADCAST) { return NVME_INVALID_NSID | NVME_DNR; } ns = nvme_ns(n, nsid); - if (unlikely(!ns)) { - id_ns = &inactive; - } else { - id_ns = &ns->id_ns; + if (ns) { + switch (cns) { + case NVME_ID_CNS_NS: + id_ns = ns->id_ns[NVME_IOCS_NVM]; + if (!id_ns) { + return NVME_INVALID_IOCS | NVME_DNR; + } + + break; + + case NVME_ID_CNS_NS_IOCS: + if (csi == NVME_IOCS_NVM) { + break; + } + + if (csi >= NVME_IOCS_MAX) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + id_ns = ns->id_ns[csi]; + if (!id_ns) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + break; + + default: + return NVME_INVALID_FIELD | NVME_DNR; + } } - return nvme_dma(n, (uint8_t *)id_ns, sizeof(NvmeIdNs), + return nvme_dma(n, (uint8_t *)id_ns, NVME_IDENTIFY_DATA_SIZE, DMA_DIRECTION_FROM_DEVICE, req); } -static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_identify_nslist(NvmeCtrl *n, uint8_t cns, uint8_t csi, + NvmeRequest *req) { NvmeIdentify *c = (NvmeIdentify *)&req->cmd; static const int data_len = NVME_IDENTIFY_DATA_SIZE; @@ -2038,7 +2114,7 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req) uint16_t ret; int j = 0; - trace_pci_nvme_identify_nslist(min_nsid); + trace_pci_nvme_identify_nslist(min_nsid, csi); /* * Both 0xffffffff (NVME_NSID_BROADCAST) and 0xfffffffe are invalid values @@ -2050,11 +2126,21 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req) return NVME_INVALID_NSID | NVME_DNR; } + if (cns == NVME_ID_CNS_NS_ACTIVE_LIST_IOCS && !csi) { + return NVME_INVALID_FIELD | NVME_DNR; + } + list = g_malloc0(data_len); for (int i = 1; i <= n->num_namespaces; i++) { - if (i <= min_nsid || !nvme_ns(n, i)) { + NvmeNamespace *ns = nvme_ns(n, i); + if (i <= min_nsid || !ns) { continue; } + + if (cns == NVME_ID_CNS_NS_ACTIVE_LIST_IOCS && csi && csi != ns->iocs) { + continue; + } + list[j++] = cpu_to_le32(i); if (j == data_len / sizeof(uint32_t)) { break; @@ -2078,6 +2164,11 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req) NvmeIdNsDescr hdr; uint8_t v[16]; } uuid; + + struct { + NvmeIdNsDescr hdr; + uint8_t v; + } iocs; }; struct data *ns_descrs = (struct data *)list; @@ -2104,25 +2195,45 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req) ns_descrs->uuid.hdr.nidl = NVME_NIDT_UUID_LEN; memcpy(&ns_descrs->uuid.v, ns->params.uuid.data, NVME_NIDT_UUID_LEN); + ns_descrs->iocs.hdr.nidt = NVME_NIDT_CSI; + ns_descrs->iocs.hdr.nidl = NVME_NIDT_CSI_LEN; + stb_p(&ns_descrs->iocs.v, ns->iocs); + return nvme_dma(n, list, NVME_IDENTIFY_DATA_SIZE, DMA_DIRECTION_FROM_DEVICE, req); } +static uint16_t nvme_identify_iocs(NvmeCtrl *n, uint16_t cntid, + NvmeRequest *req) +{ + return nvme_dma(n, (uint8_t *) n->iocscs, sizeof(n->iocscs), + DMA_DIRECTION_FROM_DEVICE, req); +} + static uint16_t nvme_identify(NvmeCtrl *n, NvmeRequest *req) { NvmeIdentify *c = (NvmeIdentify *)&req->cmd; + trace_pci_nvme_identify(nvme_cid(req), le32_to_cpu(c->nsid), + le16_to_cpu(c->cntid), c->cns, c->csi, + le16_to_cpu(c->nvmsetid)); + switch (le32_to_cpu(c->cns)) { case NVME_ID_CNS_NS: - return nvme_identify_ns(n, req); + case NVME_ID_CNS_NS_IOCS: + return nvme_identify_ns(n, c->cns, c->csi, req); case NVME_ID_CNS_CTRL: - return nvme_identify_ctrl(n, req); + case NVME_ID_CNS_CTRL_IOCS: + return nvme_identify_ctrl(n, c->cns, c->csi, req); case NVME_ID_CNS_NS_ACTIVE_LIST: - return nvme_identify_nslist(n, req); + case NVME_ID_CNS_NS_ACTIVE_LIST_IOCS: + return nvme_identify_nslist(n, c->cns, c->csi, req); case NVME_ID_CNS_NS_DESCR_LIST: return nvme_identify_ns_descr_list(n, req); + case NVME_ID_CNS_IOCS: + return nvme_identify_iocs(n, c->cntid, req); default: - trace_pci_nvme_err_invalid_identify_cns(le32_to_cpu(c->cns)); + trace_pci_nvme_err_invalid_identify_cns(c->cns); return NVME_INVALID_FIELD | NVME_DNR; } } @@ -2131,7 +2242,7 @@ static uint16_t nvme_abort(NvmeCtrl *n, NvmeRequest *req) { uint16_t sqid = le32_to_cpu(req->cmd.cdw10) & 0xffff; - req->cqe.result = 1; + req->cqe.dw0 = 1; if (nvme_check_sqid(n, sqid)) { return NVME_INVALID_FIELD | NVME_DNR; } @@ -2310,6 +2421,9 @@ defaults: result |= NVME_INTVC_NOCOALESCING; } + break; + case NVME_COMMAND_SET_PROFILE: + result = cpu_to_le32(n->features.iocsci & 0x1ff); break; default: result = nvme_feature_default[fid]; @@ -2317,7 +2431,8 @@ defaults: } out: - req->cqe.result = cpu_to_le32(result); + req->cqe.dw0 = cpu_to_le32(result); + return NVME_SUCCESS; } @@ -2417,7 +2532,7 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req) continue; } - if (NVME_ID_NS_NSFEAT_DULBE(ns->id_ns.nsfeat)) { + if (ns->id_ns[NVME_IOCS_NVM]) { ns->features.err_rec = dw11; } } @@ -2463,14 +2578,34 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req) ((dw11 >> 16) & 0xFFFF) + 1, n->params.max_ioqpairs, n->params.max_ioqpairs); - req->cqe.result = cpu_to_le32((n->params.max_ioqpairs - 1) | - ((n->params.max_ioqpairs - 1) << 16)); + req->cqe.dw0 = cpu_to_le32((n->params.max_ioqpairs - 1) | + ((n->params.max_ioqpairs - 1) << 16)); break; case NVME_ASYNCHRONOUS_EVENT_CONF: n->features.async_config = dw11; break; case NVME_TIMESTAMP: return nvme_set_feature_timestamp(n, req); + case NVME_COMMAND_SET_PROFILE: + if (NVME_CC_CSS(n->bar.cc) == NVME_CC_CSS_ALL) { + uint16_t iocsci = dw11 & 0x1ff; + uint64_t iocsc = n->iocscs[iocsci]; + + for (int i = 1; i <= n->num_namespaces; i++) { + ns = nvme_ns(n, i); + if (!ns) { + continue; + } + + if (!(iocsc & (1 << ns->iocs))) { + return NVME_IOCS_COMB_REJECTED | NVME_DNR; + } + } + + n->features.iocsci = iocsci; + } + + break; default: return NVME_FEAT_NOT_CHANGEABLE | NVME_DNR; } @@ -3152,6 +3287,8 @@ static void nvme_init_state(NvmeCtrl *n) n->features.temp_thresh_hi = NVME_TEMPERATURE_WARNING; n->starttime_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL); n->aer_reqs = g_new0(NvmeRequest *, n->params.aerl + 1); + n->iocscs[0] = 1 << NVME_IOCS_NVM; + n->features.iocsci = 0; } int nvme_register_namespace(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) @@ -3377,6 +3514,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev) NVME_CAP_SET_CQR(n->bar.cap, 1); NVME_CAP_SET_TO(n->bar.cap, 0xf); NVME_CAP_SET_CSS(n->bar.cap, NVME_CAP_CSS_NVM); + NVME_CAP_SET_CSS(n->bar.cap, NVME_CAP_CSS_CSI); NVME_CAP_SET_CSS(n->bar.cap, NVME_CAP_CSS_ADMIN_ONLY); NVME_CAP_SET_MPSMAX(n->bar.cap, 4); NVME_CAP_SET_CMBS(n->bar.cap, n->params.cmb_size_mb ? 1 : 0); @@ -3438,6 +3576,11 @@ static void nvme_exit(PCIDevice *pci_dev) if (n->pmrdev) { host_memory_backend_set_mapped(n->pmrdev, false); } + + for (int i = 0; i < NVME_IOCS_MAX; i++) { + g_free(n->id_ctrl_iocss[i]); + } + msix_uninit_exclusive_bar(pci_dev); } diff --git a/hw/block/trace-events b/hw/block/trace-events index 35ea40c49169..1f1aef719301 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -58,10 +58,12 @@ pci_nvme_create_sq(uint64_t addr, uint16_t sqid, uint16_t cqid, uint16_t qsize, pci_nvme_create_cq(uint64_t addr, uint16_t cqid, uint16_t vector, uint16_t size, uint16_t qflags, int ien) "create completion queue, addr=0x%"PRIx64", cqid=%"PRIu16", vector=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16", ien=%d" pci_nvme_del_sq(uint16_t qid) "deleting submission queue sqid=%"PRIu16"" pci_nvme_del_cq(uint16_t cqid) "deleted completion queue, cqid=%"PRIu16"" +pci_nvme_identify(uint16_t cid, uint32_t nsid, uint16_t cntid, uint8_t cns, uint8_t csi, uint16_t nvmsetid) "cid %"PRIu16" nsid %"PRIu32" cntid 0x%"PRIx16" cns 0x%"PRIx8" csi 0x%"PRIx8" nvmsetid %"PRIu16"" pci_nvme_identify_ctrl(void) "identify controller" -pci_nvme_identify_ns(uint32_t ns) "nsid %"PRIu32"" -pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32"" +pci_nvme_identify_ns(uint32_t ns, uint8_t csi) "nsid %"PRIu32" csi 0x%"PRIx8"" +pci_nvme_identify_nslist(uint32_t ns, uint8_t csi) "nsid %"PRIu32" csi 0x%"PRIx8"" pci_nvme_identify_ns_descr_list(uint32_t ns) "nsid %"PRIu32"" +pci_nvme_identify_io_cmd_set(uint16_t cid) "cid %"PRIu16"" pci_nvme_get_log(uint16_t cid, uint8_t lid, uint8_t lsp, uint8_t rae, uint32_t len, uint64_t off) "cid %"PRIu16" lid 0x%"PRIx8" lsp 0x%"PRIx8" rae 0x%"PRIx8" len %"PRIu32" off %"PRIu64"" pci_nvme_getfeat(uint16_t cid, uint32_t nsid, uint8_t fid, uint8_t sel, uint32_t cdw11) "cid %"PRIu16" nsid 0x%"PRIx32" fid 0x%"PRIx8" sel 0x%"PRIx8" cdw11 0x%"PRIx32"" pci_nvme_setfeat(uint16_t cid, uint32_t nsid, uint8_t fid, uint8_t save, uint32_t cdw11) "cid %"PRIu16" nsid 0x%"PRIx32" fid 0x%"PRIx8" save 0x%"PRIx8" cdw11 0x%"PRIx32"" From patchwork Thu Nov 26 23:45:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Klaus Jensen X-Patchwork-Id: 11934937 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62568C63777 for ; Thu, 26 Nov 2020 23:59:40 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A4D382070E for ; Thu, 26 Nov 2020 23:59:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A4D382070E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=irrelevant.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:37942 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kiRAo-0000Cz-3E for qemu-devel@archiver.kernel.org; Thu, 26 Nov 2020 18:59:38 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:55928) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kiQxz-0007SO-GX; Thu, 26 Nov 2020 18:46:23 -0500 Received: from out3-smtp.messagingengine.com ([66.111.4.27]:34617) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kiQxq-0003uM-Ny; Thu, 26 Nov 2020 18:46:23 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id E4F525C021C; Thu, 26 Nov 2020 18:46:13 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Thu, 26 Nov 2020 18:46:13 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=xBNEy9TY+OjnW BspJwQp2WIA7m5CiTOGu7hT46Et+zk=; b=lkJ6pQGjETHGfP2YaBBXXfP6e3st+ y74b/Gxo4KPKz8pg7xE1GqKzqbgWeQ5PMxGIHPB6pNY56OseQ5pUph2PCgWmadAh t86fU9WFqZJpfjurddCtcD7EtnbaNRFFqhSM/Tb3VYEjkDFr9cFYdIe6AVHyLPcx lZiHIMIQ56adFVi0L/BshEkKWTHS8ybhnEEiu2f+NiDylFXdeW0zrotGfEaUJlVa gjnLVzRHPqUPozzaS71DN+NQ+SzRPy/0aW3L0C4rtfe0yIA/2LOcNf74QrzWnsED DU+wlFzz9BpUK7fIlY3NW346AKIewIzWqPbenvwExHXJ1RZzxZbYqYYgQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=xBNEy9TY+OjnWBspJwQp2WIA7m5CiTOGu7hT46Et+zk=; b=eom7yWsy oGJi5KBStniaGfbyIykeEB73vi01J1F7ph9wukXjuhmSzma2dOmoeteHggEW9t+Q JueRGb0Ihf44ijt5bfLY9mVGC9zpfeFiO+yHPEZ2UA4Ij1mRoay5lyFZKmy6NoYr M2HV9fVAlKW55mZj0rq0N0sAIbEsoCFwmSKJdRPvURh/Mu5sqVTTxwmN0qRH3Mvw vzDY4PEJgqkJf3EgX4hUwdiRR8ao/2aM9K+XZe3y4vnVl2OJiZsmKFvifz7l8zbm lCtD4JK7dEqovvu8TkBsbnnkpN0JzreTkWqJaLwwrFwZ40EtgwJIvNkmKVF9hAEx eTyoRBXWV1pdlQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrudehfedgudefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvufffkffojghfggfgsedtkeertdertddtnecuhfhrohhmpefmlhgruhhs ucflvghnshgvnhcuoehithhssehirhhrvghlvghvrghnthdrughkqeenucggtffrrghtth gvrhhnpeeuleetgeeiuefhgfekfefgveejiefgteekiedtgfdtieefhfdthfefueffvefg keenucfkphepkedtrdduieejrdelkedrudeltdenucevlhhushhtvghrufhiiigvpedtne curfgrrhgrmhepmhgrihhlfhhrohhmpehithhssehirhhrvghlvghvrghnthdrughk X-ME-Proxy: Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id 8447D3064AB0; Thu, 26 Nov 2020 18:46:12 -0500 (EST) From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH v5 06/12] hw/block/nvme: add basic read/write for zoned namespaces Date: Fri, 27 Nov 2020 00:45:55 +0100 Message-Id: <20201126234601.689714-7-its@irrelevant.dk> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201126234601.689714-1-its@irrelevant.dk> References: <20201126234601.689714-1-its@irrelevant.dk> MIME-Version: 1.0 Received-SPF: pass client-ip=66.111.4.27; envelope-from=its@irrelevant.dk; helo=out3-smtp.messagingengine.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Klaus Jensen , Max Reitz , Keith Busch , Stefan Hajnoczi , Klaus Jensen Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Klaus Jensen This adds basic read and write for zoned namespaces. A zoned namespace is created by setting the iocs namespace parameter to 0x2 and specifying the zns.zcap parameter (zone capacity) in number of logical blocks per zone. If a zone size (zns.zsze) is not specified, the namespace device will set the zone size to be the next power of two and fit in as many zones as possible on the underlying namespace blockdev. This behavior is not required by the specification, but ensures that the device can be initialized by the Linux kernel nvme driver, which requires a power of two zone size. Signed-off-by: Klaus Jensen --- docs/specs/nvme.txt | 8 + hw/block/nvme-ns.h | 78 ++++++++ include/block/nvme.h | 60 +++++- hw/block/nvme-ns.c | 86 +++++++++ hw/block/nvme.c | 415 ++++++++++++++++++++++++++++++++++++++++-- hw/block/trace-events | 8 + 6 files changed, 635 insertions(+), 20 deletions(-) diff --git a/docs/specs/nvme.txt b/docs/specs/nvme.txt index 619bd9ce4378..80cb34406255 100644 --- a/docs/specs/nvme.txt +++ b/docs/specs/nvme.txt @@ -6,6 +6,14 @@ The nvme device (-device nvme) emulates an NVM Express Controller. `iocs`; The "I/O Command Set" associated with the namespace. E.g. 0x0 for the NVM Command Set (the default), or 0x2 for the Zoned Namespace Command Set. + `zns.zcap`; If `iocs` is 0x2, this specifies the zone capacity. It is + specified in units of logical blocks. + + `zns.zsze`; If `iocs` is 0x2, this specifies the zone size. It is specified + in units of the logical blocks. If not specified, the value depends on + zns.zcap; if the zone capacity is a power of two, the zone size will be + set to that, otherwise it will default to the next power of two. + Reference Specifications ------------------------ diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index 3b095423cf52..e373d62c5873 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -27,8 +27,19 @@ typedef struct NvmeNamespaceParams { uint16_t mssrl; uint32_t mcl; uint8_t msrc; + + struct { + uint64_t zcap; + uint64_t zsze; + } zns; } NvmeNamespaceParams; +typedef struct NvmeZone { + NvmeZoneDescriptor *zd; + + uint64_t wp_staging; +} NvmeZone; + typedef struct NvmeNamespace { DeviceState parent_obj; BlockConf blkconf; @@ -42,8 +53,20 @@ typedef struct NvmeNamespace { struct { uint32_t err_rec; } features; + + struct { + int num_zones; + + NvmeZone *zones; + NvmeZoneDescriptor *zd; + } zns; } NvmeNamespace; +static inline bool nvme_ns_zoned(NvmeNamespace *ns) +{ + return ns->iocs == NVME_IOCS_ZONED; +} + static inline uint32_t nvme_nsid(NvmeNamespace *ns) { if (ns) { @@ -59,11 +82,23 @@ static inline NvmeLBAF *nvme_ns_lbaf(NvmeNamespace *ns) return &id_ns->lbaf[NVME_ID_NS_FLBAS_INDEX(id_ns->flbas)]; } +static inline NvmeLBAFE *nvme_ns_lbafe(NvmeNamespace *ns) +{ + NvmeIdNsNvm *id_ns = ns->id_ns[NVME_IOCS_NVM]; + NvmeIdNsZns *id_ns_zns = ns->id_ns[NVME_IOCS_ZONED]; + return &id_ns_zns->lbafe[NVME_ID_NS_FLBAS_INDEX(id_ns->flbas)]; +} + static inline uint8_t nvme_ns_lbads(NvmeNamespace *ns) { return nvme_ns_lbaf(ns)->ds; } +static inline uint64_t nvme_ns_zsze(NvmeNamespace *ns) +{ + return nvme_ns_lbafe(ns)->zsze; +} + /* calculate the number of LBAs that the namespace can accomodate */ static inline uint64_t nvme_ns_nlbas(NvmeNamespace *ns) { @@ -82,8 +117,51 @@ static inline size_t nvme_l2b(NvmeNamespace *ns, uint64_t lba) return lba << nvme_ns_lbads(ns); } +static inline int nvme_ns_zone_idx(NvmeNamespace *ns, uint64_t lba) +{ + return lba / nvme_ns_zsze(ns); +} + +static inline NvmeZone *nvme_ns_zone(NvmeNamespace *ns, uint64_t lba) +{ + int idx = nvme_ns_zone_idx(ns, lba); + if (unlikely(idx >= ns->zns.num_zones)) { + return NULL; + } + + return &ns->zns.zones[idx]; +} + +static inline NvmeZoneState nvme_zs(NvmeZone *zone) +{ + return (zone->zd->zs >> 4) & 0xf; +} + +static inline void nvme_zs_set(NvmeZone *zone, NvmeZoneState zs) +{ + zone->zd->zs = zs << 4; +} + +static inline uint64_t nvme_zslba(NvmeZone *zone) +{ + return le64_to_cpu(zone->zd->zslba); +} + +static inline uint64_t nvme_zcap(NvmeZone *zone) +{ + return le64_to_cpu(zone->zd->zcap); +} + +static inline uint64_t nvme_wp(NvmeZone *zone) +{ + return le64_to_cpu(zone->zd->wp); +} + typedef struct NvmeCtrl NvmeCtrl; +const char *nvme_zs_str(NvmeZone *zone); +const char *nvme_zs_to_str(NvmeZoneState zs); + int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp); void nvme_ns_drain(NvmeNamespace *ns); void nvme_ns_flush(NvmeNamespace *ns); diff --git a/include/block/nvme.h b/include/block/nvme.h index 53c051d52c53..6a5616bb9304 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -395,8 +395,9 @@ enum NvmePmrmscMask { (pmrmsc |= (uint64_t)(val & PMRMSC_CBA_MASK) << PMRMSC_CBA_SHIFT) enum NvmeCommandSet { - NVME_IOCS_NVM = 0x0, - NVME_IOCS_MAX = 0x1, + NVME_IOCS_NVM = 0x0, + NVME_IOCS_ZONED = 0x2, + NVME_IOCS_MAX = 0x3, }; enum NvmeSglDescriptorType { @@ -738,6 +739,12 @@ enum NvmeStatusCodes { NVME_INVALID_PROT_INFO = 0x0181, NVME_WRITE_TO_RO = 0x0182, NVME_CMD_SIZE_LIMIT = 0x0183, + NVME_ZONE_BOUNDARY_ERROR = 0x01b8, + NVME_ZONE_IS_FULL = 0x01b9, + NVME_ZONE_IS_READ_ONLY = 0x01ba, + NVME_ZONE_IS_OFFLINE = 0x01bb, + NVME_ZONE_INVALID_WRITE = 0x01bc, + NVME_INVALID_ZONE_STATE_TRANSITION = 0x01bf, NVME_WRITE_FAULT = 0x0280, NVME_UNRECOVERED_READ = 0x0281, NVME_E2E_GUARD_ERROR = 0x0282, @@ -814,6 +821,31 @@ enum { NVME_EFFECTS_UUID_SEL = 1 << 19, }; +typedef enum NvmeZoneType { + NVME_ZT_SEQ = 0x2, +} NvmeZoneType; + +typedef enum NvmeZoneState { + NVME_ZS_ZSE = 0x1, + NVME_ZS_ZSIO = 0x2, + NVME_ZS_ZSEO = 0x3, + NVME_ZS_ZSC = 0x4, + NVME_ZS_ZSRO = 0xd, + NVME_ZS_ZSF = 0xe, + NVME_ZS_ZSO = 0xf, +} NvmeZoneState; + +typedef struct QEMU_PACKED NvmeZoneDescriptor { + uint8_t zt; + uint8_t zs; + uint8_t za; + uint8_t rsvd3[5]; + uint64_t zcap; + uint64_t zslba; + uint64_t wp; + uint8_t rsvd32[32]; +} NvmeZoneDescriptor; + enum NvmeSmartWarn { NVME_SMART_SPARE = 1 << 0, NVME_SMART_TEMPERATURE = 1 << 1, @@ -827,6 +859,7 @@ enum NvmeLogIdentifier { NVME_LOG_SMART_INFO = 0x02, NVME_LOG_FW_SLOT_INFO = 0x03, NVME_LOG_EFFECTS = 0x05, + NVME_LOG_CHANGED_ZONE_LIST = 0xbf, }; typedef struct QEMU_PACKED NvmePSD { @@ -1146,9 +1179,27 @@ enum NvmeIdNsDps { DPS_FIRST_EIGHT = 8, }; +typedef struct QEMU_PACKED NvmeLBAFE { + uint64_t zsze; + uint8_t zdes; + uint8_t rsvd9[7]; +} NvmeLBAFE; + +typedef struct QEMU_PACKED NvmeIdNsZns { + uint16_t zoc; + uint16_t ozcs; + uint32_t mar; + uint32_t mor; + uint32_t rrl; + uint32_t frl; + uint8_t rsvd20[2796]; + NvmeLBAFE lbafe[16]; + uint8_t rsvd3072[768]; + uint8_t vs[256]; +} NvmeIdNsZns; + static inline void _nvme_check_size(void) { - QEMU_BUILD_BUG_ON(sizeof(NvmeBar) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeAerResult) != 4); QEMU_BUILD_BUG_ON(sizeof(NvmeCqe) != 16); QEMU_BUILD_BUG_ON(sizeof(NvmeDsmRange) != 16); @@ -1167,8 +1218,11 @@ static inline void _nvme_check_size(void) QEMU_BUILD_BUG_ON(sizeof(NvmeSmartLog) != 512); QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrl) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsNvm) != 4096); + QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsZns) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeSglDescriptor) != 16); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsDescr) != 4); QEMU_BUILD_BUG_ON(sizeof(NvmeEffectsLog) != 4096); + QEMU_BUILD_BUG_ON(sizeof(NvmeZoneDescriptor) != 64); + QEMU_BUILD_BUG_ON(sizeof(NvmeLBAFE) != 16); } #endif diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index 7d70095439b6..1f3d0644ba42 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -30,6 +30,67 @@ #define MIN_DISCARD_GRANULARITY (4 * KiB) +const char *nvme_zs_str(NvmeZone *zone) +{ + return nvme_zs_to_str(nvme_zs(zone)); +} + +const char *nvme_zs_to_str(NvmeZoneState zs) +{ + switch (zs) { + case NVME_ZS_ZSE: return "ZSE"; + case NVME_ZS_ZSIO: return "ZSIO"; + case NVME_ZS_ZSEO: return "ZSEO"; + case NVME_ZS_ZSC: return "ZSC"; + case NVME_ZS_ZSRO: return "ZSRO"; + case NVME_ZS_ZSF: return "ZSF"; + case NVME_ZS_ZSO: return "ZSO"; + } + + return "UNKNOWN"; +} + +static void nvme_ns_zns_init_zones(NvmeNamespace *ns) +{ + NvmeZone *zone; + NvmeZoneDescriptor *zd; + uint64_t zslba, zsze = nvme_ns_zsze(ns); + + for (int i = 0; i < ns->zns.num_zones; i++) { + zslba = i * zsze; + + zone = &ns->zns.zones[i]; + zone->zd = &ns->zns.zd[i]; + zone->wp_staging = zslba; + + zd = zone->zd; + zd->zt = NVME_ZT_SEQ; + zd->zcap = cpu_to_le64(ns->params.zns.zcap); + zd->wp = zd->zslba = cpu_to_le64(zslba); + + nvme_zs_set(zone, NVME_ZS_ZSE); + } +} + +static void nvme_ns_init_zoned(NvmeNamespace *ns) +{ + NvmeIdNsNvm *id_ns = ns->id_ns[NVME_IOCS_NVM]; + NvmeIdNsZns *id_ns_zns = ns->id_ns[NVME_IOCS_ZONED]; + + for (int i = 0; i <= id_ns->nlbaf; i++) { + id_ns_zns->lbafe[i].zsze = ns->params.zns.zsze ? + cpu_to_le64(ns->params.zns.zsze) : + cpu_to_le64(pow2ceil(ns->params.zns.zcap)); + } + + ns->zns.num_zones = nvme_ns_nlbas(ns) / nvme_ns_zsze(ns); + ns->zns.zones = g_malloc0_n(ns->zns.num_zones, sizeof(NvmeZone)); + ns->zns.zd = g_malloc0_n(ns->zns.num_zones, sizeof(NvmeZoneDescriptor)); + + id_ns_zns->mar = 0xffffffff; + id_ns_zns->mor = 0xffffffff; +} + static int nvme_ns_init(NvmeNamespace *ns, Error **errp) { BlockDriverInfo bdi; @@ -48,6 +109,11 @@ static int nvme_ns_init(NvmeNamespace *ns, Error **errp) id_ns->nsze = cpu_to_le64(nvme_ns_nlbas(ns)); + if (nvme_ns_zoned(ns)) { + ns->id_ns[NVME_IOCS_ZONED] = g_new0(NvmeIdNsZns, 1); + nvme_ns_init_zoned(ns); + } + /* no thin provisioning */ id_ns->ncap = id_ns->nsze; id_ns->nuse = id_ns->ncap; @@ -112,6 +178,20 @@ static int nvme_ns_check_constraints(NvmeNamespace *ns, Error **errp) switch (ns->params.iocs) { case NVME_IOCS_NVM: break; + + case NVME_IOCS_ZONED: + if (!ns->params.zns.zcap) { + error_setg(errp, "zns.zcap must be specified"); + return -1; + } + + if (ns->params.zns.zsze && ns->params.zns.zsze < ns->params.zns.zcap) { + error_setg(errp, "zns.zsze cannot be less than zns.zcap"); + return -1; + } + + break; + default: error_setg(errp, "unsupported iocs"); return -1; @@ -134,6 +214,10 @@ int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) return -1; } + if (nvme_ns_zoned(ns)) { + nvme_ns_zns_init_zones(ns); + } + if (nvme_register_namespace(n, ns, errp)) { return -1; } @@ -173,6 +257,8 @@ static Property nvme_ns_props[] = { DEFINE_PROP_UINT16("mssrl", NvmeNamespace, params.mssrl, 128), DEFINE_PROP_UINT32("mcl", NvmeNamespace, params.mcl, 128), DEFINE_PROP_UINT8("msrc", NvmeNamespace, params.msrc, 255), + DEFINE_PROP_UINT64("zns.zcap", NvmeNamespace, params.zns.zcap, 0), + DEFINE_PROP_UINT64("zns.zsze", NvmeNamespace, params.zns.zsze, 0), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 5df7c9598b13..60a467d5df62 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -858,6 +858,90 @@ static void nvme_clear_events(NvmeCtrl *n, uint8_t event_type) } } +static uint16_t nvme_check_zone_readable(NvmeZone *zone) +{ + if (nvme_zs(zone) == NVME_ZS_ZSO) { + trace_pci_nvme_err_zone_is_offline(nvme_zslba(zone)); + return NVME_ZONE_IS_OFFLINE | NVME_DNR; + } + + return NVME_SUCCESS; +} + +static uint16_t nvme_check_zone_read(NvmeNamespace *ns, uint64_t slba, + uint32_t nlb, NvmeZone *zone) +{ + uint64_t zslba = nvme_zslba(zone); + uint64_t zsze = nvme_ns_zsze(ns); + uint16_t status; + + status = nvme_check_zone_readable(zone); + if (status) { + return status; + } + + if ((slba + nlb) > (zslba + zsze)) { + trace_pci_nvme_err_zone_boundary(slba, nlb, zsze); + return NVME_ZONE_BOUNDARY_ERROR | NVME_DNR; + } + + return NVME_SUCCESS; +} + +static uint16_t nvme_check_zone_writable(NvmeZone *zone) +{ + NvmeZoneState zs = nvme_zs(zone); + uint64_t zslba = nvme_zslba(zone); + + switch (zs) { + case NVME_ZS_ZSE: + case NVME_ZS_ZSIO: + case NVME_ZS_ZSEO: + case NVME_ZS_ZSC: + return NVME_SUCCESS; + case NVME_ZS_ZSRO: + trace_pci_nvme_err_zone_is_read_only(zslba); + return NVME_ZONE_IS_READ_ONLY | NVME_DNR; + case NVME_ZS_ZSF: + trace_pci_nvme_err_zone_is_full(zslba); + return NVME_ZONE_IS_FULL; + case NVME_ZS_ZSO: + trace_pci_nvme_err_zone_is_offline(zslba); + return NVME_ZONE_IS_OFFLINE | NVME_DNR; + } + + trace_pci_nvme_err_invalid_zone_state(zslba, nvme_zs_to_str(zs), zs); + return NVME_INTERNAL_DEV_ERROR | NVME_DNR; +} + +static uint16_t nvme_check_zone_write(uint64_t slba, uint32_t nlb, + NvmeZone *zone) +{ + uint64_t zslba, wp, zcap; + uint16_t status; + + zslba = nvme_zslba(zone); + wp = zone->wp_staging; + zcap = nvme_zcap(zone); + + status = nvme_check_zone_writable(zone); + if (status) { + return status; + } + + if ((wp - zslba) + nlb > zcap) { + trace_pci_nvme_err_zone_boundary(slba, nlb, zcap); + return NVME_ZONE_BOUNDARY_ERROR | NVME_DNR; + } + + if (slba != wp) { + trace_pci_nvme_err_zone_invalid_write(slba, wp); + return NVME_ZONE_INVALID_WRITE; + } + + return NVME_SUCCESS; +} + static inline uint16_t nvme_check_mdts(NvmeCtrl *n, size_t len) { uint8_t mdts = n->params.mdts; @@ -924,8 +1008,125 @@ static uint16_t nvme_check_dulbe(NvmeNamespace *ns, uint64_t slba, return NVME_SUCCESS; } -static void nvme_aio_err(NvmeRequest *req, int ret) +static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, + NvmeZoneState to) { + NvmeZoneState from = nvme_zs(zone); + + trace_pci_nvme_zrm_transition(ns->params.nsid, nvme_zslba(zone), + nvme_zs_to_str(from), from, + nvme_zs_to_str(to), to); + + if (from == to) { + return NVME_SUCCESS; + } + + switch (from) { + case NVME_ZS_ZSE: + break; + + case NVME_ZS_ZSIO: + case NVME_ZS_ZSEO: + switch (to) { + case NVME_ZS_ZSE: + case NVME_ZS_ZSO: + case NVME_ZS_ZSEO: + case NVME_ZS_ZSF: + case NVME_ZS_ZSRO: + case NVME_ZS_ZSC: + break; + + default: + return NVME_INVALID_ZONE_STATE_TRANSITION | NVME_DNR; + } + + break; + + case NVME_ZS_ZSC: + switch (to) { + case NVME_ZS_ZSE: + case NVME_ZS_ZSO: + case NVME_ZS_ZSF: + case NVME_ZS_ZSRO: + case NVME_ZS_ZSIO: + case NVME_ZS_ZSEO: + break; + + default: + return NVME_INVALID_ZONE_STATE_TRANSITION | NVME_DNR; + } + + break; + + case NVME_ZS_ZSRO: + switch (to) { + case NVME_ZS_ZSO: + break; + + default: + return NVME_INVALID_ZONE_STATE_TRANSITION | NVME_DNR; + } + + break; + + case NVME_ZS_ZSF: + switch (to) { + case NVME_ZS_ZSE: + case NVME_ZS_ZSO: + case NVME_ZS_ZSRO: + break; + + default: + return NVME_INVALID_ZONE_STATE_TRANSITION | NVME_DNR; + } + + break; + + case NVME_ZS_ZSO: + return NVME_INVALID_ZONE_STATE_TRANSITION | NVME_DNR; + } + + nvme_zs_set(zone, to); + return NVME_SUCCESS; +} + +static uint16_t __nvme_zns_advance_wp(NvmeNamespace *ns, NvmeZone *zone, + uint32_t nlb) +{ + uint64_t wp = nvme_wp(zone); + + trace_pci_nvme_zns_advance_wp(nvme_nsid(ns), nvme_zslba(zone), wp, nlb); + + wp += nlb; + zone->zd->wp = cpu_to_le64(wp); + if (wp == nvme_zslba(zone) + nvme_zcap(zone)) { + uint16_t status = nvme_zrm_transition(ns, zone, NVME_ZS_ZSF); + if (status) { + return status; + } + } + + return NVME_SUCCESS; +} + +static void nvme_zns_advance_wp(NvmeRequest *req) +{ + NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; + uint64_t slba = le64_to_cpu(rw->slba); + uint32_t nlb = (uint32_t)le16_to_cpu(rw->nlb) + 1; + NvmeZone *zone = nvme_ns_zone(req->ns, slba); + uint16_t status; + + status = __nvme_zns_advance_wp(req->ns, zone, nlb); + if (status) { + req->status = status; + } +} + +static void nvme_aio_err(NvmeRequest *req, int ret, NvmeZone *zone) +{ + NvmeNamespace *ns = req->ns; + uint16_t status = NVME_SUCCESS; Error *local_err = NULL; @@ -948,6 +1149,17 @@ static void nvme_aio_err(NvmeRequest *req, int ret) error_setg_errno(&local_err, -ret, "aio failed"); error_report_err(local_err); + if (zone) { + /* + * Transition the zone to read-only on write fault and offline + * on unrecovered read or internal dev error. + */ + NvmeZoneState zs = status == NVME_WRITE_FAULT ? + NVME_ZS_ZSRO : NVME_ZS_ZSO; + + nvme_zrm_transition(ns, zone, zs); + } + /* * Set the command status code to the first encountered error but allow a * subsequent Internal Device Error to trump it. @@ -963,6 +1175,7 @@ static void nvme_rw_cb(void *opaque, int ret) { NvmeRequest *req = opaque; NvmeNamespace *ns = req->ns; + NvmeZone *zone = NULL; BlockBackend *blk = ns->blkconf.blk; BlockAcctCookie *acct = &req->acct; @@ -970,25 +1183,53 @@ static void nvme_rw_cb(void *opaque, int ret) trace_pci_nvme_rw_cb(nvme_cid(req), blk_name(blk)); + if (nvme_ns_zoned(ns)) { + NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; + uint64_t slba = le64_to_cpu(rw->slba); + zone = nvme_ns_zone(ns, slba); + } + if (!ret) { block_acct_done(stats, acct); + + if (zone) { + switch (req->cmd.opcode) { + case NVME_CMD_WRITE: + case NVME_CMD_WRITE_ZEROES: + nvme_zns_advance_wp(req); + default: + break; + } + } } else { block_acct_failed(stats, acct); - nvme_aio_err(req, ret); + nvme_aio_err(req, ret, zone); } nvme_enqueue_req_completion(nvme_cq(req), req); } +struct nvme_discard_ctx { + NvmeRequest *req; + uint64_t slba; +}; + static void nvme_aio_discard_cb(void *opaque, int ret) { - NvmeRequest *req = opaque; + struct nvme_discard_ctx *ctx = opaque; + NvmeRequest *req = ctx->req; + NvmeNamespace *ns = req->ns; uintptr_t *discards = (uintptr_t *)&req->opaque; trace_pci_nvme_aio_discard_cb(nvme_cid(req)); if (ret) { - nvme_aio_err(req, ret); + NvmeZone *zone = NULL; + if (nvme_ns_zoned(ns)) { + zone = nvme_ns_zone(ns, ctx->slba); + } + + nvme_aio_err(req, ret, zone); } (*discards)--; @@ -1009,21 +1250,38 @@ struct nvme_copy_ctx { struct nvme_copy_in_ctx { NvmeRequest *req; QEMUIOVector iov; + uint64_t slba; }; static void nvme_copy_cb(void *opaque, int ret) { NvmeRequest *req = opaque; NvmeNamespace *ns = req->ns; + NvmeZone *zone = NULL; struct nvme_copy_ctx *ctx = req->opaque; trace_pci_nvme_copy_cb(nvme_cid(req)); + if (nvme_ns_zoned(ns)) { + NvmeCopyCmd *copy = (NvmeCopyCmd *)&req->cmd; + uint64_t sdlba = le64_to_cpu(copy->sdlba); + zone = nvme_ns_zone(ns, sdlba); + } + if (!ret) { block_acct_done(blk_get_stats(ns->blkconf.blk), &req->acct); + + if (zone) { + uint16_t status; + + status = __nvme_zns_advance_wp(ns, zone, ctx->nlb); + if (status) { + req->status = status; + } + } } else { block_acct_failed(blk_get_stats(ns->blkconf.blk), &req->acct); - nvme_aio_err(req, ret); + nvme_aio_err(req, ret, zone); } g_free(ctx->bounce); @@ -1048,14 +1306,32 @@ static void nvme_copy_in_complete(NvmeRequest *req) if (status) { trace_pci_nvme_err_invalid_lba_range(sdlba, ctx->nlb, nvme_ns_nsze(ns)); - req->status = status; + goto invalid; + } - g_free(ctx->bounce); - g_free(ctx); + if (nvme_ns_zoned(ns)) { + NvmeZone *zone = nvme_ns_zone(ns, sdlba); + assert(zone); - nvme_enqueue_req_completion(nvme_cq(req), req); + status = nvme_check_zone_write(sdlba, ctx->nlb, zone); + if (status) { + goto invalid; + } - return; + switch (nvme_zs(zone)) { + case NVME_ZS_ZSIO: + case NVME_ZS_ZSEO: + break; + default: + status = nvme_zrm_transition(ns, zone, NVME_ZS_ZSIO); + if (status) { + goto invalid; + } + + break; + } + + zone->wp_staging += ctx->nlb; } qemu_iovec_init(&req->iov, 1); @@ -1066,6 +1342,16 @@ static void nvme_copy_in_complete(NvmeRequest *req) req->aiocb = blk_aio_pwritev(ns->blkconf.blk, nvme_l2b(ns, sdlba), &req->iov, 0, nvme_copy_cb, req); + + return; + +invalid: + req->status = status; + + g_free(ctx->bounce); + g_free(ctx); + + nvme_enqueue_req_completion(nvme_cq(req), req); } static void nvme_aio_copy_in_cb(void *opaque, int ret) @@ -1073,17 +1359,22 @@ static void nvme_aio_copy_in_cb(void *opaque, int ret) struct nvme_copy_in_ctx *in_ctx = opaque; NvmeRequest *req = in_ctx->req; NvmeNamespace *ns = req->ns; + NvmeZone *zone = NULL; struct nvme_copy_ctx *ctx = req->opaque; - qemu_iovec_destroy(&in_ctx->iov); - g_free(in_ctx); - trace_pci_nvme_aio_copy_in_cb(nvme_cid(req)); if (ret) { - nvme_aio_err(req, ret); + if (nvme_ns_zoned(ns)) { + zone = nvme_ns_zone(ns, in_ctx->slba); + } + + nvme_aio_err(req, ret, zone); } + qemu_iovec_destroy(&in_ctx->iov); + g_free(in_ctx); + ctx->copies--; if (ctx->copies) { @@ -1114,6 +1405,7 @@ static void nvme_compare_cb(void *opaque, int ret) { NvmeRequest *req = opaque; NvmeNamespace *ns = req->ns; + NvmeZone *zone = NULL; struct nvme_compare_ctx *ctx = req->opaque; g_autofree uint8_t *buf = NULL; uint16_t status; @@ -1123,8 +1415,13 @@ static void nvme_compare_cb(void *opaque, int ret) if (!ret) { block_acct_done(blk_get_stats(ns->blkconf.blk), &req->acct); } else { + if (nvme_ns_zoned(ns)) { + NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; + zone = nvme_ns_zone(ns, le64_to_cpu(rw->slba)); + } + block_acct_failed(blk_get_stats(ns->blkconf.blk), &req->acct); - nvme_aio_err(req, ret); + nvme_aio_err(req, ret, zone); goto out; } @@ -1198,11 +1495,16 @@ static uint16_t nvme_dsm(NvmeCtrl *n, NvmeRequest *req) while (len) { size_t bytes = MIN(BDRV_REQUEST_MAX_BYTES, len); + struct nvme_discard_ctx *ctx; + + ctx = g_new0(struct nvme_discard_ctx, 1); + ctx->req = req; + ctx->slba = slba; (*discards)++; blk_aio_pdiscard(ns->blkconf.blk, offset, bytes, - nvme_aio_discard_cb, req); + nvme_aio_discard_cb, ctx); offset += bytes; len -= bytes; @@ -1289,6 +1591,16 @@ static uint16_t nvme_copy(NvmeCtrl *n, NvmeRequest *req) goto free_bounce; } } + + if (nvme_ns_zoned(ns)) { + NvmeZone *zone = nvme_ns_zone(ns, slba); + assert(zone); + + status = nvme_check_zone_read(ns, slba, nlb, zone); + if (status) { + goto free_bounce; + } + } } block_acct_start(blk_get_stats(ns->blkconf.blk), &req->acct, @@ -1313,6 +1625,7 @@ static uint16_t nvme_copy(NvmeCtrl *n, NvmeRequest *req) struct nvme_copy_in_ctx *in_ctx = g_new(struct nvme_copy_in_ctx, 1); in_ctx->req = req; + in_ctx->slba = slba; qemu_iovec_init(&in_ctx->iov, 1); qemu_iovec_add(&in_ctx->iov, bouncep, len); @@ -1374,6 +1687,17 @@ static uint16_t nvme_compare(NvmeCtrl *n, NvmeRequest *req) } } + if (nvme_ns_zoned(ns)) { + NvmeZone *zone = nvme_ns_zone(ns, slba); + assert(zone); + + status = nvme_check_zone_read(ns, slba, nlb, zone); + if (status) { + return status; + } + } + + bounce = g_malloc(len); ctx = g_new(struct nvme_compare_ctx, 1); @@ -1424,6 +1748,16 @@ static uint16_t nvme_read(NvmeCtrl *n, NvmeRequest *req) goto invalid; } + if (nvme_ns_zoned(ns)) { + NvmeZone *zone = nvme_ns_zone(ns, slba); + assert(zone); + + status = nvme_check_zone_read(ns, slba, nlb, zone); + if (status) { + goto invalid; + } + } + if (NVME_ERR_REC_DULBE(ns->features.err_rec)) { status = nvme_check_dulbe(ns, slba, nlb); if (status) { @@ -1483,6 +1817,31 @@ static uint16_t nvme_write(NvmeCtrl *n, NvmeRequest *req) goto invalid; } + if (nvme_ns_zoned(ns)) { + NvmeZone *zone = nvme_ns_zone(ns, slba); + assert(zone); + + status = nvme_check_zone_write(slba, nlb, zone); + if (status) { + goto invalid; + } + + switch (nvme_zs(zone)) { + case NVME_ZS_ZSIO: + case NVME_ZS_ZSEO: + break; + default: + status = nvme_zrm_transition(ns, zone, NVME_ZS_ZSIO); + if (status) { + goto invalid; + } + + break; + } + + zone->wp_staging += nlb; + } + data_offset = nvme_l2b(ns, slba); if (!wrz) { @@ -1841,6 +2200,7 @@ static uint16_t nvme_effects_log(NvmeCtrl *n, uint32_t buf_len, uint64_t off, } switch (csi) { + case NVME_IOCS_ZONED: case NVME_IOCS_NVM: nvme_effects_nvm(&effects); break; @@ -2716,6 +3076,23 @@ static void nvme_ctrl_shutdown(NvmeCtrl *n) } nvme_ns_flush(ns); + + if (nvme_ns_zoned(ns)) { + for (int i = 0; i < ns->zns.num_zones; i++) { + NvmeZone *zone = &ns->zns.zones[i]; + + switch (nvme_zs(zone)) { + case NVME_ZS_ZSIO: + case NVME_ZS_ZSEO: + nvme_zrm_transition(ns, zone, NVME_ZS_ZSC); + + /* fallthrough */ + + default: + break; + } + } + } } } @@ -3287,7 +3664,8 @@ static void nvme_init_state(NvmeCtrl *n) n->features.temp_thresh_hi = NVME_TEMPERATURE_WARNING; n->starttime_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL); n->aer_reqs = g_new0(NvmeRequest *, n->params.aerl + 1); - n->iocscs[0] = 1 << NVME_IOCS_NVM; + n->iocscs[0] = (1 << NVME_IOCS_NVM) | (1 << NVME_IOCS_ZONED); + n->iocscs[1] = 1 << NVME_IOCS_NVM; n->features.iocsci = 0; } @@ -3456,6 +3834,9 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev) uint8_t *pci_conf = pci_dev->config; char *subnqn; + n->id_ctrl_iocss[NVME_IOCS_NVM] = g_new0(NvmeIdCtrl, 1); + n->id_ctrl_iocss[NVME_IOCS_ZONED] = g_new0(NvmeIdCtrl, 1); + id->vid = cpu_to_le16(pci_get_word(pci_conf + PCI_VENDOR_ID)); id->ssvid = cpu_to_le16(pci_get_word(pci_conf + PCI_SUBSYSTEM_VENDOR_ID)); strpadcpy((char *)id->mn, sizeof(id->mn), "QEMU NVMe Ctrl", ' '); diff --git a/hw/block/trace-events b/hw/block/trace-events index 1f1aef719301..8b4533f99000 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -82,6 +82,8 @@ pci_nvme_enqueue_event_noqueue(int queued) "queued %d" pci_nvme_enqueue_event_masked(uint8_t typ) "type 0x%"PRIx8"" pci_nvme_no_outstanding_aers(void) "ignoring event; no outstanding AERs" pci_nvme_enqueue_req_completion(uint16_t cid, uint16_t cqid, uint16_t status) "cid %"PRIu16" cqid %"PRIu16" status 0x%"PRIx16"" +pci_nvme_zrm_transition(uint32_t nsid, uint64_t zslba, const char *s_from, uint8_t from, const char *s_to, uint8_t to) "nsid %"PRIu32" zslba 0x%"PRIx64" from '%s' (%"PRIu8") to '%s' (%"PRIu8")" +pci_nvme_zns_advance_wp(uint32_t nsid, uint64_t zslba, uint64_t wp_orig, uint32_t nlb) "nsid 0x%"PRIx32" zslba 0x%"PRIx64" wp_orig 0x%"PRIx64" nlb %"PRIu32"" pci_nvme_mmio_read(uint64_t addr) "addr 0x%"PRIx64"" pci_nvme_mmio_write(uint64_t addr, uint64_t data) "addr 0x%"PRIx64" data 0x%"PRIx64"" pci_nvme_mmio_doorbell_cq(uint16_t cqid, uint16_t new_head) "cqid %"PRIu16" new_head %"PRIu16"" @@ -107,6 +109,11 @@ pci_nvme_err_addr_write(uint64_t addr) "addr 0x%"PRIx64"" pci_nvme_err_cfs(void) "controller fatal status" pci_nvme_err_aio(uint16_t cid, const char *errname, uint16_t status) "cid %"PRIu16" err '%s' status 0x%"PRIx16"" pci_nvme_err_copy_invalid_format(uint8_t format) "format 0x%"PRIx8"" +pci_nvme_err_zone_is_full(uint64_t zslba) "zslba 0x%"PRIx64"" +pci_nvme_err_zone_is_read_only(uint64_t zslba) "zslba 0x%"PRIx64"" +pci_nvme_err_zone_is_offline(uint64_t zslba) "zslba 0x%"PRIx64"" +pci_nvme_err_zone_invalid_write(uint64_t slba, uint64_t wp) "lba 0x%"PRIx64" wp 0x%"PRIx64"" +pci_nvme_err_zone_boundary(uint64_t slba, uint32_t nlb, uint64_t zcap) "lba 0x%"PRIx64" nlb %"PRIu32" zcap 0x%"PRIx64"" pci_nvme_err_invalid_sgld(uint16_t cid, uint8_t typ) "cid %"PRIu16" type 0x%"PRIx8"" pci_nvme_err_invalid_num_sgld(uint16_t cid, uint8_t typ) "cid %"PRIu16" type 0x%"PRIx8"" pci_nvme_err_invalid_sgl_excess_length(uint16_t cid) "cid %"PRIu16"" @@ -133,6 +140,7 @@ pci_nvme_err_invalid_identify_cns(uint16_t cns) "identify, invalid cns=0x%"PRIx1 pci_nvme_err_invalid_getfeat(int dw10) "invalid get features, dw10=0x%"PRIx32"" pci_nvme_err_invalid_setfeat(uint32_t dw10) "invalid set features, dw10=0x%"PRIx32"" pci_nvme_err_invalid_log_page(uint16_t cid, uint16_t lid) "cid %"PRIu16" lid 0x%"PRIx16"" +pci_nvme_err_invalid_zone_state(uint64_t zslba, const char *zs_str, uint8_t zs) "zslba 0x%"PRIx64" zs '%s' (%"PRIu8")" pci_nvme_err_startfail_cq(void) "nvme_start_ctrl failed because there are non-admin completion queues" pci_nvme_err_startfail_sq(void) "nvme_start_ctrl failed because there are non-admin submission queues" pci_nvme_err_startfail_nbarasq(void) "nvme_start_ctrl failed because the admin submission queue address is null" From patchwork Thu Nov 26 23:45:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Klaus Jensen X-Patchwork-Id: 11934929 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EEBC4C63798 for ; Thu, 26 Nov 2020 23:54:55 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6B9C22070E for ; Thu, 26 Nov 2020 23:54:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6B9C22070E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=irrelevant.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:57816 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kiR6E-0004vP-6M for qemu-devel@archiver.kernel.org; Thu, 26 Nov 2020 18:54:54 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:55902) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kiQxy-0007Qr-B5; Thu, 26 Nov 2020 18:46:22 -0500 Received: from out3-smtp.messagingengine.com ([66.111.4.27]:53671) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kiQxs-0003ug-Gt; Thu, 26 Nov 2020 18:46:22 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 4C7505C01FC; Thu, 26 Nov 2020 18:46:15 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Thu, 26 Nov 2020 18:46:15 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=qnUU2lOqvT7/v /Hed/S2MvdLIpUGXX97YuIOZRmdD0o=; b=b2bS9y1iNQiumKXAyQmzrSYA+tIsi tKYsAl0g8rxIwvhclfn2LAj0cVa7BOt5S84cMJ/lLwMkptXupJWv6QzWKcOUUDAe 5yRLnzt9nlGYOE+t84+EQXyIvMAwo7Yw5bVHn9/Q07PFO89K6/tRmih9nYLf4gGh 7RvtumsvlNbwcKu00J0Po3+6HwWqEW0VaaSYG9mkfbZJ8CzgQuqvVhtDX4PU3FQl R2C6p5IKRTRSuOluLdfAKpF/oVXElGmUidpyB3Q0xR+D/1dfu6D3hiatpb+mWP2t N8cF3FrWNFXm4E5DFtoZ6WB2aJKOZGkkUgWKGkSHMHygCKqF6pZP75AIA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=qnUU2lOqvT7/v/Hed/S2MvdLIpUGXX97YuIOZRmdD0o=; b=M1DzdnHc KgcjxLxD4V+8M8K8CtIw45RWb180nMrIOKMTTkkIqNipnLqHj1WZnzUWUeSn/UQV Ly2+LJHok1ys//YMyk82CnlQq03QA3LL0sgh6sgA+0LsivqHOALoQXliJw+sJOrB 8/pjk2M7Ra6PZ77cPf+S8tnS1QF0X/YYxVEGokYry9GxXN+kwLfS618/iBvta76b gnSoI8tkuZ9mlr78dpzjj9uK7feOZRCEYbs4xFHsnuQVTPq/vEj0+K7a2amKcVfT 6MYZQNhfp6k8GNW+vGm0yqqRqdzU9/e4NWEyL1PbO4JRFGx/A1uFnsN96tBeMiBd MkIc6B4wP9Vr9A== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrudehfedgudefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvufffkffojghfggfgsedtkeertdertddtnecuhfhrohhmpefmlhgruhhs ucflvghnshgvnhcuoehithhssehirhhrvghlvghvrghnthdrughkqeenucggtffrrghtth gvrhhnpeeuleetgeeiuefhgfekfefgveejiefgteekiedtgfdtieefhfdthfefueffvefg keenucfkphepkedtrdduieejrdelkedrudeltdenucevlhhushhtvghrufhiiigvpedtne curfgrrhgrmhepmhgrihhlfhhrohhmpehithhssehirhhrvghlvghvrghnthdrughk X-ME-Proxy: Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id E4BE73064AB2; Thu, 26 Nov 2020 18:46:13 -0500 (EST) From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH v5 07/12] hw/block/nvme: add the zone management receive command Date: Fri, 27 Nov 2020 00:45:56 +0100 Message-Id: <20201126234601.689714-8-its@irrelevant.dk> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201126234601.689714-1-its@irrelevant.dk> References: <20201126234601.689714-1-its@irrelevant.dk> MIME-Version: 1.0 Received-SPF: pass client-ip=66.111.4.27; envelope-from=its@irrelevant.dk; helo=out3-smtp.messagingengine.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Klaus Jensen , Max Reitz , Keith Busch , Stefan Hajnoczi , Klaus Jensen Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Klaus Jensen Add the Zone Management Receive command. Signed-off-by: Klaus Jensen --- hw/block/nvme-ns.h | 8 +++ hw/block/nvme.h | 1 + include/block/nvme.h | 46 +++++++++++++ hw/block/nvme-ns.c | 8 +++ hw/block/nvme.c | 150 ++++++++++++++++++++++++++++++++++++++++++ hw/block/trace-events | 1 + 6 files changed, 214 insertions(+) diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index e373d62c5873..6370ef1a162b 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -31,11 +31,13 @@ typedef struct NvmeNamespaceParams { struct { uint64_t zcap; uint64_t zsze; + uint8_t zdes; } zns; } NvmeNamespaceParams; typedef struct NvmeZone { NvmeZoneDescriptor *zd; + uint8_t *zde; uint64_t wp_staging; } NvmeZone; @@ -59,6 +61,7 @@ typedef struct NvmeNamespace { NvmeZone *zones; NvmeZoneDescriptor *zd; + uint8_t *zde; } zns; } NvmeNamespace; @@ -99,6 +102,11 @@ static inline uint64_t nvme_ns_zsze(NvmeNamespace *ns) return nvme_ns_lbafe(ns)->zsze; } +static inline size_t nvme_ns_zdes_bytes(NvmeNamespace *ns) +{ + return ns->params.zns.zdes << 6; +} + /* calculate the number of LBAs that the namespace can accomodate */ static inline uint64_t nvme_ns_nlbas(NvmeNamespace *ns) { diff --git a/hw/block/nvme.h b/hw/block/nvme.h index b1616ba79733..97f9f543c9dd 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -63,6 +63,7 @@ static inline const char *nvme_io_opc_str(uint8_t opc) case NVME_CMD_WRITE_ZEROES: return "NVME_NVM_CMD_WRITE_ZEROES"; case NVME_CMD_DSM: return "NVME_NVM_CMD_DSM"; case NVME_CMD_COPY: return "NVME_NVM_CMD_COPY"; + case NVME_CMD_ZONE_MGMT_RECV: return "NVME_ZONED_CMD_ZONE_MGMT_RECV"; default: return "NVME_NVM_CMD_UNKNOWN"; } } diff --git a/include/block/nvme.h b/include/block/nvme.h index 6a5616bb9304..e000e79bb12b 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -485,6 +485,7 @@ enum NvmeIoCommands { NVME_CMD_WRITE_ZEROES = 0x08, NVME_CMD_DSM = 0x09, NVME_CMD_COPY = 0x19, + NVME_CMD_ZONE_MGMT_RECV = 0x7a, }; typedef struct QEMU_PACKED NvmeDeleteQ { @@ -597,6 +598,44 @@ enum { NVME_RW_PRINFO_PRCHK_REF = 1 << 10, }; +typedef struct QEMU_PACKED NvmeZoneMgmtRecvCmd { + uint8_t opcode; + uint8_t flags; + uint16_t cid; + uint32_t nsid; + uint8_t rsvd8[16]; + NvmeCmdDptr dptr; + uint64_t slba; + uint32_t numdw; + uint8_t zra; + uint8_t zrasf; + uint8_t pr; + uint8_t rsvd55[9]; +} NvmeZoneMgmtRecvCmd; + +enum { + NVME_ZMR_REPORT = 0x0, + NVME_ZMR_EXTENDED_REPORT = 0x1, + + NVME_ZMR_PARTIAL = 0x1, +}; + +enum { + NVME_ZMR_LIST_ALL = 0x0, + NVME_ZMR_LIST_ZSE = 0x1, + NVME_ZMR_LIST_ZSIO = 0x2, + NVME_ZMR_LIST_ZSEO = 0x3, + NVME_ZMR_LIST_ZSC = 0x4, + NVME_ZMR_LIST_ZSF = 0x5, + NVME_ZMR_LIST_ZSRO = 0x6, + NVME_ZMR_LIST_ZSO = 0x7, +}; + +typedef struct QEMU_PACKED NvmeZoneReportHeader { + uint64_t num_zones; + uint8_t rsvd[56]; +} NvmeZoneReportHeader; + typedef struct QEMU_PACKED NvmeDsmCmd { uint8_t opcode; uint8_t flags; @@ -846,6 +885,12 @@ typedef struct QEMU_PACKED NvmeZoneDescriptor { uint8_t rsvd32[32]; } NvmeZoneDescriptor; +#define NVME_ZA_ZDEV (1 << 7) + +#define NVME_ZA_SET(za, attrs) ((za) |= (attrs)) +#define NVME_ZA_CLEAR(za, attrs) ((za) &= ~(attrs)) +#define NVME_ZA_CLEAR_ALL(za) ((za) = 0x0) + enum NvmeSmartWarn { NVME_SMART_SPARE = 1 << 0, NVME_SMART_TEMPERATURE = 1 << 1, @@ -1212,6 +1257,7 @@ static inline void _nvme_check_size(void) QEMU_BUILD_BUG_ON(sizeof(NvmeRwCmd) != 64); QEMU_BUILD_BUG_ON(sizeof(NvmeDsmCmd) != 64); QEMU_BUILD_BUG_ON(sizeof(NvmeCopyCmd) != 64); + QEMU_BUILD_BUG_ON(sizeof(NvmeZoneMgmtRecvCmd) != 64); QEMU_BUILD_BUG_ON(sizeof(NvmeRangeType) != 64); QEMU_BUILD_BUG_ON(sizeof(NvmeErrorLog) != 64); QEMU_BUILD_BUG_ON(sizeof(NvmeFwSlotInfoLog) != 512); diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index 1f3d0644ba42..f2e8ee80b606 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -61,6 +61,9 @@ static void nvme_ns_zns_init_zones(NvmeNamespace *ns) zone = &ns->zns.zones[i]; zone->zd = &ns->zns.zd[i]; + if (ns->params.zns.zdes) { + zone->zde = &ns->zns.zde[i]; + } zone->wp_staging = zslba; zd = zone->zd; @@ -81,11 +84,15 @@ static void nvme_ns_init_zoned(NvmeNamespace *ns) id_ns_zns->lbafe[i].zsze = ns->params.zns.zsze ? cpu_to_le64(ns->params.zns.zsze) : cpu_to_le64(pow2ceil(ns->params.zns.zcap)); + id_ns_zns->lbafe[i].zdes = ns->params.zns.zdes; } ns->zns.num_zones = nvme_ns_nlbas(ns) / nvme_ns_zsze(ns); ns->zns.zones = g_malloc0_n(ns->zns.num_zones, sizeof(NvmeZone)); ns->zns.zd = g_malloc0_n(ns->zns.num_zones, sizeof(NvmeZoneDescriptor)); + if (ns->params.zns.zdes) { + ns->zns.zde = g_malloc0_n(ns->zns.num_zones, nvme_ns_zdes_bytes(ns)); + } id_ns_zns->mar = 0xffffffff; id_ns_zns->mor = 0xffffffff; @@ -259,6 +266,7 @@ static Property nvme_ns_props[] = { DEFINE_PROP_UINT8("msrc", NvmeNamespace, params.msrc, 255), DEFINE_PROP_UINT64("zns.zcap", NvmeNamespace, params.zns.zcap, 0), DEFINE_PROP_UINT64("zns.zsze", NvmeNamespace, params.zns.zsze, 0), + DEFINE_PROP_UINT8("zns.zdes", NvmeNamespace, params.zns.zdes, 0), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 60a467d5df62..8dc6b565a4a0 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1012,6 +1012,7 @@ static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, NvmeZoneState to) { NvmeZoneState from = nvme_zs(zone); + NvmeZoneDescriptor *zd = zone->zd; trace_pci_nvme_zrm_transition(ns->params.nsid, nvme_zslba(zone), nvme_zs_to_str(from), from, @@ -1030,6 +1031,10 @@ static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, switch (to) { case NVME_ZS_ZSE: case NVME_ZS_ZSO: + NVME_ZA_CLEAR_ALL(zd->za); + + /* fallthrough */ + case NVME_ZS_ZSEO: case NVME_ZS_ZSF: case NVME_ZS_ZSRO: @@ -1046,6 +1051,10 @@ static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, switch (to) { case NVME_ZS_ZSE: case NVME_ZS_ZSO: + NVME_ZA_CLEAR_ALL(zd->za); + + /* fallthrough */ + case NVME_ZS_ZSF: case NVME_ZS_ZSRO: case NVME_ZS_ZSIO: @@ -1061,6 +1070,7 @@ static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, case NVME_ZS_ZSRO: switch (to) { case NVME_ZS_ZSO: + NVME_ZA_CLEAR_ALL(zd->za); break; default: @@ -1073,6 +1083,10 @@ static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, switch (to) { case NVME_ZS_ZSE: case NVME_ZS_ZSO: + NVME_ZA_CLEAR_ALL(zd->za); + + /* fallthrough */ + case NVME_ZS_ZSRO: break; @@ -1446,6 +1460,123 @@ out: nvme_enqueue_req_completion(nvme_cq(req), req); } +static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, NvmeRequest *req) +{ + NvmeZoneMgmtRecvCmd *recv = (NvmeZoneMgmtRecvCmd *)&req->cmd; + NvmeNamespace *ns = req->ns; + NvmeZone *zone; + uint8_t zra = recv->zra; + uint8_t zrasf = recv->zrasf; + uint8_t pr = recv->pr & 0x1; + uint64_t slba = le64_to_cpu(recv->slba); + size_t len = (le32_to_cpu(recv->numdw) + 1) << 2; + int num_zones = 0, zidx = 0, zidx_begin, i; + uint16_t zes, status; + uint8_t *buf, *bufp, zs_list; + + if (!nvme_ns_zoned(ns)) { + return NVME_INVALID_OPCODE | NVME_DNR; + } + + trace_pci_nvme_zone_mgmt_recv(nvme_cid(req), nvme_nsid(ns), slba, len, + zra, zrasf, pr); + + if (!(len && nvme_ns_zone(ns, slba))) { + return NVME_SUCCESS; + } + + status = nvme_check_mdts(n, len); + if (status) { + return status; + } + + switch (zrasf) { + case NVME_ZMR_LIST_ALL: + zs_list = 0; + break; + + case NVME_ZMR_LIST_ZSE: + zs_list = NVME_ZS_ZSE; + break; + + case NVME_ZMR_LIST_ZSIO: + zs_list = NVME_ZS_ZSIO; + break; + + case NVME_ZMR_LIST_ZSEO: + zs_list = NVME_ZS_ZSEO; + break; + + case NVME_ZMR_LIST_ZSC: + zs_list = NVME_ZS_ZSC; + break; + + case NVME_ZMR_LIST_ZSF: + zs_list = NVME_ZS_ZSF; + break; + + case NVME_ZMR_LIST_ZSRO: + zs_list = NVME_ZS_ZSRO; + break; + + case NVME_ZMR_LIST_ZSO: + zs_list = NVME_ZS_ZSO; + break; + default: + return NVME_INVALID_FIELD | NVME_DNR; + } + + zidx_begin = zidx = slba / nvme_ns_zsze(ns); + zes = sizeof(NvmeZoneDescriptor); + if (zra == NVME_ZMR_EXTENDED_REPORT) { + zes += nvme_ns_zdes_bytes(ns); + } + + buf = bufp = g_malloc0(len); + bufp += sizeof(NvmeZoneReportHeader); + + while ((bufp + zes) - buf <= len && zidx < ns->zns.num_zones) { + zone = &ns->zns.zones[zidx++]; + + if (zs_list && zs_list != nvme_zs(zone)) { + continue; + } + + num_zones++; + + memcpy(bufp, zone->zd, sizeof(NvmeZoneDescriptor)); + + if (zra == NVME_ZMR_EXTENDED_REPORT) { + memcpy(bufp + sizeof(NvmeZoneDescriptor), zone->zde, + nvme_ns_zdes_bytes(ns)); + } + + bufp += zes; + } + + if (!(pr & NVME_ZMR_PARTIAL)) { + if (!zs_list) { + num_zones = ns->zns.num_zones - zidx_begin; + } else { + num_zones = 0; + for (i = 0; i < ns->zns.num_zones; i++) { + zone = &ns->zns.zones[i]; + + if (zs_list == nvme_zs(zone)) { + num_zones++; + } + } + } + } + + stq_le_p(buf, (uint64_t)num_zones); + + status = nvme_dma(n, buf, len, DMA_DIRECTION_FROM_DEVICE, req); + g_free(buf); + + return status; +} + static uint16_t nvme_dsm(NvmeCtrl *n, NvmeRequest *req) { NvmeNamespace *ns = req->ns; @@ -1907,6 +2038,8 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req) return nvme_dsm(n, req); case NVME_CMD_COPY: return nvme_copy(n, req); + case NVME_CMD_ZONE_MGMT_RECV: + return nvme_zone_mgmt_recv(n, req); default: trace_pci_nvme_err_invalid_opc(req->cmd.opcode); return NVME_INVALID_OPCODE | NVME_DNR; @@ -2158,6 +2291,11 @@ static void nvme_effects_nvm(NvmeEffectsLog *effects) effects->iocs[NVME_CMD_COPY] = NVME_EFFECTS_CSUPP | NVME_EFFECTS_LBCC; } +static void nvme_effects_zoned(NvmeEffectsLog *effects) +{ + effects->iocs[NVME_CMD_ZONE_MGMT_RECV] = NVME_EFFECTS_CSUPP; +} + static uint16_t nvme_effects_log(NvmeCtrl *n, uint32_t buf_len, uint64_t off, NvmeRequest *req) { @@ -2201,6 +2339,10 @@ static uint16_t nvme_effects_log(NvmeCtrl *n, uint32_t buf_len, uint64_t off, switch (csi) { case NVME_IOCS_ZONED: + nvme_effects_zoned(&effects); + + /* fallthrough */ + case NVME_IOCS_NVM: nvme_effects_nvm(&effects); break; @@ -3088,6 +3230,14 @@ static void nvme_ctrl_shutdown(NvmeCtrl *n) /* fallthrough */ + case NVME_ZS_ZSC: + if (nvme_wp(zone) == nvme_zslba(zone) && + !(zone->zd->za & NVME_ZA_ZDEV)) { + nvme_zrm_transition(ns, zone, NVME_ZS_ZSE); + } + + /* fallthrough */ + default: break; } diff --git a/hw/block/trace-events b/hw/block/trace-events index 8b4533f99000..429b4849d2dc 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -54,6 +54,7 @@ pci_nvme_compare(uint16_t cid, uint32_t nsid, uint64_t slba, uint32_t nlb) "cid pci_nvme_compare_cb(uint16_t cid) "cid %"PRIu16"" pci_nvme_aio_discard_cb(uint16_t cid) "cid %"PRIu16"" pci_nvme_aio_copy_in_cb(uint16_t cid) "cid %"PRIu16"" +pci_nvme_zone_mgmt_recv(uint16_t cid, uint32_t nsid, uint64_t slba, uint64_t len, uint8_t zra, uint8_t zfeat, uint8_t zflags) "cid %"PRIu16" nsid %"PRIu32" slba 0x%"PRIx64" len %"PRIu64" zra 0x%"PRIx8" zrasf 0x%"PRIx8" pr 0x%"PRIx8"" pci_nvme_create_sq(uint64_t addr, uint16_t sqid, uint16_t cqid, uint16_t qsize, uint16_t qflags) "create submission queue, addr=0x%"PRIx64", sqid=%"PRIu16", cqid=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16"" pci_nvme_create_cq(uint64_t addr, uint16_t cqid, uint16_t vector, uint16_t size, uint16_t qflags, int ien) "create completion queue, addr=0x%"PRIx64", cqid=%"PRIu16", vector=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16", ien=%d" pci_nvme_del_sq(uint16_t qid) "deleting submission queue sqid=%"PRIu16"" From patchwork Thu Nov 26 23:45:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Klaus Jensen X-Patchwork-Id: 11934945 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF7F3C63777 for ; Fri, 27 Nov 2020 00:06:17 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5BB8520857 for ; Fri, 27 Nov 2020 00:06:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5BB8520857 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=irrelevant.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:44186 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kiRHE-0003FL-1j for qemu-devel@archiver.kernel.org; Thu, 26 Nov 2020 19:06:16 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:55956) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kiQy0-0007UX-Mz; Thu, 26 Nov 2020 18:46:24 -0500 Received: from out3-smtp.messagingengine.com ([66.111.4.27]:48785) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kiQxu-0003vM-9D; Thu, 26 Nov 2020 18:46:24 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id B174C5C0214; Thu, 26 Nov 2020 18:46:16 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Thu, 26 Nov 2020 18:46:16 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=RjaO5b/MNMKXE 99teDtfmvRiUhkYxRLHT8EjxPhx/uI=; b=MhxUWRzuK+48B64gp5+7Gr7F+q8Pw il0r9GofWItwJ6w9vAYcqjg3CkV51I8pCpBd8BoLZcPJYQKdP8F04vpmkxRD53Eh IaTP/Dy/KPgCfrbLPATgw0YoQVyF81WaGCzusCLHfDEd8VwX8Zk6py0s560quvf/ PmP9QzGtvIJH4BAiBGY7Pbdk0qcQTJ53CaxZ1prAgWz777L8vO4fX+FDB+Ji2zGY bV1l8dHjsZCuJN8icgYSiZcKzxagPMS7efXk1yvU/B2TF16M9W8A6wFiCEmhLhYJ NAMaWmB2hpH62fQ6wdZdXmYdqwXmB014+VkMWJpuudccF6OMZhnXNOSVw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=RjaO5b/MNMKXE99teDtfmvRiUhkYxRLHT8EjxPhx/uI=; b=YSNVCF53 Y6aXX0uGOMsFexGE7Fhr5cbZAm4bT/RRFcqq+hkuTlC7gU7DW6O3AMm7qZjTqH7F E0s7pm3h2CY0bl8KIkNTxh9ENBSZbzll6yTar7JT+dsegkShdOXF3ZQDy7o94BAQ EtLZXAiUXPqLEN/utpJDGRUGZspYkGHcTIyGxDO/t6+CNx2w48XVBpy32Nghz0zT Thbxel8GzqG7kHbmQxeBY9QXpZdCnqgf99oo5VrEf0r1QlWAKQEqPFKBaxzjVk7C zOYiDUTgdQXTOPx7UHPzgny+YfdpfCMZn0RINJo8ik604KktJ0ohs4LvWA8nQKyz KvbiUxNzAlxEAw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrudehfedgudefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvufffkffojghfggfgsedtkeertdertddtnecuhfhrohhmpefmlhgruhhs ucflvghnshgvnhcuoehithhssehirhhrvghlvghvrghnthdrughkqeenucggtffrrghtth gvrhhnpeeuleetgeeiuefhgfekfefgveejiefgteekiedtgfdtieefhfdthfefueffvefg keenucfkphepkedtrdduieejrdelkedrudeltdenucevlhhushhtvghrufhiiigvpedtne curfgrrhgrmhepmhgrihhlfhhrohhmpehithhssehirhhrvghlvghvrghnthdrughk X-ME-Proxy: Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id 506603064AAE; Thu, 26 Nov 2020 18:46:15 -0500 (EST) From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH v5 08/12] hw/block/nvme: add the zone management send command Date: Fri, 27 Nov 2020 00:45:57 +0100 Message-Id: <20201126234601.689714-9-its@irrelevant.dk> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201126234601.689714-1-its@irrelevant.dk> References: <20201126234601.689714-1-its@irrelevant.dk> MIME-Version: 1.0 Received-SPF: pass client-ip=66.111.4.27; envelope-from=its@irrelevant.dk; helo=out3-smtp.messagingengine.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Klaus Jensen , Max Reitz , Keith Busch , Stefan Hajnoczi , Klaus Jensen Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Klaus Jensen Add the Zone Management Send command. The spec specifies that "All logical blocks in a zone *shall* be marked as deallocated when [the zone is reset]". Since the device guarantees 0x00 to be read from deallocated blocks we have to issue a pwrite_zeroes since we cannot be sure that a discard actually does anything. Typically, this will be converted to an unmap/discard operation, but if the underlying blockdev doesn't support that a reset can be pretty expensive. Signed-off-by: Klaus Jensen --- hw/block/nvme.h | 1 + include/block/nvme.h | 30 ++++ hw/block/nvme.c | 383 ++++++++++++++++++++++++++++++++++++++++++ hw/block/trace-events | 12 ++ 4 files changed, 426 insertions(+) diff --git a/hw/block/nvme.h b/hw/block/nvme.h index 97f9f543c9dd..0cf3b303e34e 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -63,6 +63,7 @@ static inline const char *nvme_io_opc_str(uint8_t opc) case NVME_CMD_WRITE_ZEROES: return "NVME_NVM_CMD_WRITE_ZEROES"; case NVME_CMD_DSM: return "NVME_NVM_CMD_DSM"; case NVME_CMD_COPY: return "NVME_NVM_CMD_COPY"; + case NVME_CMD_ZONE_MGMT_SEND: return "NVME_ZONED_CMD_ZONE_MGMT_SEND"; case NVME_CMD_ZONE_MGMT_RECV: return "NVME_ZONED_CMD_ZONE_MGMT_RECV"; default: return "NVME_NVM_CMD_UNKNOWN"; } diff --git a/include/block/nvme.h b/include/block/nvme.h index e000e79bb12b..4c2b6fbb799a 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -485,6 +485,7 @@ enum NvmeIoCommands { NVME_CMD_WRITE_ZEROES = 0x08, NVME_CMD_DSM = 0x09, NVME_CMD_COPY = 0x19, + NVME_CMD_ZONE_MGMT_SEND = 0x79, NVME_CMD_ZONE_MGMT_RECV = 0x7a, }; @@ -598,6 +599,34 @@ enum { NVME_RW_PRINFO_PRCHK_REF = 1 << 10, }; +typedef struct QEMU_PACKED NvmeZoneMgmtSendCmd { + uint8_t opcode; + uint8_t flags; + uint16_t cid; + uint32_t nsid; + uint32_t rsvd8[4]; + NvmeCmdDptr dptr; + uint64_t slba; + uint32_t rsvd48; + uint8_t zsa; + uint8_t select_all; + uint8_t rsvd54[2]; + uint32_t rsvd56[2]; +} NvmeZoneMgmtSendCmd; + +enum { + NVME_ZMS_SELECT_ALL = 0x1, +}; + +enum { + NVME_ZMS_CLOSE = 0x1, + NVME_ZMS_FINISH = 0x2, + NVME_ZMS_OPEN = 0x3, + NVME_ZMS_RESET = 0x4, + NVME_ZMS_OFFLINE = 0x5, + NVME_ZMS_SET_ZDE = 0x10, +}; + typedef struct QEMU_PACKED NvmeZoneMgmtRecvCmd { uint8_t opcode; uint8_t flags; @@ -1257,6 +1286,7 @@ static inline void _nvme_check_size(void) QEMU_BUILD_BUG_ON(sizeof(NvmeRwCmd) != 64); QEMU_BUILD_BUG_ON(sizeof(NvmeDsmCmd) != 64); QEMU_BUILD_BUG_ON(sizeof(NvmeCopyCmd) != 64); + QEMU_BUILD_BUG_ON(sizeof(NvmeZoneMgmtSendCmd) != 64); QEMU_BUILD_BUG_ON(sizeof(NvmeZoneMgmtRecvCmd) != 64); QEMU_BUILD_BUG_ON(sizeof(NvmeRangeType) != 64); QEMU_BUILD_BUG_ON(sizeof(NvmeErrorLog) != 64); diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 8dc6b565a4a0..f0f4d72266bf 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1008,6 +1008,12 @@ static uint16_t nvme_check_dulbe(NvmeNamespace *ns, uint64_t slba, return NVME_SUCCESS; } +static inline void nvme_zone_reset_wp(NvmeZone *zone) +{ + zone->zd->wp = zone->zd->zslba; + zone->wp_staging = nvme_zslba(zone); +} + static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, NvmeZoneState to) { @@ -1030,6 +1036,10 @@ static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, case NVME_ZS_ZSEO: switch (to) { case NVME_ZS_ZSE: + nvme_zone_reset_wp(zone); + + /* fallthrough */ + case NVME_ZS_ZSO: NVME_ZA_CLEAR_ALL(zd->za); @@ -1050,6 +1060,10 @@ static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, case NVME_ZS_ZSC: switch (to) { case NVME_ZS_ZSE: + nvme_zone_reset_wp(zone); + + /* fallthrough */ + case NVME_ZS_ZSO: NVME_ZA_CLEAR_ALL(zd->za); @@ -1082,6 +1096,10 @@ static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, case NVME_ZS_ZSF: switch (to) { case NVME_ZS_ZSE: + nvme_zone_reset_wp(zone); + + /* fallthrough */ + case NVME_ZS_ZSO: NVME_ZA_CLEAR_ALL(zd->za); @@ -1460,6 +1478,367 @@ out: nvme_enqueue_req_completion(nvme_cq(req), req); } +struct nvme_zone_reset_ctx { + NvmeRequest *req; + NvmeZone *zone; +}; + +static void nvme_aio_zone_reset_cb(void *opaque, int ret) +{ + struct nvme_zone_reset_ctx *ctx = opaque; + NvmeRequest *req = ctx->req; + NvmeZone *zone = ctx->zone; + uintptr_t *resets = (uintptr_t *)&req->opaque; + + g_free(ctx); + + trace_pci_nvme_aio_zone_reset_cb(nvme_cid(req), nvme_zslba(zone)); + + if (!ret) { + nvme_zrm_transition(req->ns, zone, NVME_ZS_ZSE); + } else { + nvme_aio_err(req, ret, zone); + } + + (*resets)--; + + if (*resets) { + return; + } + + nvme_enqueue_req_completion(nvme_cq(req), req); +} + +static uint16_t nvme_zone_mgmt_send_close(NvmeCtrl *n, NvmeRequest *req, + NvmeZone *zone) +{ + NvmeNamespace *ns = req->ns; + uint16_t status; + + trace_pci_nvme_zone_mgmt_send_close(nvme_cid(req), nvme_nsid(ns), + nvme_zslba(zone), nvme_zs_str(zone)); + + switch (nvme_zs(zone)) { + case NVME_ZS_ZSC: + return NVME_SUCCESS; + + case NVME_ZS_ZSE: + /* + * The state machine in nvme_zrm_transition allows zones to transition + * from ZSE to ZSC. That transition is only valid if done as part Set + * Zone Descriptor, so do an early check here. + */ + return NVME_INVALID_ZONE_STATE_TRANSITION | NVME_DNR; + + default: + break; + } + + status = nvme_zrm_transition(ns, zone, NVME_ZS_ZSC); + if (status) { + return status; + } + + return NVME_SUCCESS; +} + +static uint16_t nvme_zone_mgmt_send_finish(NvmeCtrl *n, NvmeRequest *req, + NvmeZone *zone) +{ + NvmeNamespace *ns = req->ns; + uint16_t status; + + trace_pci_nvme_zone_mgmt_send_finish(nvme_cid(req), nvme_nsid(ns), + nvme_zslba(zone), nvme_zs_str(zone)); + + if (nvme_zs(zone) == NVME_ZS_ZSF) { + return NVME_SUCCESS; + } + + status = nvme_zrm_transition(ns, zone, NVME_ZS_ZSF); + if (status) { + return status; + } + + return NVME_SUCCESS; +} + +static uint16_t nvme_zone_mgmt_send_open(NvmeCtrl *n, NvmeRequest *req, + NvmeZone *zone) +{ + NvmeNamespace *ns = req->ns; + uint16_t status; + + trace_pci_nvme_zone_mgmt_send_open(nvme_cid(req), nvme_nsid(ns), + nvme_zslba(zone), nvme_zs_str(zone)); + + if (nvme_zs(zone) == NVME_ZS_ZSEO) { + return NVME_SUCCESS; + } + + status = nvme_zrm_transition(ns, zone, NVME_ZS_ZSEO); + if (status) { + return status; + } + + return NVME_SUCCESS; +} + +static void __nvme_zone_mgmt_send_reset(NvmeCtrl *n, NvmeRequest *req, + NvmeZone *zone) +{ + NvmeNamespace *ns = req->ns; + uint64_t zslba = nvme_zslba(zone); + uint64_t zsze = nvme_ns_zsze(ns); + uintptr_t *resets = (uintptr_t *)&req->opaque; + struct nvme_zone_reset_ctx *ctx = g_new(struct nvme_zone_reset_ctx, 1); + + trace_pci_nvme_zone_mgmt_send_reset(nvme_cid(req), nvme_nsid(ns), + nvme_zslba(zone), nvme_zs_str(zone)); + + /* + * The zone reset callback needs to know the zone that is being reset in + * order to transition the zone. + */ + ctx->req = req; + ctx->zone = zone; + + (*resets)++; + + blk_aio_pwrite_zeroes(ns->blkconf.blk, nvme_l2b(ns, zslba), + nvme_l2b(ns, zsze), BDRV_REQ_MAY_UNMAP, + nvme_aio_zone_reset_cb, ctx); +} + +static uint16_t nvme_zone_mgmt_send_reset(NvmeCtrl *n, NvmeRequest *req, + NvmeZone *zone) +{ + uintptr_t *resets = (uintptr_t *)&req->opaque; + + *resets = 1; + + __nvme_zone_mgmt_send_reset(n, req, zone); + + (*resets)--; + + return *resets ? NVME_NO_COMPLETE : req->status; +} + +static uint16_t nvme_zone_mgmt_send_offline(NvmeCtrl *n, NvmeRequest *req, + NvmeZone *zone) +{ + NvmeNamespace *ns = req->ns; + + trace_pci_nvme_zone_mgmt_send_offline(nvme_cid(req), nvme_nsid(ns), + nvme_zslba(zone), nvme_zs_str(zone)); + + switch (nvme_zs(zone)) { + case NVME_ZS_ZSRO: + nvme_zrm_transition(ns, zone, NVME_ZS_ZSO); + + /* fallthrough */ + + case NVME_ZS_ZSO: + return NVME_SUCCESS; + + default: + break; + } + + return NVME_INVALID_ZONE_STATE_TRANSITION | NVME_DNR; +} + +static uint16_t nvme_zone_mgmt_send_set_zde(NvmeCtrl *n, NvmeRequest *req, + NvmeZone *zone) +{ + NvmeNamespace *ns = req->ns; + uint16_t status; + + trace_pci_nvme_zone_mgmt_send_set_zde(nvme_cid(req), nvme_nsid(ns), + nvme_zslba(zone), nvme_zs_str(zone)); + + if (nvme_zs(zone) != NVME_ZS_ZSE) { + trace_pci_nvme_err_invalid_zone_state(nvme_zslba(zone), + nvme_zs_str(zone), + nvme_zs(zone)); + return NVME_INVALID_ZONE_STATE_TRANSITION | NVME_DNR; + } + + status = nvme_check_mdts(n, nvme_ns_zdes_bytes(ns)); + if (status) { + return status; + } + + status = nvme_dma(n, zone->zde, nvme_ns_zdes_bytes(ns), + DMA_DIRECTION_TO_DEVICE, req); + if (status) { + return status; + } + + status = nvme_zrm_transition(ns, zone, NVME_ZS_ZSC); + if (status) { + return status; + } + + NVME_ZA_SET(zone->zd->za, NVME_ZA_ZDEV); + + return NVME_SUCCESS; +} + +static uint16_t nvme_zone_mgmt_send_all(NvmeCtrl *n, NvmeNamespace *ns, + uint8_t zsa, NvmeRequest *req) +{ + NvmeZone *zone; + uintptr_t *resets = (uintptr_t *)&req->opaque; + uint16_t status = NVME_SUCCESS; + + trace_pci_nvme_zone_mgmt_send_all(nvme_cid(req), nvme_nsid(ns), zsa); + + switch (zsa) { + case NVME_ZMS_SET_ZDE: + return NVME_INVALID_FIELD | NVME_DNR; + + case NVME_ZMS_CLOSE: + for (int i = 0; i < ns->zns.num_zones; i++) { + zone = &ns->zns.zones[i]; + + switch (nvme_zs(zone)) { + case NVME_ZS_ZSIO: + case NVME_ZS_ZSEO: + status = nvme_zone_mgmt_send_close(n, req, zone); + if (status) { + return status; + } + + default: + continue; + } + } + + break; + + case NVME_ZMS_FINISH: + for (int i = 0; i < ns->zns.num_zones; i++) { + zone = &ns->zns.zones[i]; + + switch (nvme_zs(zone)) { + case NVME_ZS_ZSIO: + case NVME_ZS_ZSEO: + case NVME_ZS_ZSC: + status = nvme_zone_mgmt_send_finish(n, req, zone); + if (status) { + return status; + } + + default: + continue; + } + } + + break; + + case NVME_ZMS_OPEN: + for (int i = 0; i < ns->zns.num_zones; i++) { + zone = &ns->zns.zones[i]; + + if (nvme_zs(zone) == NVME_ZS_ZSC) { + status = nvme_zone_mgmt_send_open(n, req, zone); + if (status) { + return status; + } + } + } + + break; + + case NVME_ZMS_RESET: + *resets = 1; + + for (int i = 0; i < ns->zns.num_zones; i++) { + zone = &ns->zns.zones[i]; + + switch (nvme_zs(zone)) { + case NVME_ZS_ZSIO: + case NVME_ZS_ZSEO: + case NVME_ZS_ZSC: + case NVME_ZS_ZSF: + __nvme_zone_mgmt_send_reset(n, req, zone); + default: + continue; + } + } + + (*resets)--; + + return *resets ? NVME_NO_COMPLETE : req->status; + + case NVME_ZMS_OFFLINE: + for (int i = 0; i < ns->zns.num_zones; i++) { + zone = &ns->zns.zones[i]; + + if (nvme_zs(zone) == NVME_ZS_ZSRO) { + status = nvme_zone_mgmt_send_offline(n, req, zone); + if (status) { + return status; + } + } + } + + break; + } + + return status; +} + +static uint16_t nvme_zone_mgmt_send(NvmeCtrl *n, NvmeRequest *req) +{ + NvmeZoneMgmtSendCmd *send = (NvmeZoneMgmtSendCmd *)&req->cmd; + NvmeNamespace *ns = req->ns; + NvmeZone *zone; + uint8_t zsa = send->zsa; + uint8_t select_all = send->select_all & 0x1; + uint64_t zslba = le64_to_cpu(send->slba); + + if (!nvme_ns_zoned(ns)) { + return NVME_INVALID_OPCODE | NVME_DNR; + } + + trace_pci_nvme_zone_mgmt_send(nvme_cid(req), ns->params.nsid, zslba, zsa, + select_all); + + if (select_all) { + return nvme_zone_mgmt_send_all(n, ns, zsa, req); + } + + zone = nvme_ns_zone(ns, zslba); + if (!zone) { + trace_pci_nvme_err_invalid_zone(zslba); + return NVME_INVALID_FIELD | NVME_DNR; + } + + if (zslba != nvme_zslba(zone)) { + trace_pci_nvme_err_invalid_zslba(zslba); + return NVME_INVALID_FIELD | NVME_DNR; + } + + switch (zsa) { + case NVME_ZMS_CLOSE: + return nvme_zone_mgmt_send_close(n, req, zone); + case NVME_ZMS_FINISH: + return nvme_zone_mgmt_send_finish(n, req, zone); + case NVME_ZMS_OPEN: + return nvme_zone_mgmt_send_open(n, req, zone); + case NVME_ZMS_RESET: + return nvme_zone_mgmt_send_reset(n, req, zone); + case NVME_ZMS_OFFLINE: + return nvme_zone_mgmt_send_offline(n, req, zone); + case NVME_ZMS_SET_ZDE: + return nvme_zone_mgmt_send_set_zde(n, req, zone); + } + + return NVME_INVALID_FIELD | NVME_DNR; +} + static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, NvmeRequest *req) { NvmeZoneMgmtRecvCmd *recv = (NvmeZoneMgmtRecvCmd *)&req->cmd; @@ -2038,6 +2417,8 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req) return nvme_dsm(n, req); case NVME_CMD_COPY: return nvme_copy(n, req); + case NVME_CMD_ZONE_MGMT_SEND: + return nvme_zone_mgmt_send(n, req); case NVME_CMD_ZONE_MGMT_RECV: return nvme_zone_mgmt_recv(n, req); default: @@ -2294,6 +2675,8 @@ static void nvme_effects_nvm(NvmeEffectsLog *effects) static void nvme_effects_zoned(NvmeEffectsLog *effects) { effects->iocs[NVME_CMD_ZONE_MGMT_RECV] = NVME_EFFECTS_CSUPP; + effects->iocs[NVME_CMD_ZONE_MGMT_SEND] = NVME_EFFECTS_CSUPP | + NVME_EFFECTS_LBCC; } static uint16_t nvme_effects_log(NvmeCtrl *n, uint32_t buf_len, uint64_t off, diff --git a/hw/block/trace-events b/hw/block/trace-events index 429b4849d2dc..f62dfda279cd 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -54,6 +54,16 @@ pci_nvme_compare(uint16_t cid, uint32_t nsid, uint64_t slba, uint32_t nlb) "cid pci_nvme_compare_cb(uint16_t cid) "cid %"PRIu16"" pci_nvme_aio_discard_cb(uint16_t cid) "cid %"PRIu16"" pci_nvme_aio_copy_in_cb(uint16_t cid) "cid %"PRIu16"" +pci_nvme_aio_zone_reset_cb(uint16_t cid, uint64_t zslba) "cid %"PRIu16" zslba 0x%"PRIx64"" +pci_nvme_zone_mgmt_send(uint16_t cid, uint32_t nsid, uint64_t zslba, uint8_t zsa, uint8_t select_all) "cid %"PRIu16" nsid %"PRIu32" zslba 0x%"PRIx64" zsa 0x%"PRIx8" select_all 0x%"PRIx8"" +pci_nvme_zone_mgmt_send_all(uint16_t cid, uint32_t nsid, uint8_t za) "cid %"PRIu16" nsid %"PRIu32" za 0x%"PRIx8"" +pci_nvme_zone_mgmt_send_close(uint16_t cid, uint32_t nsid, uint64_t zslba, const char *zc) "cid %"PRIu16" nsid %"PRIu32" zslba 0x%"PRIx64" zc \"%s\"" +pci_nvme_zone_mgmt_send_finish(uint16_t cid, uint32_t nsid, uint64_t zslba, const char *zc) "cid %"PRIu16" nsid %"PRIu32" zslba 0x%"PRIx64" zc \"%s\"" +pci_nvme_zone_mgmt_send_open(uint16_t cid, uint32_t nsid, uint64_t zslba, const char *zc) "cid %"PRIu16" nsid %"PRIu32" zslba 0x%"PRIx64" zc \"%s\"" +pci_nvme_zone_mgmt_send_reset(uint16_t cid, uint32_t nsid, uint64_t zslba, const char *zc) "cid %"PRIu16" nsid %"PRIu32" zslba 0x%"PRIx64" zc \"%s\"" +pci_nvme_zone_mgmt_send_reset_cb(uint16_t cid, uint32_t nsid) "cid %"PRIu16" nsid %"PRIu32"" +pci_nvme_zone_mgmt_send_offline(uint16_t cid, uint32_t nsid, uint64_t zslba, const char *zc) "cid %"PRIu16" nsid %"PRIu32" zslba 0x%"PRIx64" zc \"%s\"" +pci_nvme_zone_mgmt_send_set_zde(uint16_t cid, uint32_t nsid, uint64_t zslba, const char *zc) "cid %"PRIu16" nsid %"PRIu32" zslba 0x%"PRIx64" zc \"%s\"" pci_nvme_zone_mgmt_recv(uint16_t cid, uint32_t nsid, uint64_t slba, uint64_t len, uint8_t zra, uint8_t zfeat, uint8_t zflags) "cid %"PRIu16" nsid %"PRIu32" slba 0x%"PRIx64" len %"PRIu64" zra 0x%"PRIx8" zrasf 0x%"PRIx8" pr 0x%"PRIx8"" pci_nvme_create_sq(uint64_t addr, uint16_t sqid, uint16_t cqid, uint16_t qsize, uint16_t qflags) "create submission queue, addr=0x%"PRIx64", sqid=%"PRIu16", cqid=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16"" pci_nvme_create_cq(uint64_t addr, uint16_t cqid, uint16_t vector, uint16_t size, uint16_t qflags, int ien) "create completion queue, addr=0x%"PRIx64", cqid=%"PRIu16", vector=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16", ien=%d" @@ -142,6 +152,8 @@ pci_nvme_err_invalid_getfeat(int dw10) "invalid get features, dw10=0x%"PRIx32"" pci_nvme_err_invalid_setfeat(uint32_t dw10) "invalid set features, dw10=0x%"PRIx32"" pci_nvme_err_invalid_log_page(uint16_t cid, uint16_t lid) "cid %"PRIu16" lid 0x%"PRIx16"" pci_nvme_err_invalid_zone_state(uint64_t zslba, const char *zs_str, uint8_t zs) "zslba 0x%"PRIx64" zs '%s' (%"PRIu8")" +pci_nvme_err_invalid_zone(uint64_t lba) "lba 0x%"PRIx64"" +pci_nvme_err_invalid_zslba(uint64_t lba) "lba 0x%"PRIx64"" pci_nvme_err_startfail_cq(void) "nvme_start_ctrl failed because there are non-admin completion queues" pci_nvme_err_startfail_sq(void) "nvme_start_ctrl failed because there are non-admin submission queues" pci_nvme_err_startfail_nbarasq(void) "nvme_start_ctrl failed because the admin submission queue address is null" From patchwork Thu Nov 26 23:45:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Klaus Jensen X-Patchwork-Id: 11934941 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 628C2C63777 for ; Fri, 27 Nov 2020 00:00:21 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E1596207D8 for ; Fri, 27 Nov 2020 00:00:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E1596207D8 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=irrelevant.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:38480 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kiRBT-0000U3-Kv for qemu-devel@archiver.kernel.org; Thu, 26 Nov 2020 19:00:19 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:55934) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kiQxz-0007TB-Uj; Thu, 26 Nov 2020 18:46:23 -0500 Received: from out3-smtp.messagingengine.com ([66.111.4.27]:41739) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kiQxw-0003vy-7F; Thu, 26 Nov 2020 18:46:23 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 17AAC5C021A; Thu, 26 Nov 2020 18:46:18 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Thu, 26 Nov 2020 18:46:18 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=zqITmz4AkPdWb YCNCexqMVl0qhCfrEAejpyDPbeLevY=; b=anU0f6HzEK71afs7ndMkAjS3OoXOo 0LbUZ3WT16xdK6DjNHpsJ+NB5k5FiiyAt05EIior+CZsdgiEPlM3gtedC17uxdug 2c3i62U+RO2VjTzH9LXat4ot8FftQQlyKTc4KotfSNL60E94WWlDYEBNpbsxdurY MXwifDS+Pi4ti1c9Qs6MLm9Z/8yy5taxQjl+TFocUoJZBY7PpEfV/D9AoXJrikbr pXPDPDqOL+sKMg5lpuomqfu+Z86dvcDr3obRqvL2uizV18I0rzf/mQJsdXMDksSz ghCgW0jo0QYxPf5OIiJMw5ZHmIPuwvrZnFvfhAn2WtPaWfccFzDKEmiEA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=zqITmz4AkPdWbYCNCexqMVl0qhCfrEAejpyDPbeLevY=; b=aB7/ew6M UYlNhpq4DhG+rmY0IaEPRy+39drN3NWognRq0KRP3UUtuw9MdUZMwfq75Igd/8u8 1OnldnZ9WzY6c3O19xKVfka3Bmzrvp5HbtJC/vccKTISsjYRTdXKOQkGl2K0pxG2 WPlYKriMTOb2QwpMk52MuCI2IV/68pWVnwa60nzilPywGRzQBdazOnqmbT0yZc+6 xjOVNV9Np519dvACo5LQQVba7jKeQ89bs1CkiqB5VDtzEPyNnCo2TQD1Fbv393RX rIE822xSXoRYMtOje3W2HQjHbaKeFn7i2pRmBmmg8hW9MJkbc1zchN7t46LTffnw 54HTSgehYZmxAA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrudehfedgudefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvufffkffojghfggfgsedtkeertdertddtnecuhfhrohhmpefmlhgruhhs ucflvghnshgvnhcuoehithhssehirhhrvghlvghvrghnthdrughkqeenucggtffrrghtth gvrhhnpeeuleetgeeiuefhgfekfefgveejiefgteekiedtgfdtieefhfdthfefueffvefg keenucfkphepkedtrdduieejrdelkedrudeltdenucevlhhushhtvghrufhiiigvpeefne curfgrrhgrmhepmhgrihhlfhhrohhmpehithhssehirhhrvghlvghvrghnthdrughk X-ME-Proxy: Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id AF1113064AB0; Thu, 26 Nov 2020 18:46:16 -0500 (EST) From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH v5 09/12] hw/block/nvme: add the zone append command Date: Fri, 27 Nov 2020 00:45:58 +0100 Message-Id: <20201126234601.689714-10-its@irrelevant.dk> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201126234601.689714-1-its@irrelevant.dk> References: <20201126234601.689714-1-its@irrelevant.dk> MIME-Version: 1.0 Received-SPF: pass client-ip=66.111.4.27; envelope-from=its@irrelevant.dk; helo=out3-smtp.messagingengine.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Klaus Jensen , Max Reitz , Keith Busch , Stefan Hajnoczi , Klaus Jensen Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Klaus Jensen Add the Zone Append command. Signed-off-by: Klaus Jensen --- hw/block/nvme.h | 5 ++++ include/block/nvme.h | 7 ++++++ hw/block/nvme.c | 53 +++++++++++++++++++++++++++++++++++++++++++ hw/block/trace-events | 1 + 4 files changed, 66 insertions(+) diff --git a/hw/block/nvme.h b/hw/block/nvme.h index 0cf3b303e34e..65d3070dec8c 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -16,6 +16,10 @@ typedef struct NvmeParams { uint32_t aer_max_queued; uint8_t mdts; bool use_intel_id; + + struct { + uint8_t zasl; + } zns; } NvmeParams; typedef struct NvmeAsyncEvent { @@ -65,6 +69,7 @@ static inline const char *nvme_io_opc_str(uint8_t opc) case NVME_CMD_COPY: return "NVME_NVM_CMD_COPY"; case NVME_CMD_ZONE_MGMT_SEND: return "NVME_ZONED_CMD_ZONE_MGMT_SEND"; case NVME_CMD_ZONE_MGMT_RECV: return "NVME_ZONED_CMD_ZONE_MGMT_RECV"; + case NVME_CMD_ZONE_APPEND: return "NVME_ZONED_CMD_ZONE_APPEND"; default: return "NVME_NVM_CMD_UNKNOWN"; } } diff --git a/include/block/nvme.h b/include/block/nvme.h index 4c2b6fbb799a..9ea7dfc40cc6 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -487,6 +487,7 @@ enum NvmeIoCommands { NVME_CMD_COPY = 0x19, NVME_CMD_ZONE_MGMT_SEND = 0x79, NVME_CMD_ZONE_MGMT_RECV = 0x7a, + NVME_CMD_ZONE_APPEND = 0x7d, }; typedef struct QEMU_PACKED NvmeDeleteQ { @@ -1059,6 +1060,11 @@ enum NvmeIdCtrlLpa { NVME_LPA_EXTENDED = 1 << 2, }; +typedef struct QEMU_PACKED NvmeIdCtrlZns { + uint8_t zasl; + uint8_t rsvd1[4095]; +} NvmeIdCtrlZns; + #define NVME_CTRL_SQES_MIN(sqes) ((sqes) & 0xf) #define NVME_CTRL_SQES_MAX(sqes) (((sqes) >> 4) & 0xf) #define NVME_CTRL_CQES_MIN(cqes) ((cqes) & 0xf) @@ -1293,6 +1299,7 @@ static inline void _nvme_check_size(void) QEMU_BUILD_BUG_ON(sizeof(NvmeFwSlotInfoLog) != 512); QEMU_BUILD_BUG_ON(sizeof(NvmeSmartLog) != 512); QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrl) != 4096); + QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrlZns) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsNvm) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsZns) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeSglDescriptor) != 16); diff --git a/hw/block/nvme.c b/hw/block/nvme.c index f0f4d72266bf..3c2b255294d3 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -953,6 +953,21 @@ static inline uint16_t nvme_check_mdts(NvmeCtrl *n, size_t len) return NVME_SUCCESS; } +static inline uint16_t nvme_check_zasl(NvmeCtrl *n, size_t len) +{ + uint8_t zasl = n->params.zns.zasl; + + if (!zasl) { + return nvme_check_mdts(n, len); + } + + if (len > n->page_size << zasl) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + return NVME_SUCCESS; +} + static inline uint16_t nvme_check_bounds(NvmeNamespace *ns, uint64_t slba, uint32_t nlb) { @@ -1169,6 +1184,7 @@ static void nvme_aio_err(NvmeRequest *req, int ret, NvmeZone *zone) case NVME_CMD_FLUSH: case NVME_CMD_WRITE: case NVME_CMD_WRITE_ZEROES: + case NVME_CMD_ZONE_APPEND: status = NVME_WRITE_FAULT; break; default: @@ -1228,6 +1244,7 @@ static void nvme_rw_cb(void *opaque, int ret) switch (req->cmd.opcode) { case NVME_CMD_WRITE: case NVME_CMD_WRITE_ZEROES: + case NVME_CMD_ZONE_APPEND: nvme_zns_advance_wp(req); default: break; @@ -2308,8 +2325,13 @@ static uint16_t nvme_write(NvmeCtrl *n, NvmeRequest *req) uint64_t data_offset; BlockBackend *blk = ns->blkconf.blk; bool wrz = rw->opcode == NVME_CMD_WRITE_ZEROES; + bool append = rw->opcode == NVME_CMD_ZONE_APPEND; uint16_t status; + if (append && !nvme_ns_zoned(ns)) { + return NVME_INVALID_OPCODE | NVME_DNR; + } + trace_pci_nvme_write(nvme_cid(req), nvme_io_opc_str(rw->opcode), nvme_nsid(ns), nlb, data_size, slba); @@ -2331,6 +2353,24 @@ static uint16_t nvme_write(NvmeCtrl *n, NvmeRequest *req) NvmeZone *zone = nvme_ns_zone(ns, slba); assert(zone); + if (append) { + uint64_t wp = zone->wp_staging; + + if (slba != nvme_zslba(zone)) { + trace_pci_nvme_err_invalid_zslba(slba); + return NVME_INVALID_FIELD | NVME_DNR; + } + + status = nvme_check_zasl(n, data_size); + if (status) { + trace_pci_nvme_err_zasl(nvme_cid(req), data_size); + goto invalid; + } + + slba = wp; + rw->slba = req->cqe.qw0 = cpu_to_le64(wp); + } + status = nvme_check_zone_write(slba, nlb, zone); if (status) { goto invalid; @@ -2408,6 +2448,7 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req) return nvme_flush(n, req); case NVME_CMD_WRITE_ZEROES: case NVME_CMD_WRITE: + case NVME_CMD_ZONE_APPEND: return nvme_write(n, req); case NVME_CMD_READ: return nvme_read(n, req); @@ -2677,6 +2718,8 @@ static void nvme_effects_zoned(NvmeEffectsLog *effects) effects->iocs[NVME_CMD_ZONE_MGMT_RECV] = NVME_EFFECTS_CSUPP; effects->iocs[NVME_CMD_ZONE_MGMT_SEND] = NVME_EFFECTS_CSUPP | NVME_EFFECTS_LBCC; + effects->iocs[NVME_CMD_ZONE_APPEND] = NVME_EFFECTS_CSUPP | + NVME_EFFECTS_LBCC; } static uint16_t nvme_effects_log(NvmeCtrl *n, uint32_t buf_len, uint64_t off, @@ -4169,6 +4212,11 @@ static void nvme_check_constraints(NvmeCtrl *n, Error **errp) return; } + if (params->zns.zasl && params->zns.zasl > params->mdts) { + error_setg(errp, "zns.zasl must be less than or equal to mdts"); + return; + } + if (n->pmrdev) { if (host_memory_backend_is_mapped(n->pmrdev)) { error_setg(errp, "can't use already busy memdev: %s", @@ -4364,12 +4412,16 @@ static void nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev, Error **errp) static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev) { NvmeIdCtrl *id = &n->id_ctrl; + NvmeIdCtrlZns *id_zns; uint8_t *pci_conf = pci_dev->config; char *subnqn; n->id_ctrl_iocss[NVME_IOCS_NVM] = g_new0(NvmeIdCtrl, 1); n->id_ctrl_iocss[NVME_IOCS_ZONED] = g_new0(NvmeIdCtrl, 1); + id_zns = n->id_ctrl_iocss[NVME_IOCS_ZONED]; + id_zns->zasl = n->params.zns.zasl; + id->vid = cpu_to_le16(pci_get_word(pci_conf + PCI_VENDOR_ID)); id->ssvid = cpu_to_le16(pci_get_word(pci_conf + PCI_SUBSYSTEM_VENDOR_ID)); strpadcpy((char *)id->mn, sizeof(id->mn), "QEMU NVMe Ctrl", ' '); @@ -4511,6 +4563,7 @@ static Property nvme_props[] = { DEFINE_PROP_UINT32("aer_max_queued", NvmeCtrl, params.aer_max_queued, 64), DEFINE_PROP_UINT8("mdts", NvmeCtrl, params.mdts, 7), DEFINE_PROP_BOOL("use-intel-id", NvmeCtrl, params.use_intel_id, false), + DEFINE_PROP_UINT8("zns.zasl", NvmeCtrl, params.zns.zasl, 0), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/block/trace-events b/hw/block/trace-events index f62dfda279cd..221dc1af36c9 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -114,6 +114,7 @@ pci_nvme_mmio_shutdown_cleared(void) "shutdown bit cleared" # nvme traces for error conditions pci_nvme_err_mdts(uint16_t cid, size_t len) "cid %"PRIu16" len %zu" +pci_nvme_err_zasl(uint16_t cid, size_t len) "cid %"PRIu16" len %zu" pci_nvme_err_req_status(uint16_t cid, uint32_t nsid, uint16_t status, uint8_t opc) "cid %"PRIu16" nsid %"PRIu32" status 0x%"PRIx16" opc 0x%"PRIx8"" pci_nvme_err_addr_read(uint64_t addr) "addr 0x%"PRIx64"" pci_nvme_err_addr_write(uint64_t addr) "addr 0x%"PRIx64"" From patchwork Thu Nov 26 23:45:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Klaus Jensen X-Patchwork-Id: 11934943 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E009FC63798 for ; Fri, 27 Nov 2020 00:05:55 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 33D46221F1 for ; Fri, 27 Nov 2020 00:05:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 33D46221F1 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=irrelevant.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:43754 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kiRGr-0002zH-FE for qemu-devel@archiver.kernel.org; Thu, 26 Nov 2020 19:05:53 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:55980) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kiQy1-0007Vc-Lg; Thu, 26 Nov 2020 18:46:25 -0500 Received: from out3-smtp.messagingengine.com ([66.111.4.27]:42029) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kiQxx-0003wJ-VF; Thu, 26 Nov 2020 18:46:25 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 7CB325C0216; Thu, 26 Nov 2020 18:46:19 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Thu, 26 Nov 2020 18:46:19 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=ZCm6WADM4E7jC Yb0tlAvMNGbhC5Es7a147Vj/anMXhQ=; b=yYoHyGbgLJ2xPURaWf7A0eqkwuAYg hgszAj17segCig3zdj+oIHC5ZRArmOkWXpogmeQxCss2grsVqcxjOBZve7nSfvb6 gy7kEVKbTBqxlju6/Q3mQLPSBQpNcELL7h7zpQZp13GXwQaQGYFP1vXWxyrsTOjA +9hPokrPirmqfhzdFOf0ehpKcJXAZXfszxwB0LDVkOzG9ExOtNPKmnUHgTKbPsOY X9X09+cN3v79RimJEkaU6q4Em3n1Xke0wVtwY8+TMauAWqP02ZSxeR4DOf1JPg3Q CRVYfVLpJThNktwBoaUbK3QvhjI0lG7ywQg6SokWdZ3CM4OjMiWZDmCSQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=ZCm6WADM4E7jCYb0tlAvMNGbhC5Es7a147Vj/anMXhQ=; b=FE9s5OyI 0MfPQYaHaazYPlzTOg3gMoGGzvxZQl8EwBQN3rKdxonQI23wOZEI3/zJlakskWk7 I3L5sIVp8bDsrAbzp57SWRELc6xHv0kz4Nwfak5p8pR9xwGEhRyuMLOf1tfvNHUr IOYTeeSJNYG19Yo2zDnKLU01CYqhfWhbSvjwzzE222cCQYqv4fgJd1CwZK5e/wH6 pDVALfvHwlc1WYN7gRU6z8hMd87MwakFinlAKOAghABhtrGkT2i+VIkDuSfbxJza YB1TPqtE27h6qma8ZXK9CJMDCzewwFchMJDiNCvELosoglKpX/t6KZ9NX9TVLf8i nyuTUxdXA+aC0w== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrudehfedgudefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvufffkffojghfggfgsedtkeertdertddtnecuhfhrohhmpefmlhgruhhs ucflvghnshgvnhcuoehithhssehirhhrvghlvghvrghnthdrughkqeenucggtffrrghtth gvrhhnpeejfeejjeejffelfeeuhffftddvgfegudejvddvffdtvdfhgeejkeehuefhjeej ueenucffohhmrghinheprhgvshhouhhrtggvshdrrggtthhivhgvpdhrvghsohhurhgtvg hsrdhophgvnhenucfkphepkedtrdduieejrdelkedrudeltdenucevlhhushhtvghrufhi iigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehithhssehirhhrvghlvghvrghnth drughk X-ME-Proxy: Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id 1D1783064AB2; Thu, 26 Nov 2020 18:46:18 -0500 (EST) From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH v5 10/12] hw/block/nvme: track and enforce zone resources Date: Fri, 27 Nov 2020 00:45:59 +0100 Message-Id: <20201126234601.689714-11-its@irrelevant.dk> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201126234601.689714-1-its@irrelevant.dk> References: <20201126234601.689714-1-its@irrelevant.dk> MIME-Version: 1.0 Received-SPF: pass client-ip=66.111.4.27; envelope-from=its@irrelevant.dk; helo=out3-smtp.messagingengine.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Klaus Jensen , Max Reitz , Keith Busch , Stefan Hajnoczi , Klaus Jensen Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Klaus Jensen Track number of open/active resources. Signed-off-by: Klaus Jensen --- docs/specs/nvme.txt | 6 ++++ hw/block/nvme-ns.h | 7 +++++ include/block/nvme.h | 2 ++ hw/block/nvme-ns.c | 17 ++++++++++-- hw/block/nvme.c | 65 ++++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 95 insertions(+), 2 deletions(-) diff --git a/docs/specs/nvme.txt b/docs/specs/nvme.txt index 80cb34406255..03bb4d9516b4 100644 --- a/docs/specs/nvme.txt +++ b/docs/specs/nvme.txt @@ -14,6 +14,12 @@ The nvme device (-device nvme) emulates an NVM Express Controller. zns.zcap; if the zone capacity is a power of two, the zone size will be set to that, otherwise it will default to the next power of two. + `zns.mar`; Specifies the number of active resources available. This is a 0s + based value. + + `zns.mor`; Specifies the number of open resources available. This is a 0s + based value. + Reference Specifications ------------------------ diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index 6370ef1a162b..20be2a7c882f 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -32,6 +32,8 @@ typedef struct NvmeNamespaceParams { uint64_t zcap; uint64_t zsze; uint8_t zdes; + uint32_t mar; + uint32_t mor; } zns; } NvmeNamespaceParams; @@ -62,6 +64,11 @@ typedef struct NvmeNamespace { NvmeZone *zones; NvmeZoneDescriptor *zd; uint8_t *zde; + + struct { + uint32_t open; + uint32_t active; + } resources; } zns; } NvmeNamespace; diff --git a/include/block/nvme.h b/include/block/nvme.h index 9ea7dfc40cc6..4038761f3650 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -813,6 +813,8 @@ enum NvmeStatusCodes { NVME_ZONE_IS_READ_ONLY = 0x01ba, NVME_ZONE_IS_OFFLINE = 0x01bb, NVME_ZONE_INVALID_WRITE = 0x01bc, + NVME_TOO_MANY_ACTIVE_ZONES = 0x01bd, + NVME_TOO_MANY_OPEN_ZONES = 0x01be, NVME_INVALID_ZONE_STATE_TRANSITION = 0x01bf, NVME_WRITE_FAULT = 0x0280, NVME_UNRECOVERED_READ = 0x0281, diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index f2e8ee80b606..3cbc62556175 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -94,8 +94,13 @@ static void nvme_ns_init_zoned(NvmeNamespace *ns) ns->zns.zde = g_malloc0_n(ns->zns.num_zones, nvme_ns_zdes_bytes(ns)); } - id_ns_zns->mar = 0xffffffff; - id_ns_zns->mor = 0xffffffff; + id_ns_zns->mar = cpu_to_le32(ns->params.zns.mar); + id_ns_zns->mor = cpu_to_le32(ns->params.zns.mor); + + ns->zns.resources.active = ns->params.zns.mar != 0xffffffff ? + ns->params.zns.mar + 1 : ns->zns.num_zones; + ns->zns.resources.open = ns->params.zns.mor != 0xffffffff ? + ns->params.zns.mor + 1 : ns->zns.num_zones; } static int nvme_ns_init(NvmeNamespace *ns, Error **errp) @@ -197,6 +202,12 @@ static int nvme_ns_check_constraints(NvmeNamespace *ns, Error **errp) return -1; } + if (ns->params.zns.mor > ns->params.zns.mar) { + error_setg(errp, "maximum open resources (zns.mor) must be less " + "than or equal to maximum active resources (zns.mar)"); + return -1; + } + break; default: @@ -267,6 +278,8 @@ static Property nvme_ns_props[] = { DEFINE_PROP_UINT64("zns.zcap", NvmeNamespace, params.zns.zcap, 0), DEFINE_PROP_UINT64("zns.zsze", NvmeNamespace, params.zns.zsze, 0), DEFINE_PROP_UINT8("zns.zdes", NvmeNamespace, params.zns.zdes, 0), + DEFINE_PROP_UINT32("zns.mar", NvmeNamespace, params.zns.mar, 0xffffffff), + DEFINE_PROP_UINT32("zns.mor", NvmeNamespace, params.zns.mor, 0xffffffff), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 3c2b255294d3..bc1446aeab9d 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1045,6 +1045,40 @@ static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, switch (from) { case NVME_ZS_ZSE: + switch (to) { + case NVME_ZS_ZSF: + case NVME_ZS_ZSRO: + case NVME_ZS_ZSO: + break; + + case NVME_ZS_ZSC: + if (!ns->zns.resources.active) { + return NVME_TOO_MANY_ACTIVE_ZONES; + } + + ns->zns.resources.active--; + + break; + + case NVME_ZS_ZSIO: + case NVME_ZS_ZSEO: + if (!ns->zns.resources.active) { + return NVME_TOO_MANY_ACTIVE_ZONES; + } + + if (!ns->zns.resources.open) { + return NVME_TOO_MANY_OPEN_ZONES; + } + + ns->zns.resources.active--; + ns->zns.resources.open--; + + break; + + default: + return NVME_INVALID_ZONE_STATE_TRANSITION | NVME_DNR; + } + break; case NVME_ZS_ZSIO: @@ -1063,7 +1097,13 @@ static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, case NVME_ZS_ZSEO: case NVME_ZS_ZSF: case NVME_ZS_ZSRO: + ns->zns.resources.active++; + + /* fallthrough */ + case NVME_ZS_ZSC: + ns->zns.resources.open++; + break; default: @@ -1086,8 +1126,18 @@ static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, case NVME_ZS_ZSF: case NVME_ZS_ZSRO: + ns->zns.resources.active++; + + break; + case NVME_ZS_ZSIO: case NVME_ZS_ZSEO: + if (!ns->zns.resources.open) { + return NVME_TOO_MANY_OPEN_ZONES; + } + + ns->zns.resources.open--; + break; default: @@ -1707,6 +1757,7 @@ static uint16_t nvme_zone_mgmt_send_all(NvmeCtrl *n, NvmeNamespace *ns, { NvmeZone *zone; uintptr_t *resets = (uintptr_t *)&req->opaque; + int count; uint16_t status = NVME_SUCCESS; trace_pci_nvme_zone_mgmt_send_all(nvme_cid(req), nvme_nsid(ns), zsa); @@ -1755,6 +1806,20 @@ static uint16_t nvme_zone_mgmt_send_all(NvmeCtrl *n, NvmeNamespace *ns, break; case NVME_ZMS_OPEN: + count = 0; + + for (int i = 0; i < ns->zns.num_zones; i++) { + zone = &ns->zns.zones[i]; + + if (nvme_zs(zone) == NVME_ZS_ZSC) { + count++; + } + } + + if (count > ns->zns.resources.open) { + return NVME_TOO_MANY_OPEN_ZONES; + } + for (int i = 0; i < ns->zns.num_zones; i++) { zone = &ns->zns.zones[i]; From patchwork Thu Nov 26 23:46:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Klaus Jensen X-Patchwork-Id: 11934939 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNWANTED_LANGUAGE_BODY, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7AF81C63798 for ; Fri, 27 Nov 2020 00:00:13 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 11122207D8 for ; Fri, 27 Nov 2020 00:00:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 11122207D8 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=irrelevant.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:38370 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kiRBL-0000Qi-HH for qemu-devel@archiver.kernel.org; Thu, 26 Nov 2020 19:00:11 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:55976) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kiQy1-0007VT-Dh; Thu, 26 Nov 2020 18:46:25 -0500 Received: from out3-smtp.messagingengine.com ([66.111.4.27]:39797) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kiQxx-0003wK-UJ; Thu, 26 Nov 2020 18:46:25 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id DE2E35C0208; Thu, 26 Nov 2020 18:46:20 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Thu, 26 Nov 2020 18:46:20 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=1r6PsEZsJ75jd 7IVSUxM7XHx9gwDInEmWggXzcQ+Upk=; b=nKEuyV4+Oa/M31UR4OZ7dU9w2Mwmy mjaQEwGLTW6mZi+weuN6/FI4OrO9wJZAG30JX6+nmPI+Q61e1bbFz6V5U67TDtd5 5cT0xvGWbGFPHsXCgSIfaQpg/xod7/GNGDq4Cm95Re6Bx1+PUCqzdt2V2uEKigc3 LmHEVTgtR63hHusMpajYgEdtH8GDUnhNzZ0DvICeXd2OaT+SwHeaTvrG3VuJ8Oh+ 1sQpRrWTnsF8sG0ycMTQQGPuposgS2SO6B0EKiVFhiYCJ6TMxp5j8gosFLwIAyj+ Et1Ctp7sa5ZMsUtyfoNPyfBLZxDNSBExfXYItqwzyWe/1O1JB8/aMH1RA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=1r6PsEZsJ75jd7IVSUxM7XHx9gwDInEmWggXzcQ+Upk=; b=laglfpIe 8u2vY2Dq6V9N19cvDOrlrox1++w7vFixLpiKQimsJrqPI1uvV7XAkqKgzqMFWXL/ ZD34zEm9boDb7tYc8N5+7rLkpOJ7BWD9kHxtQY8EFK0XRvQp0BnneZf+J6Jv6Jkx mx+DYy1qn+USFrYxwq9yg4QghHnx2/jDw7zsk+mA1JXmLPRPOC6lDjPGqQkbjIgR VmhjtTVhXncr/TljgB6V+IbDSq9k03x+kA8qUk79cW7g1Wj3KpfaPIgVXf16n0xK ufeJuhaxWuyF9OkidsOueAvgGitKnW5HPTd7LamWdwZh2IC1Nfe5Iovw+zCMfT4x t25MEcMItekF+A== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrudehfedgudefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvufffkffojghfggfgsedtkeertdertddtnecuhfhrohhmpefmlhgruhhs ucflvghnshgvnhcuoehithhssehirhhrvghlvghvrghnthdrughkqeenucggtffrrghtth gvrhhnpedvffffudfghfefffdvvddvtdevhffffffgueffkefggfekjefgheduheetvefg heenucffohhmrghinheprhgvshhouhhrtggvshdrohhpvghnpdhrvghsohhurhgtvghsrd grtghtihhvvgenucfkphepkedtrdduieejrdelkedrudeltdenucevlhhushhtvghrufhi iigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehithhssehirhhrvghlvghvrghnth drughk X-ME-Proxy: Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id 7DB4A3064AAE; Thu, 26 Nov 2020 18:46:19 -0500 (EST) From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH v5 11/12] hw/block/nvme: allow open to close zone transitions by controller Date: Fri, 27 Nov 2020 00:46:00 +0100 Message-Id: <20201126234601.689714-12-its@irrelevant.dk> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201126234601.689714-1-its@irrelevant.dk> References: <20201126234601.689714-1-its@irrelevant.dk> MIME-Version: 1.0 Received-SPF: pass client-ip=66.111.4.27; envelope-from=its@irrelevant.dk; helo=out3-smtp.messagingengine.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Klaus Jensen , Max Reitz , Keith Busch , Stefan Hajnoczi , Klaus Jensen Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Klaus Jensen Allow the controller to release open resources by transitioning implicitly and explicitly opened zones to closed. This is done using a naive "least recently opened" strategy. Signed-off-by: Klaus Jensen --- hw/block/nvme-ns.h | 5 ++++ hw/block/nvme-ns.c | 3 +++ hw/block/nvme.c | 57 ++++++++++++++++++++++++++++++++++++++++--- hw/block/trace-events | 1 + 4 files changed, 63 insertions(+), 3 deletions(-) diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index 20be2a7c882f..05a79a214605 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -42,6 +42,8 @@ typedef struct NvmeZone { uint8_t *zde; uint64_t wp_staging; + + QTAILQ_ENTRY(NvmeZone) lru_entry; } NvmeZone; typedef struct NvmeNamespace { @@ -68,6 +70,9 @@ typedef struct NvmeNamespace { struct { uint32_t open; uint32_t active; + + QTAILQ_HEAD(, NvmeZone) lru_open; + QTAILQ_HEAD(, NvmeZone) lru_active; } resources; } zns; } NvmeNamespace; diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index 3cbc62556175..cd0f075dd281 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -101,6 +101,9 @@ static void nvme_ns_init_zoned(NvmeNamespace *ns) ns->params.zns.mar + 1 : ns->zns.num_zones; ns->zns.resources.open = ns->params.zns.mor != 0xffffffff ? ns->params.zns.mor + 1 : ns->zns.num_zones; + + QTAILQ_INIT(&ns->zns.resources.lru_open); + QTAILQ_INIT(&ns->zns.resources.lru_active); } static int nvme_ns_init(NvmeNamespace *ns, Error **errp) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index bc1446aeab9d..e62efd7cf0c4 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1029,11 +1029,47 @@ static inline void nvme_zone_reset_wp(NvmeZone *zone) zone->wp_staging = nvme_zslba(zone); } +static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, + NvmeZoneState to); + +static uint16_t nvme_zrm_release_open(NvmeNamespace *ns) +{ + NvmeZone *candidate; + NvmeZoneState zs; + uint16_t status; + + trace_pci_nvme_zrm_release_open(ns->params.nsid); + + QTAILQ_FOREACH(candidate, &ns->zns.resources.lru_open, lru_entry) { + zs = nvme_zs(candidate); + + /* skip explicitly opened zones */ + if (zs == NVME_ZS_ZSEO) { + continue; + } + + /* skip zones that have in-flight writes */ + if (candidate->wp_staging != nvme_wp(candidate)) { + continue; + } + + status = nvme_zrm_transition(ns, candidate, NVME_ZS_ZSC); + if (status) { + return status; + } + + return NVME_SUCCESS; + } + + return NVME_TOO_MANY_OPEN_ZONES; +} + static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, NvmeZoneState to) { NvmeZoneState from = nvme_zs(zone); NvmeZoneDescriptor *zd = zone->zd; + uint16_t status; trace_pci_nvme_zrm_transition(ns->params.nsid, nvme_zslba(zone), nvme_zs_to_str(from), from, @@ -1057,6 +1093,7 @@ static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, } ns->zns.resources.active--; + QTAILQ_INSERT_TAIL(&ns->zns.resources.lru_active, zone, lru_entry); break; @@ -1067,11 +1104,15 @@ static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, } if (!ns->zns.resources.open) { - return NVME_TOO_MANY_OPEN_ZONES; + status = nvme_zrm_release_open(ns); + if (status) { + return status; + } } ns->zns.resources.active--; ns->zns.resources.open--; + QTAILQ_INSERT_TAIL(&ns->zns.resources.lru_open, zone, lru_entry); break; @@ -1098,11 +1139,15 @@ static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, case NVME_ZS_ZSF: case NVME_ZS_ZSRO: ns->zns.resources.active++; + ns->zns.resources.open++; + QTAILQ_REMOVE(&ns->zns.resources.lru_open, zone, lru_entry); - /* fallthrough */ + break; case NVME_ZS_ZSC: ns->zns.resources.open++; + QTAILQ_REMOVE(&ns->zns.resources.lru_open, zone, lru_entry); + QTAILQ_INSERT_TAIL(&ns->zns.resources.lru_active, zone, lru_entry); break; @@ -1127,16 +1172,22 @@ static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, case NVME_ZS_ZSF: case NVME_ZS_ZSRO: ns->zns.resources.active++; + QTAILQ_REMOVE(&ns->zns.resources.lru_active, zone, lru_entry); break; case NVME_ZS_ZSIO: case NVME_ZS_ZSEO: if (!ns->zns.resources.open) { - return NVME_TOO_MANY_OPEN_ZONES; + status = nvme_zrm_release_open(ns); + if (status) { + return status; + } } ns->zns.resources.open--; + QTAILQ_REMOVE(&ns->zns.resources.lru_active, zone, lru_entry); + QTAILQ_INSERT_TAIL(&ns->zns.resources.lru_open, zone, lru_entry); break; diff --git a/hw/block/trace-events b/hw/block/trace-events index 221dc1af36c9..31482bfba1fe 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -94,6 +94,7 @@ pci_nvme_enqueue_event_masked(uint8_t typ) "type 0x%"PRIx8"" pci_nvme_no_outstanding_aers(void) "ignoring event; no outstanding AERs" pci_nvme_enqueue_req_completion(uint16_t cid, uint16_t cqid, uint16_t status) "cid %"PRIu16" cqid %"PRIu16" status 0x%"PRIx16"" pci_nvme_zrm_transition(uint32_t nsid, uint64_t zslba, const char *s_from, uint8_t from, const char *s_to, uint8_t to) "nsid %"PRIu32" zslba 0x%"PRIx64" from '%s' (%"PRIu8") to '%s' (%"PRIu8")" +pci_nvme_zrm_release_open(uint32_t nsid) "nsid %"PRIu32"" pci_nvme_zns_advance_wp(uint32_t nsid, uint64_t zslba, uint64_t wp_orig, uint32_t nlb) "nsid 0x%"PRIx32" zslba 0x%"PRIx64" wp_orig 0x%"PRIx64" nlb %"PRIu32"" pci_nvme_mmio_read(uint64_t addr) "addr 0x%"PRIx64"" pci_nvme_mmio_write(uint64_t addr, uint64_t data) "addr 0x%"PRIx64" data 0x%"PRIx64"" From patchwork Thu Nov 26 23:46:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Klaus Jensen X-Patchwork-Id: 11934935 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 785F7C63798 for ; Thu, 26 Nov 2020 23:55:50 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1719F2070E for ; Thu, 26 Nov 2020 23:55:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1719F2070E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=irrelevant.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:59580 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kiR76-0005eb-Ux for qemu-devel@archiver.kernel.org; Thu, 26 Nov 2020 18:55:49 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:55986) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kiQy2-0007XO-Q4; Thu, 26 Nov 2020 18:46:26 -0500 Received: from out3-smtp.messagingengine.com ([66.111.4.27]:45893) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kiQxz-0003x4-2v; Thu, 26 Nov 2020 18:46:26 -0500 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 3A61A5C01FE; Thu, 26 Nov 2020 18:46:22 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Thu, 26 Nov 2020 18:46:22 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=T1xCuj03YegiT nv6u+2mv2W49KKWdZNfqOgXE+BHTY4=; b=Gr4M7oj+PaFJqEMIjl/UZTiCxjr56 a9GIIRzYci8v7dfWqrJOo+mQjePJIMh76qOFcV61fsaf+r61AEfftHLYX7tpqSxU om6XFztNH2UORCSNMI3SMr/8WacFsJWD5FE2swZlXWOl2eGiQftRy7VqsAqlIGc9 lmYHMtgeLMeeP0gakkV1a1NImCGeYvXF3CMUsaY3hCEzCkei3XfWf3+1S+uhN8HJ duZixj6NFT595q6DJLNSE5K9MiN63auyfqt/QaJ5/Fu+burr6jpdos8OGeRuyE+i fGuQImGjrWWdUUkoG6t3rMnwrSHIQ1vMwD4KhuJfqAxqNNouJXM1+SXcA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=T1xCuj03YegiTnv6u+2mv2W49KKWdZNfqOgXE+BHTY4=; b=X4e5MhsF xlua6SdTIb4Z0GlsGsgiDZ5xsHD6+SHTBZkt5ETtqzNIBkExRn+CLOMMFSDKOusE dTI61mmJ+TKK1fxqpkX2gfES/y5F352GfEhZ2YU1MzhXHG1EJf93wxDRbFSKmhUt v+oYRR1SxTxUwgsAaBzB65KmaVg6zKNbzW7/Ed1SVl3WqO4WYmrS16YzO9pYAPmg W2jrazLO0kGUEl3mZiiQjBEsfhY5oz8ywhqKoOxUUNdYCNiK3plPEEtdhWlcdKV5 P7i4uT5MMkkxtYzhWLJmO5hxcPsMn3zapgC8MjX0nvnQ4aBJudj3+IuK2Fp5VmLC /+Cdfr+XQfatVQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrudehfedgudefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvufffkffojghfggfgsedtkeertdertddtnecuhfhrohhmpefmlhgruhhs ucflvghnshgvnhcuoehithhssehirhhrvghlvghvrghnthdrughkqeenucggtffrrghtth gvrhhnpeefgefhledtfeekvedthfeuveeghfefledukedugfeiudfgveeiieelhfeufedt ieenucffohhmrghinheprhgvshhouhhrtggvshdrrggtthhivhgvnecukfhppeektddrud eijedrleekrdduledtnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghi lhhfrhhomhepihhtshesihhrrhgvlhgvvhgrnhhtrdgukh X-ME-Proxy: Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id E26BB3064AAF; Thu, 26 Nov 2020 18:46:20 -0500 (EST) From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH RFC v5 12/12] hw/block/nvme: add persistence for zone info Date: Fri, 27 Nov 2020 00:46:01 +0100 Message-Id: <20201126234601.689714-13-its@irrelevant.dk> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201126234601.689714-1-its@irrelevant.dk> References: <20201126234601.689714-1-its@irrelevant.dk> MIME-Version: 1.0 Received-SPF: pass client-ip=66.111.4.27; envelope-from=its@irrelevant.dk; helo=out3-smtp.messagingengine.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Klaus Jensen , Max Reitz , Keith Busch , Stefan Hajnoczi , Klaus Jensen Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Klaus Jensen Signed-off-by: Klaus Jensen --- docs/specs/nvme.txt | 15 +++ hw/block/nvme-ns.h | 16 ++++ hw/block/nvme-ns.c | 212 +++++++++++++++++++++++++++++++++++++++++- hw/block/nvme.c | 87 +++++++++++++++++ hw/block/trace-events | 2 + 5 files changed, 331 insertions(+), 1 deletion(-) diff --git a/docs/specs/nvme.txt b/docs/specs/nvme.txt index 03bb4d9516b4..05d81c88ad4e 100644 --- a/docs/specs/nvme.txt +++ b/docs/specs/nvme.txt @@ -20,6 +20,21 @@ The nvme device (-device nvme) emulates an NVM Express Controller. `zns.mor`; Specifies the number of open resources available. This is a 0s based value. + `zns.pstate`; This parameter specifies another blockdev to be used for + storing zone state persistently. + + -drive id=zns-pstate,file=zns-pstate.img,format=raw + -device nvme-ns,zns.pstate=zns-pstate,... + + To reset (or initialize) state, the blockdev image should be of zero size: + + qemu-img create -f raw zns-pstate.img 0 + + The image will be intialized with a file format header and truncated to + the required size. If the pstate given is of non-zero size, it will be + assumed to already contain zone state information and the header will be + checked. + Reference Specifications ------------------------ diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index 05a79a214605..5cb4c1da59ce 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -19,6 +19,15 @@ #define NVME_NS(obj) \ OBJECT_CHECK(NvmeNamespace, (obj), TYPE_NVME_NS) +#define NVME_ZONE_PSTATE_MAGIC ((0x00 << 24) | ('S' << 16) | ('N' << 8) | 'Z') +#define NVME_ZONE_PSTATE_V1 1 + +typedef struct NvmeZonePStateHeader { + uint32_t magic; + uint32_t version; + uint8_t rsvd8[4088]; +} QEMU_PACKED NvmeZonePStateHeader; + typedef struct NvmeNamespaceParams { uint32_t nsid; uint8_t iocs; @@ -74,6 +83,8 @@ typedef struct NvmeNamespace { QTAILQ_HEAD(, NvmeZone) lru_open; QTAILQ_HEAD(, NvmeZone) lru_active; } resources; + + BlockBackend *pstate; } zns; } NvmeNamespace; @@ -186,4 +197,9 @@ int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp); void nvme_ns_drain(NvmeNamespace *ns); void nvme_ns_flush(NvmeNamespace *ns); +static inline void _nvme_ns_check_size(void) +{ + QEMU_BUILD_BUG_ON(sizeof(NvmeZonePStateHeader) != 4096); +} + #endif /* NVME_NS_H */ diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index cd0f075dd281..4f311dd704c0 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -50,6 +50,31 @@ const char *nvme_zs_to_str(NvmeZoneState zs) return "UNKNOWN"; } +static int nvme_blk_truncate(BlockBackend *blk, size_t len, Error **errp) +{ + int ret; + uint64_t perm, shared_perm; + + blk_get_perm(blk, &perm, &shared_perm); + + ret = blk_set_perm(blk, perm | BLK_PERM_RESIZE, shared_perm, errp); + if (ret < 0) { + return ret; + } + + ret = blk_truncate(blk, len, false, PREALLOC_MODE_OFF, 0, errp); + if (ret < 0) { + return ret; + } + + ret = blk_set_perm(blk, perm, shared_perm, errp); + if (ret < 0) { + return ret; + } + + return 0; +} + static void nvme_ns_zns_init_zones(NvmeNamespace *ns) { NvmeZone *zone; @@ -153,6 +178,176 @@ static int nvme_ns_init(NvmeNamespace *ns, Error **errp) return 0; } +static int nvme_ns_zns_restore_zone_state(NvmeNamespace *ns, Error **errp) +{ + for (int i = 0; i < ns->zns.num_zones; i++) { + NvmeZone *zone = &ns->zns.zones[i]; + zone->zd = &ns->zns.zd[i]; + if (ns->params.zns.zdes) { + zone->zde = &ns->zns.zde[i]; + } + + switch (nvme_zs(zone)) { + case NVME_ZS_ZSE: + case NVME_ZS_ZSF: + case NVME_ZS_ZSRO: + case NVME_ZS_ZSO: + break; + + case NVME_ZS_ZSC: + if (nvme_wp(zone) == nvme_zslba(zone) && + !(zone->zd->za & NVME_ZA_ZDEV)) { + nvme_zs_set(zone, NVME_ZS_ZSE); + break; + } + + if (ns->zns.resources.active) { + ns->zns.resources.active--; + QTAILQ_INSERT_TAIL(&ns->zns.resources.lru_active, zone, + lru_entry); + break; + } + + /* fallthrough */ + + case NVME_ZS_ZSIO: + case NVME_ZS_ZSEO: + zone->zd->wp = zone->zd->zslba; + nvme_zs_set(zone, NVME_ZS_ZSF); + break; + + default: + error_setg(errp, "invalid zone state"); + return -1; + } + + zone->wp_staging = nvme_wp(zone); + } + + return 0; +} + +static int nvme_ns_zns_init_pstate(NvmeNamespace *ns, Error **errp) +{ + BlockBackend *blk = ns->zns.pstate; + NvmeZonePStateHeader header; + size_t zd_len, zde_len; + int ret; + + zd_len = ns->zns.num_zones * sizeof(NvmeZoneDescriptor); + zde_len = ns->zns.num_zones * nvme_ns_zdes_bytes(ns); + + ret = nvme_blk_truncate(blk, zd_len + zde_len + sizeof(header), errp); + if (ret < 0) { + error_setg_errno(errp, -ret, "could not truncate zone pstate"); + return ret; + } + + nvme_ns_zns_init_zones(ns); + + ret = blk_pwrite(blk, 0, ns->zns.zd, zd_len, 0); + if (ret < 0) { + error_setg_errno(errp, -ret, "could not write zone descriptors to " + "zone pstate"); + return ret; + } + + header = (NvmeZonePStateHeader) { + .magic = cpu_to_le32(NVME_ZONE_PSTATE_MAGIC), + .version = cpu_to_le32(NVME_ZONE_PSTATE_V1), + }; + + ret = blk_pwrite(blk, zd_len + zde_len, &header, sizeof(header), 0); + if (ret < 0) { + error_setg_errno(errp, -ret, "could not write zone pstate header"); + return ret; + } + + return 0; +} + +static int nvme_ns_zns_load_pstate(NvmeNamespace *ns, size_t len, Error **errp) +{ + BlockBackend *blk = ns->zns.pstate; + NvmeZonePStateHeader header; + size_t zd_len, zde_len; + int ret; + + ret = blk_pread(blk, len - sizeof(header), &header, sizeof(header)); + if (ret < 0) { + error_setg_errno(errp, -ret, "could not read zone pstate header"); + return ret; + } + + if (le32_to_cpu(header.magic) != NVME_ZONE_PSTATE_MAGIC) { + error_setg(errp, "invalid zone pstate header"); + return -1; + } else if (le32_to_cpu(header.version) > NVME_ZONE_PSTATE_V1) { + error_setg(errp, "unsupported zone pstate version"); + return -1; + } + + zd_len = ns->zns.num_zones * sizeof(NvmeZoneDescriptor); + zde_len = ns->zns.num_zones * nvme_ns_zdes_bytes(ns); + + ret = blk_pread(blk, 0, ns->zns.zd, zd_len); + if (ret < 0) { + error_setg_errno(errp, -ret, "could not read zone descriptors from " + "zone pstate"); + return ret; + } + + if (zde_len) { + ret = blk_pread(blk, zd_len, ns->zns.zde, zde_len); + if (ret < 0) { + error_setg_errno(errp, -ret, "could not read zone descriptor " + "extensions from zone pstate"); + return ret; + } + } + + if (nvme_ns_zns_restore_zone_state(ns, errp)) { + return -1; + } + + ret = blk_pwrite(blk, 0, ns->zns.zd, zd_len, 0); + if (ret < 0) { + error_setg_errno(errp, -ret, "could not write zone descriptors to " + "zone pstate"); + return ret; + } + + return 0; +} + +static int nvme_ns_zns_setup_pstate(NvmeNamespace *ns, Error **errp) +{ + BlockBackend *blk = ns->zns.pstate; + uint64_t perm, shared_perm; + ssize_t len; + int ret; + + perm = BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE; + shared_perm = BLK_PERM_ALL; + + ret = blk_set_perm(blk, perm, shared_perm, errp); + if (ret) { + return ret; + } + + len = blk_getlength(blk); + if (len < 0) { + error_setg_errno(errp, -len, "could not determine zone pstate size"); + return len; + } + + if (!len) { + return nvme_ns_zns_init_pstate(ns, errp); + } + + return nvme_ns_zns_load_pstate(ns, len, errp); +} + static int nvme_ns_init_blk(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) { if (!blkconf_blocksizes(&ns->blkconf, errp)) { @@ -236,7 +431,13 @@ int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) } if (nvme_ns_zoned(ns)) { - nvme_ns_zns_init_zones(ns); + if (ns->zns.pstate) { + if (nvme_ns_zns_setup_pstate(ns, errp)) { + return -1; + } + } else { + nvme_ns_zns_init_zones(ns); + } } if (nvme_register_namespace(n, ns, errp)) { @@ -249,11 +450,19 @@ int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) void nvme_ns_drain(NvmeNamespace *ns) { blk_drain(ns->blkconf.blk); + + if (ns->zns.pstate) { + blk_drain(ns->zns.pstate); + } } void nvme_ns_flush(NvmeNamespace *ns) { blk_flush(ns->blkconf.blk); + + if (ns->zns.pstate) { + blk_flush(ns->zns.pstate); + } } static void nvme_ns_realize(DeviceState *dev, Error **errp) @@ -283,6 +492,7 @@ static Property nvme_ns_props[] = { DEFINE_PROP_UINT8("zns.zdes", NvmeNamespace, params.zns.zdes, 0), DEFINE_PROP_UINT32("zns.mar", NvmeNamespace, params.zns.mar, 0xffffffff), DEFINE_PROP_UINT32("zns.mor", NvmeNamespace, params.zns.mor, 0xffffffff), + DEFINE_PROP_DRIVE("zns.pstate", NvmeNamespace, zns.pstate), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/block/nvme.c b/hw/block/nvme.c index e62efd7cf0c4..04ad9f20223d 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1023,6 +1023,46 @@ static uint16_t nvme_check_dulbe(NvmeNamespace *ns, uint64_t slba, return NVME_SUCCESS; } +static int nvme_zns_commit_zone(NvmeNamespace *ns, NvmeZone *zone) +{ + uint64_t zslba; + int64_t offset; + + if (!ns->zns.pstate) { + return 0; + } + + trace_pci_nvme_zns_commit_zone(nvme_nsid(ns), nvme_zslba(zone)); + + zslba = nvme_zslba(zone); + offset = nvme_ns_zone_idx(ns, zslba) * sizeof(NvmeZoneDescriptor); + + return blk_pwrite(ns->zns.pstate, offset, zone->zd, + sizeof(NvmeZoneDescriptor), 0); +} + +static int nvme_zns_commit_zde(NvmeNamespace *ns, NvmeZone *zone) +{ + uint64_t zslba; + int zidx; + size_t zd_len, zdes_bytes; + int64_t offset; + + if (!ns->zns.pstate) { + return 0; + } + + trace_pci_nvme_zns_commit_zde(nvme_nsid(ns), nvme_zslba(zone)); + + zd_len = ns->zns.num_zones * sizeof(NvmeZoneDescriptor); + zslba = nvme_zslba(zone); + zidx = nvme_ns_zone_idx(ns, zslba); + zdes_bytes = nvme_ns_zdes_bytes(ns); + offset = zd_len + zidx * zdes_bytes; + + return blk_pwrite(ns->zns.pstate, offset, zone->zde, zdes_bytes, 0); +} + static inline void nvme_zone_reset_wp(NvmeZone *zone) { zone->zd->wp = zone->zd->zslba; @@ -1058,6 +1098,10 @@ static uint16_t nvme_zrm_release_open(NvmeNamespace *ns) return status; } + if (nvme_zns_commit_zone(ns, candidate) < 0) { + return NVME_INTERNAL_DEV_ERROR; + } + return NVME_SUCCESS; } @@ -1252,6 +1296,10 @@ static uint16_t __nvme_zns_advance_wp(NvmeNamespace *ns, NvmeZone *zone, if (status) { return status; } + + if (nvme_zns_commit_zone(ns, zone) < 0) { + return NVME_INTERNAL_DEV_ERROR; + } } return NVME_SUCCESS; @@ -1307,6 +1355,10 @@ static void nvme_aio_err(NvmeRequest *req, int ret, NvmeZone *zone) NVME_ZS_ZSRO : NVME_ZS_ZSO; nvme_zrm_transition(ns, zone, zs); + + if (nvme_zns_commit_zone(req->ns, zone) < 0) { + req->status = NVME_INTERNAL_DEV_ERROR; + } } /* @@ -1618,6 +1670,10 @@ static void nvme_aio_zone_reset_cb(void *opaque, int ret) nvme_aio_err(req, ret, zone); } + if (nvme_zns_commit_zone(req->ns, zone) < 0) { + req->status = NVME_INTERNAL_DEV_ERROR; + } + (*resets)--; if (*resets) { @@ -1657,6 +1713,10 @@ static uint16_t nvme_zone_mgmt_send_close(NvmeCtrl *n, NvmeRequest *req, return status; } + if (nvme_zns_commit_zone(ns, zone) < 0) { + return NVME_INTERNAL_DEV_ERROR; + } + return NVME_SUCCESS; } @@ -1678,6 +1738,10 @@ static uint16_t nvme_zone_mgmt_send_finish(NvmeCtrl *n, NvmeRequest *req, return status; } + if (nvme_zns_commit_zone(ns, zone) < 0) { + return NVME_INTERNAL_DEV_ERROR; + } + return NVME_SUCCESS; } @@ -1699,6 +1763,10 @@ static uint16_t nvme_zone_mgmt_send_open(NvmeCtrl *n, NvmeRequest *req, return status; } + if (nvme_zns_commit_zone(ns, zone) < 0) { + return NVME_INTERNAL_DEV_ERROR; + } + return NVME_SUCCESS; } @@ -1754,6 +1822,10 @@ static uint16_t nvme_zone_mgmt_send_offline(NvmeCtrl *n, NvmeRequest *req, case NVME_ZS_ZSRO: nvme_zrm_transition(ns, zone, NVME_ZS_ZSO); + if (nvme_zns_commit_zone(ns, zone) < 0) { + return NVME_INTERNAL_DEV_ERROR; + } + /* fallthrough */ case NVME_ZS_ZSO: @@ -1793,6 +1865,10 @@ static uint16_t nvme_zone_mgmt_send_set_zde(NvmeCtrl *n, NvmeRequest *req, return status; } + if (nvme_zns_commit_zde(ns, zone) < 0) { + return NVME_INTERNAL_DEV_ERROR; + } + status = nvme_zrm_transition(ns, zone, NVME_ZS_ZSC); if (status) { return status; @@ -1800,6 +1876,10 @@ static uint16_t nvme_zone_mgmt_send_set_zde(NvmeCtrl *n, NvmeRequest *req, NVME_ZA_SET(zone->zd->za, NVME_ZA_ZDEV); + if (nvme_zns_commit_zone(ns, zone) < 0) { + return NVME_INTERNAL_DEV_ERROR; + } + return NVME_SUCCESS; } @@ -2502,6 +2582,11 @@ static uint16_t nvme_write(NvmeCtrl *n, NvmeRequest *req) goto invalid; } + if (nvme_zns_commit_zone(ns, zone) < 0) { + status = NVME_INTERNAL_DEV_ERROR; + goto invalid; + } + break; } @@ -3778,6 +3863,8 @@ static void nvme_ctrl_shutdown(NvmeCtrl *n) nvme_zrm_transition(ns, zone, NVME_ZS_ZSE); } + nvme_zns_commit_zone(ns, zone); + /* fallthrough */ default: diff --git a/hw/block/trace-events b/hw/block/trace-events index 31482bfba1fe..aa5491c398b9 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -96,6 +96,8 @@ pci_nvme_enqueue_req_completion(uint16_t cid, uint16_t cqid, uint16_t status) "c pci_nvme_zrm_transition(uint32_t nsid, uint64_t zslba, const char *s_from, uint8_t from, const char *s_to, uint8_t to) "nsid %"PRIu32" zslba 0x%"PRIx64" from '%s' (%"PRIu8") to '%s' (%"PRIu8")" pci_nvme_zrm_release_open(uint32_t nsid) "nsid %"PRIu32"" pci_nvme_zns_advance_wp(uint32_t nsid, uint64_t zslba, uint64_t wp_orig, uint32_t nlb) "nsid 0x%"PRIx32" zslba 0x%"PRIx64" wp_orig 0x%"PRIx64" nlb %"PRIu32"" +pci_nvme_zns_commit_zone(uint32_t nsid, uint64_t zslba) "nsid 0x%"PRIx32" zslba 0x%"PRIx64"" +pci_nvme_zns_commit_zde(uint32_t nsid, uint64_t zslba) "nsid 0x%"PRIx32" zslba 0x%"PRIx64"" pci_nvme_mmio_read(uint64_t addr) "addr 0x%"PRIx64"" pci_nvme_mmio_write(uint64_t addr, uint64_t data) "addr 0x%"PRIx64" data 0x%"PRIx64"" pci_nvme_mmio_doorbell_cq(uint16_t cqid, uint16_t new_head) "cqid %"PRIu16" new_head %"PRIu16""