From patchwork Fri Oct 30 02:32:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 11868309 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 565016A2 for ; Fri, 30 Oct 2020 02:37:25 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CAF1C206B5 for ; Fri, 30 Oct 2020 02:37:24 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="V91rBy5z" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CAF1C206B5 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:59774 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kYKI7-00043M-OX for patchwork-qemu-devel@patchwork.kernel.org; Thu, 29 Oct 2020 22:37:23 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41124) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYKDp-0005f2-91; Thu, 29 Oct 2020 22:32:57 -0400 Received: from esa3.hgst.iphmx.com ([216.71.153.141]:10984) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYKDk-0006cy-Ln; Thu, 29 Oct 2020 22:32:56 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1604025172; x=1635561172; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=sJfeCQHJ4b6kZ7bQQ3zkoS9KoX1KA99USQyP+uPAQOw=; b=V91rBy5zW0TfdFobMzfBgfvtJlF7Qcn9/0oXhq4WAaZaad8xlCUhhxTF SEMBMyo0/L2rrVzqcfeJBy3w8+fCD/34wr1cN61H45CjxW3y98B3X3JH3 hQUnrPvX4A85f9DnRM99/KU3QMAeFa+JrnID4nf+Y+OqEzqIBftYpYZCI xMQyqzCsFnSN+73OcQUGDsaU46ZuED18TfzjYFZfUwmxuLyYdQOvgM+g1 FDMd3q7c7WweNxBLWJTdJKHU8+ZMCY1YYG1yVQ7P6QpaEEGPPvYyo4u5M Yw1owleiIlitq7jqT/CUJ5q2zXMYOQVKRBKyV5VSSgu/Wdi/4/9YTfYXA A==; IronPort-SDR: fhKMsVZX5edwhJokBs7u+21t4dWjzv6EpOR0uHC1vixWs+ddZ1TAAdUfsuyIgaEbWejl2umBle IW8qt/qCyMjxWN8UETP7UtYLcupDgxB2BnDujZdzm7ey8diXQoxgTYzHBIgpazArohzhH1uLNT YJrUaXsGDcQOFquKLNmCHygNTqHEoQQoniu9/6h3yBj2QtovbBKhxBW4dlzXxdNYpm61VvrZSx uROqw9DIWAWY9akmbFIxHpp3uhbcB3zB/t7v72N0eZVyrkWDFON5IEu4N+/SEuxQC1P3ep07oA Czc= X-IronPort-AV: E=Sophos;i="5.77,431,1596470400"; d="scan'208";a="155748051" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 30 Oct 2020 10:32:50 +0800 IronPort-SDR: nTD7ZzpNzf6EajYeXFt/asRvgZXlYOWjXzmKWpWXVpMttrBasMnCjAlIqmFekYUGFIPhkIuTgV 4EeXkjw8+1Cmn9S/Ix7tjvFHSYNNHTfr8aKQ5ZdZYOJaXfC1B+Ul5oDxA/J376wQmxv+KW/5Zl LoQLb36DzRbwrGSOJ4bfRIUwArNQDeVR0sTAl8J8Pqo35iBFNbqau1znyG8PmmkvJ6pkOi5Bdw 9Kg4DMMbV6aKH043olduDmvvhvSJHc3nVIH3DchNwzwKFgj7Bdn6G6S72rFCKSAWcyPpoe3rpr tz8ARZcCsQ54Ar5+h/rWJQ+9 Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2020 19:17:55 -0700 IronPort-SDR: zLveBajxHZpUiOO261ZfWfJlClHLWKyJrbRgm/HfcflCfHSRn6/y4Oqgv4TvA0KDDGnHDhmdsL yQGOV9tIkPRlev++LaYQBkMUbHrTd6sqdFvLXqKqwj6a8NF4yxQXMlk10ue6Z1m1tWGG1Np2l/ jh76CSXDSH6q00E9ORhoFppxb81ZOzaLN6UcRQmtmgAlfrjv+FoIvmvCMX48iKGEtHIVkAAihx 0ppsWfzJ7gMEUpxd8L7jSccPEkgc62l7AIHI+viQMZdCZJhY1ozvNk999H2UrKZSWrIWZohflm 6MA= WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip01.wdc.com with ESMTP; 29 Oct 2020 19:32:49 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v8 01/11] hw/block/nvme: Add Commands Supported and Effects log Date: Fri, 30 Oct 2020 11:32:32 +0900 Message-Id: <20201030023242.5204-2-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201030023242.5204-1-dmitry.fomichev@wdc.com> References: <20201030023242.5204-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=216.71.153.141; envelope-from=prvs=56530b5a8=dmitry.fomichev@wdc.com; helo=esa3.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/10/29 22:32:49 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" This log page becomes necessary to implement to allow checking for Zone Append command support in Zoned Namespace Command Set. This commit adds the code to report this log page for NVM Command Set only. The parts that are specific to zoned operation will be added later in the series. All incoming admin and i/o commands are now only processed if their corresponding support bits are set in this log. This provides an easy way to control what commands to support and what not to depending on set CC.CSS. Signed-off-by: Dmitry Fomichev Reviewed-by: Niklas Cassel --- hw/block/nvme-ns.h | 1 + hw/block/nvme.c | 96 +++++++++++++++++++++++++++++++++++++++---- hw/block/trace-events | 1 + include/block/nvme.h | 19 +++++++++ 4 files changed, 108 insertions(+), 9 deletions(-) diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index 83734f4606..ea8c2f785d 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -29,6 +29,7 @@ typedef struct NvmeNamespace { int32_t bootindex; int64_t size; NvmeIdNs id_ns; + const uint32_t *iocs; NvmeNamespaceParams params; } NvmeNamespace; diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 9d30ca69dc..9fed061a9d 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -111,6 +111,28 @@ static const uint32_t nvme_feature_cap[NVME_FID_MAX] = { [NVME_TIMESTAMP] = NVME_FEAT_CAP_CHANGE, }; +static const uint32_t nvme_cse_acs[256] = { + [NVME_ADM_CMD_DELETE_SQ] = NVME_CMD_EFF_CSUPP, + [NVME_ADM_CMD_CREATE_SQ] = NVME_CMD_EFF_CSUPP, + [NVME_ADM_CMD_GET_LOG_PAGE] = NVME_CMD_EFF_CSUPP, + [NVME_ADM_CMD_DELETE_CQ] = NVME_CMD_EFF_CSUPP, + [NVME_ADM_CMD_CREATE_CQ] = NVME_CMD_EFF_CSUPP, + [NVME_ADM_CMD_IDENTIFY] = NVME_CMD_EFF_CSUPP, + [NVME_ADM_CMD_ABORT] = NVME_CMD_EFF_CSUPP, + [NVME_ADM_CMD_SET_FEATURES] = NVME_CMD_EFF_CSUPP, + [NVME_ADM_CMD_GET_FEATURES] = NVME_CMD_EFF_CSUPP, + [NVME_ADM_CMD_ASYNC_EV_REQ] = NVME_CMD_EFF_CSUPP, +}; + +static const uint32_t nvme_cse_iocs_none[256]; + +static const uint32_t nvme_cse_iocs_nvm[256] = { + [NVME_CMD_FLUSH] = NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC, + [NVME_CMD_WRITE_ZEROES] = NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC, + [NVME_CMD_WRITE] = NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC, + [NVME_CMD_READ] = NVME_CMD_EFF_CSUPP, +}; + static void nvme_process_sq(void *opaque); static uint16_t nvme_cid(NvmeRequest *req) @@ -1032,10 +1054,6 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req) trace_pci_nvme_io_cmd(nvme_cid(req), nsid, nvme_sqid(req), req->cmd.opcode, nvme_io_opc_str(req->cmd.opcode)); - if (NVME_CC_CSS(n->bar.cc) == NVME_CC_CSS_ADMIN_ONLY) { - return NVME_INVALID_OPCODE | NVME_DNR; - } - if (!nvme_nsid_valid(n, nsid)) { return NVME_INVALID_NSID | NVME_DNR; } @@ -1045,6 +1063,11 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req) return NVME_INVALID_FIELD | NVME_DNR; } + if (!(req->ns->iocs[req->cmd.opcode] & NVME_CMD_EFF_CSUPP)) { + trace_pci_nvme_err_invalid_opc(req->cmd.opcode); + return NVME_INVALID_OPCODE | NVME_DNR; + } + switch (req->cmd.opcode) { case NVME_CMD_FLUSH: return nvme_flush(n, req); @@ -1054,8 +1077,7 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req) case NVME_CMD_READ: return nvme_rw(n, req); default: - trace_pci_nvme_err_invalid_opc(req->cmd.opcode); - return NVME_INVALID_OPCODE | NVME_DNR; + assert(false); } } @@ -1291,6 +1313,37 @@ static uint16_t nvme_error_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len, DMA_DIRECTION_FROM_DEVICE, req); } +static uint16_t nvme_cmd_effects(NvmeCtrl *n, uint32_t buf_len, + uint64_t off, NvmeRequest *req) +{ + NvmeEffectsLog log = {}; + const uint32_t *src_iocs = NULL; + uint32_t trans_len; + + if (off >= sizeof(log)) { + trace_pci_nvme_err_invalid_log_page_offset(off, sizeof(log)); + return NVME_INVALID_FIELD | NVME_DNR; + } + + switch (NVME_CC_CSS(n->bar.cc)) { + case NVME_CC_CSS_NVM: + src_iocs = nvme_cse_iocs_nvm; + case NVME_CC_CSS_ADMIN_ONLY: + break; + } + + memcpy(log.acs, nvme_cse_acs, sizeof(nvme_cse_acs)); + + if (src_iocs) { + memcpy(log.iocs, src_iocs, sizeof(log.iocs)); + } + + trans_len = MIN(sizeof(log) - off, buf_len); + + return nvme_dma(n, ((uint8_t *)&log) + off, trans_len, + DMA_DIRECTION_FROM_DEVICE, req); +} + static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req) { NvmeCmd *cmd = &req->cmd; @@ -1334,6 +1387,8 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req) return nvme_smart_info(n, rae, len, off, req); case NVME_LOG_FW_SLOT_INFO: return nvme_fw_log_info(n, len, off, req); + case NVME_LOG_CMD_EFFECTS: + return nvme_cmd_effects(n, len, off, req); default: trace_pci_nvme_err_invalid_log_page(nvme_cid(req), lid); return NVME_INVALID_FIELD | NVME_DNR; @@ -1920,6 +1975,11 @@ static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeRequest *req) trace_pci_nvme_admin_cmd(nvme_cid(req), nvme_sqid(req), req->cmd.opcode, nvme_adm_opc_str(req->cmd.opcode)); + if (!(nvme_cse_acs[req->cmd.opcode] & NVME_CMD_EFF_CSUPP)) { + trace_pci_nvme_err_invalid_admin_opc(req->cmd.opcode); + return NVME_INVALID_OPCODE | NVME_DNR; + } + switch (req->cmd.opcode) { case NVME_ADM_CMD_DELETE_SQ: return nvme_del_sq(n, req); @@ -1942,8 +2002,7 @@ static uint16_t nvme_admin_cmd(NvmeCtrl *n, NvmeRequest *req) case NVME_ADM_CMD_ASYNC_EV_REQ: return nvme_aer(n, req); default: - trace_pci_nvme_err_invalid_admin_opc(req->cmd.opcode); - return NVME_INVALID_OPCODE | NVME_DNR; + assert(false); } } @@ -2031,6 +2090,23 @@ static void nvme_clear_ctrl(NvmeCtrl *n) n->bar.cc = 0; } +static void nvme_select_ns_iocs(NvmeCtrl *n) +{ + NvmeNamespace *ns; + int i; + + for (i = 1; i <= n->num_namespaces; i++) { + ns = nvme_ns(n, i); + if (!ns) { + continue; + } + ns->iocs = nvme_cse_iocs_none; + if (NVME_CC_CSS(n->bar.cc) != NVME_CC_CSS_ADMIN_ONLY) { + ns->iocs = nvme_cse_iocs_nvm; + } + } +} + static int nvme_start_ctrl(NvmeCtrl *n) { uint32_t page_bits = NVME_CC_MPS(n->bar.cc) + 12; @@ -2129,6 +2205,8 @@ static int nvme_start_ctrl(NvmeCtrl *n) QTAILQ_INIT(&n->aer_queue); + nvme_select_ns_iocs(n); + return 0; } @@ -2737,7 +2815,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev) id->acl = 3; id->aerl = n->params.aerl; id->frmw = (NVME_NUM_FW_SLOTS << 1) | NVME_FRMW_SLOT1_RO; - id->lpa = NVME_LPA_NS_SMART | NVME_LPA_EXTENDED; + id->lpa = NVME_LPA_NS_SMART | NVME_LPA_CSE | NVME_LPA_EXTENDED; /* recommended default value (~70 C) */ id->wctemp = cpu_to_le16(NVME_TEMPERATURE_WARNING); diff --git a/hw/block/trace-events b/hw/block/trace-events index fac5995d94..3ea6c29926 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -104,6 +104,7 @@ pci_nvme_err_invalid_prp(void) "invalid PRP" pci_nvme_err_invalid_opc(uint8_t opc) "invalid opcode 0x%"PRIx8"" pci_nvme_err_invalid_admin_opc(uint8_t opc) "invalid admin opcode 0x%"PRIx8"" pci_nvme_err_invalid_lba_range(uint64_t start, uint64_t len, uint64_t limit) "Invalid LBA start=%"PRIu64" len=%"PRIu64" limit=%"PRIu64"" +pci_nvme_err_invalid_log_page_offset(uint64_t ofs, uint64_t size) "must be <= %"PRIu64", got %"PRIu64"" pci_nvme_err_invalid_del_sq(uint16_t qid) "invalid submission queue deletion, sid=%"PRIu16"" pci_nvme_err_invalid_create_sq_cqid(uint16_t cqid) "failed creating submission queue, invalid cqid=%"PRIu16"" pci_nvme_err_invalid_create_sq_sqid(uint16_t sqid) "failed creating submission queue, invalid sqid=%"PRIu16"" diff --git a/include/block/nvme.h b/include/block/nvme.h index 6de2d5aa75..4779495b7d 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -744,10 +744,27 @@ enum NvmeSmartWarn { NVME_SMART_FAILED_VOLATILE_MEDIA = 1 << 4, }; +typedef struct NvmeEffectsLog { + uint32_t acs[256]; + uint32_t iocs[256]; + uint8_t resv[2048]; +} NvmeEffectsLog; + +enum { + NVME_CMD_EFF_CSUPP = 1 << 0, + NVME_CMD_EFF_LBCC = 1 << 1, + NVME_CMD_EFF_NCC = 1 << 2, + NVME_CMD_EFF_NIC = 1 << 3, + NVME_CMD_EFF_CCC = 1 << 4, + NVME_CMD_EFF_CSE_MASK = 3 << 16, + NVME_CMD_EFF_UUID_SEL = 1 << 19, +}; + enum NvmeLogIdentifier { NVME_LOG_ERROR_INFO = 0x01, NVME_LOG_SMART_INFO = 0x02, NVME_LOG_FW_SLOT_INFO = 0x03, + NVME_LOG_CMD_EFFECTS = 0x05, }; typedef struct QEMU_PACKED NvmePSD { @@ -860,6 +877,7 @@ enum NvmeIdCtrlFrmw { enum NvmeIdCtrlLpa { NVME_LPA_NS_SMART = 1 << 0, + NVME_LPA_CSE = 1 << 1, NVME_LPA_EXTENDED = 1 << 2, }; @@ -1059,6 +1077,7 @@ static inline void _nvme_check_size(void) QEMU_BUILD_BUG_ON(sizeof(NvmeErrorLog) != 64); QEMU_BUILD_BUG_ON(sizeof(NvmeFwSlotInfoLog) != 512); QEMU_BUILD_BUG_ON(sizeof(NvmeSmartLog) != 512); + QEMU_BUILD_BUG_ON(sizeof(NvmeEffectsLog) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrl) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNs) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeSglDescriptor) != 16); From patchwork Fri Oct 30 02:32:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 11868303 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0F53361C for ; Fri, 30 Oct 2020 02:35:21 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B0C0020739 for ; Fri, 30 Oct 2020 02:35:20 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="rcaC8qU5" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B0C0020739 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:51770 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kYKG7-0000l6-KS for patchwork-qemu-devel@patchwork.kernel.org; Thu, 29 Oct 2020 22:35:19 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41118) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYKDo-0005dv-N1; Thu, 29 Oct 2020 22:32:56 -0400 Received: from esa3.hgst.iphmx.com ([216.71.153.141]:10987) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYKDm-0006dF-M3; Thu, 29 Oct 2020 22:32:56 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1604025174; x=1635561174; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=vaHsQrbzTnR4ns1fW2MjJ4Z6RM4+LbcIywqdOEdAqqg=; b=rcaC8qU5rgQoKc8QQjlEbljarTtcDTrji/LRiz5ZUdbFXDN7MjrvtwU9 zdrQ1/b5unUzbPPN4SryKpXERRi8YZ3VWBymRTDNjK7FyOPQSFmhZxuFh dzkakkJZVzfhJmLAu+9vAA4KycX63g4cKsIjb57plB2ASAXVyWUYtKnK5 Ro/xqnSX2JMopHajAXJW2FroFPOW6CSWw8Lth00+6rPXvatK2erclLZqi uhvmI+Ds7iEclOF+r5Ib+q+DD0mlsQZidfUKPvXPxWP/O99nf12Zu+v+d IeyHoWNbMQuGxxcPGDjTgOQpvd9w2HFCi8Cvk2Ndv1ks9eVU3GE1xVjBN A==; IronPort-SDR: yPX86ATQes2vRuBX6W884rG5r64OJB03Q4gcx0HlyzMYa5u9dSJHoGhoHIZBI+H34W2vdB2Evj kmWNwnWvWh3Tor2sDR16aSpdhbgpTu49LG1I4BddSjdLNMQFL/90gy5tmbZ5bpeW8yD6NNX07z /RyxbPgzkA/KeyxDqxtg4cF/ahcJIO3PcGe3KRPju8T1TzWne8TIoGv/skahJI9rVXnGX/Tczw IEEW509z2mFqKaxoJSpFxGTpXN1XntgfO8pMfD8k2I8qMZX3p4/giCZCnoIOyB5/TNzPFO923S lrA= X-IronPort-AV: E=Sophos;i="5.77,431,1596470400"; d="scan'208";a="155748054" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 30 Oct 2020 10:32:52 +0800 IronPort-SDR: IBRKlva+jUYvwyLJplQp6CwAPBjSIVm0cARVmrAFw9Ku9ffcOgGc5TDFeHkmRGriIWv6kN9bT5 N4qW8N0ufyjkoUJoDNomxx2iEQGq1RTTU7NPplwO2RDEnXvFu2hj2eUtRs+2e04fLSUPFTWUoq 7IhibtVM6dcsiH0f0rAihBmS+i9o/DM/e0VijBTL2KzqjnXv3PQ5zXET3wfjEqxfuoMHXq8uDP TCXcm2yHrDOApFqEmWds76DMKkXYSTLeAeQVjAsGBxbQLGwoGgAAuyoizacx4LSScCBoJa2zGv 4AsFRMQlTThcd7wfuYwVLkJn Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2020 19:17:57 -0700 IronPort-SDR: Y8dCoky6bjV++GDOR8wphz+MIGlJmNVp3In5uMbrrNnJkxWJVss9wy7uWjwyY+LCu6C/lsWEKn X59fK9Z9z8oPZIjr7hl4+bBASPLCsOIsjOXlIB/FpHnQsdiVLJhv4U0hstK8s3bPXMy8rIRFZL 5H5PR+7iLoYcC4xn4NEQssj3g5f6VMG90yPZXqW1+XnJUU4sC81Q4SszJZIg6jLHF3SBUn1Yqz TBIfgic5irOtXSwp2oHKrqPvAj0gzXkkP0vSKsoOrbFvZB1cukMr9sFJlm92z0luWT64613V1y WWE= WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip01.wdc.com with ESMTP; 29 Oct 2020 19:32:52 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v8 02/11] hw/block/nvme: Generate namespace UUIDs Date: Fri, 30 Oct 2020 11:32:33 +0900 Message-Id: <20201030023242.5204-3-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201030023242.5204-1-dmitry.fomichev@wdc.com> References: <20201030023242.5204-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=216.71.153.141; envelope-from=prvs=56530b5a8=dmitry.fomichev@wdc.com; helo=esa3.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/10/29 22:32:49 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" In NVMe 1.4, a namespace must report an ID descriptor of UUID type if it doesn't support EUI64 or NGUID. Add a new namespace property, "uuid", that provides the user the option to either specify the UUID explicitly or have a UUID generated automatically every time a namespace is initialized. Suggested-by: Klaus Jensen Signed-off-by: Dmitry Fomichev Reviewed-by: Klaus Jensen Reviewed-by: Keith Busch Reviewed-by: Niklas Cassel --- hw/block/nvme-ns.c | 1 + hw/block/nvme-ns.h | 1 + hw/block/nvme.c | 9 +++++---- 3 files changed, 7 insertions(+), 4 deletions(-) diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index b69cdaf27e..de735eb9f3 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -129,6 +129,7 @@ static void nvme_ns_realize(DeviceState *dev, Error **errp) static Property nvme_ns_props[] = { DEFINE_BLOCK_PROPERTIES(NvmeNamespace, blkconf), DEFINE_PROP_UINT32("nsid", NvmeNamespace, params.nsid, 0), + DEFINE_PROP_UUID("uuid", NvmeNamespace, params.uuid), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index ea8c2f785d..a38071884a 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -21,6 +21,7 @@ typedef struct NvmeNamespaceParams { uint32_t nsid; + QemuUUID uuid; } NvmeNamespaceParams; typedef struct NvmeNamespace { diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 9fed061a9d..1ec8ccb3e6 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1572,6 +1572,7 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req) static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req) { + NvmeNamespace *ns; NvmeIdentify *c = (NvmeIdentify *)&req->cmd; uint32_t nsid = le32_to_cpu(c->nsid); uint8_t list[NVME_IDENTIFY_DATA_SIZE]; @@ -1591,7 +1592,8 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req) return NVME_INVALID_NSID | NVME_DNR; } - if (unlikely(!nvme_ns(n, nsid))) { + ns = nvme_ns(n, nsid); + if (unlikely(!ns)) { return NVME_INVALID_FIELD | NVME_DNR; } @@ -1600,12 +1602,11 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req) /* * Because the NGUID and EUI64 fields are 0 in the Identify Namespace data * structure, a Namespace UUID (nidt = 0x3) must be reported in the - * Namespace Identification Descriptor. Add a very basic Namespace UUID - * here. + * Namespace Identification Descriptor. Add the namespace UUID here. */ ns_descrs->uuid.hdr.nidt = NVME_NIDT_UUID; ns_descrs->uuid.hdr.nidl = NVME_NIDT_UUID_LEN; - stl_be_p(&ns_descrs->uuid.v, nsid); + memcpy(&ns_descrs->uuid.v, ns->params.uuid.data, NVME_NIDT_UUID_LEN); return nvme_dma(n, list, NVME_IDENTIFY_DATA_SIZE, DMA_DIRECTION_FROM_DEVICE, req); From patchwork Fri Oct 30 02:32:34 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 11868311 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 822E86A2 for ; Fri, 30 Oct 2020 02:37:31 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 283AD206B5 for ; Fri, 30 Oct 2020 02:37:31 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="Iqte+LRv" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 283AD206B5 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:59966 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kYKID-00047k-SQ for patchwork-qemu-devel@patchwork.kernel.org; Thu, 29 Oct 2020 22:37:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41134) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYKDr-0005iO-0j; Thu, 29 Oct 2020 22:32:59 -0400 Received: from esa3.hgst.iphmx.com ([216.71.153.141]:10980) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYKDo-0006bi-N5; Thu, 29 Oct 2020 22:32:58 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1604025176; x=1635561176; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3Tn85ElkxCk8/vnAe2n/Qwxg6E+GzDZiN+lwwkagU1o=; b=Iqte+LRv2Tfi5aIG4AwiqLIdAEr9j6BztVKEzzLddeNlZ/6KR87UXEQk vjqLc/pqXWqoJK2ZoloJeKpERH2G3RI50mFhy7CMbmuaA0v4MAZSjr63o Lt0IRfW7TzXuH/uOIkOSo6XKzjJlBmxfePsUVzMwab64i25CjgH+PJuJP WXyFuSblEPnKFG1Bbnz+aQR6DEhn5T65s0GpHsLtJxaxJTAmGo38YJCxE 5CzPFOTFjaCxdEjllXGES49YQBsAiGhSNlfMsWrPpDIevjsE7PLtllNSP DuZh2cdnXvEmCsaWwQV6JqAqQZJhbb0IMSWC/ZGs2CcEhtvHHfxImDR75 g==; IronPort-SDR: rCDzD1GnvzHPrfumH5Uq32+5RJB7Th5XFRXJ9hXcyZtTCEQBx1dpD2RIgZp3JGUCklkZNKFqZz 7FkN6tp0S7/Oc78RPg1C13in/m15N+LzPjlL7hfSS+0zTtGy87kibVWMZ/jdK2aQWO8R/R5y8x pIw6b6Lj+MYl2AngbGU3CZAjEsKdTqs4xF3J2nQLOt6Yy/7xQBhPuUiyjk7sjzrkpkogVzMzh/ G4ZSvRNWwsKisIWBL+sutsQoYznwfm82rsMMlgC0buUn4v290nCytTcov4dxoFZQkUd5yTZgTj qYo= X-IronPort-AV: E=Sophos;i="5.77,431,1596470400"; d="scan'208";a="155748062" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 30 Oct 2020 10:32:54 +0800 IronPort-SDR: h33n1UKrBygRz6UxHlGdnyY2zUrssGUQjE0xmWhCDw4UagXAWuXJBdYAH2/FPUtbSgfVO+u7ck i8tPxUyf1kFlgBJvMldhjUq6k3ekPDLQUDHJmejVTDoOyTvUsld/fvsHwdm95Lulf1H/5odkFx E9GBOU+ZeQVqK7TN3OjB93WAUyHiCJELxCD3VVXMf3fHL9SogJ5XFJK0eizIxdx2A4t2e3OdAc EQKd70tkKg5RXOMac2VQCzL7BOFbaerI305NwOmNHYha2PNI1fFiv3LPWyd3WEbq7SGED1awTy CwM2LwWt6Mpp15orMz4E3m2l Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2020 19:17:59 -0700 IronPort-SDR: u2u6bu9W8JsAG5ATL3QlCcKHG0ymvOGinX6Lk8J5xF0zH/FCZBpTeB+kQm1TjfOwhQR4T28BBh fUx3WwJ4t1rd5zak1SOTnxcu0pbh04nfYwEY4PAmV3MYTpjfnXrP1T1kdYB1nKrPxSWxpDfd4+ /5EEt6uA8uC317uCyXFiLV7hGKq1iybQiOy+zlZn5VIuOA+vconsrXC28NaUBn2nmtfQv//Tcm xd03QL74YPYTUcIlB//WUe83Zxb0jFPUSCruT4LocWfEFpGc1ZDV0ksngFmh55V+8GbClk0WNi ec0= WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip01.wdc.com with ESMTP; 29 Oct 2020 19:32:54 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v8 03/11] hw/block/nvme: Separate read and write handlers Date: Fri, 30 Oct 2020 11:32:34 +0900 Message-Id: <20201030023242.5204-4-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201030023242.5204-1-dmitry.fomichev@wdc.com> References: <20201030023242.5204-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=216.71.153.141; envelope-from=prvs=56530b5a8=dmitry.fomichev@wdc.com; helo=esa3.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/10/29 22:32:49 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" With ZNS support in place, the majority of code in nvme_rw() has become read- or write-specific. Move these parts to two separate handlers, nvme_read() and nvme_write() to make the code more readable and to remove multiple is_write checks that so far existed in the i/o path. This is a refactoring patch, no change in functionality. Signed-off-by: Dmitry Fomichev Reviewed-by: Niklas Cassel Acked-by: Klaus Jensen --- hw/block/nvme.c | 91 ++++++++++++++++++++++++++++++------------- hw/block/trace-events | 3 +- 2 files changed, 67 insertions(+), 27 deletions(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 1ec8ccb3e6..3023774484 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -963,6 +963,54 @@ static uint16_t nvme_flush(NvmeCtrl *n, NvmeRequest *req) return NVME_NO_COMPLETE; } +static uint16_t nvme_read(NvmeCtrl *n, NvmeRequest *req) +{ + NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; + NvmeNamespace *ns = req->ns; + uint64_t slba = le64_to_cpu(rw->slba); + uint32_t nlb = (uint32_t)le16_to_cpu(rw->nlb) + 1; + uint64_t data_size = nvme_l2b(ns, nlb); + uint64_t data_offset; + BlockBackend *blk = ns->blkconf.blk; + uint16_t status; + + trace_pci_nvme_read(nvme_cid(req), nvme_nsid(ns), nlb, data_size, slba); + + status = nvme_check_mdts(n, data_size); + if (status) { + trace_pci_nvme_err_mdts(nvme_cid(req), data_size); + goto invalid; + } + + status = nvme_check_bounds(n, ns, slba, nlb); + if (status) { + trace_pci_nvme_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze); + goto invalid; + } + + status = nvme_map_dptr(n, data_size, req); + if (status) { + goto invalid; + } + + data_offset = nvme_l2b(ns, slba); + + block_acct_start(blk_get_stats(blk), &req->acct, data_size, + BLOCK_ACCT_READ); + if (req->qsg.sg) { + req->aiocb = dma_blk_read(blk, &req->qsg, data_offset, + BDRV_SECTOR_SIZE, nvme_rw_cb, req); + } else { + req->aiocb = blk_aio_preadv(blk, data_offset, &req->iov, 0, + nvme_rw_cb, req); + } + return NVME_NO_COMPLETE; + +invalid: + block_acct_invalid(blk_get_stats(blk), BLOCK_ACCT_READ); + return status | NVME_DNR; +} + static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req) { NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; @@ -988,22 +1036,19 @@ static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req) return NVME_NO_COMPLETE; } -static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_write(NvmeCtrl *n, NvmeRequest *req) { NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; NvmeNamespace *ns = req->ns; - uint32_t nlb = (uint32_t)le16_to_cpu(rw->nlb) + 1; uint64_t slba = le64_to_cpu(rw->slba); - + uint32_t nlb = (uint32_t)le16_to_cpu(rw->nlb) + 1; uint64_t data_size = nvme_l2b(ns, nlb); - uint64_t data_offset = nvme_l2b(ns, slba); - enum BlockAcctType acct = req->cmd.opcode == NVME_CMD_WRITE ? - BLOCK_ACCT_WRITE : BLOCK_ACCT_READ; + uint64_t data_offset; BlockBackend *blk = ns->blkconf.blk; uint16_t status; - trace_pci_nvme_rw(nvme_cid(req), nvme_io_opc_str(rw->opcode), - nvme_nsid(ns), nlb, data_size, slba); + trace_pci_nvme_write(nvme_cid(req), nvme_io_opc_str(rw->opcode), + nvme_nsid(ns), nlb, data_size, slba); status = nvme_check_mdts(n, data_size); if (status) { @@ -1022,29 +1067,22 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req) goto invalid; } - block_acct_start(blk_get_stats(blk), &req->acct, data_size, acct); + data_offset = nvme_l2b(ns, slba); + + block_acct_start(blk_get_stats(blk), &req->acct, data_size, + BLOCK_ACCT_WRITE); if (req->qsg.sg) { - if (acct == BLOCK_ACCT_WRITE) { - req->aiocb = dma_blk_write(blk, &req->qsg, data_offset, - BDRV_SECTOR_SIZE, nvme_rw_cb, req); - } else { - req->aiocb = dma_blk_read(blk, &req->qsg, data_offset, - BDRV_SECTOR_SIZE, nvme_rw_cb, req); - } + req->aiocb = dma_blk_write(blk, &req->qsg, data_offset, + BDRV_SECTOR_SIZE, nvme_rw_cb, req); } else { - if (acct == BLOCK_ACCT_WRITE) { - req->aiocb = blk_aio_pwritev(blk, data_offset, &req->iov, 0, - nvme_rw_cb, req); - } else { - req->aiocb = blk_aio_preadv(blk, data_offset, &req->iov, 0, - nvme_rw_cb, req); - } + req->aiocb = blk_aio_pwritev(blk, data_offset, &req->iov, 0, + nvme_rw_cb, req); } return NVME_NO_COMPLETE; invalid: - block_acct_invalid(blk_get_stats(ns->blkconf.blk), acct); - return status; + block_acct_invalid(blk_get_stats(blk), BLOCK_ACCT_WRITE); + return status | NVME_DNR; } static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req) @@ -1074,8 +1112,9 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req) case NVME_CMD_WRITE_ZEROES: return nvme_write_zeroes(n, req); case NVME_CMD_WRITE: + return nvme_write(n, req); case NVME_CMD_READ: - return nvme_rw(n, req); + return nvme_read(n, req); default: assert(false); } diff --git a/hw/block/trace-events b/hw/block/trace-events index 3ea6c29926..d81b1891d8 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -40,7 +40,8 @@ pci_nvme_map_prp(uint64_t trans_len, uint32_t len, uint64_t prp1, uint64_t prp2, pci_nvme_map_sgl(uint16_t cid, uint8_t typ, uint64_t len) "cid %"PRIu16" type 0x%"PRIx8" len %"PRIu64"" pci_nvme_io_cmd(uint16_t cid, uint32_t nsid, uint16_t sqid, uint8_t opcode, const char *opname) "cid %"PRIu16" nsid %"PRIu32" sqid %"PRIu16" opc 0x%"PRIx8" opname '%s'" pci_nvme_admin_cmd(uint16_t cid, uint16_t sqid, uint8_t opcode, const char *opname) "cid %"PRIu16" sqid %"PRIu16" opc 0x%"PRIx8" opname '%s'" -pci_nvme_rw(uint16_t cid, const char *verb, uint32_t nsid, uint32_t nlb, uint64_t count, uint64_t lba) "cid %"PRIu16" opname '%s' nsid %"PRIu32" nlb %"PRIu32" count %"PRIu64" lba 0x%"PRIx64"" +pci_nvme_read(uint16_t cid, uint32_t nsid, uint32_t nlb, uint64_t count, uint64_t lba) "cid %"PRIu16" nsid %"PRIu32" nlb %"PRIu32" count %"PRIu64" lba 0x%"PRIx64"" +pci_nvme_write(uint16_t cid, const char *verb, uint32_t nsid, uint32_t nlb, uint64_t count, uint64_t lba) "cid %"PRIu16" opname '%s' nsid %"PRIu32" nlb %"PRIu32" count %"PRIu64" lba 0x%"PRIx64"" pci_nvme_rw_cb(uint16_t cid, const char *blkname) "cid %"PRIu16" blk '%s'" pci_nvme_write_zeroes(uint16_t cid, uint32_t nsid, uint64_t slba, uint32_t nlb) "cid %"PRIu16" nsid %"PRIu32" slba %"PRIu64" nlb %"PRIu32"" pci_nvme_create_sq(uint64_t addr, uint16_t sqid, uint16_t cqid, uint16_t qsize, uint16_t qflags) "create submission queue, addr=0x%"PRIx64", sqid=%"PRIu16", cqid=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16"" From patchwork Fri Oct 30 02:32:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 11868299 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4434A61C for ; Fri, 30 Oct 2020 02:34:25 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E258E20739 for ; Fri, 30 Oct 2020 02:34:24 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="kklmOFf/" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E258E20739 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:48916 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kYKFD-00081X-T9 for patchwork-qemu-devel@patchwork.kernel.org; Thu, 29 Oct 2020 22:34:23 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41172) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYKDt-0005pt-Qh; Thu, 29 Oct 2020 22:33:01 -0400 Received: from esa3.hgst.iphmx.com ([216.71.153.141]:10998) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYKDr-0006dn-7u; Thu, 29 Oct 2020 22:33:01 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1604025178; x=1635561178; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=8ARzR47N/qoaJcuPun/pbphvgstWibX4rHpfNfh048o=; b=kklmOFf/IiqSzl4EtiTXKZSgiIg2aavIC1Da9S8c4zZL/f37Qd9VjIzB k/kI1TtZDAZkLzc/ROI7FkZR5kwEJ9nbCIFMwV+txr6klH2yKluKjOhYg W70jLSa5Me4I1OrBMT/3ehmFG1zejBWRioI70xf8pFyzZFwYi/gbb7pPb ulGIqAv/OJLFc09k46IzVYw/G4ZmDxmvxNocqBsjbm0Zbk6AminDGqCP8 qN0/uHh3/bmC45ehb7wtgTpQHjIC8BesfApMlqhg9+GQXJ1bOT+uUUhQV OCjHmqUARijWuTgeONj0w9RdHLxH77D4FBdt6I1HCUY3CspmiTHpCH0lW A==; IronPort-SDR: OOFIxDUeB2m07c1opadwsDBGQHoqs+f6Vo9PM91Ed+fQtWIhpHhuCNE8JeFlzoCZiNdrVtINxj DJcr/B4AzNxIQoTbHtEZerPILOjTYEWMC/Vs5ZaZwJuWtxfODXEyEUyfvZDpC8gsTCIC1Fggh2 tXubWuFcCIxGLBClUTCrAbXb4husaw/Thyy+JZ58L+MZdPYaELlQmMYzD1QqRtc+bkwupH3/B1 cHt0guZIynG+ZimXRYHhs3BsQ7qGjjOXNH0/nXLEsyA9jHzJUt+DISoI3ZvVLISD+Q7AEAJN53 szc= X-IronPort-AV: E=Sophos;i="5.77,431,1596470400"; d="scan'208";a="155748069" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 30 Oct 2020 10:32:57 +0800 IronPort-SDR: ngMg7lPAbctA9MWiGKBqjU2Ol8TbxTO159YVS7BXAC1VZUeMsUjRWNGMwHv3DO3z/s//imYnAJ q8HwUno/ZaDEqDGb8n6vyZklp+c+NYmKBisU5rWS1PoBcMZi1x0+W3Kjw7wGJ07f5ebTTdD8su X864T9Qc9SxqfdjVN2tgRpDH6mnCeouB2SEAkZzWXcvgba8tTkzr1eO/VIgjj5PouLdQD103DA xUoG1gorPBMiYtoLtsaZJfvIo1Za0wYHNWieVEqc3C++hPnvtQWvZ6VfeKkv4LPSGJhR3YqdGD YVt7CRZo7Bue8X1UzGG2z03Y Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2020 19:18:01 -0700 IronPort-SDR: pig0vMkZbOh2jfFgVylg+DbvWke9vFC/ctMSWMtXxyrMZfPbGEKXm3O4oGV8TAo9AUad5DgY8H TiKxNEAk9oqnKyt5f5QNlpazOjuER0fx+8YoKW51gJq8jdsolDt23l/UG01NY+y4RDH635WPuT BdCiCyVZOEu8fgTtBS5+1gh4A1rCQeNub5vOYRvaSzo1g7eX+zH5Mgqnk+2uGCHhGpWOu63ydf ERaBth95FJ1WOyXsWOvPSTrdoSfLuxsqtvEhI/tJ8ODLW+rC7u3v8wY8CHeNc0Kt8m/dQ1CWcy GmM= WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip01.wdc.com with ESMTP; 29 Oct 2020 19:32:56 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v8 04/11] hw/block/nvme: Merge nvme_write_zeroes() with nvme_write() Date: Fri, 30 Oct 2020 11:32:35 +0900 Message-Id: <20201030023242.5204-5-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201030023242.5204-1-dmitry.fomichev@wdc.com> References: <20201030023242.5204-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=216.71.153.141; envelope-from=prvs=56530b5a8=dmitry.fomichev@wdc.com; helo=esa3.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/10/29 22:32:49 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" nvme_write() now handles WRITE, WRITE ZEROES and ZONE_APPEND. Signed-off-by: Dmitry Fomichev Reviewed-by: Niklas Cassel Acked-by: Klaus Jensen --- hw/block/nvme.c | 72 +++++++++++++++++-------------------------- hw/block/trace-events | 1 - 2 files changed, 28 insertions(+), 45 deletions(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 3023774484..2a33540542 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1011,32 +1011,7 @@ invalid: return status | NVME_DNR; } -static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req) -{ - NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; - NvmeNamespace *ns = req->ns; - uint64_t slba = le64_to_cpu(rw->slba); - uint32_t nlb = (uint32_t)le16_to_cpu(rw->nlb) + 1; - uint64_t offset = nvme_l2b(ns, slba); - uint32_t count = nvme_l2b(ns, nlb); - uint16_t status; - - trace_pci_nvme_write_zeroes(nvme_cid(req), nvme_nsid(ns), slba, nlb); - - status = nvme_check_bounds(n, ns, slba, nlb); - if (status) { - trace_pci_nvme_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze); - return status; - } - - block_acct_start(blk_get_stats(req->ns->blkconf.blk), &req->acct, 0, - BLOCK_ACCT_WRITE); - req->aiocb = blk_aio_pwrite_zeroes(req->ns->blkconf.blk, offset, count, - BDRV_REQ_MAY_UNMAP, nvme_rw_cb, req); - return NVME_NO_COMPLETE; -} - -static uint16_t nvme_write(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_write(NvmeCtrl *n, NvmeRequest *req, bool wrz) { NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; NvmeNamespace *ns = req->ns; @@ -1050,10 +1025,12 @@ static uint16_t nvme_write(NvmeCtrl *n, NvmeRequest *req) trace_pci_nvme_write(nvme_cid(req), nvme_io_opc_str(rw->opcode), nvme_nsid(ns), nlb, data_size, slba); - status = nvme_check_mdts(n, data_size); - if (status) { - trace_pci_nvme_err_mdts(nvme_cid(req), data_size); - goto invalid; + if (!wrz) { + status = nvme_check_mdts(n, data_size); + if (status) { + trace_pci_nvme_err_mdts(nvme_cid(req), data_size); + goto invalid; + } } status = nvme_check_bounds(n, ns, slba, nlb); @@ -1062,21 +1039,28 @@ static uint16_t nvme_write(NvmeCtrl *n, NvmeRequest *req) goto invalid; } - status = nvme_map_dptr(n, data_size, req); - if (status) { - goto invalid; - } - data_offset = nvme_l2b(ns, slba); - block_acct_start(blk_get_stats(blk), &req->acct, data_size, - BLOCK_ACCT_WRITE); - if (req->qsg.sg) { - req->aiocb = dma_blk_write(blk, &req->qsg, data_offset, - BDRV_SECTOR_SIZE, nvme_rw_cb, req); + if (!wrz) { + status = nvme_map_dptr(n, data_size, req); + if (status) { + goto invalid; + } + + block_acct_start(blk_get_stats(blk), &req->acct, data_size, + BLOCK_ACCT_WRITE); + if (req->qsg.sg) { + req->aiocb = dma_blk_write(blk, &req->qsg, data_offset, + BDRV_SECTOR_SIZE, nvme_rw_cb, req); + } else { + req->aiocb = blk_aio_pwritev(blk, data_offset, &req->iov, 0, + nvme_rw_cb, req); + } } else { - req->aiocb = blk_aio_pwritev(blk, data_offset, &req->iov, 0, - nvme_rw_cb, req); + block_acct_start(blk_get_stats(blk), &req->acct, 0, BLOCK_ACCT_WRITE); + req->aiocb = blk_aio_pwrite_zeroes(blk, data_offset, data_size, + BDRV_REQ_MAY_UNMAP, nvme_rw_cb, + req); } return NVME_NO_COMPLETE; @@ -1110,9 +1094,9 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req) case NVME_CMD_FLUSH: return nvme_flush(n, req); case NVME_CMD_WRITE_ZEROES: - return nvme_write_zeroes(n, req); + return nvme_write(n, req, true); case NVME_CMD_WRITE: - return nvme_write(n, req); + return nvme_write(n, req, false); case NVME_CMD_READ: return nvme_read(n, req); default: diff --git a/hw/block/trace-events b/hw/block/trace-events index d81b1891d8..658633177d 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -43,7 +43,6 @@ pci_nvme_admin_cmd(uint16_t cid, uint16_t sqid, uint8_t opcode, const char *opna pci_nvme_read(uint16_t cid, uint32_t nsid, uint32_t nlb, uint64_t count, uint64_t lba) "cid %"PRIu16" nsid %"PRIu32" nlb %"PRIu32" count %"PRIu64" lba 0x%"PRIx64"" pci_nvme_write(uint16_t cid, const char *verb, uint32_t nsid, uint32_t nlb, uint64_t count, uint64_t lba) "cid %"PRIu16" opname '%s' nsid %"PRIu32" nlb %"PRIu32" count %"PRIu64" lba 0x%"PRIx64"" pci_nvme_rw_cb(uint16_t cid, const char *blkname) "cid %"PRIu16" blk '%s'" -pci_nvme_write_zeroes(uint16_t cid, uint32_t nsid, uint64_t slba, uint32_t nlb) "cid %"PRIu16" nsid %"PRIu32" slba %"PRIu64" nlb %"PRIu32"" pci_nvme_create_sq(uint64_t addr, uint16_t sqid, uint16_t cqid, uint16_t qsize, uint16_t qflags) "create submission queue, addr=0x%"PRIx64", sqid=%"PRIu16", cqid=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16"" pci_nvme_create_cq(uint64_t addr, uint16_t cqid, uint16_t vector, uint16_t size, uint16_t qflags, int ien) "create completion queue, addr=0x%"PRIx64", cqid=%"PRIu16", vector=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16", ien=%d" pci_nvme_del_sq(uint16_t qid) "deleting submission queue sqid=%"PRIu16"" From patchwork Fri Oct 30 02:32:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 11868317 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 800111592 for ; Fri, 30 Oct 2020 02:39:53 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E18732076E for ; Fri, 30 Oct 2020 02:39:52 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="WO/rXDZf" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E18732076E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:39284 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kYKKV-0007Cs-RM for patchwork-qemu-devel@patchwork.kernel.org; Thu, 29 Oct 2020 22:39:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41222) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYKDw-0005xm-KE; Thu, 29 Oct 2020 22:33:04 -0400 Received: from esa3.hgst.iphmx.com ([216.71.153.141]:10980) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYKDs-0006bi-VD; Thu, 29 Oct 2020 22:33:04 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1604025180; x=1635561180; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Oa5NG34I6PDC+YzEDLOvzph69s009+tEhRCxUswYRD4=; b=WO/rXDZfU9bh3SVxCarmSPvc3S/SJtPpaMMvH5F0MGlBXbO/Mga7vi79 pn82G4kCfGn9mH5i9jwshu2Mu9a4GC1rMZgUUBm8JjTbCZ20AxcZq3Pmi 3sgH35tFNUlDrXZEoq0mIwed3PagIZ0m03kfyClAv4d7cUbM0JCNLcwid Q25h7Gap4cT4QMlJRj7lCfjxmvIl4ewIqdG6Jwg2pTL8prsgaxdKCkObA LZslNEnUqOLnQsrg76B0r3N9/om/tU0ZHxf4Q5+I07pXVu+GwfbFbzTkS t36BMwazMcxHjL6pAryEgmZ2Fboe+zAJzEIncE3ZONNZ49fDDgtlee/Nh g==; IronPort-SDR: rY1Reao06LHoLi32MskAhrR/XTdp/N4hvcEjiFVbPTtqW0ap50eGOYrD1FlfIr/kuZK0K/rCh1 S33niPqRQc2xNiHIWN8iXs35YbW8b69qmsUu+VzAbjgQAdhZnTFe4SChgUXL3muwSgpXryGNH+ LFWcgSFUpyE/6TJLgIp9zuAuBe2wQP5kqBr6GzpyW+8J4nfTi7LNUYx6T7N1WBngJ6hLYnG+d5 xY/b8HSGePSeRnHdrb/W/idzTXv7VylY2YEJCtm5C3YxPLEA1UfP4tJhjwOyivIq6tHllmNfa6 4rU= X-IronPort-AV: E=Sophos;i="5.77,431,1596470400"; d="scan'208";a="155748077" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 30 Oct 2020 10:32:59 +0800 IronPort-SDR: BV5QXzYowmKIim/26EfS+pFAga5KQ9hf33/Z+YhfXhOBtiFNLau0wi9Uhznre946A/VPwTsZ5i NYtxe4HlKs+0nVrd3gykoq+6qu/pikfelbhBWT6khd+YJaqnsYviXpViQ1OHQh1kb4F3jNpWN3 +kRpFh8JaM7LO4mncKA/ke9C88RjOWCzYmHET83Ut5qnPC3/ypU6/9eK4CRVa+yjDZYaP2wrP2 6r1LHg/uvbOWy7HuTBLNG+XBVf4nEvnYf8GlLhaZPtnqbimi1oBCFvAgppJi58VKShkFjmGeml A1QIOzInKaYOtNz3UWYBYkAp Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2020 19:18:03 -0700 IronPort-SDR: Ltufx4PQUVn/yAAkGn8FLRi89nEict0SxnIkPnyfxEOfq8/9Cl1cLSPjPZz+t680SgpqGsM3RN wwbLeTAdZDGB+tlhk9DznLm9Z5wDiQd/QPcG3J4ZRSwTR6Lqlf1Ktd4TDRhYKV6nGM9exrI1NT /SglbjrF4TobmyakABJ8vxcmgGTD82EhQvbRytuskI6GLbA79gjRSmyaEm8rAbdXD5wjKvTHoo vOYepErJhZMLIsn7XTjtSZqc4fdiKbkqgQhzxAbN1OtEkRnKEnykR/ow5uNll2IqT9+561Dbwy Tes= WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip01.wdc.com with ESMTP; 29 Oct 2020 19:32:58 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v8 05/11] hw/block/nvme: Add support for Namespace Types Date: Fri, 30 Oct 2020 11:32:36 +0900 Message-Id: <20201030023242.5204-6-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201030023242.5204-1-dmitry.fomichev@wdc.com> References: <20201030023242.5204-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=216.71.153.141; envelope-from=prvs=56530b5a8=dmitry.fomichev@wdc.com; helo=esa3.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/10/29 22:32:49 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: Niklas Cassel Define the structures and constants required to implement Namespace Types support. Namespace Types introduce a new command set, "I/O Command Sets", that allows the host to retrieve the command sets associated with a namespace. Introduce support for the command set and enable detection for the NVM Command Set. The new workflows for identify commands rely heavily on zero-filled identify structs. E.g., certain CNS commands are defined to return a zero-filled identify struct when an inactive namespace NSID is supplied. Add a helper function in order to avoid code duplication when reporting zero-filled identify structures. Signed-off-by: Niklas Cassel Signed-off-by: Dmitry Fomichev Reviewed-by: Keith Busch --- hw/block/nvme-ns.c | 2 + hw/block/nvme-ns.h | 1 + hw/block/nvme.c | 188 +++++++++++++++++++++++++++++++++++------- hw/block/trace-events | 7 ++ include/block/nvme.h | 66 +++++++++++---- 5 files changed, 219 insertions(+), 45 deletions(-) diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index de735eb9f3..c0362426cc 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -41,6 +41,8 @@ static void nvme_ns_init(NvmeNamespace *ns) id_ns->nsze = cpu_to_le64(nvme_ns_nlbas(ns)); + ns->csi = NVME_CSI_NVM; + /* no thin provisioning */ id_ns->ncap = id_ns->nsze; id_ns->nuse = id_ns->ncap; diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index a38071884a..d795e44bab 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -31,6 +31,7 @@ typedef struct NvmeNamespace { int64_t size; NvmeIdNs id_ns; const uint32_t *iocs; + uint8_t csi; NvmeNamespaceParams params; } NvmeNamespace; diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 2a33540542..d9e9fd264c 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1336,7 +1336,7 @@ static uint16_t nvme_error_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len, DMA_DIRECTION_FROM_DEVICE, req); } -static uint16_t nvme_cmd_effects(NvmeCtrl *n, uint32_t buf_len, +static uint16_t nvme_cmd_effects(NvmeCtrl *n, uint8_t csi, uint32_t buf_len, uint64_t off, NvmeRequest *req) { NvmeEffectsLog log = {}; @@ -1351,8 +1351,15 @@ static uint16_t nvme_cmd_effects(NvmeCtrl *n, uint32_t buf_len, switch (NVME_CC_CSS(n->bar.cc)) { case NVME_CC_CSS_NVM: src_iocs = nvme_cse_iocs_nvm; + /* fall through */ case NVME_CC_CSS_ADMIN_ONLY: break; + case NVME_CC_CSS_CSI: + switch (csi) { + case NVME_CSI_NVM: + src_iocs = nvme_cse_iocs_nvm; + break; + } } memcpy(log.acs, nvme_cse_acs, sizeof(nvme_cse_acs)); @@ -1378,6 +1385,7 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req) uint8_t lid = dw10 & 0xff; uint8_t lsp = (dw10 >> 8) & 0xf; uint8_t rae = (dw10 >> 15) & 0x1; + uint8_t csi = le32_to_cpu(cmd->cdw14) >> 24; uint32_t numdl, numdu; uint64_t off, lpol, lpou; size_t len; @@ -1411,7 +1419,7 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req) case NVME_LOG_FW_SLOT_INFO: return nvme_fw_log_info(n, len, off, req); case NVME_LOG_CMD_EFFECTS: - return nvme_cmd_effects(n, len, off, req); + return nvme_cmd_effects(n, csi, len, off, req); default: trace_pci_nvme_err_invalid_log_page(nvme_cid(req), lid); return NVME_INVALID_FIELD | NVME_DNR; @@ -1524,6 +1532,13 @@ static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeRequest *req) return NVME_SUCCESS; } +static uint16_t nvme_rpt_empty_id_struct(NvmeCtrl *n, NvmeRequest *req) +{ + uint8_t id[NVME_IDENTIFY_DATA_SIZE] = {}; + + return nvme_dma(n, id, sizeof(id), DMA_DIRECTION_FROM_DEVICE, req); +} + static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeRequest *req) { trace_pci_nvme_identify_ctrl(); @@ -1532,11 +1547,23 @@ static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeRequest *req) DMA_DIRECTION_FROM_DEVICE, req); } +static uint16_t nvme_identify_ctrl_csi(NvmeCtrl *n, NvmeRequest *req) +{ + NvmeIdentify *c = (NvmeIdentify *)&req->cmd; + + trace_pci_nvme_identify_ctrl_csi(c->csi); + + if (c->csi == NVME_CSI_NVM) { + return nvme_rpt_empty_id_struct(n, req); + } + + return NVME_INVALID_FIELD | NVME_DNR; +} + static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req) { NvmeNamespace *ns; NvmeIdentify *c = (NvmeIdentify *)&req->cmd; - NvmeIdNs *id_ns, inactive = { 0 }; uint32_t nsid = le32_to_cpu(c->nsid); trace_pci_nvme_identify_ns(nsid); @@ -1547,23 +1574,46 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req) ns = nvme_ns(n, nsid); if (unlikely(!ns)) { - id_ns = &inactive; - } else { - id_ns = &ns->id_ns; + return nvme_rpt_empty_id_struct(n, req); } - return nvme_dma(n, (uint8_t *)id_ns, sizeof(NvmeIdNs), + return nvme_dma(n, (uint8_t *)&ns->id_ns, sizeof(NvmeIdNs), DMA_DIRECTION_FROM_DEVICE, req); } +static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req) +{ + NvmeNamespace *ns; + NvmeIdentify *c = (NvmeIdentify *)&req->cmd; + uint32_t nsid = le32_to_cpu(c->nsid); + + trace_pci_nvme_identify_ns_csi(nsid, c->csi); + + if (!nvme_nsid_valid(n, nsid) || nsid == NVME_NSID_BROADCAST) { + return NVME_INVALID_NSID | NVME_DNR; + } + + ns = nvme_ns(n, nsid); + if (unlikely(!ns)) { + return nvme_rpt_empty_id_struct(n, req); + } + + if (c->csi == NVME_CSI_NVM) { + return nvme_rpt_empty_id_struct(n, req); + } + + return NVME_INVALID_FIELD | NVME_DNR; +} + static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req) { + NvmeNamespace *ns; NvmeIdentify *c = (NvmeIdentify *)&req->cmd; - static const int data_len = NVME_IDENTIFY_DATA_SIZE; uint32_t min_nsid = le32_to_cpu(c->nsid); - uint32_t *list; - uint16_t ret; - int j = 0; + uint8_t list[NVME_IDENTIFY_DATA_SIZE] = {}; + static const int data_len = sizeof(list); + uint32_t *list_ptr = (uint32_t *)list; + int i, j = 0; trace_pci_nvme_identify_nslist(min_nsid); @@ -1577,20 +1627,61 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req) return NVME_INVALID_NSID | NVME_DNR; } - list = g_malloc0(data_len); - for (int i = 1; i <= n->num_namespaces; i++) { - if (i <= min_nsid || !nvme_ns(n, i)) { + for (i = 1; i <= n->num_namespaces; i++) { + ns = nvme_ns(n, i); + if (!ns) { continue; } - list[j++] = cpu_to_le32(i); + if (ns->params.nsid <= min_nsid) { + continue; + } + list_ptr[j++] = cpu_to_le32(ns->params.nsid); if (j == data_len / sizeof(uint32_t)) { break; } } - ret = nvme_dma(n, (uint8_t *)list, data_len, DMA_DIRECTION_FROM_DEVICE, - req); - g_free(list); - return ret; + + return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req); +} + +static uint16_t nvme_identify_nslist_csi(NvmeCtrl *n, NvmeRequest *req) +{ + NvmeNamespace *ns; + NvmeIdentify *c = (NvmeIdentify *)&req->cmd; + uint32_t min_nsid = le32_to_cpu(c->nsid); + uint8_t list[NVME_IDENTIFY_DATA_SIZE] = {}; + static const int data_len = sizeof(list); + uint32_t *list_ptr = (uint32_t *)list; + int i, j = 0; + + trace_pci_nvme_identify_nslist_csi(min_nsid, c->csi); + + /* + * Same as in nvme_identify_nslist(), 0xffffffff/0xfffffffe are invalid. + */ + if (min_nsid >= NVME_NSID_BROADCAST - 1) { + return NVME_INVALID_NSID | NVME_DNR; + } + + if (c->csi != NVME_CSI_NVM) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + for (i = 1; i <= n->num_namespaces; i++) { + ns = nvme_ns(n, i); + if (!ns) { + continue; + } + if (ns->params.nsid <= min_nsid) { + continue; + } + list_ptr[j++] = cpu_to_le32(ns->params.nsid); + if (j == data_len / sizeof(uint32_t)) { + break; + } + } + + return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req); } static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req) @@ -1598,13 +1689,17 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req) NvmeNamespace *ns; NvmeIdentify *c = (NvmeIdentify *)&req->cmd; uint32_t nsid = le32_to_cpu(c->nsid); - uint8_t list[NVME_IDENTIFY_DATA_SIZE]; + uint8_t list[NVME_IDENTIFY_DATA_SIZE] = {}; struct data { struct { NvmeIdNsDescr hdr; - uint8_t v[16]; + uint8_t v[NVME_NIDL_UUID]; } uuid; + struct { + NvmeIdNsDescr hdr; + uint8_t v; + } csi; }; struct data *ns_descrs = (struct data *)list; @@ -1620,19 +1715,31 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req) return NVME_INVALID_FIELD | NVME_DNR; } - memset(list, 0x0, sizeof(list)); - /* * Because the NGUID and EUI64 fields are 0 in the Identify Namespace data * structure, a Namespace UUID (nidt = 0x3) must be reported in the * Namespace Identification Descriptor. Add the namespace UUID here. */ ns_descrs->uuid.hdr.nidt = NVME_NIDT_UUID; - ns_descrs->uuid.hdr.nidl = NVME_NIDT_UUID_LEN; - memcpy(&ns_descrs->uuid.v, ns->params.uuid.data, NVME_NIDT_UUID_LEN); + ns_descrs->uuid.hdr.nidl = NVME_NIDL_UUID; + memcpy(&ns_descrs->uuid.v, ns->params.uuid.data, NVME_NIDL_UUID); - return nvme_dma(n, list, NVME_IDENTIFY_DATA_SIZE, - DMA_DIRECTION_FROM_DEVICE, req); + ns_descrs->csi.hdr.nidt = NVME_NIDT_CSI; + ns_descrs->csi.hdr.nidl = NVME_NIDL_CSI; + ns_descrs->csi.v = ns->csi; + + return nvme_dma(n, list, sizeof(list), DMA_DIRECTION_FROM_DEVICE, req); +} + +static uint16_t nvme_identify_cmd_set(NvmeCtrl *n, NvmeRequest *req) +{ + uint8_t list[NVME_IDENTIFY_DATA_SIZE] = {}; + static const int data_len = sizeof(list); + + trace_pci_nvme_identify_cmd_set(); + + NVME_SET_CSI(*list, NVME_CSI_NVM); + return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req); } static uint16_t nvme_identify(NvmeCtrl *n, NvmeRequest *req) @@ -1642,12 +1749,20 @@ static uint16_t nvme_identify(NvmeCtrl *n, NvmeRequest *req) switch (le32_to_cpu(c->cns)) { case NVME_ID_CNS_NS: return nvme_identify_ns(n, req); + case NVME_ID_CNS_CS_NS: + return nvme_identify_ns_csi(n, req); case NVME_ID_CNS_CTRL: return nvme_identify_ctrl(n, req); + case NVME_ID_CNS_CS_CTRL: + return nvme_identify_ctrl_csi(n, req); case NVME_ID_CNS_NS_ACTIVE_LIST: return nvme_identify_nslist(n, req); + case NVME_ID_CNS_CS_NS_ACTIVE_LIST: + return nvme_identify_nslist_csi(n, req); case NVME_ID_CNS_NS_DESCR_LIST: return nvme_identify_ns_descr_list(n, req); + case NVME_ID_CNS_IO_COMMAND_SET: + return nvme_identify_cmd_set(n, req); default: trace_pci_nvme_err_invalid_identify_cns(le32_to_cpu(c->cns)); return NVME_INVALID_FIELD | NVME_DNR; @@ -1828,7 +1943,9 @@ defaults: if (iv == n->admin_cq.vector) { result |= NVME_INTVC_NOCOALESCING; } - + break; + case NVME_COMMAND_SET_PROFILE: + result = 0; break; default: result = nvme_feature_default[fid]; @@ -1969,6 +2086,12 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req) break; case NVME_TIMESTAMP: return nvme_set_feature_timestamp(n, req); + case NVME_COMMAND_SET_PROFILE: + if (dw11 & 0x1ff) { + trace_pci_nvme_err_invalid_iocsci(dw11 & 0x1ff); + return NVME_CMD_SET_CMB_REJECTED | NVME_DNR; + } + break; default: return NVME_FEAT_NOT_CHANGEABLE | NVME_DNR; } @@ -2125,8 +2248,12 @@ static void nvme_select_ns_iocs(NvmeCtrl *n) continue; } ns->iocs = nvme_cse_iocs_none; - if (NVME_CC_CSS(n->bar.cc) != NVME_CC_CSS_ADMIN_ONLY) { - ns->iocs = nvme_cse_iocs_nvm; + switch (ns->csi) { + case NVME_CSI_NVM: + if (NVME_CC_CSS(n->bar.cc) != NVME_CC_CSS_ADMIN_ONLY) { + ns->iocs = nvme_cse_iocs_nvm; + } + break; } } } @@ -2868,6 +2995,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev) NVME_CAP_SET_CQR(n->bar.cap, 1); NVME_CAP_SET_TO(n->bar.cap, 0xf); NVME_CAP_SET_CSS(n->bar.cap, NVME_CAP_CSS_NVM); + NVME_CAP_SET_CSS(n->bar.cap, NVME_CAP_CSS_CSI_SUPP); NVME_CAP_SET_CSS(n->bar.cap, NVME_CAP_CSS_ADMIN_ONLY); NVME_CAP_SET_MPSMAX(n->bar.cap, 4); diff --git a/hw/block/trace-events b/hw/block/trace-events index 658633177d..051e115f47 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -48,8 +48,12 @@ pci_nvme_create_cq(uint64_t addr, uint16_t cqid, uint16_t vector, uint16_t size, pci_nvme_del_sq(uint16_t qid) "deleting submission queue sqid=%"PRIu16"" pci_nvme_del_cq(uint16_t cqid) "deleted completion queue, cqid=%"PRIu16"" pci_nvme_identify_ctrl(void) "identify controller" +pci_nvme_identify_ctrl_csi(uint8_t csi) "identify controller, csi=0x%"PRIx8"" pci_nvme_identify_ns(uint32_t ns) "nsid %"PRIu32"" +pci_nvme_identify_ns_csi(uint32_t ns, uint8_t csi) "nsid=%"PRIu32", csi=0x%"PRIx8"" pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32"" +pci_nvme_identify_nslist_csi(uint16_t ns, uint8_t csi) "nsid=%"PRIu16", csi=0x%"PRIx8"" +pci_nvme_identify_cmd_set(void) "identify i/o command set" pci_nvme_identify_ns_descr_list(uint32_t ns) "nsid %"PRIu32"" pci_nvme_get_log(uint16_t cid, uint8_t lid, uint8_t lsp, uint8_t rae, uint32_t len, uint64_t off) "cid %"PRIu16" lid 0x%"PRIx8" lsp 0x%"PRIx8" rae 0x%"PRIx8" len %"PRIu32" off %"PRIu64"" pci_nvme_getfeat(uint16_t cid, uint32_t nsid, uint8_t fid, uint8_t sel, uint32_t cdw11) "cid %"PRIu16" nsid 0x%"PRIx32" fid 0x%"PRIx8" sel 0x%"PRIx8" cdw11 0x%"PRIx32"" @@ -105,6 +109,8 @@ pci_nvme_err_invalid_opc(uint8_t opc) "invalid opcode 0x%"PRIx8"" pci_nvme_err_invalid_admin_opc(uint8_t opc) "invalid admin opcode 0x%"PRIx8"" pci_nvme_err_invalid_lba_range(uint64_t start, uint64_t len, uint64_t limit) "Invalid LBA start=%"PRIu64" len=%"PRIu64" limit=%"PRIu64"" pci_nvme_err_invalid_log_page_offset(uint64_t ofs, uint64_t size) "must be <= %"PRIu64", got %"PRIu64"" +pci_nvme_err_only_nvm_cmd_set_avail(void) "setting 110b CC.CSS, but only NVM command set is enabled" +pci_nvme_err_invalid_iocsci(uint32_t idx) "unsupported command set combination index %"PRIu32"" pci_nvme_err_invalid_del_sq(uint16_t qid) "invalid submission queue deletion, sid=%"PRIu16"" pci_nvme_err_invalid_create_sq_cqid(uint16_t cqid) "failed creating submission queue, invalid cqid=%"PRIu16"" pci_nvme_err_invalid_create_sq_sqid(uint16_t sqid) "failed creating submission queue, invalid sqid=%"PRIu16"" @@ -161,6 +167,7 @@ pci_nvme_ub_db_wr_invalid_cq(uint32_t qid) "completion queue doorbell write for pci_nvme_ub_db_wr_invalid_cqhead(uint32_t qid, uint16_t new_head) "completion queue doorbell write value beyond queue size, cqid=%"PRIu32", new_head=%"PRIu16", ignoring" pci_nvme_ub_db_wr_invalid_sq(uint32_t qid) "submission queue doorbell write for nonexistent queue, sqid=%"PRIu32", ignoring" pci_nvme_ub_db_wr_invalid_sqtail(uint32_t qid, uint16_t new_tail) "submission queue doorbell write value beyond queue size, sqid=%"PRIu32", new_head=%"PRIu16", ignoring" +pci_nvme_ub_unknown_css_value(void) "unknown value in cc.css field" # xen-block.c xen_block_realize(const char *type, uint32_t disk, uint32_t partition) "%s d%up%u" diff --git a/include/block/nvme.h b/include/block/nvme.h index 4779495b7d..bcf695c646 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -84,6 +84,7 @@ enum NvmeCapMask { enum NvmeCapCss { NVME_CAP_CSS_NVM = 1 << 0, + NVME_CAP_CSS_CSI_SUPP = 1 << 6, NVME_CAP_CSS_ADMIN_ONLY = 1 << 7, }; @@ -117,9 +118,25 @@ enum NvmeCcMask { enum NvmeCcCss { NVME_CC_CSS_NVM = 0x0, + NVME_CC_CSS_CSI = 0x6, NVME_CC_CSS_ADMIN_ONLY = 0x7, }; +#define NVME_SET_CC_EN(cc, val) \ + (cc |= (uint32_t)((val) & CC_EN_MASK) << CC_EN_SHIFT) +#define NVME_SET_CC_CSS(cc, val) \ + (cc |= (uint32_t)((val) & CC_CSS_MASK) << CC_CSS_SHIFT) +#define NVME_SET_CC_MPS(cc, val) \ + (cc |= (uint32_t)((val) & CC_MPS_MASK) << CC_MPS_SHIFT) +#define NVME_SET_CC_AMS(cc, val) \ + (cc |= (uint32_t)((val) & CC_AMS_MASK) << CC_AMS_SHIFT) +#define NVME_SET_CC_SHN(cc, val) \ + (cc |= (uint32_t)((val) & CC_SHN_MASK) << CC_SHN_SHIFT) +#define NVME_SET_CC_IOSQES(cc, val) \ + (cc |= (uint32_t)((val) & CC_IOSQES_MASK) << CC_IOSQES_SHIFT) +#define NVME_SET_CC_IOCQES(cc, val) \ + (cc |= (uint32_t)((val) & CC_IOCQES_MASK) << CC_IOCQES_SHIFT) + enum NvmeCstsShift { CSTS_RDY_SHIFT = 0, CSTS_CFS_SHIFT = 1, @@ -534,8 +551,13 @@ typedef struct QEMU_PACKED NvmeIdentify { uint64_t rsvd2[2]; uint64_t prp1; uint64_t prp2; - uint32_t cns; - uint32_t rsvd11[5]; + uint8_t cns; + uint8_t rsvd10; + uint16_t ctrlid; + uint16_t nvmsetid; + uint8_t rsvd11; + uint8_t csi; + uint32_t rsvd12[4]; } NvmeIdentify; typedef struct QEMU_PACKED NvmeRwCmd { @@ -655,6 +677,7 @@ enum NvmeStatusCodes { NVME_MD_SGL_LEN_INVALID = 0x0010, NVME_SGL_DESCR_TYPE_INVALID = 0x0011, NVME_INVALID_USE_OF_CMB = 0x0012, + NVME_CMD_SET_CMB_REJECTED = 0x002b, NVME_LBA_RANGE = 0x0080, NVME_CAP_EXCEEDED = 0x0081, NVME_NS_NOT_READY = 0x0082, @@ -781,11 +804,15 @@ typedef struct QEMU_PACKED NvmePSD { #define NVME_IDENTIFY_DATA_SIZE 4096 -enum { - NVME_ID_CNS_NS = 0x0, - NVME_ID_CNS_CTRL = 0x1, - NVME_ID_CNS_NS_ACTIVE_LIST = 0x2, - NVME_ID_CNS_NS_DESCR_LIST = 0x3, +enum NvmeIdCns { + NVME_ID_CNS_NS = 0x00, + NVME_ID_CNS_CTRL = 0x01, + NVME_ID_CNS_NS_ACTIVE_LIST = 0x02, + NVME_ID_CNS_NS_DESCR_LIST = 0x03, + NVME_ID_CNS_CS_NS = 0x05, + NVME_ID_CNS_CS_CTRL = 0x06, + NVME_ID_CNS_CS_NS_ACTIVE_LIST = 0x07, + NVME_ID_CNS_IO_COMMAND_SET = 0x1c, }; typedef struct QEMU_PACKED NvmeIdCtrl { @@ -933,6 +960,7 @@ enum NvmeFeatureIds { NVME_WRITE_ATOMICITY = 0xa, NVME_ASYNCHRONOUS_EVENT_CONF = 0xb, NVME_TIMESTAMP = 0xe, + NVME_COMMAND_SET_PROFILE = 0x19, NVME_SOFTWARE_PROGRESS_MARKER = 0x80, NVME_FID_MAX = 0x100, }; @@ -1017,18 +1045,26 @@ typedef struct QEMU_PACKED NvmeIdNsDescr { uint8_t rsvd2[2]; } NvmeIdNsDescr; -enum { - NVME_NIDT_EUI64_LEN = 8, - NVME_NIDT_NGUID_LEN = 16, - NVME_NIDT_UUID_LEN = 16, +enum NvmeNsIdentifierLength { + NVME_NIDL_EUI64 = 8, + NVME_NIDL_NGUID = 16, + NVME_NIDL_UUID = 16, + NVME_NIDL_CSI = 1, }; enum NvmeNsIdentifierType { - NVME_NIDT_EUI64 = 0x1, - NVME_NIDT_NGUID = 0x2, - NVME_NIDT_UUID = 0x3, + NVME_NIDT_EUI64 = 0x01, + NVME_NIDT_NGUID = 0x02, + NVME_NIDT_UUID = 0x03, + NVME_NIDT_CSI = 0x04, }; +enum NvmeCsi { + NVME_CSI_NVM = 0x00, +}; + +#define NVME_SET_CSI(vec, csi) (vec |= (uint8_t)(1 << (csi))) + /*Deallocate Logical Block Features*/ #define NVME_ID_NS_DLFEAT_GUARD_CRC(dlfeat) ((dlfeat) & 0x10) #define NVME_ID_NS_DLFEAT_WRITE_ZEROES(dlfeat) ((dlfeat) & 0x08) @@ -1079,8 +1115,8 @@ static inline void _nvme_check_size(void) QEMU_BUILD_BUG_ON(sizeof(NvmeSmartLog) != 512); QEMU_BUILD_BUG_ON(sizeof(NvmeEffectsLog) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrl) != 4096); + QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsDescr) != 4); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNs) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeSglDescriptor) != 16); - QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsDescr) != 4); } #endif From patchwork Fri Oct 30 02:32:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 11868321 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2F6B26A2 for ; Fri, 30 Oct 2020 02:41:29 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B48A52076E for ; Fri, 30 Oct 2020 02:41:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="JfqwQDgl" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B48A52076E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:43934 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kYKM3-0000kf-O2 for patchwork-qemu-devel@patchwork.kernel.org; Thu, 29 Oct 2020 22:41:27 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41230) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYKDx-00060p-LZ; Thu, 29 Oct 2020 22:33:05 -0400 Received: from esa3.hgst.iphmx.com ([216.71.153.141]:10998) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYKDv-0006dn-4y; Thu, 29 Oct 2020 22:33:05 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1604025182; x=1635561182; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=d314ib4hspwnmoqEeUQZZ7DJ78AQXSVepnrOba+JKFQ=; b=JfqwQDglxUdIj9rje7e6pmHpGYMuaYcdJFw32FBpL6BiMSfDKdvm3xB1 sgVWOK0OCya75c6sBE1AOg5iOtCLlcxDGlCtV//6Vw1PrVVh95ltDJyOS HTQCUq5b6gfmQOg/1Rt93CsNYfDSWwnGrrTFB5WARzE+gfHuOusCRU96J NqizK9a0441a+ayUf6qeRlv9LEklNyJA75CeREMvfg9u40OL1ec5/FnBF LP/NQ5bp2WNNa8y01QZIm8LR9J+ETDrgSWltucQNdDg3L4sSRYhRswYZo Oh1qNgvLSmpa4RLW8A9hph0u1jfqLPjK0xgmMk6WMKbVQtS+EoZnotjaa A==; IronPort-SDR: 84X6mZ9+HzDIrMGZ4D4jMEEJJ4UaeyIYaNu9YqnvCJ3qBVN5iHz1axb+nbkJFiciMTplz++HwA mWlZC+fejMAuDJMhZurWBjxhkR1JZXwdbGSqdMOcNEEScBYN6EDEMyEDFvDCzccbeZ+0SIYMWS Ab/GcS7XY36wtywJE1DUrRGDeUND5Lp18y+Rg7C5hJGs7SCgKN/DSFe6shNoJKR8ovSgDbwLE5 KFKTOjsihiuko3REOVO+E+OjgXb9T9QMpMQ9yV2db/OY4qmB1GLsGEtTHT4BLMrP95cSfVe1wt /z8= X-IronPort-AV: E=Sophos;i="5.77,431,1596470400"; d="scan'208";a="155748081" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 30 Oct 2020 10:33:01 +0800 IronPort-SDR: bhNGC/PS/xfaJc/5v45xp5p61uiUhsE3YnMEN46HpWSj+aFmvxrzoWje258tG22MvwghPmFT1g aSTqEixbZ0lKbrHamrrIsRshODz76zhv1xMyt5TTmHRBsjeT/PmyiQgfa1l/PEWfICUQXyhPtK sNulnw6QpAbv91j29sMXq65ccZe1sKjsNIj5HCYZR8MOlAipGH6ywGv7XKZBXdzCmAxx5IjcGX 36Jl2fwikJ9L2oEegd0p1rbnd3tFyhGjYsYBSDXagC4F3d2Ox9PuTFHBW+t6Q1ROFuwxesPDdA 83IpU1m9V8qKHaxFoTeg+fsJ Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2020 19:18:06 -0700 IronPort-SDR: M6c6EHgPNdyABrgbuB6eREJm2ZIJzijR0C4rvzOxffNVyiCs9T3SGiHDKMZGCGsRMDSe62Qak9 Xm8ldDhljI/HUYxn9u0Ef4mic+WIJCKBkK+vAMNARnFdHiE2dqMvajasmf/86Crj830Av4Fw0I En8im+pFlEofGqzz1PRTzGVIJJgfUIl5FQnJSvx3wqq8vedr5nVBIoY3wgE2DBTov8lPmf1ckt o4GysTyPdly91TbWsKZFYJuje3FNB4RJNwClgt9uzBt02Rwp5cY5EwoPWk0oAWmF1ZnStzsOva siE= WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip01.wdc.com with ESMTP; 29 Oct 2020 19:33:01 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v8 06/11] hw/block/nvme: Support allocated CNS command variants Date: Fri, 30 Oct 2020 11:32:37 +0900 Message-Id: <20201030023242.5204-7-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201030023242.5204-1-dmitry.fomichev@wdc.com> References: <20201030023242.5204-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=216.71.153.141; envelope-from=prvs=56530b5a8=dmitry.fomichev@wdc.com; helo=esa3.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/10/29 22:32:49 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: Niklas Cassel Many CNS commands have "allocated" command variants. These include a namespace as long as it is allocated, that is a namespace is included regardless if it is active (attached) or not. While these commands are optional (they are mandatory for controllers supporting the namespace attachment command), our QEMU implementation is more complete by actually providing support for these CNS values. However, since our QEMU model currently does not support the namespace attachment command, these new allocated CNS commands will return the same result as the active CNS command variants. In NVMe, a namespace is active if it exists and is attached to the controller. Add a new Boolean namespace flag, "attached", to provide the most basic namespace attachment support. The default value for this new flag is true. Also, implement the logic in the new CNS values to include/exclude namespaces based on this new property. The only thing missing is hooking up the actual Namespace Attachment command opcode, which will allow a user to toggle the "attached" flag per namespace. The reason for not hooking up this command completely is because the NVMe specification requires the namespace management command to be supported if the namespace attachment command is supported. Signed-off-by: Niklas Cassel Signed-off-by: Dmitry Fomichev Reviewed-by: Keith Busch --- hw/block/nvme-ns.c | 1 + hw/block/nvme-ns.h | 1 + hw/block/nvme.c | 68 ++++++++++++++++++++++++++++++++++++-------- include/block/nvme.h | 20 +++++++------ 4 files changed, 70 insertions(+), 20 deletions(-) diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index c0362426cc..e191ef9be0 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -42,6 +42,7 @@ static void nvme_ns_init(NvmeNamespace *ns) id_ns->nsze = cpu_to_le64(nvme_ns_nlbas(ns)); ns->csi = NVME_CSI_NVM; + ns->attached = true; /* no thin provisioning */ id_ns->ncap = id_ns->nsze; diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index d795e44bab..2d9cd29d07 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -31,6 +31,7 @@ typedef struct NvmeNamespace { int64_t size; NvmeIdNs id_ns; const uint32_t *iocs; + bool attached; uint8_t csi; NvmeNamespaceParams params; diff --git a/hw/block/nvme.c b/hw/block/nvme.c index d9e9fd264c..7b3eafad03 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1084,6 +1084,9 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req) if (unlikely(!req->ns)) { return NVME_INVALID_FIELD | NVME_DNR; } + if (!req->ns->attached) { + return NVME_INVALID_FIELD | NVME_DNR; + } if (!(req->ns->iocs[req->cmd.opcode] & NVME_CMD_EFF_CSUPP)) { trace_pci_nvme_err_invalid_opc(req->cmd.opcode); @@ -1245,6 +1248,7 @@ static uint16_t nvme_smart_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len, uint32_t trans_len; NvmeNamespace *ns; time_t current_ms; + int i; if (off >= sizeof(smart)) { return NVME_INVALID_FIELD | NVME_DNR; @@ -1255,15 +1259,18 @@ static uint16_t nvme_smart_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len, if (!ns) { return NVME_INVALID_NSID | NVME_DNR; } - nvme_set_blk_stats(ns, &stats); + if (ns->attached) { + nvme_set_blk_stats(ns, &stats); + } } else { - int i; - for (i = 1; i <= n->num_namespaces; i++) { ns = nvme_ns(n, i); if (!ns) { continue; } + if (!ns->attached) { + continue; + } nvme_set_blk_stats(ns, &stats); } } @@ -1560,7 +1567,8 @@ static uint16_t nvme_identify_ctrl_csi(NvmeCtrl *n, NvmeRequest *req) return NVME_INVALID_FIELD | NVME_DNR; } -static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req, + bool only_active) { NvmeNamespace *ns; NvmeIdentify *c = (NvmeIdentify *)&req->cmd; @@ -1577,11 +1585,16 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req) return nvme_rpt_empty_id_struct(n, req); } + if (only_active && !ns->attached) { + return nvme_rpt_empty_id_struct(n, req); + } + return nvme_dma(n, (uint8_t *)&ns->id_ns, sizeof(NvmeIdNs), DMA_DIRECTION_FROM_DEVICE, req); } -static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req, + bool only_active) { NvmeNamespace *ns; NvmeIdentify *c = (NvmeIdentify *)&req->cmd; @@ -1598,6 +1611,10 @@ static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req) return nvme_rpt_empty_id_struct(n, req); } + if (only_active && !ns->attached) { + return nvme_rpt_empty_id_struct(n, req); + } + if (c->csi == NVME_CSI_NVM) { return nvme_rpt_empty_id_struct(n, req); } @@ -1605,7 +1622,8 @@ static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req) return NVME_INVALID_FIELD | NVME_DNR; } -static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req, + bool only_active) { NvmeNamespace *ns; NvmeIdentify *c = (NvmeIdentify *)&req->cmd; @@ -1635,6 +1653,9 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req) if (ns->params.nsid <= min_nsid) { continue; } + if (only_active && !ns->attached) { + continue; + } list_ptr[j++] = cpu_to_le32(ns->params.nsid); if (j == data_len / sizeof(uint32_t)) { break; @@ -1644,7 +1665,8 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req) return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req); } -static uint16_t nvme_identify_nslist_csi(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_identify_nslist_csi(NvmeCtrl *n, NvmeRequest *req, + bool only_active) { NvmeNamespace *ns; NvmeIdentify *c = (NvmeIdentify *)&req->cmd; @@ -1675,6 +1697,9 @@ static uint16_t nvme_identify_nslist_csi(NvmeCtrl *n, NvmeRequest *req) if (ns->params.nsid <= min_nsid) { continue; } + if (only_active && !ns->attached) { + continue; + } list_ptr[j++] = cpu_to_le32(ns->params.nsid); if (j == data_len / sizeof(uint32_t)) { break; @@ -1748,17 +1773,25 @@ static uint16_t nvme_identify(NvmeCtrl *n, NvmeRequest *req) switch (le32_to_cpu(c->cns)) { case NVME_ID_CNS_NS: - return nvme_identify_ns(n, req); + return nvme_identify_ns(n, req, true); case NVME_ID_CNS_CS_NS: - return nvme_identify_ns_csi(n, req); + return nvme_identify_ns_csi(n, req, true); + case NVME_ID_CNS_NS_PRESENT: + return nvme_identify_ns(n, req, false); + case NVME_ID_CNS_CS_NS_PRESENT: + return nvme_identify_ns_csi(n, req, false); case NVME_ID_CNS_CTRL: return nvme_identify_ctrl(n, req); case NVME_ID_CNS_CS_CTRL: return nvme_identify_ctrl_csi(n, req); case NVME_ID_CNS_NS_ACTIVE_LIST: - return nvme_identify_nslist(n, req); + return nvme_identify_nslist(n, req, true); case NVME_ID_CNS_CS_NS_ACTIVE_LIST: - return nvme_identify_nslist_csi(n, req); + return nvme_identify_nslist_csi(n, req, true); + case NVME_ID_CNS_NS_PRESENT_LIST: + return nvme_identify_nslist(n, req, false); + case NVME_ID_CNS_CS_NS_PRESENT_LIST: + return nvme_identify_nslist_csi(n, req, false); case NVME_ID_CNS_NS_DESCR_LIST: return nvme_identify_ns_descr_list(n, req); case NVME_ID_CNS_IO_COMMAND_SET: @@ -1831,6 +1864,7 @@ static uint16_t nvme_get_feature_timestamp(NvmeCtrl *n, NvmeRequest *req) static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeRequest *req) { + NvmeNamespace *ns; NvmeCmd *cmd = &req->cmd; uint32_t dw10 = le32_to_cpu(cmd->cdw10); uint32_t dw11 = le32_to_cpu(cmd->cdw11); @@ -1862,7 +1896,11 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeRequest *req) return NVME_INVALID_NSID | NVME_DNR; } - if (!nvme_ns(n, nsid)) { + ns = nvme_ns(n, nsid); + if (!ns) { + return NVME_INVALID_FIELD | NVME_DNR; + } + if (!ns->attached) { return NVME_INVALID_FIELD | NVME_DNR; } } @@ -2004,6 +2042,9 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req) if (unlikely(!ns)) { return NVME_INVALID_FIELD | NVME_DNR; } + if (!ns->attached) { + return NVME_INVALID_FIELD | NVME_DNR; + } } } else if (nsid && nsid != NVME_NSID_BROADCAST) { if (!nvme_nsid_valid(n, nsid)) { @@ -2051,6 +2092,9 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req) if (!ns) { continue; } + if (!ns->attached) { + continue; + } if (!(dw11 & 0x1) && blk_enable_write_cache(ns->blkconf.blk)) { blk_flush(ns->blkconf.blk); diff --git a/include/block/nvme.h b/include/block/nvme.h index bcf695c646..3653b4aefc 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -805,14 +805,18 @@ typedef struct QEMU_PACKED NvmePSD { #define NVME_IDENTIFY_DATA_SIZE 4096 enum NvmeIdCns { - NVME_ID_CNS_NS = 0x00, - NVME_ID_CNS_CTRL = 0x01, - NVME_ID_CNS_NS_ACTIVE_LIST = 0x02, - NVME_ID_CNS_NS_DESCR_LIST = 0x03, - NVME_ID_CNS_CS_NS = 0x05, - NVME_ID_CNS_CS_CTRL = 0x06, - NVME_ID_CNS_CS_NS_ACTIVE_LIST = 0x07, - NVME_ID_CNS_IO_COMMAND_SET = 0x1c, + NVME_ID_CNS_NS = 0x00, + NVME_ID_CNS_CTRL = 0x01, + NVME_ID_CNS_NS_ACTIVE_LIST = 0x02, + NVME_ID_CNS_NS_DESCR_LIST = 0x03, + NVME_ID_CNS_CS_NS = 0x05, + NVME_ID_CNS_CS_CTRL = 0x06, + NVME_ID_CNS_CS_NS_ACTIVE_LIST = 0x07, + NVME_ID_CNS_NS_PRESENT_LIST = 0x10, + NVME_ID_CNS_NS_PRESENT = 0x11, + NVME_ID_CNS_CS_NS_PRESENT_LIST = 0x1a, + NVME_ID_CNS_CS_NS_PRESENT = 0x1b, + NVME_ID_CNS_IO_COMMAND_SET = 0x1c, }; typedef struct QEMU_PACKED NvmeIdCtrl { From patchwork Fri Oct 30 02:32:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 11868305 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8D82E61C for ; Fri, 30 Oct 2020 02:36:46 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id EC4F120739 for ; Fri, 30 Oct 2020 02:36:43 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="GYj+hdJ0" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EC4F120739 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:57640 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kYKHS-0003CI-Pu for patchwork-qemu-devel@patchwork.kernel.org; Thu, 29 Oct 2020 22:36:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41268) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYKE2-0006CJ-El; Thu, 29 Oct 2020 22:33:10 -0400 Received: from esa3.hgst.iphmx.com ([216.71.153.141]:10980) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYKDx-0006bi-LA; Thu, 29 Oct 2020 22:33:10 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1604025185; x=1635561185; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=BEC5MqIBZllbG4GdXksNrI/GzE16zo2NDNlpfssRlB4=; b=GYj+hdJ020WLp5ljHMqpmZSefEpfamZe/VkNzk1/KLnzPi2LBy7RI2RT lWpVJvwb9l47Enk0gh3s+U6CWVLuO4dhDMlIORhYW4NUiXleWOCJ0zue1 g3upWB0wI8Iz6jTWnn0UY1J9CvoF9VA5JgiIllPHdesccn6gOGbWURp0j pifuwAxAWjYkn60y90VbnU3DHhXoCmAqqf6clobvdnfKGpviqa5/TvMG9 rniUxxjp/D3ivh4iH0FnF6T/p3DdydQQiKcKN7Q4EC/lgMfaND6rQAT5O WN0Or0j8MyC79cUfvcrU9xeHbxu424ZOZ0xss/iebbmjs5ySS0MDCC9WZ g==; IronPort-SDR: QyZba68Xb8KGXxz/vdip0ZphFLvEOFJyXMfU+4OrmoiUVMtQ6MmcADBFJJY/hF3tyH2mkocjPN S2hEVI5apc3RVten0cEv+vriw2X1kJKnV1Eg4M+w8R9ZrCZXbzOwesTxqXz8aa+MF5Hf0N8+hq Dt2zi6nUa9/7Ua6WKvxcIuAWj74t5HfojX25aqgs3ZkfLJAqQL29rsww4NbXzdz9zvX34snq7D NTvvqDVu3SvBa2xvKIf8aK5RVMnT3RVD+vdfhCgUDbeUXRH+IdDtJQHXaqUUFAlKd/uOaTFdOI Co4= X-IronPort-AV: E=Sophos;i="5.77,431,1596470400"; d="scan'208";a="155748084" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 30 Oct 2020 10:33:03 +0800 IronPort-SDR: esplyxTfwucJqABax7LUlZGOZTxfZPInkep7JQpPTzPMAqyEU8oeWdU5IsWT1gfKuNEnEPoYFg wFShM00PUYZvgYFGq6SLAsR1bIBFwsZjEspV2kARGAKNe0E++ZngPguNxoGVTQyhj/5UEDlP4G 9+hTFPvesORpiQ6s98xYgqLvjZdHJDyumj3y6to9ATPicMUHTmKlg8u1kiToz4uRBVr+h+nbvL oFNNYA5GGGf7qwW8bUE/RfupvxR9WpC3Xyw0C6y4FXtFYlil0oadi856x3kmPCGjna8ICFpCNS 8RHrdng9a5PeOoiJh7xhY/fR Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2020 19:18:08 -0700 IronPort-SDR: E4nHlLGmVYR807jJRyi68UAFIjKkkXTbFjdzQcQqkhBOnmVKg+Zs6affpXCVSOm9y0wc3xnc18 4oGf/Fob/+ahXa9gCi/3lcxcxwHrenM/HD4F4uq/6hHxCcB7sYa4ycHobwJBbqhJ2CpnsXsEwC 8spMFvMyazZm6UfFnQUdBWfTHZl4TNtvLXwsdJZbc4yFm3+DfuxKYUTOaS587AO/N+zXY6BC7o w6KORoufVpxzYAC9RNOMhkt9rddSMmGdIMbKUaUuOepFbY4LVtr3c1kC2rvRICp1DZlqdBKAaA 09w= WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip01.wdc.com with ESMTP; 29 Oct 2020 19:33:03 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v8 07/11] hw/block/nvme: Support Zoned Namespace Command Set Date: Fri, 30 Oct 2020 11:32:38 +0900 Message-Id: <20201030023242.5204-8-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201030023242.5204-1-dmitry.fomichev@wdc.com> References: <20201030023242.5204-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=216.71.153.141; envelope-from=prvs=56530b5a8=dmitry.fomichev@wdc.com; helo=esa3.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/10/29 22:32:49 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" The emulation code has been changed to advertise NVM Command Set when "zoned" device property is not set (default) and Zoned Namespace Command Set otherwise. Define values and structures that are needed to support Zoned Namespace Command Set (NVMe TP 4053) in PCI NVMe controller emulator. Define trace events where needed in newly introduced code. In order to improve scalability, all open, closed and full zones are organized in separate linked lists. Consequently, almost all zone operations don't require scanning of the entire zone array (which potentially can be quite large) - it is only necessary to enumerate one or more zone lists. Handlers for three new NVMe commands introduced in Zoned Namespace Command Set specification are added, namely for Zone Management Receive, Zone Management Send and Zone Append. Device initialization code has been extended to create a proper configuration for zoned operation using device properties. Read/Write command handler is modified to only allow writes at the write pointer if the namespace is zoned. For Zone Append command, writes implicitly happen at the write pointer and the starting write pointer value is returned as the result of the command. Write Zeroes handler is modified to add zoned checks that are identical to those done as a part of Write flow. Subsequent commits in this series add ZDE support and checks for active and open zone limits. Signed-off-by: Niklas Cassel Signed-off-by: Hans Holmberg Signed-off-by: Ajay Joshi Signed-off-by: Chaitanya Kulkarni Signed-off-by: Matias Bjorling Signed-off-by: Aravind Ramesh Signed-off-by: Shin'ichiro Kawasaki Signed-off-by: Adam Manzanares Signed-off-by: Dmitry Fomichev Reviewed-by: Niklas Cassel --- block/nvme.c | 2 +- hw/block/nvme-ns.c | 173 ++++++++ hw/block/nvme-ns.h | 54 +++ hw/block/nvme.c | 977 +++++++++++++++++++++++++++++++++++++++++- hw/block/nvme.h | 8 + hw/block/trace-events | 18 +- include/block/nvme.h | 113 ++++- 7 files changed, 1322 insertions(+), 23 deletions(-) diff --git a/block/nvme.c b/block/nvme.c index 05485fdd11..7a513c9a17 100644 --- a/block/nvme.c +++ b/block/nvme.c @@ -333,7 +333,7 @@ static inline int nvme_translate_error(const NvmeCqe *c) { uint16_t status = (le16_to_cpu(c->status) >> 1) & 0xFF; if (status) { - trace_nvme_error(le32_to_cpu(c->result), + trace_nvme_error(le32_to_cpu(c->result32), le16_to_cpu(c->sq_head), le16_to_cpu(c->sq_id), le16_to_cpu(c->cid), diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index e191ef9be0..e6db7f7d3b 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -25,6 +25,7 @@ #include "hw/qdev-properties.h" #include "hw/qdev-core.h" +#include "trace.h" #include "nvme.h" #include "nvme-ns.h" @@ -77,6 +78,151 @@ static int nvme_ns_init_blk(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) return 0; } +static int nvme_calc_zone_geometry(NvmeNamespace *ns, Error **errp) +{ + uint64_t zone_size, zone_cap; + uint32_t nz, lbasz = ns->blkconf.logical_block_size; + + if (ns->params.zone_size_bs) { + zone_size = ns->params.zone_size_bs; + } else { + zone_size = NVME_DEFAULT_ZONE_SIZE; + } + if (ns->params.zone_cap_bs) { + zone_cap = ns->params.zone_cap_bs; + } else { + zone_cap = zone_size; + } + if (zone_cap > zone_size) { + error_setg(errp, "zone capacity %luB exceeds zone size %luB", + zone_cap, zone_size); + return -1; + } + if (zone_size < lbasz) { + error_setg(errp, "zone size %luB too small, must be at least %uB", + zone_size, lbasz); + return -1; + } + if (zone_cap < lbasz) { + error_setg(errp, "zone capacity %luB too small, must be at least %uB", + zone_cap, lbasz); + return -1; + } + ns->zone_size = zone_size / lbasz; + ns->zone_capacity = zone_cap / lbasz; + + nz = DIV_ROUND_UP(ns->size / lbasz, ns->zone_size); + ns->num_zones = nz; + ns->zone_array_size = sizeof(NvmeZone) * nz; + ns->zone_size_log2 = 0; + if (is_power_of_2(ns->zone_size)) { + ns->zone_size_log2 = 63 - clz64(ns->zone_size); + } + + return 0; +} + +static void nvme_init_zone_state(NvmeNamespace *ns) +{ + uint64_t start = 0, zone_size = ns->zone_size; + uint64_t capacity = ns->num_zones * zone_size; + NvmeZone *zone; + int i; + + ns->zone_array = g_malloc0(ns->zone_array_size); + + QTAILQ_INIT(&ns->exp_open_zones); + QTAILQ_INIT(&ns->imp_open_zones); + QTAILQ_INIT(&ns->closed_zones); + QTAILQ_INIT(&ns->full_zones); + + zone = ns->zone_array; + for (i = 0; i < ns->num_zones; i++, zone++) { + if (start + zone_size > capacity) { + zone_size = capacity - start; + } + zone->d.zt = NVME_ZONE_TYPE_SEQ_WRITE; + nvme_set_zone_state(zone, NVME_ZONE_STATE_EMPTY); + zone->d.za = 0; + zone->d.zcap = ns->zone_capacity; + zone->d.zslba = start; + zone->d.wp = start; + zone->w_ptr = start; + start += zone_size; + } +} + +static int nvme_zoned_init_ns(NvmeCtrl *n, NvmeNamespace *ns, int lba_index, + Error **errp) +{ + NvmeIdNsZoned *id_ns_z; + + if (nvme_calc_zone_geometry(ns, errp) != 0) { + return -1; + } + + nvme_init_zone_state(ns); + + id_ns_z = g_malloc0(sizeof(NvmeIdNsZoned)); + + /* MAR/MOR are zeroes-based, 0xffffffff means no limit */ + id_ns_z->mar = 0xffffffff; + id_ns_z->mor = 0xffffffff; + id_ns_z->zoc = 0; + id_ns_z->ozcs = ns->params.cross_zone_read ? 0x01 : 0x00; + + id_ns_z->lbafe[lba_index].zsze = cpu_to_le64(ns->zone_size); + id_ns_z->lbafe[lba_index].zdes = 0; + + ns->csi = NVME_CSI_ZONED; + ns->id_ns.nsze = cpu_to_le64(ns->zone_size * ns->num_zones); + ns->id_ns.ncap = cpu_to_le64(ns->zone_capacity * ns->num_zones); + ns->id_ns.nuse = ns->id_ns.ncap; + + ns->id_ns_zoned = id_ns_z; + + return 0; +} + +static void nvme_clear_zone(NvmeNamespace *ns, NvmeZone *zone) +{ + uint8_t state; + + zone->w_ptr = zone->d.wp; + state = nvme_get_zone_state(zone); + if (zone->d.wp != zone->d.zslba) { + if (state != NVME_ZONE_STATE_CLOSED) { + trace_pci_nvme_clear_ns_close(state, zone->d.zslba); + nvme_set_zone_state(zone, NVME_ZONE_STATE_CLOSED); + } + QTAILQ_INSERT_HEAD(&ns->closed_zones, zone, entry); + } else { + trace_pci_nvme_clear_ns_reset(state, zone->d.zslba); + nvme_set_zone_state(zone, NVME_ZONE_STATE_EMPTY); + } +} + +/* + * Close all the zones that are currently open. + */ +static void nvme_zoned_ns_shutdown(NvmeNamespace *ns) +{ + NvmeZone *zone, *next; + + QTAILQ_FOREACH_SAFE(zone, &ns->closed_zones, entry, next) { + QTAILQ_REMOVE(&ns->closed_zones, zone, entry); + nvme_clear_zone(ns, zone); + } + QTAILQ_FOREACH_SAFE(zone, &ns->imp_open_zones, entry, next) { + QTAILQ_REMOVE(&ns->imp_open_zones, zone, entry); + nvme_clear_zone(ns, zone); + } + QTAILQ_FOREACH_SAFE(zone, &ns->exp_open_zones, entry, next) { + QTAILQ_REMOVE(&ns->exp_open_zones, zone, entry); + nvme_clear_zone(ns, zone); + } +} + static int nvme_ns_check_constraints(NvmeNamespace *ns, Error **errp) { if (!ns->blkconf.blk) { @@ -98,6 +244,12 @@ int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) } nvme_ns_init(ns); + if (ns->params.zoned) { + if (nvme_zoned_init_ns(n, ns, 0, errp) != 0) { + return -1; + } + } + if (nvme_register_namespace(n, ns, errp)) { return -1; } @@ -115,6 +267,21 @@ void nvme_ns_flush(NvmeNamespace *ns) blk_flush(ns->blkconf.blk); } +void nvme_ns_shutdown(NvmeNamespace *ns) +{ + if (ns->params.zoned) { + nvme_zoned_ns_shutdown(ns); + } +} + +void nvme_ns_cleanup(NvmeNamespace *ns) +{ + if (ns->params.zoned) { + g_free(ns->id_ns_zoned); + g_free(ns->zone_array); + } +} + static void nvme_ns_realize(DeviceState *dev, Error **errp) { NvmeNamespace *ns = NVME_NS(dev); @@ -133,6 +300,12 @@ static Property nvme_ns_props[] = { DEFINE_BLOCK_PROPERTIES(NvmeNamespace, blkconf), DEFINE_PROP_UINT32("nsid", NvmeNamespace, params.nsid, 0), DEFINE_PROP_UUID("uuid", NvmeNamespace, params.uuid), + DEFINE_PROP_BOOL("zoned", NvmeNamespace, params.zoned, false), + DEFINE_PROP_SIZE("zoned.zsze", NvmeNamespace, params.zone_size_bs, + NVME_DEFAULT_ZONE_SIZE), + DEFINE_PROP_SIZE("zoned.zcap", NvmeNamespace, params.zone_cap_bs, 0), + DEFINE_PROP_BOOL("zoned.cross_read", NvmeNamespace, + params.cross_zone_read, false), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index 2d9cd29d07..d2631ff5a3 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -19,9 +19,20 @@ #define NVME_NS(obj) \ OBJECT_CHECK(NvmeNamespace, (obj), TYPE_NVME_NS) +typedef struct NvmeZone { + NvmeZoneDescr d; + uint64_t w_ptr; + QTAILQ_ENTRY(NvmeZone) entry; +} NvmeZone; + typedef struct NvmeNamespaceParams { uint32_t nsid; QemuUUID uuid; + + bool zoned; + bool cross_zone_read; + uint64_t zone_size_bs; + uint64_t zone_cap_bs; } NvmeNamespaceParams; typedef struct NvmeNamespace { @@ -34,6 +45,18 @@ typedef struct NvmeNamespace { bool attached; uint8_t csi; + NvmeIdNsZoned *id_ns_zoned; + NvmeZone *zone_array; + QTAILQ_HEAD(, NvmeZone) exp_open_zones; + QTAILQ_HEAD(, NvmeZone) imp_open_zones; + QTAILQ_HEAD(, NvmeZone) closed_zones; + QTAILQ_HEAD(, NvmeZone) full_zones; + uint32_t num_zones; + uint64_t zone_size; + uint64_t zone_capacity; + uint64_t zone_array_size; + uint32_t zone_size_log2; + NvmeNamespaceParams params; } NvmeNamespace; @@ -71,8 +94,39 @@ static inline size_t nvme_l2b(NvmeNamespace *ns, uint64_t lba) typedef struct NvmeCtrl NvmeCtrl; +static inline uint8_t nvme_get_zone_state(NvmeZone *zone) +{ + return zone->d.zs >> 4; +} + +static inline void nvme_set_zone_state(NvmeZone *zone, enum NvmeZoneState state) +{ + zone->d.zs = state << 4; +} + +static inline uint64_t nvme_zone_rd_boundary(NvmeNamespace *ns, NvmeZone *zone) +{ + return zone->d.zslba + ns->zone_size; +} + +static inline uint64_t nvme_zone_wr_boundary(NvmeZone *zone) +{ + return zone->d.zslba + zone->d.zcap; +} + +static inline bool nvme_wp_is_valid(NvmeZone *zone) +{ + uint8_t st = nvme_get_zone_state(zone); + + return st != NVME_ZONE_STATE_FULL && + st != NVME_ZONE_STATE_READ_ONLY && + st != NVME_ZONE_STATE_OFFLINE; +} + int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp); void nvme_ns_drain(NvmeNamespace *ns); void nvme_ns_flush(NvmeNamespace *ns); +void nvme_ns_shutdown(NvmeNamespace *ns); +void nvme_ns_cleanup(NvmeNamespace *ns); #endif /* NVME_NS_H */ diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 7b3eafad03..0326e1611c 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -133,6 +133,16 @@ static const uint32_t nvme_cse_iocs_nvm[256] = { [NVME_CMD_READ] = NVME_CMD_EFF_CSUPP, }; +static const uint32_t nvme_cse_iocs_zoned[256] = { + [NVME_CMD_FLUSH] = NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC, + [NVME_CMD_WRITE_ZEROES] = NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC, + [NVME_CMD_WRITE] = NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC, + [NVME_CMD_READ] = NVME_CMD_EFF_CSUPP, + [NVME_CMD_ZONE_APPEND] = NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC, + [NVME_CMD_ZONE_MGMT_SEND] = NVME_CMD_EFF_CSUPP, + [NVME_CMD_ZONE_MGMT_RECV] = NVME_CMD_EFF_CSUPP, +}; + static void nvme_process_sq(void *opaque); static uint16_t nvme_cid(NvmeRequest *req) @@ -149,6 +159,46 @@ static uint16_t nvme_sqid(NvmeRequest *req) return le16_to_cpu(req->sq->sqid); } +static void nvme_assign_zone_state(NvmeNamespace *ns, NvmeZone *zone, + uint8_t state) +{ + if (QTAILQ_IN_USE(zone, entry)) { + switch (nvme_get_zone_state(zone)) { + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + QTAILQ_REMOVE(&ns->exp_open_zones, zone, entry); + break; + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + QTAILQ_REMOVE(&ns->imp_open_zones, zone, entry); + break; + case NVME_ZONE_STATE_CLOSED: + QTAILQ_REMOVE(&ns->closed_zones, zone, entry); + break; + case NVME_ZONE_STATE_FULL: + QTAILQ_REMOVE(&ns->full_zones, zone, entry); + } + } + + nvme_set_zone_state(zone, state); + + switch (state) { + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + QTAILQ_INSERT_TAIL(&ns->exp_open_zones, zone, entry); + break; + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + QTAILQ_INSERT_TAIL(&ns->imp_open_zones, zone, entry); + break; + case NVME_ZONE_STATE_CLOSED: + QTAILQ_INSERT_TAIL(&ns->closed_zones, zone, entry); + break; + case NVME_ZONE_STATE_FULL: + QTAILQ_INSERT_TAIL(&ns->full_zones, zone, entry); + case NVME_ZONE_STATE_READ_ONLY: + break; + default: + zone->d.za = 0; + } +} + static bool nvme_addr_is_cmb(NvmeCtrl *n, hwaddr addr) { hwaddr low = n->ctrl_mem.addr; @@ -841,7 +891,7 @@ static void nvme_process_aers(void *opaque) req = n->aer_reqs[n->outstanding_aers]; - result = (NvmeAerResult *) &req->cqe.result; + result = (NvmeAerResult *) &req->cqe.result32; result->event_type = event->result.event_type; result->event_info = event->result.event_info; result->log_page = event->result.log_page; @@ -910,6 +960,318 @@ static inline uint16_t nvme_check_bounds(NvmeCtrl *n, NvmeNamespace *ns, return NVME_SUCCESS; } +static void nvme_fill_read_data(NvmeRequest *req, uint64_t offset, + uint32_t max_len) +{ + QEMUSGList *qsg = &req->qsg; + QEMUIOVector *iov = &req->iov; + ScatterGatherEntry *entry; + uint32_t len, ent_len; + + if (qsg->nsg > 0) { + entry = qsg->sg; + len = qsg->size; + if (max_len) { + len = MIN(len, max_len); + } + for (; len > 0; len -= ent_len) { + ent_len = MIN(len, entry->len); + if (offset > ent_len) { + offset -= ent_len; + } else if (offset != 0) { + dma_memory_set(qsg->as, entry->base + offset, + 0, ent_len - offset); + offset = 0; + } else { + dma_memory_set(qsg->as, entry->base, 0, ent_len); + } + entry++; + } + } else if (iov->iov) { + len = iov_size(iov->iov, iov->niov); + if (max_len) { + len = MIN(len, max_len); + } + qemu_iovec_memset(iov, offset, 0, len - offset); + } +} + +static inline uint32_t nvme_zone_idx(NvmeNamespace *ns, uint64_t slba) +{ + return ns->zone_size_log2 > 0 ? slba >> ns->zone_size_log2 : + slba / ns->zone_size; +} + +static inline NvmeZone *nvme_get_zone_by_slba(NvmeNamespace *ns, uint64_t slba) +{ + uint32_t zone_idx = nvme_zone_idx(ns, slba); + + assert(zone_idx < ns->num_zones); + return &ns->zone_array[zone_idx]; +} + +static uint16_t nvme_zone_state_ok_to_write(NvmeZone *zone) +{ + uint16_t status; + + switch (nvme_get_zone_state(zone)) { + case NVME_ZONE_STATE_EMPTY: + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + case NVME_ZONE_STATE_CLOSED: + status = NVME_SUCCESS; + break; + case NVME_ZONE_STATE_FULL: + status = NVME_ZONE_FULL; + break; + case NVME_ZONE_STATE_OFFLINE: + status = NVME_ZONE_OFFLINE; + break; + case NVME_ZONE_STATE_READ_ONLY: + status = NVME_ZONE_READ_ONLY; + break; + default: + assert(false); + } + + return status; +} + +static uint16_t nvme_check_zone_write(NvmeCtrl *n, NvmeNamespace *ns, + NvmeZone *zone, uint64_t slba, + uint32_t nlb, bool append) +{ + uint16_t status; + + if (unlikely((slba + nlb) > nvme_zone_wr_boundary(zone))) { + status = NVME_ZONE_BOUNDARY_ERROR; + } else { + status = nvme_zone_state_ok_to_write(zone); + } + + if (status != NVME_SUCCESS) { + trace_pci_nvme_err_zone_write_not_ok(slba, nlb, status); + } else { + assert(nvme_wp_is_valid(zone)); + if (append) { + if (unlikely(slba != zone->d.zslba)) { + trace_pci_nvme_err_append_not_at_start(slba, zone->d.zslba); + status = NVME_ZONE_INVALID_WRITE; + } + if (nvme_l2b(ns, nlb) > (n->page_size << n->zasl)) { + trace_pci_nvme_err_append_too_large(slba, nlb, n->zasl); + status = NVME_INVALID_FIELD; + } + } else if (unlikely(slba != zone->w_ptr)) { + trace_pci_nvme_err_write_not_at_wp(slba, zone->d.zslba, + zone->w_ptr); + status = NVME_ZONE_INVALID_WRITE; + } + } + + return status; +} + +static uint16_t nvme_zone_state_ok_to_read(NvmeZone *zone) +{ + uint16_t status; + + switch (nvme_get_zone_state(zone)) { + case NVME_ZONE_STATE_EMPTY: + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + case NVME_ZONE_STATE_FULL: + case NVME_ZONE_STATE_CLOSED: + case NVME_ZONE_STATE_READ_ONLY: + status = NVME_SUCCESS; + break; + case NVME_ZONE_STATE_OFFLINE: + status = NVME_ZONE_OFFLINE | NVME_DNR; + break; + default: + assert(false); + } + + return status; +} + +typedef struct NvmeReadFillCtx { + uint64_t pre_rd_fill_slba; + uint64_t read_slba; + uint64_t post_rd_fill_slba; + + uint32_t pre_rd_fill_nlb; + uint32_t read_nlb; + uint32_t post_rd_fill_nlb; +} NvmeReadFillCtx; + +static uint16_t nvme_check_zone_read(NvmeNamespace *ns, uint64_t slba, + uint32_t nlb, NvmeReadFillCtx *rfc) +{ + NvmeZone *zone = nvme_get_zone_by_slba(ns, slba); + NvmeZone *next_zone; + uint64_t bndry = nvme_zone_rd_boundary(ns, zone); + uint64_t end = slba + nlb, wp1, wp2; + uint16_t status; + + rfc->read_slba = slba; + rfc->read_nlb = nlb; + + status = nvme_zone_state_ok_to_read(zone); + if (status != NVME_SUCCESS) { + ; + } else if (likely(end <= bndry)) { + if (end > zone->w_ptr) { + wp1 = zone->w_ptr; + if (slba >= wp1) { + /* No i/o necessary, just fill */ + rfc->pre_rd_fill_slba = slba; + rfc->pre_rd_fill_nlb = nlb; + rfc->read_nlb = 0; + } else { + rfc->read_nlb = wp1 - slba; + rfc->post_rd_fill_slba = wp1; + rfc->post_rd_fill_nlb = nlb - rfc->read_nlb; + } + } + } else if (!ns->params.cross_zone_read) { + status = NVME_ZONE_BOUNDARY_ERROR; + } else { + /* + * Read across zone boundary - look at the next zone. + * Earlier bounds checks ensure that the current zone + * is not the last one. + */ + next_zone = zone + 1; + status = nvme_zone_state_ok_to_read(next_zone); + if (status != NVME_SUCCESS) { + ; + } else if (end > nvme_zone_rd_boundary(ns, next_zone)) { + /* + * As zone size is much larger than a typical maximum + * i/o size in real hardware, only allow the i/o range + * to span no more than one pair of zones. + */ + status = NVME_ZONE_BOUNDARY_ERROR; + } else { + wp1 = zone->w_ptr; + wp2 = next_zone->w_ptr; + if (wp2 == bndry) { + if (slba >= wp1) { + /* Again, no i/o necessary, just fill */ + rfc->pre_rd_fill_slba = slba; + rfc->pre_rd_fill_nlb = nlb; + rfc->read_nlb = 0; + } else { + rfc->read_nlb = wp1 - slba; + rfc->post_rd_fill_slba = wp1; + rfc->post_rd_fill_nlb = nlb - rfc->read_nlb; + } + } else if (slba < wp1) { + if (end > wp2) { + if (wp1 == bndry) { + rfc->post_rd_fill_slba = wp2; + rfc->post_rd_fill_nlb = end - wp2; + rfc->read_nlb = wp2 - slba; + } else { + rfc->pre_rd_fill_slba = wp2; + rfc->pre_rd_fill_nlb = end - wp2; + rfc->read_nlb = wp2 - slba; + rfc->post_rd_fill_slba = wp1; + rfc->post_rd_fill_nlb = bndry - wp1; + } + } else { + rfc->post_rd_fill_slba = wp1; + rfc->post_rd_fill_nlb = bndry - wp1; + } + } else { + if (end > wp2) { + rfc->pre_rd_fill_slba = slba; + rfc->pre_rd_fill_nlb = end - slba; + rfc->read_slba = bndry; + rfc->read_nlb = wp2 - bndry; + } else { + rfc->read_slba = bndry; + rfc->read_nlb = end - bndry; + rfc->post_rd_fill_slba = slba; + rfc->post_rd_fill_nlb = bndry - slba; + } + } + } + } + + return status; +} + +static bool nvme_finalize_zoned_write(NvmeNamespace *ns, NvmeRequest *req, + bool failed) +{ + NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; + NvmeZone *zone; + uint64_t slba, start_wp = req->cqe.result64; + uint32_t nlb; + + if (rw->opcode != NVME_CMD_WRITE && + rw->opcode != NVME_CMD_ZONE_APPEND && + rw->opcode != NVME_CMD_WRITE_ZEROES) { + return false; + } + + slba = le64_to_cpu(rw->slba); + nlb = le16_to_cpu(rw->nlb) + 1; + zone = nvme_get_zone_by_slba(ns, slba); + + if (!failed && zone->w_ptr < start_wp + nlb) { + /* + * A preceding queued write to the zone has failed, + * now this write is not at the WP, fail it too. + */ + failed = true; + } + + if (failed) { + req->cqe.result64 = 0; + } else if (zone->w_ptr == nvme_zone_wr_boundary(zone)) { + switch (nvme_get_zone_state(zone)) { + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + case NVME_ZONE_STATE_CLOSED: + case NVME_ZONE_STATE_EMPTY: + nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_FULL); + /* fall through */ + case NVME_ZONE_STATE_FULL: + break; + default: + assert(false); + } + zone->d.wp = zone->w_ptr; + } else { + zone->d.wp += nlb; + } + + return failed; +} + +static uint64_t nvme_advance_zone_wp(NvmeNamespace *ns, NvmeZone *zone, + uint32_t nlb) +{ + uint64_t result = zone->w_ptr; + uint8_t zs; + + zone->w_ptr += nlb; + + if (zone->w_ptr < nvme_zone_wr_boundary(zone)) { + zs = nvme_get_zone_state(zone); + switch (zs) { + case NVME_ZONE_STATE_EMPTY: + case NVME_ZONE_STATE_CLOSED: + nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_IMPLICITLY_OPEN); + } + } + + return result; +} + static void nvme_rw_cb(void *opaque, int ret) { NvmeRequest *req = opaque; @@ -924,10 +1286,26 @@ static void nvme_rw_cb(void *opaque, int ret) trace_pci_nvme_rw_cb(nvme_cid(req), blk_name(blk)); if (!ret) { - block_acct_done(stats, acct); + if (ns->params.zoned) { + if (nvme_finalize_zoned_write(ns, req, false)) { + ret = EIO; + block_acct_failed(stats, acct); + req->status = NVME_ZONE_INVALID_WRITE; + } else if (req->fill_len) { + nvme_fill_read_data(req, req->fill_off, req->fill_len); + req->fill_len = 0; + } + } + if (!ret) { + block_acct_done(stats, acct); + } } else { uint16_t status; + if (ns->params.zoned) { + nvme_finalize_zoned_write(ns, req, true); + } + block_acct_failed(stats, acct); switch (req->cmd.opcode) { @@ -970,7 +1348,9 @@ static uint16_t nvme_read(NvmeCtrl *n, NvmeRequest *req) uint64_t slba = le64_to_cpu(rw->slba); uint32_t nlb = (uint32_t)le16_to_cpu(rw->nlb) + 1; uint64_t data_size = nvme_l2b(ns, nlb); - uint64_t data_offset; + uint64_t data_offset, fill_off; + uint32_t fill_len; + NvmeReadFillCtx rfc = {}; BlockBackend *blk = ns->blkconf.blk; uint16_t status; @@ -988,11 +1368,40 @@ static uint16_t nvme_read(NvmeCtrl *n, NvmeRequest *req) goto invalid; } + if (ns->params.zoned) { + status = nvme_check_zone_read(ns, slba, nlb, &rfc); + if (status != NVME_SUCCESS) { + trace_pci_nvme_err_zone_read_not_ok(slba, nlb, status); + goto invalid; + } + } + status = nvme_map_dptr(n, data_size, req); if (status) { goto invalid; } + if (ns->params.zoned) { + if (rfc.pre_rd_fill_nlb) { + fill_off = nvme_l2b(ns, rfc.pre_rd_fill_slba - slba); + fill_len = nvme_l2b(ns, rfc.pre_rd_fill_nlb); + nvme_fill_read_data(req, fill_off, fill_len); + } + if (!rfc.read_nlb) { + /* No backend I/O necessary, only needed to fill the buffer */ + req->status = NVME_SUCCESS; + return NVME_SUCCESS; + } + if (rfc.post_rd_fill_nlb) { + req->fill_off = nvme_l2b(ns, rfc.post_rd_fill_slba - slba); + req->fill_len = nvme_l2b(ns, rfc.post_rd_fill_nlb); + } else { + req->fill_len = 0; + } + slba = rfc.read_slba; + data_size = nvme_l2b(ns, rfc.read_nlb); + } + data_offset = nvme_l2b(ns, slba); block_acct_start(blk_get_stats(blk), &req->acct, data_size, @@ -1011,7 +1420,7 @@ invalid: return status | NVME_DNR; } -static uint16_t nvme_write(NvmeCtrl *n, NvmeRequest *req, bool wrz) +static uint16_t nvme_write(NvmeCtrl *n, NvmeRequest *req, bool append, bool wrz) { NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; NvmeNamespace *ns = req->ns; @@ -1019,6 +1428,7 @@ static uint16_t nvme_write(NvmeCtrl *n, NvmeRequest *req, bool wrz) uint32_t nlb = (uint32_t)le16_to_cpu(rw->nlb) + 1; uint64_t data_size = nvme_l2b(ns, nlb); uint64_t data_offset; + NvmeZone *zone; BlockBackend *blk = ns->blkconf.blk; uint16_t status; @@ -1039,6 +1449,25 @@ static uint16_t nvme_write(NvmeCtrl *n, NvmeRequest *req, bool wrz) goto invalid; } + if (ns->params.zoned) { + zone = nvme_get_zone_by_slba(ns, slba); + + status = nvme_check_zone_write(n, ns, zone, slba, nlb, append); + if (status != NVME_SUCCESS) { + goto invalid; + } + + if (append) { + slba = zone->w_ptr; + } + + req->cqe.result64 = nvme_advance_zone_wp(ns, zone, nlb); + } else if (append) { + trace_pci_nvme_err_invalid_opc(rw->opcode); + status = NVME_INVALID_OPCODE; + goto invalid; + } + data_offset = nvme_l2b(ns, slba); if (!wrz) { @@ -1069,6 +1498,433 @@ invalid: return status | NVME_DNR; } +static uint16_t nvme_get_mgmt_zone_slba_idx(NvmeNamespace *ns, NvmeCmd *c, + uint64_t *slba, uint32_t *zone_idx) +{ + uint32_t dw10 = le32_to_cpu(c->cdw10); + uint32_t dw11 = le32_to_cpu(c->cdw11); + + if (!ns->params.zoned) { + trace_pci_nvme_err_invalid_opc(c->opcode); + return NVME_INVALID_OPCODE | NVME_DNR; + } + + *slba = ((uint64_t)dw11) << 32 | dw10; + if (unlikely(*slba >= ns->id_ns.nsze)) { + trace_pci_nvme_err_invalid_lba_range(*slba, 0, ns->id_ns.nsze); + *slba = 0; + return NVME_LBA_RANGE | NVME_DNR; + } + + *zone_idx = nvme_zone_idx(ns, *slba); + assert(*zone_idx < ns->num_zones); + + return NVME_SUCCESS; +} + +typedef uint16_t (*op_handler_t)(NvmeNamespace *, NvmeZone *, + uint8_t); + +enum NvmeZoneProcessingMask { + NVME_PROC_CURRENT_ZONE = 0, + NVME_PROC_IMP_OPEN_ZONES = 1 << 0, + NVME_PROC_EXP_OPEN_ZONES = 1 << 1, + NVME_PROC_CLOSED_ZONES = 1 << 2, + NVME_PROC_READ_ONLY_ZONES = 1 << 3, + NVME_PROC_FULL_ZONES = 1 << 4, +}; + +static uint16_t nvme_open_zone(NvmeNamespace *ns, NvmeZone *zone, + uint8_t state) +{ + switch (state) { + case NVME_ZONE_STATE_EMPTY: + case NVME_ZONE_STATE_CLOSED: + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_EXPLICITLY_OPEN); + /* fall through */ + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + return NVME_SUCCESS; + } + + return NVME_ZONE_INVAL_TRANSITION; +} + +static uint16_t nvme_close_zone(NvmeNamespace *ns, NvmeZone *zone, + uint8_t state) +{ + switch (state) { + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_CLOSED); + /* fall through */ + case NVME_ZONE_STATE_CLOSED: + return NVME_SUCCESS; + } + + return NVME_ZONE_INVAL_TRANSITION; +} + +static uint16_t nvme_finish_zone(NvmeNamespace *ns, NvmeZone *zone, + uint8_t state) +{ + switch (state) { + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + case NVME_ZONE_STATE_CLOSED: + case NVME_ZONE_STATE_EMPTY: + zone->w_ptr = nvme_zone_wr_boundary(zone); + zone->d.wp = zone->w_ptr; + nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_FULL); + /* fall through */ + case NVME_ZONE_STATE_FULL: + return NVME_SUCCESS; + } + + return NVME_ZONE_INVAL_TRANSITION; +} + +static uint16_t nvme_reset_zone(NvmeNamespace *ns, NvmeZone *zone, + uint8_t state) +{ + switch (state) { + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + case NVME_ZONE_STATE_CLOSED: + case NVME_ZONE_STATE_FULL: + zone->w_ptr = zone->d.zslba; + zone->d.wp = zone->w_ptr; + nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_EMPTY); + /* fall through */ + case NVME_ZONE_STATE_EMPTY: + return NVME_SUCCESS; + } + + return NVME_ZONE_INVAL_TRANSITION; +} + +static uint16_t nvme_offline_zone(NvmeNamespace *ns, NvmeZone *zone, + uint8_t state) +{ + switch (state) { + case NVME_ZONE_STATE_READ_ONLY: + nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_OFFLINE); + /* fall through */ + case NVME_ZONE_STATE_OFFLINE: + return NVME_SUCCESS; + } + + return NVME_ZONE_INVAL_TRANSITION; +} + +static uint16_t nvme_bulk_proc_zone(NvmeNamespace *ns, NvmeZone *zone, + enum NvmeZoneProcessingMask proc_mask, + op_handler_t op_hndlr) +{ + uint16_t status = NVME_SUCCESS; + uint8_t zs = nvme_get_zone_state(zone); + bool proc_zone = false; + + switch (zs) { + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + proc_zone = proc_mask & NVME_PROC_IMP_OPEN_ZONES; + break; + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + proc_zone = proc_mask & NVME_PROC_EXP_OPEN_ZONES; + break; + case NVME_ZONE_STATE_CLOSED: + proc_zone = proc_mask & NVME_PROC_CLOSED_ZONES; + break; + case NVME_ZONE_STATE_READ_ONLY: + proc_zone = proc_mask & NVME_PROC_READ_ONLY_ZONES; + break; + case NVME_ZONE_STATE_FULL: + proc_zone = proc_mask & NVME_PROC_FULL_ZONES; + } + + if (proc_zone) { + status = op_hndlr(ns, zone, zs); + } + + return status; +} + +static uint16_t nvme_do_zone_op(NvmeNamespace *ns, NvmeZone *zone, + enum NvmeZoneProcessingMask proc_mask, + op_handler_t op_hndlr) +{ + NvmeZone *next; + uint16_t status = NVME_SUCCESS; + int i; + + if (!proc_mask) { + status = op_hndlr(ns, zone, nvme_get_zone_state(zone)); + } else { + if (proc_mask & NVME_PROC_CLOSED_ZONES) { + QTAILQ_FOREACH_SAFE(zone, &ns->closed_zones, entry, next) { + status = nvme_bulk_proc_zone(ns, zone, proc_mask, op_hndlr); + if (status != NVME_SUCCESS) { + goto out; + } + } + } + if (proc_mask & NVME_PROC_IMP_OPEN_ZONES) { + QTAILQ_FOREACH_SAFE(zone, &ns->imp_open_zones, entry, next) { + status = nvme_bulk_proc_zone(ns, zone, proc_mask, op_hndlr); + if (status != NVME_SUCCESS) { + goto out; + } + } + } + if (proc_mask & NVME_PROC_EXP_OPEN_ZONES) { + QTAILQ_FOREACH_SAFE(zone, &ns->exp_open_zones, entry, next) { + status = nvme_bulk_proc_zone(ns, zone, proc_mask, op_hndlr); + if (status != NVME_SUCCESS) { + goto out; + } + } + } + if (proc_mask & NVME_PROC_FULL_ZONES) { + QTAILQ_FOREACH_SAFE(zone, &ns->full_zones, entry, next) { + status = nvme_bulk_proc_zone(ns, zone, proc_mask, op_hndlr); + if (status != NVME_SUCCESS) { + goto out; + } + } + } + + if (proc_mask & NVME_PROC_READ_ONLY_ZONES) { + for (i = 0; i < ns->num_zones; i++, zone++) { + status = nvme_bulk_proc_zone(ns, zone, proc_mask, op_hndlr); + if (status != NVME_SUCCESS) { + goto out; + } + } + } + } + +out: + return status; +} + +static uint16_t nvme_zone_mgmt_send(NvmeCtrl *n, NvmeRequest *req) +{ + NvmeCmd *cmd = (NvmeCmd *)&req->cmd; + NvmeNamespace *ns = req->ns; + NvmeZone *zone; + uint32_t dw13 = le32_to_cpu(cmd->cdw13); + uint64_t slba = 0; + uint32_t zone_idx = 0; + uint16_t status; + uint8_t action; + bool all; + enum NvmeZoneProcessingMask proc_mask = NVME_PROC_CURRENT_ZONE; + + action = dw13 & 0xff; + all = dw13 & 0x100; + + req->status = NVME_SUCCESS; + + if (!all) { + status = nvme_get_mgmt_zone_slba_idx(ns, cmd, &slba, &zone_idx); + if (status) { + return status; + } + } + + zone = &ns->zone_array[zone_idx]; + if (slba != zone->d.zslba) { + trace_pci_nvme_err_unaligned_zone_cmd(action, slba, zone->d.zslba); + return NVME_INVALID_FIELD | NVME_DNR; + } + + switch (action) { + + case NVME_ZONE_ACTION_OPEN: + if (all) { + proc_mask = NVME_PROC_CLOSED_ZONES; + } + trace_pci_nvme_open_zone(slba, zone_idx, all); + status = nvme_do_zone_op(ns, zone, proc_mask, nvme_open_zone); + break; + + case NVME_ZONE_ACTION_CLOSE: + if (all) { + proc_mask = NVME_PROC_IMP_OPEN_ZONES | NVME_PROC_EXP_OPEN_ZONES; + } + trace_pci_nvme_close_zone(slba, zone_idx, all); + status = nvme_do_zone_op(ns, zone, proc_mask, nvme_close_zone); + break; + + case NVME_ZONE_ACTION_FINISH: + if (all) { + proc_mask = NVME_PROC_IMP_OPEN_ZONES | NVME_PROC_EXP_OPEN_ZONES | + NVME_PROC_CLOSED_ZONES; + } + trace_pci_nvme_finish_zone(slba, zone_idx, all); + status = nvme_do_zone_op(ns, zone, proc_mask, nvme_finish_zone); + break; + + case NVME_ZONE_ACTION_RESET: + if (all) { + proc_mask = NVME_PROC_IMP_OPEN_ZONES | NVME_PROC_EXP_OPEN_ZONES | + NVME_PROC_CLOSED_ZONES | NVME_PROC_FULL_ZONES; + } + trace_pci_nvme_reset_zone(slba, zone_idx, all); + status = nvme_do_zone_op(ns, zone, proc_mask, nvme_reset_zone); + break; + + case NVME_ZONE_ACTION_OFFLINE: + if (all) { + proc_mask = NVME_PROC_READ_ONLY_ZONES; + } + trace_pci_nvme_offline_zone(slba, zone_idx, all); + status = nvme_do_zone_op(ns, zone, proc_mask, nvme_offline_zone); + break; + + case NVME_ZONE_ACTION_SET_ZD_EXT: + trace_pci_nvme_set_descriptor_extension(slba, zone_idx); + return NVME_INVALID_FIELD | NVME_DNR; + break; + + default: + trace_pci_nvme_err_invalid_mgmt_action(action); + status = NVME_INVALID_FIELD; + } + + if (status == NVME_ZONE_INVAL_TRANSITION) { + trace_pci_nvme_err_invalid_zone_state_transition(action, slba, + zone->d.za); + } + if (status) { + status |= NVME_DNR; + } + + return status; +} + +static bool nvme_zone_matches_filter(uint32_t zafs, NvmeZone *zl) +{ + int zs = nvme_get_zone_state(zl); + + switch (zafs) { + case NVME_ZONE_REPORT_ALL: + return true; + case NVME_ZONE_REPORT_EMPTY: + return zs == NVME_ZONE_STATE_EMPTY; + case NVME_ZONE_REPORT_IMPLICITLY_OPEN: + return zs == NVME_ZONE_STATE_IMPLICITLY_OPEN; + case NVME_ZONE_REPORT_EXPLICITLY_OPEN: + return zs == NVME_ZONE_STATE_EXPLICITLY_OPEN; + case NVME_ZONE_REPORT_CLOSED: + return zs == NVME_ZONE_STATE_CLOSED; + case NVME_ZONE_REPORT_FULL: + return zs == NVME_ZONE_STATE_FULL; + case NVME_ZONE_REPORT_READ_ONLY: + return zs == NVME_ZONE_STATE_READ_ONLY; + case NVME_ZONE_REPORT_OFFLINE: + return zs == NVME_ZONE_STATE_OFFLINE; + default: + return false; + } +} + +static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, NvmeRequest *req) +{ + NvmeCmd *cmd = (NvmeCmd *)&req->cmd; + NvmeNamespace *ns = req->ns; + /* cdw12 is zero-based number of dwords to return. Convert to bytes */ + uint32_t len = (le32_to_cpu(cmd->cdw12) + 1) << 2; + uint32_t dw13 = le32_to_cpu(cmd->cdw13); + uint32_t zone_idx, zra, zrasf, partial; + uint64_t max_zones, nr_zones = 0; + uint16_t ret; + uint64_t slba; + NvmeZoneDescr *z; + NvmeZone *zs; + NvmeZoneReportHeader *header; + void *buf, *buf_p; + size_t zone_entry_sz; + + req->status = NVME_SUCCESS; + + ret = nvme_get_mgmt_zone_slba_idx(ns, cmd, &slba, &zone_idx); + if (ret) { + return ret; + } + + if (len < sizeof(NvmeZoneReportHeader)) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + zra = dw13 & 0xff; + if (!(zra == NVME_ZONE_REPORT || zra == NVME_ZONE_REPORT_EXTENDED)) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + if (zra == NVME_ZONE_REPORT_EXTENDED) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + zrasf = (dw13 >> 8) & 0xff; + if (zrasf > NVME_ZONE_REPORT_OFFLINE) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + partial = (dw13 >> 16) & 0x01; + + zone_entry_sz = sizeof(NvmeZoneDescr); + + max_zones = (len - sizeof(NvmeZoneReportHeader)) / zone_entry_sz; + buf = g_malloc0(len); + + header = (NvmeZoneReportHeader *)buf; + buf_p = buf + sizeof(NvmeZoneReportHeader); + + while (zone_idx < ns->num_zones && nr_zones < max_zones) { + zs = &ns->zone_array[zone_idx]; + + if (!nvme_zone_matches_filter(zrasf, zs)) { + zone_idx++; + continue; + } + + z = (NvmeZoneDescr *)buf_p; + buf_p += sizeof(NvmeZoneDescr); + nr_zones++; + + z->zt = zs->d.zt; + z->zs = zs->d.zs; + z->zcap = cpu_to_le64(zs->d.zcap); + z->zslba = cpu_to_le64(zs->d.zslba); + z->za = zs->d.za; + + if (nvme_wp_is_valid(zs)) { + z->wp = cpu_to_le64(zs->d.wp); + } else { + z->wp = cpu_to_le64(~0ULL); + } + + zone_idx++; + } + + if (!partial) { + for (; zone_idx < ns->num_zones; zone_idx++) { + zs = &ns->zone_array[zone_idx]; + if (nvme_zone_matches_filter(zrasf, zs)) { + nr_zones++; + } + } + } + header->nr_zones = cpu_to_le64(nr_zones); + + ret = nvme_dma(n, (uint8_t *)buf, len, DMA_DIRECTION_FROM_DEVICE, req); + + g_free(buf); + + return ret; +} + static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req) { uint32_t nsid = le32_to_cpu(req->cmd.nsid); @@ -1097,11 +1953,17 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req) case NVME_CMD_FLUSH: return nvme_flush(n, req); case NVME_CMD_WRITE_ZEROES: - return nvme_write(n, req, true); + return nvme_write(n, req, false, true); + case NVME_CMD_ZONE_APPEND: + return nvme_write(n, req, true, false); case NVME_CMD_WRITE: - return nvme_write(n, req, false); + return nvme_write(n, req, false, false); case NVME_CMD_READ: return nvme_read(n, req); + case NVME_CMD_ZONE_MGMT_SEND: + return nvme_zone_mgmt_send(n, req); + case NVME_CMD_ZONE_MGMT_RECV: + return nvme_zone_mgmt_recv(n, req); default: assert(false); } @@ -1366,6 +2228,9 @@ static uint16_t nvme_cmd_effects(NvmeCtrl *n, uint8_t csi, uint32_t buf_len, case NVME_CSI_NVM: src_iocs = nvme_cse_iocs_nvm; break; + case NVME_CSI_ZONED: + src_iocs = nvme_cse_iocs_zoned; + break; } } @@ -1546,6 +2411,16 @@ static uint16_t nvme_rpt_empty_id_struct(NvmeCtrl *n, NvmeRequest *req) return nvme_dma(n, id, sizeof(id), DMA_DIRECTION_FROM_DEVICE, req); } +static inline bool nvme_csi_has_nvm_support(NvmeNamespace *ns) +{ + switch (ns->csi) { + case NVME_CSI_NVM: + case NVME_CSI_ZONED: + return true; + } + return false; +} + static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeRequest *req) { trace_pci_nvme_identify_ctrl(); @@ -1557,11 +2432,16 @@ static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeRequest *req) static uint16_t nvme_identify_ctrl_csi(NvmeCtrl *n, NvmeRequest *req) { NvmeIdentify *c = (NvmeIdentify *)&req->cmd; + NvmeIdCtrlZoned id = {}; trace_pci_nvme_identify_ctrl_csi(c->csi); if (c->csi == NVME_CSI_NVM) { return nvme_rpt_empty_id_struct(n, req); + } else if (c->csi == NVME_CSI_ZONED) { + id.zasl = n->zasl; + return nvme_dma(n, (uint8_t *)&id, sizeof(id), + DMA_DIRECTION_FROM_DEVICE, req); } return NVME_INVALID_FIELD | NVME_DNR; @@ -1589,8 +2469,12 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req, return nvme_rpt_empty_id_struct(n, req); } - return nvme_dma(n, (uint8_t *)&ns->id_ns, sizeof(NvmeIdNs), - DMA_DIRECTION_FROM_DEVICE, req); + if (c->csi == NVME_CSI_NVM && nvme_csi_has_nvm_support(ns)) { + return nvme_dma(n, (uint8_t *)&ns->id_ns, sizeof(NvmeIdNs), + DMA_DIRECTION_FROM_DEVICE, req); + } + + return NVME_INVALID_CMD_SET | NVME_DNR; } static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req, @@ -1615,8 +2499,11 @@ static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req, return nvme_rpt_empty_id_struct(n, req); } - if (c->csi == NVME_CSI_NVM) { + if (c->csi == NVME_CSI_NVM && nvme_csi_has_nvm_support(ns)) { return nvme_rpt_empty_id_struct(n, req); + } else if (c->csi == NVME_CSI_ZONED && ns->csi == NVME_CSI_ZONED) { + return nvme_dma(n, (uint8_t *)ns->id_ns_zoned, sizeof(NvmeIdNsZoned), + DMA_DIRECTION_FROM_DEVICE, req); } return NVME_INVALID_FIELD | NVME_DNR; @@ -1685,7 +2572,7 @@ static uint16_t nvme_identify_nslist_csi(NvmeCtrl *n, NvmeRequest *req, return NVME_INVALID_NSID | NVME_DNR; } - if (c->csi != NVME_CSI_NVM) { + if (c->csi != NVME_CSI_NVM && c->csi != NVME_CSI_ZONED) { return NVME_INVALID_FIELD | NVME_DNR; } @@ -1694,7 +2581,7 @@ static uint16_t nvme_identify_nslist_csi(NvmeCtrl *n, NvmeRequest *req, if (!ns) { continue; } - if (ns->params.nsid <= min_nsid) { + if (ns->params.nsid <= min_nsid || c->csi != ns->csi) { continue; } if (only_active && !ns->attached) { @@ -1764,6 +2651,8 @@ static uint16_t nvme_identify_cmd_set(NvmeCtrl *n, NvmeRequest *req) trace_pci_nvme_identify_cmd_set(); NVME_SET_CSI(*list, NVME_CSI_NVM); + NVME_SET_CSI(*list, NVME_CSI_ZONED); + return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req); } @@ -1806,7 +2695,7 @@ static uint16_t nvme_abort(NvmeCtrl *n, NvmeRequest *req) { uint16_t sqid = le32_to_cpu(req->cmd.cdw10) & 0xffff; - req->cqe.result = 1; + req->cqe.result32 = 1; if (nvme_check_sqid(n, sqid)) { return NVME_INVALID_FIELD | NVME_DNR; } @@ -1991,7 +2880,7 @@ defaults: } out: - req->cqe.result = cpu_to_le32(result); + req->cqe.result32 = cpu_to_le32(result); return NVME_SUCCESS; } @@ -2122,8 +3011,8 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req) ((dw11 >> 16) & 0xFFFF) + 1, n->params.max_ioqpairs, n->params.max_ioqpairs); - req->cqe.result = cpu_to_le32((n->params.max_ioqpairs - 1) | - ((n->params.max_ioqpairs - 1) << 16)); + req->cqe.result32 = cpu_to_le32((n->params.max_ioqpairs - 1) | + ((n->params.max_ioqpairs - 1) << 16)); break; case NVME_ASYNCHRONOUS_EVENT_CONF: n->features.async_config = dw11; @@ -2234,7 +3123,7 @@ static void nvme_process_sq(void *opaque) } } -static void nvme_clear_ctrl(NvmeCtrl *n) +static void nvme_clear_ctrl(NvmeCtrl *n, bool shutdown) { NvmeNamespace *ns; int i; @@ -2278,6 +3167,17 @@ static void nvme_clear_ctrl(NvmeCtrl *n) nvme_ns_flush(ns); } + if (shutdown) { + for (i = 1; i <= n->num_namespaces; i++) { + ns = nvme_ns(n, i); + if (!ns) { + continue; + } + + nvme_ns_shutdown(ns); + } + } + n->bar.cc = 0; } @@ -2298,6 +3198,13 @@ static void nvme_select_ns_iocs(NvmeCtrl *n) ns->iocs = nvme_cse_iocs_nvm; } break; + case NVME_CSI_ZONED: + if (NVME_CC_CSS(n->bar.cc) == NVME_CC_CSS_CSI) { + ns->iocs = nvme_cse_iocs_zoned; + } else if (NVME_CC_CSS(n->bar.cc) == NVME_CC_CSS_NVM) { + ns->iocs = nvme_cse_iocs_nvm; + } + break; } } } @@ -2396,6 +3303,17 @@ static int nvme_start_ctrl(NvmeCtrl *n) nvme_init_sq(&n->admin_sq, n, n->bar.asq, 0, 0, NVME_AQA_ASQS(n->bar.aqa) + 1); + if (!n->params.zasl_bs) { + n->zasl = n->params.mdts; + } else { + if (n->params.zasl_bs < n->page_size) { + trace_pci_nvme_err_startfail_zasl_too_small(n->params.zasl_bs, + n->page_size); + return -1; + } + n->zasl = 31 - clz32(n->params.zasl_bs / n->page_size); + } + nvme_set_timestamp(n, 0ULL); QTAILQ_INIT(&n->aer_queue); @@ -2468,12 +3386,12 @@ static void nvme_write_bar(NvmeCtrl *n, hwaddr offset, uint64_t data, } } else if (!NVME_CC_EN(data) && NVME_CC_EN(n->bar.cc)) { trace_pci_nvme_mmio_stopped(); - nvme_clear_ctrl(n); + nvme_clear_ctrl(n, false); n->bar.csts &= ~NVME_CSTS_READY; } if (NVME_CC_SHN(data) && !(NVME_CC_SHN(n->bar.cc))) { trace_pci_nvme_mmio_shutdown_set(); - nvme_clear_ctrl(n); + nvme_clear_ctrl(n, true); n->bar.cc = data; n->bar.csts |= NVME_CSTS_SHST_COMPLETE; } else if (!NVME_CC_SHN(data) && NVME_CC_SHN(n->bar.cc)) { @@ -2820,6 +3738,13 @@ static void nvme_check_constraints(NvmeCtrl *n, Error **errp) host_memory_backend_set_mapped(n->pmrdev, true); } + + if (n->params.zasl_bs) { + if (!is_power_of_2(n->params.zasl_bs)) { + error_setg(errp, "zone append size limit has to be a power of 2"); + return; + } + } } static void nvme_init_state(NvmeCtrl *n) @@ -3085,9 +4010,21 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp) static void nvme_exit(PCIDevice *pci_dev) { NvmeCtrl *n = NVME(pci_dev); + NvmeNamespace *ns; + int i; - nvme_clear_ctrl(n); + nvme_clear_ctrl(n, true); + + for (i = 1; i <= n->num_namespaces; i++) { + ns = nvme_ns(n, i); + if (!ns) { + continue; + } + + nvme_ns_cleanup(ns); + } g_free(n->namespaces); + g_free(n->cq); g_free(n->sq); g_free(n->aer_reqs); @@ -3115,6 +4052,8 @@ static Property nvme_props[] = { DEFINE_PROP_UINT32("aer_max_queued", NvmeCtrl, params.aer_max_queued, 64), DEFINE_PROP_UINT8("mdts", NvmeCtrl, params.mdts, 7), DEFINE_PROP_BOOL("use-intel-id", NvmeCtrl, params.use_intel_id, false), + DEFINE_PROP_SIZE32("zoned.append_size_limit", NvmeCtrl, params.zasl_bs, + NVME_DEFAULT_MAX_ZA_SIZE), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/block/nvme.h b/hw/block/nvme.h index e080a2318a..4cb0615128 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -6,6 +6,9 @@ #define NVME_MAX_NAMESPACES 256 +#define NVME_DEFAULT_ZONE_SIZE (128 * MiB) +#define NVME_DEFAULT_MAX_ZA_SIZE (128 * KiB) + typedef struct NvmeParams { char *serial; uint32_t num_queues; /* deprecated since 5.1 */ @@ -16,6 +19,7 @@ typedef struct NvmeParams { uint32_t aer_max_queued; uint8_t mdts; bool use_intel_id; + uint32_t zasl_bs; } NvmeParams; typedef struct NvmeAsyncEvent { @@ -28,6 +32,8 @@ typedef struct NvmeRequest { struct NvmeNamespace *ns; BlockAIOCB *aiocb; uint16_t status; + uint64_t fill_off; + uint32_t fill_len; NvmeCqe cqe; NvmeCmd cmd; BlockAcctCookie acct; @@ -147,6 +153,8 @@ typedef struct NvmeCtrl { QTAILQ_HEAD(, NvmeAsyncEvent) aer_queue; int aer_queued; + uint8_t zasl; + NvmeNamespace namespace; NvmeNamespace *namespaces[NVME_MAX_NAMESPACES]; NvmeSQueue **sq; diff --git a/hw/block/trace-events b/hw/block/trace-events index 051e115f47..8f221ba73c 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -89,6 +89,14 @@ pci_nvme_mmio_start_success(void) "setting controller enable bit succeeded" pci_nvme_mmio_stopped(void) "cleared controller enable bit" pci_nvme_mmio_shutdown_set(void) "shutdown bit set" pci_nvme_mmio_shutdown_cleared(void) "shutdown bit cleared" +pci_nvme_open_zone(uint64_t slba, uint32_t zone_idx, int all) "open zone, slba=%"PRIu64", idx=%"PRIu32", all=%"PRIi32"" +pci_nvme_close_zone(uint64_t slba, uint32_t zone_idx, int all) "close zone, slba=%"PRIu64", idx=%"PRIu32", all=%"PRIi32"" +pci_nvme_finish_zone(uint64_t slba, uint32_t zone_idx, int all) "finish zone, slba=%"PRIu64", idx=%"PRIu32", all=%"PRIi32"" +pci_nvme_reset_zone(uint64_t slba, uint32_t zone_idx, int all) "reset zone, slba=%"PRIu64", idx=%"PRIu32", all=%"PRIi32"" +pci_nvme_offline_zone(uint64_t slba, uint32_t zone_idx, int all) "offline zone, slba=%"PRIu64", idx=%"PRIu32", all=%"PRIi32"" +pci_nvme_set_descriptor_extension(uint64_t slba, uint32_t zone_idx) "set zone descriptor extension, slba=%"PRIu64", idx=%"PRIu32"" +pci_nvme_clear_ns_close(uint32_t state, uint64_t slba) "zone state=%"PRIu32", slba=%"PRIu64" transitioned to Closed state" +pci_nvme_clear_ns_reset(uint32_t state, uint64_t slba) "zone state=%"PRIu32", slba=%"PRIu64" transitioned to Empty state" # nvme traces for error conditions pci_nvme_err_mdts(uint16_t cid, size_t len) "cid %"PRIu16" len %zu" @@ -109,7 +117,13 @@ pci_nvme_err_invalid_opc(uint8_t opc) "invalid opcode 0x%"PRIx8"" pci_nvme_err_invalid_admin_opc(uint8_t opc) "invalid admin opcode 0x%"PRIx8"" pci_nvme_err_invalid_lba_range(uint64_t start, uint64_t len, uint64_t limit) "Invalid LBA start=%"PRIu64" len=%"PRIu64" limit=%"PRIu64"" pci_nvme_err_invalid_log_page_offset(uint64_t ofs, uint64_t size) "must be <= %"PRIu64", got %"PRIu64"" -pci_nvme_err_only_nvm_cmd_set_avail(void) "setting 110b CC.CSS, but only NVM command set is enabled" +pci_nvme_err_unaligned_zone_cmd(uint8_t action, uint64_t slba, uint64_t zslba) "unaligned zone op 0x%"PRIx32", got slba=%"PRIu64", zslba=%"PRIu64"" +pci_nvme_err_invalid_zone_state_transition(uint8_t action, uint64_t slba, uint8_t attrs) "action=0x%"PRIx8", slba=%"PRIu64", attrs=0x%"PRIx32"" +pci_nvme_err_write_not_at_wp(uint64_t slba, uint64_t zone, uint64_t wp) "writing at slba=%"PRIu64", zone=%"PRIu64", but wp=%"PRIu64"" +pci_nvme_err_append_not_at_start(uint64_t slba, uint64_t zone) "appending at slba=%"PRIu64", but zone=%"PRIu64"" +pci_nvme_err_zone_write_not_ok(uint64_t slba, uint32_t nlb, uint32_t status) "slba=%"PRIu64", nlb=%"PRIu32", status=0x%"PRIx16"" +pci_nvme_err_zone_read_not_ok(uint64_t slba, uint32_t nlb, uint32_t status) "slba=%"PRIu64", nlb=%"PRIu32", status=0x%"PRIx16"" +pci_nvme_err_append_too_large(uint64_t slba, uint32_t nlb, uint8_t zasl) "slba=%"PRIu64", nlb=%"PRIu32", zasl=%"PRIu8"" pci_nvme_err_invalid_iocsci(uint32_t idx) "unsupported command set combination index %"PRIu32"" pci_nvme_err_invalid_del_sq(uint16_t qid) "invalid submission queue deletion, sid=%"PRIu16"" pci_nvme_err_invalid_create_sq_cqid(uint16_t cqid) "failed creating submission queue, invalid cqid=%"PRIu16"" @@ -143,7 +157,9 @@ pci_nvme_err_startfail_sqent_too_large(uint8_t log2ps, uint8_t maxlog2ps) "nvme_ pci_nvme_err_startfail_css(uint8_t css) "nvme_start_ctrl failed because invalid command set selected:%u" pci_nvme_err_startfail_asqent_sz_zero(void) "nvme_start_ctrl failed because the admin submission queue size is zero" pci_nvme_err_startfail_acqent_sz_zero(void) "nvme_start_ctrl failed because the admin completion queue size is zero" +pci_nvme_err_startfail_zasl_too_small(uint32_t zasl, uint32_t pagesz) "nvme_start_ctrl failed because zone append size limit %"PRIu32" is too small, needs to be >= %"PRIu32"" pci_nvme_err_startfail(void) "setting controller enable bit failed" +pci_nvme_err_invalid_mgmt_action(int action) "action=0x%"PRIx8"" # Traces for undefined behavior pci_nvme_ub_mmiowr_misaligned32(uint64_t offset) "MMIO write not 32-bit aligned, offset=0x%"PRIx64"" diff --git a/include/block/nvme.h b/include/block/nvme.h index 3653b4aefc..ba8a45edf5 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -489,6 +489,9 @@ enum NvmeIoCommands { NVME_CMD_COMPARE = 0x05, NVME_CMD_WRITE_ZEROES = 0x08, NVME_CMD_DSM = 0x09, + NVME_CMD_ZONE_MGMT_SEND = 0x79, + NVME_CMD_ZONE_MGMT_RECV = 0x7a, + NVME_CMD_ZONE_APPEND = 0x7d, }; typedef struct QEMU_PACKED NvmeDeleteQ { @@ -649,8 +652,10 @@ typedef struct QEMU_PACKED NvmeAerResult { } NvmeAerResult; typedef struct QEMU_PACKED NvmeCqe { - uint32_t result; - uint32_t rsvd; + union { + uint64_t result64; + uint32_t result32; + }; uint16_t sq_head; uint16_t sq_id; uint16_t cid; @@ -678,6 +683,7 @@ enum NvmeStatusCodes { NVME_SGL_DESCR_TYPE_INVALID = 0x0011, NVME_INVALID_USE_OF_CMB = 0x0012, NVME_CMD_SET_CMB_REJECTED = 0x002b, + NVME_INVALID_CMD_SET = 0x002c, NVME_LBA_RANGE = 0x0080, NVME_CAP_EXCEEDED = 0x0081, NVME_NS_NOT_READY = 0x0082, @@ -702,6 +708,14 @@ enum NvmeStatusCodes { NVME_CONFLICTING_ATTRS = 0x0180, NVME_INVALID_PROT_INFO = 0x0181, NVME_WRITE_TO_RO = 0x0182, + NVME_ZONE_BOUNDARY_ERROR = 0x01b8, + NVME_ZONE_FULL = 0x01b9, + NVME_ZONE_READ_ONLY = 0x01ba, + NVME_ZONE_OFFLINE = 0x01bb, + NVME_ZONE_INVALID_WRITE = 0x01bc, + NVME_ZONE_TOO_MANY_ACTIVE = 0x01bd, + NVME_ZONE_TOO_MANY_OPEN = 0x01be, + NVME_ZONE_INVAL_TRANSITION = 0x01bf, NVME_WRITE_FAULT = 0x0280, NVME_UNRECOVERED_READ = 0x0281, NVME_E2E_GUARD_ERROR = 0x0282, @@ -886,6 +900,11 @@ typedef struct QEMU_PACKED NvmeIdCtrl { uint8_t vs[1024]; } NvmeIdCtrl; +typedef struct NvmeIdCtrlZoned { + uint8_t zasl; + uint8_t rsvd1[4095]; +} NvmeIdCtrlZoned; + enum NvmeIdCtrlOacs { NVME_OACS_SECURITY = 1 << 0, NVME_OACS_FORMAT = 1 << 1, @@ -1011,6 +1030,12 @@ typedef struct QEMU_PACKED NvmeLBAF { uint8_t rp; } NvmeLBAF; +typedef struct QEMU_PACKED NvmeLBAFE { + uint64_t zsze; + uint8_t zdes; + uint8_t rsvd9[7]; +} NvmeLBAFE; + #define NVME_NSID_BROADCAST 0xffffffff typedef struct QEMU_PACKED NvmeIdNs { @@ -1065,10 +1090,24 @@ enum NvmeNsIdentifierType { enum NvmeCsi { NVME_CSI_NVM = 0x00, + NVME_CSI_ZONED = 0x02, }; #define NVME_SET_CSI(vec, csi) (vec |= (uint8_t)(1 << (csi))) +typedef struct QEMU_PACKED NvmeIdNsZoned { + uint16_t zoc; + uint16_t ozcs; + uint32_t mar; + uint32_t mor; + uint32_t rrl; + uint32_t frl; + uint8_t rsvd20[2796]; + NvmeLBAFE lbafe[16]; + uint8_t rsvd3072[768]; + uint8_t vs[256]; +} NvmeIdNsZoned; + /*Deallocate Logical Block Features*/ #define NVME_ID_NS_DLFEAT_GUARD_CRC(dlfeat) ((dlfeat) & 0x10) #define NVME_ID_NS_DLFEAT_WRITE_ZEROES(dlfeat) ((dlfeat) & 0x08) @@ -1100,6 +1139,71 @@ enum NvmeIdNsDps { DPS_FIRST_EIGHT = 8, }; +enum NvmeZoneAttr { + NVME_ZA_FINISHED_BY_CTLR = 1 << 0, + NVME_ZA_FINISH_RECOMMENDED = 1 << 1, + NVME_ZA_RESET_RECOMMENDED = 1 << 2, + NVME_ZA_ZD_EXT_VALID = 1 << 7, +}; + +typedef struct QEMU_PACKED NvmeZoneReportHeader { + uint64_t nr_zones; + uint8_t rsvd[56]; +} NvmeZoneReportHeader; + +enum NvmeZoneReceiveAction { + NVME_ZONE_REPORT = 0, + NVME_ZONE_REPORT_EXTENDED = 1, +}; + +enum NvmeZoneReportType { + NVME_ZONE_REPORT_ALL = 0, + NVME_ZONE_REPORT_EMPTY = 1, + NVME_ZONE_REPORT_IMPLICITLY_OPEN = 2, + NVME_ZONE_REPORT_EXPLICITLY_OPEN = 3, + NVME_ZONE_REPORT_CLOSED = 4, + NVME_ZONE_REPORT_FULL = 5, + NVME_ZONE_REPORT_READ_ONLY = 6, + NVME_ZONE_REPORT_OFFLINE = 7, +}; + +enum NvmeZoneType { + NVME_ZONE_TYPE_RESERVED = 0x00, + NVME_ZONE_TYPE_SEQ_WRITE = 0x02, +}; + +enum NvmeZoneSendAction { + NVME_ZONE_ACTION_RSD = 0x00, + NVME_ZONE_ACTION_CLOSE = 0x01, + NVME_ZONE_ACTION_FINISH = 0x02, + NVME_ZONE_ACTION_OPEN = 0x03, + NVME_ZONE_ACTION_RESET = 0x04, + NVME_ZONE_ACTION_OFFLINE = 0x05, + NVME_ZONE_ACTION_SET_ZD_EXT = 0x10, +}; + +typedef struct QEMU_PACKED NvmeZoneDescr { + uint8_t zt; + uint8_t zs; + uint8_t za; + uint8_t rsvd3[5]; + uint64_t zcap; + uint64_t zslba; + uint64_t wp; + uint8_t rsvd32[32]; +} NvmeZoneDescr; + +enum NvmeZoneState { + NVME_ZONE_STATE_RESERVED = 0x00, + NVME_ZONE_STATE_EMPTY = 0x01, + NVME_ZONE_STATE_IMPLICITLY_OPEN = 0x02, + NVME_ZONE_STATE_EXPLICITLY_OPEN = 0x03, + NVME_ZONE_STATE_CLOSED = 0x04, + NVME_ZONE_STATE_READ_ONLY = 0x0D, + NVME_ZONE_STATE_FULL = 0x0E, + NVME_ZONE_STATE_OFFLINE = 0x0F, +}; + static inline void _nvme_check_size(void) { QEMU_BUILD_BUG_ON(sizeof(NvmeBar) != 4096); @@ -1119,8 +1223,13 @@ static inline void _nvme_check_size(void) QEMU_BUILD_BUG_ON(sizeof(NvmeSmartLog) != 512); QEMU_BUILD_BUG_ON(sizeof(NvmeEffectsLog) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrl) != 4096); + QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrlZoned) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsDescr) != 4); + QEMU_BUILD_BUG_ON(sizeof(NvmeLBAF) != 4); + QEMU_BUILD_BUG_ON(sizeof(NvmeLBAFE) != 16); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNs) != 4096); + QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsZoned) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeSglDescriptor) != 16); + QEMU_BUILD_BUG_ON(sizeof(NvmeZoneDescr) != 64); } #endif From patchwork Fri Oct 30 02:32:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 11868315 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6FAA514B7 for ; Fri, 30 Oct 2020 02:39:38 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C40812076E for ; Fri, 30 Oct 2020 02:39:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="qJ4glgUd" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C40812076E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:38362 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kYKKG-0006qJ-NX for patchwork-qemu-devel@patchwork.kernel.org; Thu, 29 Oct 2020 22:39:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41274) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYKE3-0006Ed-B6; Thu, 29 Oct 2020 22:33:11 -0400 Received: from esa3.hgst.iphmx.com ([216.71.153.141]:10998) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYKDz-0006dn-N9; Thu, 29 Oct 2020 22:33:11 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1604025187; x=1635561187; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=f2jhFid1Eer6mVN1Fs9bkQWEt7kkLC8suyNdE1o+BzM=; b=qJ4glgUdQGErde79OjvwexmJbFXkHuE1nQhEYiNeaElv3KySO6wzBGw7 bCOgm/t60ODIL93ycuvYpFrgJw8sKTN481wlnuUzducl4bS5l5RjDoyhx UuvA9ARRsdUFcrGF4a1bM9zbo0jbAbU2BrvkbdoC4FLUh45/4gEjL2Xt5 zn9/VhmyTi5MDhM8cYYYG72g3Xv8QmYFPSQ3pSVd6I2WjizzgYcUMqkBy X8Q7EXYIQI4Q7NOJRgIv21R0g5WW58g41bZIuOesEWDAWCU+wMGL1L28U C8qz/1x1R3iPsNZAqkpUvjy+nR/y4xMTbAek7BdKxGbzDCsCjo5iuNaX2 g==; IronPort-SDR: I3n/DO+TFHOOqvwucmDNzlpq7lbuOBB6v984hgdd7kQ98MjCt9gmXiNmHMVgI5Pli4pqYqN7Dz UBxRp2ET2E3FpGk7uaiENM8xvWnUUZRN/yvhjrQg1Fym2MeqvLTMq2GYBDZYZIdoFjNP8o29gR qTmd6Quy6Kx2YG6Ya9E7RCzG2BfJx09ro6+RJRGhFzkDI4xg8MA7qX7BYvb9JMDecqJSdg7SV3 z+VdS5AfdrvcgnhIfN7hUvPJadnHFAOSvu3W/+4bGgB3DiFnBsx/GB9mCsVPtflF1aEz+9aSHO 7ks= X-IronPort-AV: E=Sophos;i="5.77,431,1596470400"; d="scan'208";a="155748089" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 30 Oct 2020 10:33:06 +0800 IronPort-SDR: qllZtJ0XUDcwWjQ72OGZT9U+FLUQWhqHoTdArQCO4ZT3Dx6KVVqntYERjq2PUGiAByOSurZWr7 Lq/7G+ZOlqCfV0C/t7WepinmlLdvSIhrxKf1Q4bM4iYjiwPyEjQBRSkFx7LbqREoq7guoVQjHk RsqY+HEiq3z6lXqQ0GV8pbIeRTzMiXCu6Eg4ayi+XwqUviqwiV5NPtfOEuR6cHNmjXPu+EKF9y Hdf8zFt86UutNIlA+gJ8MxnfObxqkB0ZwA2n2XVsw4898AxEKxizOQ72/p5oit8ViyRSDKUNXI iaf2tIKCOTF7qp5c6GhZ0uVY Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2020 19:18:10 -0700 IronPort-SDR: oV4aGsJOaeuKtRHjQzWN9jtXcgT7dHUOKkVbxGVJITyCKrkzUHjRX8zpWSIBYxPGereu6PhPsg evecH0VL/Mi/i9Joq4TReFMtCBrFTzwxfRqriFYZW7cLduGvatdlf+QuhOrODdg/qD1it4RDeR KJvoHj3WgRek7IgG96jJK9g/ndE4hkUWwoDVZYqyOfl/lpxym588Plba58VQbSW2GUyAOiBi0+ 4DlMm0Jeu4Isr50tMHPrAOnDDalmsFvYFeL12Wsl9LLtayBDJNZE/t/xYBx+nzs/xpkIkuEMeU XLY= WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip01.wdc.com with ESMTP; 29 Oct 2020 19:33:05 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v8 08/11] hw/block/nvme: Introduce max active and open zone limits Date: Fri, 30 Oct 2020 11:32:39 +0900 Message-Id: <20201030023242.5204-9-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201030023242.5204-1-dmitry.fomichev@wdc.com> References: <20201030023242.5204-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=216.71.153.141; envelope-from=prvs=56530b5a8=dmitry.fomichev@wdc.com; helo=esa3.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/10/29 22:32:49 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" Add two module properties, "zoned.max_active" and "zoned.max_open" to control the maximum number of zones that can be active or open. Once these variables are set to non-default values, these limits are checked during I/O and Too Many Active or Too Many Open command status is returned if they are exceeded. Signed-off-by: Hans Holmberg Signed-off-by: Dmitry Fomichev Reviewed-by: Niklas Cassel --- hw/block/nvme-ns.c | 30 +++++++++++++- hw/block/nvme-ns.h | 41 +++++++++++++++++++ hw/block/nvme.c | 94 +++++++++++++++++++++++++++++++++++++++++++ hw/block/trace-events | 2 + 4 files changed, 165 insertions(+), 2 deletions(-) diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index e6db7f7d3b..2e45838c15 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -119,6 +119,20 @@ static int nvme_calc_zone_geometry(NvmeNamespace *ns, Error **errp) ns->zone_size_log2 = 63 - clz64(ns->zone_size); } + /* Make sure that the values of all ZNS properties are sane */ + if (ns->params.max_open_zones > nz) { + error_setg(errp, + "max_open_zones value %u exceeds the number of zones %u", + ns->params.max_open_zones, nz); + return -1; + } + if (ns->params.max_active_zones > nz) { + error_setg(errp, + "max_active_zones value %u exceeds the number of zones %u", + ns->params.max_active_zones, nz); + return -1; + } + return 0; } @@ -166,8 +180,8 @@ static int nvme_zoned_init_ns(NvmeCtrl *n, NvmeNamespace *ns, int lba_index, id_ns_z = g_malloc0(sizeof(NvmeIdNsZoned)); /* MAR/MOR are zeroes-based, 0xffffffff means no limit */ - id_ns_z->mar = 0xffffffff; - id_ns_z->mor = 0xffffffff; + id_ns_z->mar = cpu_to_le32(ns->params.max_active_zones - 1); + id_ns_z->mor = cpu_to_le32(ns->params.max_open_zones - 1); id_ns_z->zoc = 0; id_ns_z->ozcs = ns->params.cross_zone_read ? 0x01 : 0x00; @@ -195,6 +209,7 @@ static void nvme_clear_zone(NvmeNamespace *ns, NvmeZone *zone) trace_pci_nvme_clear_ns_close(state, zone->d.zslba); nvme_set_zone_state(zone, NVME_ZONE_STATE_CLOSED); } + nvme_aor_inc_active(ns); QTAILQ_INSERT_HEAD(&ns->closed_zones, zone, entry); } else { trace_pci_nvme_clear_ns_reset(state, zone->d.zslba); @@ -211,16 +226,23 @@ static void nvme_zoned_ns_shutdown(NvmeNamespace *ns) QTAILQ_FOREACH_SAFE(zone, &ns->closed_zones, entry, next) { QTAILQ_REMOVE(&ns->closed_zones, zone, entry); + nvme_aor_dec_active(ns); nvme_clear_zone(ns, zone); } QTAILQ_FOREACH_SAFE(zone, &ns->imp_open_zones, entry, next) { QTAILQ_REMOVE(&ns->imp_open_zones, zone, entry); + nvme_aor_dec_open(ns); + nvme_aor_dec_active(ns); nvme_clear_zone(ns, zone); } QTAILQ_FOREACH_SAFE(zone, &ns->exp_open_zones, entry, next) { QTAILQ_REMOVE(&ns->exp_open_zones, zone, entry); + nvme_aor_dec_open(ns); + nvme_aor_dec_active(ns); nvme_clear_zone(ns, zone); } + + assert(ns->nr_open_zones == 0); } static int nvme_ns_check_constraints(NvmeNamespace *ns, Error **errp) @@ -306,6 +328,10 @@ static Property nvme_ns_props[] = { DEFINE_PROP_SIZE("zoned.zcap", NvmeNamespace, params.zone_cap_bs, 0), DEFINE_PROP_BOOL("zoned.cross_read", NvmeNamespace, params.cross_zone_read, false), + DEFINE_PROP_UINT32("zoned.max_active", NvmeNamespace, + params.max_active_zones, 0), + DEFINE_PROP_UINT32("zoned.max_open", NvmeNamespace, + params.max_open_zones, 0), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index d2631ff5a3..421bab0a57 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -33,6 +33,8 @@ typedef struct NvmeNamespaceParams { bool cross_zone_read; uint64_t zone_size_bs; uint64_t zone_cap_bs; + uint32_t max_active_zones; + uint32_t max_open_zones; } NvmeNamespaceParams; typedef struct NvmeNamespace { @@ -56,6 +58,8 @@ typedef struct NvmeNamespace { uint64_t zone_capacity; uint64_t zone_array_size; uint32_t zone_size_log2; + int32_t nr_open_zones; + int32_t nr_active_zones; NvmeNamespaceParams params; } NvmeNamespace; @@ -123,6 +127,43 @@ static inline bool nvme_wp_is_valid(NvmeZone *zone) st != NVME_ZONE_STATE_OFFLINE; } +static inline void nvme_aor_inc_open(NvmeNamespace *ns) +{ + assert(ns->nr_open_zones >= 0); + if (ns->params.max_open_zones) { + ns->nr_open_zones++; + assert(ns->nr_open_zones <= ns->params.max_open_zones); + } +} + +static inline void nvme_aor_dec_open(NvmeNamespace *ns) +{ + if (ns->params.max_open_zones) { + assert(ns->nr_open_zones > 0); + ns->nr_open_zones--; + } + assert(ns->nr_open_zones >= 0); +} + +static inline void nvme_aor_inc_active(NvmeNamespace *ns) +{ + assert(ns->nr_active_zones >= 0); + if (ns->params.max_active_zones) { + ns->nr_active_zones++; + assert(ns->nr_active_zones <= ns->params.max_active_zones); + } +} + +static inline void nvme_aor_dec_active(NvmeNamespace *ns) +{ + if (ns->params.max_active_zones) { + assert(ns->nr_active_zones > 0); + ns->nr_active_zones--; + assert(ns->nr_active_zones >= ns->nr_open_zones); + } + assert(ns->nr_active_zones >= 0); +} + int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp); void nvme_ns_drain(NvmeNamespace *ns); void nvme_ns_flush(NvmeNamespace *ns); diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 0326e1611c..485ac6fc40 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -199,6 +199,26 @@ static void nvme_assign_zone_state(NvmeNamespace *ns, NvmeZone *zone, } } +/* + * Check if we can open a zone without exceeding open/active limits. + * AOR stands for "Active and Open Resources" (see TP 4053 section 2.5). + */ +static int nvme_aor_check(NvmeNamespace *ns, uint32_t act, uint32_t opn) +{ + if (ns->params.max_active_zones != 0 && + ns->nr_active_zones + act > ns->params.max_active_zones) { + trace_pci_nvme_err_insuff_active_res(ns->params.max_active_zones); + return NVME_ZONE_TOO_MANY_ACTIVE | NVME_DNR; + } + if (ns->params.max_open_zones != 0 && + ns->nr_open_zones + opn > ns->params.max_open_zones) { + trace_pci_nvme_err_insuff_open_res(ns->params.max_open_zones); + return NVME_ZONE_TOO_MANY_OPEN | NVME_DNR; + } + + return NVME_SUCCESS; +} + static bool nvme_addr_is_cmb(NvmeCtrl *n, hwaddr addr) { hwaddr low = n->ctrl_mem.addr; @@ -1203,6 +1223,41 @@ static uint16_t nvme_check_zone_read(NvmeNamespace *ns, uint64_t slba, return status; } +static void nvme_auto_transition_zone(NvmeNamespace *ns, bool implicit, + bool adding_active) +{ + NvmeZone *zone; + + if (implicit && ns->params.max_open_zones && + ns->nr_open_zones == ns->params.max_open_zones) { + zone = QTAILQ_FIRST(&ns->imp_open_zones); + if (zone) { + /* + * Automatically close this implicitly open zone. + */ + QTAILQ_REMOVE(&ns->imp_open_zones, zone, entry); + nvme_aor_dec_open(ns); + nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_CLOSED); + } + } +} + +static uint16_t nvme_auto_open_zone(NvmeNamespace *ns, NvmeZone *zone) +{ + uint16_t status = NVME_SUCCESS; + uint8_t zs = nvme_get_zone_state(zone); + + if (zs == NVME_ZONE_STATE_EMPTY) { + nvme_auto_transition_zone(ns, true, true); + status = nvme_aor_check(ns, 1, 1); + } else if (zs == NVME_ZONE_STATE_CLOSED) { + nvme_auto_transition_zone(ns, true, false); + status = nvme_aor_check(ns, 0, 1); + } + + return status; +} + static bool nvme_finalize_zoned_write(NvmeNamespace *ns, NvmeRequest *req, bool failed) { @@ -1235,7 +1290,11 @@ static bool nvme_finalize_zoned_write(NvmeNamespace *ns, NvmeRequest *req, switch (nvme_get_zone_state(zone)) { case NVME_ZONE_STATE_IMPLICITLY_OPEN: case NVME_ZONE_STATE_EXPLICITLY_OPEN: + nvme_aor_dec_open(ns); + /* fall through */ case NVME_ZONE_STATE_CLOSED: + nvme_aor_dec_active(ns); + /* fall through */ case NVME_ZONE_STATE_EMPTY: nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_FULL); /* fall through */ @@ -1264,7 +1323,10 @@ static uint64_t nvme_advance_zone_wp(NvmeNamespace *ns, NvmeZone *zone, zs = nvme_get_zone_state(zone); switch (zs) { case NVME_ZONE_STATE_EMPTY: + nvme_aor_inc_active(ns); + /* fall through */ case NVME_ZONE_STATE_CLOSED: + nvme_aor_inc_open(ns); nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_IMPLICITLY_OPEN); } } @@ -1457,6 +1519,11 @@ static uint16_t nvme_write(NvmeCtrl *n, NvmeRequest *req, bool append, bool wrz) goto invalid; } + status = nvme_auto_open_zone(ns, zone); + if (status != NVME_SUCCESS) { + goto invalid; + } + if (append) { slba = zone->w_ptr; } @@ -1537,9 +1604,27 @@ enum NvmeZoneProcessingMask { static uint16_t nvme_open_zone(NvmeNamespace *ns, NvmeZone *zone, uint8_t state) { + uint16_t status; + switch (state) { case NVME_ZONE_STATE_EMPTY: + nvme_auto_transition_zone(ns, false, true); + status = nvme_aor_check(ns, 1, 0); + if (status != NVME_SUCCESS) { + return status; + } + nvme_aor_inc_active(ns); + /* fall through */ case NVME_ZONE_STATE_CLOSED: + status = nvme_aor_check(ns, 0, 1); + if (status != NVME_SUCCESS) { + if (state == NVME_ZONE_STATE_EMPTY) { + nvme_aor_dec_active(ns); + } + return status; + } + nvme_aor_inc_open(ns); + /* fall through */ case NVME_ZONE_STATE_IMPLICITLY_OPEN: nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_EXPLICITLY_OPEN); /* fall through */ @@ -1556,6 +1641,7 @@ static uint16_t nvme_close_zone(NvmeNamespace *ns, NvmeZone *zone, switch (state) { case NVME_ZONE_STATE_EXPLICITLY_OPEN: case NVME_ZONE_STATE_IMPLICITLY_OPEN: + nvme_aor_dec_open(ns); nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_CLOSED); /* fall through */ case NVME_ZONE_STATE_CLOSED: @@ -1571,7 +1657,11 @@ static uint16_t nvme_finish_zone(NvmeNamespace *ns, NvmeZone *zone, switch (state) { case NVME_ZONE_STATE_EXPLICITLY_OPEN: case NVME_ZONE_STATE_IMPLICITLY_OPEN: + nvme_aor_dec_open(ns); + /* fall through */ case NVME_ZONE_STATE_CLOSED: + nvme_aor_dec_active(ns); + /* fall through */ case NVME_ZONE_STATE_EMPTY: zone->w_ptr = nvme_zone_wr_boundary(zone); zone->d.wp = zone->w_ptr; @@ -1590,7 +1680,11 @@ static uint16_t nvme_reset_zone(NvmeNamespace *ns, NvmeZone *zone, switch (state) { case NVME_ZONE_STATE_EXPLICITLY_OPEN: case NVME_ZONE_STATE_IMPLICITLY_OPEN: + nvme_aor_dec_open(ns); + /* fall through */ case NVME_ZONE_STATE_CLOSED: + nvme_aor_dec_active(ns); + /* fall through */ case NVME_ZONE_STATE_FULL: zone->w_ptr = zone->d.zslba; zone->d.wp = zone->w_ptr; diff --git a/hw/block/trace-events b/hw/block/trace-events index 8f221ba73c..91d2d1a1c2 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -124,6 +124,8 @@ pci_nvme_err_append_not_at_start(uint64_t slba, uint64_t zone) "appending at slb pci_nvme_err_zone_write_not_ok(uint64_t slba, uint32_t nlb, uint32_t status) "slba=%"PRIu64", nlb=%"PRIu32", status=0x%"PRIx16"" pci_nvme_err_zone_read_not_ok(uint64_t slba, uint32_t nlb, uint32_t status) "slba=%"PRIu64", nlb=%"PRIu32", status=0x%"PRIx16"" pci_nvme_err_append_too_large(uint64_t slba, uint32_t nlb, uint8_t zasl) "slba=%"PRIu64", nlb=%"PRIu32", zasl=%"PRIu8"" +pci_nvme_err_insuff_active_res(uint32_t max_active) "max_active=%"PRIu32" zone limit exceeded" +pci_nvme_err_insuff_open_res(uint32_t max_open) "max_open=%"PRIu32" zone limit exceeded" pci_nvme_err_invalid_iocsci(uint32_t idx) "unsupported command set combination index %"PRIu32"" pci_nvme_err_invalid_del_sq(uint16_t qid) "invalid submission queue deletion, sid=%"PRIu16"" pci_nvme_err_invalid_create_sq_cqid(uint16_t cqid) "failed creating submission queue, invalid cqid=%"PRIu16"" From patchwork Fri Oct 30 02:32:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 11868319 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2C48514B7 for ; Fri, 30 Oct 2020 02:41:07 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B33E82076E for ; Fri, 30 Oct 2020 02:41:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="rm+gWREC" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B33E82076E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:42590 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kYKLh-0000C6-Hb for patchwork-qemu-devel@patchwork.kernel.org; Thu, 29 Oct 2020 22:41:05 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41288) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYKE5-0006Jv-6U; Thu, 29 Oct 2020 22:33:13 -0400 Received: from esa3.hgst.iphmx.com ([216.71.153.141]:11025) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYKE2-0006fl-OR; Thu, 29 Oct 2020 22:33:12 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1604025190; x=1635561190; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=WXNlQd/GVLRYxWcrT19NyYWyuzGW4veAlpgOKLUM+ks=; b=rm+gWRECegPAZRAbtecTVtLYMUKKcsiO7KqCXFVtPBKC+Eb4m18a+8XN oSHbQM+Hs1Xq8nby4WZrzgIkPFWGx98MgZHX2tYl38TRZvirJD3drzN7B HM4I9D0+/b2G87ewatn7RvpxrvG++d91MkbrJIFcEGvt9IFMKnTRvh73z bNf8on8Y8oGhfb6uirRKMthScdA59eEsGbs1p+oWSADzriQEO9rw3kk4c vl5ppzrCLWoRKyUx1dfpkApbIvL8i51sEj929x3pP4CK6aVsr7/MZ5XcF lJD8OepSVkb4adfeZ3IIqQojoIylH88atMqPVWmCzqCm4opNy5jVH46Sf w==; IronPort-SDR: HEQXAp6YEqUNbtKfvnaU+mIh0BmOqbkxlkozsQ0lLTPTEV135GK5n0qdrVRWdMbSvtMoQtUzWT FZ4/F33rIK/y+ttnd915UWgfIDHA/C8irX0RxYd84AR/wqCD2rkCXNtMvXvUZ0OnVledYY0lrV 5iAwsBEkS6GwofZsu/QwP1wbNWRxHauIXcpzzc90KqmMZ1coRb5gczhDSt+mQ7Be9MPSDA120f YZZCNo+G6yOlT2lIfW4lOXPpbm5x5lg2KtDcfffF1TH8CXh95ezUtM8JQZ6zio4dORwWa/lP4n bBY= X-IronPort-AV: E=Sophos;i="5.77,431,1596470400"; d="scan'208";a="155748094" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 30 Oct 2020 10:33:08 +0800 IronPort-SDR: HIH1LVRJmTnP4Z56vgEgQkQ9yZqhe0cP6upaGnq1ZV3iulmH/yfcWP0mxKDHdkk/9lbIejLsN1 H5n52aqD3iVxXu0+/g7S+7aXlp5U+USb6axXgKdl7Ex7cePz20D7thHXZMS0O7rgzrjfxY4r+2 xHqSqdZXaY0bgJ9j85V+qU6wAXLmS/fNa4jg0O2Bt5R4B2nGrwisHsHIUsMH25EkwxulKUMFIv SjkoWWyqpzDwbp9uru49eYxmw/jF3Dw9u9bMmMXe39uPx3k5FdYyAVaPV3FyEyl/r34Hd3+eJJ ky5y4ANfuVupSyID0LkfYZVB Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2020 19:18:13 -0700 IronPort-SDR: cVyL1jBjZbd8wEW4nH+VvAJtFhYbsY+qLkvuhVUVZwTMPxBX7X/C3/210PCHNnb6P17zcHRYfB tWyKTMYHHo5qU3OqooyEgGUV1eWEKUT8Ti2HsKI6sgbjGsRDx90VtIK4V/CY3lnwK+ISROWbq2 jUaqwzoniQyt7W0z6bLMtE1XgNO1EXT9S7fO/UYLuBJ0jGvsn34BVflOmcdBpnpWJMoSBEdvyP IicADl/XrLZS2PJ50jEC9FXSy66+ychhsBWzBO1T1W0janIZyEupz9pD94bSwT/lLVzuD7O1PO 23c= WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip01.wdc.com with ESMTP; 29 Oct 2020 19:33:07 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v8 09/11] hw/block/nvme: Support Zone Descriptor Extensions Date: Fri, 30 Oct 2020 11:32:40 +0900 Message-Id: <20201030023242.5204-10-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201030023242.5204-1-dmitry.fomichev@wdc.com> References: <20201030023242.5204-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=216.71.153.141; envelope-from=prvs=56530b5a8=dmitry.fomichev@wdc.com; helo=esa3.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/10/29 22:32:49 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" Zone Descriptor Extension is a label that can be assigned to a zone. It can be set to an Empty zone and it stays assigned until the zone is reset. This commit adds a new optional module property, "zoned.descr_ext_size". Its value must be a multiple of 64 bytes. If this value is non-zero, it becomes possible to assign extensions of that size to any Empty zones. The default value for this property is 0, therefore setting extensions is disabled by default. Signed-off-by: Hans Holmberg Signed-off-by: Dmitry Fomichev Reviewed-by: Klaus Jensen Reviewed-by: Niklas Cassel --- hw/block/nvme-ns.c | 25 +++++++++++++++++++-- hw/block/nvme-ns.h | 8 +++++++ hw/block/nvme.c | 51 +++++++++++++++++++++++++++++++++++++++++-- hw/block/trace-events | 2 ++ 4 files changed, 82 insertions(+), 4 deletions(-) diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index 2e45838c15..85dc73cf06 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -133,6 +133,18 @@ static int nvme_calc_zone_geometry(NvmeNamespace *ns, Error **errp) return -1; } + if (ns->params.zd_extension_size) { + if (ns->params.zd_extension_size & 0x3f) { + error_setg(errp, + "zone descriptor extension size must be a multiple of 64B"); + return -1; + } + if ((ns->params.zd_extension_size >> 6) > 0xff) { + error_setg(errp, "zone descriptor extension size is too large"); + return -1; + } + } + return 0; } @@ -144,6 +156,10 @@ static void nvme_init_zone_state(NvmeNamespace *ns) int i; ns->zone_array = g_malloc0(ns->zone_array_size); + if (ns->params.zd_extension_size) { + ns->zd_extensions = g_malloc0(ns->params.zd_extension_size * + ns->num_zones); + } QTAILQ_INIT(&ns->exp_open_zones); QTAILQ_INIT(&ns->imp_open_zones); @@ -186,7 +202,8 @@ static int nvme_zoned_init_ns(NvmeCtrl *n, NvmeNamespace *ns, int lba_index, id_ns_z->ozcs = ns->params.cross_zone_read ? 0x01 : 0x00; id_ns_z->lbafe[lba_index].zsze = cpu_to_le64(ns->zone_size); - id_ns_z->lbafe[lba_index].zdes = 0; + id_ns_z->lbafe[lba_index].zdes = + ns->params.zd_extension_size >> 6; /* Units of 64B */ ns->csi = NVME_CSI_ZONED; ns->id_ns.nsze = cpu_to_le64(ns->zone_size * ns->num_zones); @@ -204,7 +221,8 @@ static void nvme_clear_zone(NvmeNamespace *ns, NvmeZone *zone) zone->w_ptr = zone->d.wp; state = nvme_get_zone_state(zone); - if (zone->d.wp != zone->d.zslba) { + if (zone->d.wp != zone->d.zslba || + (zone->d.za & NVME_ZA_ZD_EXT_VALID)) { if (state != NVME_ZONE_STATE_CLOSED) { trace_pci_nvme_clear_ns_close(state, zone->d.zslba); nvme_set_zone_state(zone, NVME_ZONE_STATE_CLOSED); @@ -301,6 +319,7 @@ void nvme_ns_cleanup(NvmeNamespace *ns) if (ns->params.zoned) { g_free(ns->id_ns_zoned); g_free(ns->zone_array); + g_free(ns->zd_extensions); } } @@ -332,6 +351,8 @@ static Property nvme_ns_props[] = { params.max_active_zones, 0), DEFINE_PROP_UINT32("zoned.max_open", NvmeNamespace, params.max_open_zones, 0), + DEFINE_PROP_UINT32("zoned.descr_ext_size", NvmeNamespace, + params.zd_extension_size, 0), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index 421bab0a57..50a6a0e1ac 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -35,6 +35,7 @@ typedef struct NvmeNamespaceParams { uint64_t zone_cap_bs; uint32_t max_active_zones; uint32_t max_open_zones; + uint32_t zd_extension_size; } NvmeNamespaceParams; typedef struct NvmeNamespace { @@ -58,6 +59,7 @@ typedef struct NvmeNamespace { uint64_t zone_capacity; uint64_t zone_array_size; uint32_t zone_size_log2; + uint8_t *zd_extensions; int32_t nr_open_zones; int32_t nr_active_zones; @@ -127,6 +129,12 @@ static inline bool nvme_wp_is_valid(NvmeZone *zone) st != NVME_ZONE_STATE_OFFLINE; } +static inline uint8_t *nvme_get_zd_extension(NvmeNamespace *ns, + uint32_t zone_idx) +{ + return &ns->zd_extensions[zone_idx * ns->params.zd_extension_size]; +} + static inline void nvme_aor_inc_open(NvmeNamespace *ns) { assert(ns->nr_open_zones >= 0); diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 485ac6fc40..339becd3e2 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1711,6 +1711,26 @@ static uint16_t nvme_offline_zone(NvmeNamespace *ns, NvmeZone *zone, return NVME_ZONE_INVAL_TRANSITION; } +static uint16_t nvme_set_zd_ext(NvmeNamespace *ns, NvmeZone *zone) +{ + uint16_t status; + uint8_t state = nvme_get_zone_state(zone); + + if (state == NVME_ZONE_STATE_EMPTY) { + nvme_auto_transition_zone(ns, false, true); + status = nvme_aor_check(ns, 1, 0); + if (status != NVME_SUCCESS) { + return status; + } + nvme_aor_inc_active(ns); + zone->d.za |= NVME_ZA_ZD_EXT_VALID; + nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_CLOSED); + return NVME_SUCCESS; + } + + return NVME_ZONE_INVAL_TRANSITION; +} + static uint16_t nvme_bulk_proc_zone(NvmeNamespace *ns, NvmeZone *zone, enum NvmeZoneProcessingMask proc_mask, op_handler_t op_hndlr) @@ -1806,6 +1826,7 @@ static uint16_t nvme_zone_mgmt_send(NvmeCtrl *n, NvmeRequest *req) NvmeCmd *cmd = (NvmeCmd *)&req->cmd; NvmeNamespace *ns = req->ns; NvmeZone *zone; + uint8_t *zd_ext; uint32_t dw13 = le32_to_cpu(cmd->cdw13); uint64_t slba = 0; uint32_t zone_idx = 0; @@ -1878,7 +1899,22 @@ static uint16_t nvme_zone_mgmt_send(NvmeCtrl *n, NvmeRequest *req) case NVME_ZONE_ACTION_SET_ZD_EXT: trace_pci_nvme_set_descriptor_extension(slba, zone_idx); - return NVME_INVALID_FIELD | NVME_DNR; + if (all || !ns->params.zd_extension_size) { + return NVME_INVALID_FIELD | NVME_DNR; + } + zd_ext = nvme_get_zd_extension(ns, zone_idx); + status = nvme_dma(n, zd_ext, ns->params.zd_extension_size, + DMA_DIRECTION_TO_DEVICE, req); + if (status) { + trace_pci_nvme_err_zd_extension_map_error(zone_idx); + return status; + } + + status = nvme_set_zd_ext(ns, zone); + if (status == NVME_SUCCESS) { + trace_pci_nvme_zd_extension_set(zone_idx); + return status; + } break; default: @@ -1956,7 +1992,7 @@ static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, NvmeRequest *req) return NVME_INVALID_FIELD | NVME_DNR; } - if (zra == NVME_ZONE_REPORT_EXTENDED) { + if (zra == NVME_ZONE_REPORT_EXTENDED && !ns->params.zd_extension_size) { return NVME_INVALID_FIELD | NVME_DNR; } @@ -1968,6 +2004,9 @@ static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, NvmeRequest *req) partial = (dw13 >> 16) & 0x01; zone_entry_sz = sizeof(NvmeZoneDescr); + if (zra == NVME_ZONE_REPORT_EXTENDED) { + zone_entry_sz += ns->params.zd_extension_size; + } max_zones = (len - sizeof(NvmeZoneReportHeader)) / zone_entry_sz; buf = g_malloc0(len); @@ -1999,6 +2038,14 @@ static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, NvmeRequest *req) z->wp = cpu_to_le64(~0ULL); } + if (zra == NVME_ZONE_REPORT_EXTENDED) { + if (zs->d.za & NVME_ZA_ZD_EXT_VALID) { + memcpy(buf_p, nvme_get_zd_extension(ns, zone_idx), + ns->params.zd_extension_size); + } + buf_p += ns->params.zd_extension_size; + } + zone_idx++; } diff --git a/hw/block/trace-events b/hw/block/trace-events index 91d2d1a1c2..12764098a2 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -95,6 +95,7 @@ pci_nvme_finish_zone(uint64_t slba, uint32_t zone_idx, int all) "finish zone, sl pci_nvme_reset_zone(uint64_t slba, uint32_t zone_idx, int all) "reset zone, slba=%"PRIu64", idx=%"PRIu32", all=%"PRIi32"" pci_nvme_offline_zone(uint64_t slba, uint32_t zone_idx, int all) "offline zone, slba=%"PRIu64", idx=%"PRIu32", all=%"PRIi32"" pci_nvme_set_descriptor_extension(uint64_t slba, uint32_t zone_idx) "set zone descriptor extension, slba=%"PRIu64", idx=%"PRIu32"" +pci_nvme_zd_extension_set(uint32_t zone_idx) "set descriptor extension for zone_idx=%"PRIu32"" pci_nvme_clear_ns_close(uint32_t state, uint64_t slba) "zone state=%"PRIu32", slba=%"PRIu64" transitioned to Closed state" pci_nvme_clear_ns_reset(uint32_t state, uint64_t slba) "zone state=%"PRIu32", slba=%"PRIu64" transitioned to Empty state" @@ -126,6 +127,7 @@ pci_nvme_err_zone_read_not_ok(uint64_t slba, uint32_t nlb, uint32_t status) "slb pci_nvme_err_append_too_large(uint64_t slba, uint32_t nlb, uint8_t zasl) "slba=%"PRIu64", nlb=%"PRIu32", zasl=%"PRIu8"" pci_nvme_err_insuff_active_res(uint32_t max_active) "max_active=%"PRIu32" zone limit exceeded" pci_nvme_err_insuff_open_res(uint32_t max_open) "max_open=%"PRIu32" zone limit exceeded" +pci_nvme_err_zd_extension_map_error(uint32_t zone_idx) "can't map descriptor extension for zone_idx=%"PRIu32"" pci_nvme_err_invalid_iocsci(uint32_t idx) "unsupported command set combination index %"PRIu32"" pci_nvme_err_invalid_del_sq(uint16_t qid) "invalid submission queue deletion, sid=%"PRIu16"" pci_nvme_err_invalid_create_sq_cqid(uint16_t cqid) "failed creating submission queue, invalid cqid=%"PRIu16"" From patchwork Fri Oct 30 02:32:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 11868313 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1CE3961C for ; Fri, 30 Oct 2020 02:38:03 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B70D020791 for ; Fri, 30 Oct 2020 02:38:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="nQ7uLYDL" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B70D020791 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:60986 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kYKIj-0004Zt-DM for patchwork-qemu-devel@patchwork.kernel.org; Thu, 29 Oct 2020 22:38:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41302) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYKE7-0006Op-5c; Thu, 29 Oct 2020 22:33:15 -0400 Received: from esa3.hgst.iphmx.com ([216.71.153.141]:11040) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYKE4-0006g5-Sy; Thu, 29 Oct 2020 22:33:14 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1604025192; x=1635561192; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=pvyeyZ/LZaGRVXzLHJy4W4Vfxrqg+cDrcTWs9R/ZrM8=; b=nQ7uLYDLgMt24AP37YJ0/+tpcSz80l7TsPT1Ear8s6mxaupVKrAOhdr/ wRK9DwFXP/5MLjpSTXgxBfuKg+8wotvi5wL/+r3xIScIwujqP2x6QBjim uEGwgoRWHEP042Y8k3NUGtOt3l63Lly2P6dGjzOgntD1wcwGHMKpRQ8lN 2TKmMECbizKE+gmlxJk9i6f5CT9kFbM0LStkDiyZxUZqKYaxqu6gOMRDu fOP7l6PGksQQHSZpfDdP+O6nBZF3wfyV0HImBk1FTHAa1Ji2FiwqNuQqk jUJ7ioabhsrLPbICCxVnmvcA1m9UUnXxT25KzDpoqN23spfPZ8xI5yLLU g==; IronPort-SDR: /HLAp32p9kW30bYa6ay8m8WOvBoPDz2hOVX6yXjuJ5RsWleug7TDM1RclGsOQ43/CZ8drzSDJo mxpS7k+yBW2DNIP1vdEMma3nxn9B9vWADluRWDra930p1aKWMd8TvdCtTniBuvwNhPtC7EQV4E 2hNL4G/PkaAdWbgKjqlMLfDXW4d8tC79TNYcp297bB3yLRwPN6dk0aN8AGL5Ua2UHVXPqeVfGg 8/cfjHn2jUuEzxFDkdUx6sgawOyltmmjgQLxVBgHhnhymkOkVrInGmB16QKm90/o3MhOvymbSV o1A= X-IronPort-AV: E=Sophos;i="5.77,431,1596470400"; d="scan'208";a="155748100" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 30 Oct 2020 10:33:10 +0800 IronPort-SDR: KBgxBoBtgExyz6ZrsYCf3DkeHOjBNMDZz3p/ZJ0eYaqGtXknuK/I5POWRw+Yl7eKyxWTLCDCDO RMBOdUbgbuOKd0H6sMDAuwV3jLB+PPHBu8ElwyGKaxHbCjl/Ftw/q6NHImYVj9CdCblkgA2RRX 6zLyZ5OklW0hU38OxtCyTEJm9FReE7D/O6VVahbNvubnkCoFPDi4s98hUS0I45UHBK85iA7K8u ojBeGUq1Ej2eKfl0ZZDA1buDcxCgdmhhiq5BcC8B1+nAoHrw+RSTUF141PtIgVfexs75xIXc3M yyoOIv8sWLmkU5BTf7fWi63R Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2020 19:18:15 -0700 IronPort-SDR: 5Mzp/rLsVYt2EpRPBjgxd0ncK7EMqG6GY4uWt2U0qXEBouHUxl/51p/hqTg6tD5SW0y5JATHsV qPLkkf7Dr0mYer0Z5+N8b0a58dVZOHf2ypW0BRV9PKMPksDqRexw+F3Q8kyMZ1liS/WcRE0+Fx E6LG2CMPYIVX6D3W8pTX3PlXe6DZDUWXfzQiPy0iNlautNaNDqWPRx7UKSIjz5q/JjdpH7+aTC NO8+kCZ5TiSD6U2yFmEzDw6cWl5FRzbIkm0C/EafNjyMkVboexOOv0WluQWZ21GrYeh9TpKQut EFk= WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip01.wdc.com with ESMTP; 29 Oct 2020 19:33:10 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v8 10/11] hw/block/nvme: Add injection of Offline/Read-Only zones Date: Fri, 30 Oct 2020 11:32:41 +0900 Message-Id: <20201030023242.5204-11-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201030023242.5204-1-dmitry.fomichev@wdc.com> References: <20201030023242.5204-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=216.71.153.141; envelope-from=prvs=56530b5a8=dmitry.fomichev@wdc.com; helo=esa3.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/10/29 22:32:49 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" ZNS specification defines two zone conditions for the zones that no longer can function properly, possibly because of flash wear or other internal fault. It is useful to be able to "inject" a small number of such zones for testing purposes. This commit defines two optional device properties, "offline_zones" and "rdonly_zones". Users can assign non-zero values to these variables to specify the number of zones to be initialized as Offline or Read-Only. The actual number of injected zones may be smaller than the requested amount - Read-Only and Offline counts are expected to be much smaller than the total number of zones on a drive. Signed-off-by: Dmitry Fomichev Reviewed-by: Niklas Cassel --- hw/block/nvme-ns.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++ hw/block/nvme-ns.h | 2 ++ 2 files changed, 54 insertions(+) diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index 85dc73cf06..5e4a6705cd 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -21,6 +21,7 @@ #include "sysemu/sysemu.h" #include "sysemu/block-backend.h" #include "qapi/error.h" +#include "crypto/random.h" #include "hw/qdev-properties.h" #include "hw/qdev-core.h" @@ -145,6 +146,20 @@ static int nvme_calc_zone_geometry(NvmeNamespace *ns, Error **errp) } } + if (ns->params.max_open_zones < nz) { + if (ns->params.nr_offline_zones > nz - ns->params.max_open_zones) { + error_setg(errp, "offline_zones value %u is too large", + ns->params.nr_offline_zones); + return -1; + } + if (ns->params.nr_rdonly_zones > + nz - ns->params.max_open_zones - ns->params.nr_offline_zones) { + error_setg(errp, "rdonly_zones value %u is too large", + ns->params.nr_rdonly_zones); + return -1; + } + } + return 0; } @@ -153,7 +168,9 @@ static void nvme_init_zone_state(NvmeNamespace *ns) uint64_t start = 0, zone_size = ns->zone_size; uint64_t capacity = ns->num_zones * zone_size; NvmeZone *zone; + uint32_t rnd; int i; + uint16_t zs; ns->zone_array = g_malloc0(ns->zone_array_size); if (ns->params.zd_extension_size) { @@ -180,6 +197,37 @@ static void nvme_init_zone_state(NvmeNamespace *ns) zone->w_ptr = start; start += zone_size; } + + /* If required, make some zones Offline or Read Only */ + + for (i = 0; i < ns->params.nr_offline_zones; i++) { + do { + qcrypto_random_bytes(&rnd, sizeof(rnd), NULL); + rnd %= ns->num_zones; + } while (rnd < ns->params.max_open_zones); + zone = &ns->zone_array[rnd]; + zs = nvme_get_zone_state(zone); + if (zs != NVME_ZONE_STATE_OFFLINE) { + nvme_set_zone_state(zone, NVME_ZONE_STATE_OFFLINE); + } else { + i--; + } + } + + for (i = 0; i < ns->params.nr_rdonly_zones; i++) { + do { + qcrypto_random_bytes(&rnd, sizeof(rnd), NULL); + rnd %= ns->num_zones; + } while (rnd < ns->params.max_open_zones); + zone = &ns->zone_array[rnd]; + zs = nvme_get_zone_state(zone); + if (zs != NVME_ZONE_STATE_OFFLINE && + zs != NVME_ZONE_STATE_READ_ONLY) { + nvme_set_zone_state(zone, NVME_ZONE_STATE_READ_ONLY); + } else { + i--; + } + } } static int nvme_zoned_init_ns(NvmeCtrl *n, NvmeNamespace *ns, int lba_index, @@ -353,6 +401,10 @@ static Property nvme_ns_props[] = { params.max_open_zones, 0), DEFINE_PROP_UINT32("zoned.descr_ext_size", NvmeNamespace, params.zd_extension_size, 0), + DEFINE_PROP_UINT32("zoned.offline_zones", NvmeNamespace, + params.nr_offline_zones, 0), + DEFINE_PROP_UINT32("zoned.rdonly_zones", NvmeNamespace, + params.nr_rdonly_zones, 0), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index 50a6a0e1ac..b30478e5d7 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -36,6 +36,8 @@ typedef struct NvmeNamespaceParams { uint32_t max_active_zones; uint32_t max_open_zones; uint32_t zd_extension_size; + uint32_t nr_offline_zones; + uint32_t nr_rdonly_zones; } NvmeNamespaceParams; typedef struct NvmeNamespace { From patchwork Fri Oct 30 02:32:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 11868323 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9F4F46A2 for ; Fri, 30 Oct 2020 02:43:08 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 33EAE206BE for ; Fri, 30 Oct 2020 02:43:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="jxsiTJt0" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 33EAE206BE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:47624 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kYKNf-0002Gm-5P for patchwork-qemu-devel@patchwork.kernel.org; Thu, 29 Oct 2020 22:43:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41312) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYKE8-0006TD-Vh; Thu, 29 Oct 2020 22:33:17 -0400 Received: from esa3.hgst.iphmx.com ([216.71.153.141]:11025) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kYKE6-0006fl-Gy; Thu, 29 Oct 2020 22:33:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1604025194; x=1635561194; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5nTTmQuD/HiZqHuuk3VyWsaPe9oSQQMKS3teO6XgKiY=; b=jxsiTJt0p/wbL705kojZOcQRUbahRmhV+PhByDh8FA21VLm5wV5LjUvW TpSa99GTc1npBKedAI4rsHCJXGsDfypoNCuRbqv+mm5WCDmmiLs+cuJse /OuoiyI6cyS8BVnTQXnYVG29MnQS4jP4gETF2KrVwUA0lCo9KtuYyT1pY Mjm2o61z9pU2DLsDXzRB8FOPu3YN/hQxIZpNSx4TYAYaM6xdxS+/NXZjv l7qTtL0X/G6lbrk+wLcyemE0s1tFGolF/JOIZWLsCziFASV6NuJ83HZ8/ Po/+iBKw5Qu27NS9s6107z2z662AmnqWtwH8NxpIWOSQYq1E22hHYLxpa A==; IronPort-SDR: 8rJKwJjXxdOevMSSaM8gGdvZzTT+McJ9/T7byov6hOjoh6g04sp9OXZy1OboRp7ZRnjZd4l9lZ ZS0tK1Y6TUcfMvuwKE7FeQDKM2VSDDqlviUmIyqVsxcvx0ek2/HwFtkVIzOc7wMM6y2et0QmzG 1kV0rnRJxZJZFjJefKfQNUMvdmU0ySHFtyDti5A1kMgGcwhmg1RPBP4wQ6wPwa5RWjNiS5Ux4x /gLyHIWSUKWpEIvNS3I5JPZWfqKzOizhvPK2DuGIaYy2lA3sPsgtu32ZF8+GhkFeR38xWKpuBP grA= X-IronPort-AV: E=Sophos;i="5.77,431,1596470400"; d="scan'208";a="155748103" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 30 Oct 2020 10:33:12 +0800 IronPort-SDR: M7wNJ2uRU5cokfO9bYf5eLPoD3E7wg+Mtlb37ik8SSsCqPKnztsiSkGsCxSLrTH92c1/qIbB/8 9OP+JZBjV6o3miCScupey1fYEXcrpmanLnylfKsfiWIpmCiUgMa3ECvRI6TBI+8g2D14VPrTg3 eM7HLdOt8LhAhRRGCCsi5jrYDy8Q8VBeRYSTX/UhxUkk+ibAlHeda+q5WzLHJrvHr+pmPm0xPU /JwceYsiHau8R11qReEOn6yu4jCyEYGibveFNqFd/QumaELgv6OGskFcr2UWcuBiWB9CDOrqnY CoJLvTt5aCREY5ky1u4/keCW Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2020 19:18:17 -0700 IronPort-SDR: U82bYWB5Ck8sOnw7gQl/3UkiYBYHDwMhX+CT4NVfAR6KE8jCl8thnXHwRuDL7w0CIIUVj73WWz FJdoqiVtl2BWpI9we8E2jfilrzrkO5VUpZUoFZmW80OKA2uEc0bg3Mw3r/I7/zxci8OChBAVz2 I9rFn1s4DlIMOPKdOrCb7p884T969J4QMHA8j6xZ5WCM2CG5ZRoxp1PsBe9N6fXO+f8EtBWe8G ffu+xEMjc1HBZlVAFk4PQR+f9W1bDh7kae4Vk3tVuJzR/nxW1xtnnla8P8hJLuAq8xfI/4IJdm Ltg= WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip01.wdc.com with ESMTP; 29 Oct 2020 19:33:12 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v8 11/11] hw/block/nvme: Document zoned parameters in usage text Date: Fri, 30 Oct 2020 11:32:42 +0900 Message-Id: <20201030023242.5204-12-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20201030023242.5204-1-dmitry.fomichev@wdc.com> References: <20201030023242.5204-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=216.71.153.141; envelope-from=prvs=56530b5a8=dmitry.fomichev@wdc.com; helo=esa3.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/10/29 22:32:49 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" Added brief descriptions of the new device properties that are now available to users to configure features of Zoned Namespace Command Set in the emulator. This patch is for documentation only, no functionality change. Signed-off-by: Dmitry Fomichev Reviewed-by: Niklas Cassel --- hw/block/nvme.c | 47 ++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 42 insertions(+), 5 deletions(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 339becd3e2..10f5c752ba 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -9,7 +9,7 @@ */ /** - * Reference Specs: http://www.nvmexpress.org, 1.2, 1.1, 1.0e + * Reference Specs: http://www.nvmexpress.org, 1.4, 1.3, 1.2, 1.1, 1.0e * * https://nvmexpress.org/developers/nvme-specification/ */ @@ -22,8 +22,9 @@ * [pmrdev=,] \ * max_ioqpairs=, \ * aerl=, aer_max_queued=, \ - * mdts= - * -device nvme-ns,drive=,bus=bus_name,nsid= + * mdts=,zoned.append_size_limit= \ + * -device nvme-ns,drive=,bus=,nsid=,\ + * zoned= * * Note cmb_size_mb denotes size of CMB in MB. CMB is assumed to be at * offset 0 in BAR2 and supports only WDS, RDS and SQS for now. @@ -41,14 +42,50 @@ * ~~~~~~~~~~~~~~~~~~~~~~ * - `aerl` * The Asynchronous Event Request Limit (AERL). Indicates the maximum number - * of concurrently outstanding Asynchronous Event Request commands suppoert + * of concurrently outstanding Asynchronous Event Request commands support * by the controller. This is a 0's based value. * * - `aer_max_queued` * This is the maximum number of events that the device will enqueue for - * completion when there are no oustanding AERs. When the maximum number of + * completion when there are no outstanding AERs. When the maximum number of * enqueued events are reached, subsequent events will be dropped. * + * - `zoned.append_size_limit` + * The maximum I/O size in bytes that is allowed in Zone Append command. + * The default is 128KiB. Since internally this this value is maintained as + * ZASL = log2( / ), some values assigned + * to this property may be rounded down and result in a lower maximum ZA + * data size being in effect. By setting this property to 0, users can make + * ZASL to be equal to MDTS. This property only affects zoned namespaces. + * + * Setting `zoned` to true selects Zoned Command Set at the namespace. + * In this case, the following namespace properties are available to configure + * zoned operation: + * zoned.zsze= + * The number may be followed by K, M, G as in kilo-, mega- or giga-. + * + * zoned.zcap= + * The value 0 (default) forces zone capacity to be the same as zone + * size. The value of this property may not exceed zone size. + * + * zoned.descr_ext_size= + * This value needs to be specified in 64B units. If it is zero, + * namespace(s) will not support zone descriptor extensions. + * + * zoned.max_active= + * The default value means there is no limit to the number of + * concurrently active zones. + * + * zoned.max_open= + * The default value means there is no limit to the number of + * concurrently open zones. + * + * zoned.offline_zones= + * + * zoned.rdonly_zones= + * + * zoned.cross_zone_read= + * Setting this property to true enables Read Across Zone Boundaries. */ #include "qemu/osdep.h"