Message ID | 20201019021726.12048-5-dmitry.fomichev@wdc.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | hw/block/nvme: Support Namespace Types and Zoned Namespace Command Set | expand |
On Mon, Oct 19, 2020 at 11:17:19AM +0900, Dmitry Fomichev wrote: > Add a new Boolean namespace property, "attached", to provide the most > basic namespace attachment support. The default value for this new > property is true. Also, implement the logic in the new CNS values to > include/exclude namespaces based on this new property. The only thing > missing is hooking up the actual Namespace Attachment command opcode, > which will allow a user to toggle the "attached" flag per namespace. > > The reason for not hooking up this command completely is because the > NVMe specification requires the namespace management command to be > supported if the namespace attachment command is supported. Huh, the spec does require that, and that seems like an odd requirement since it prevents dynamic namespace attach states in a static namespace setup. I'm not sure why the spec assumes those two things go together, but it sure enough does! The implementation looks fine. Reviewed-by: Keith Busch <kbusch@kernel.org>
On Oct 19 11:17, Dmitry Fomichev wrote: (snip) > CAP.CSS (together with the I/O Command Set data structure) defines > what command sets are supported by the controller. > > CC.CSS (together with Set Profile) can be set to enable a subset of > the available command sets. > > Even if a user configures CC.CSS to e.g. Admin only, NVM namespaces > will still be attached (and thus marked as active). > Similarly, if a user configures CC.CSS to e.g. NVM, ZNS namespaces > will still be attached (and thus marked as active). > > However, any operation from a disabled command set will result in a > Invalid Command Opcode. > This part of the commit message seems irrelevant to the patch. > Add a new Boolean namespace property, "attached", to provide the most > basic namespace attachment support. The default value for this new > property is true. Also, implement the logic in the new CNS values to > include/exclude namespaces based on this new property. The only thing > missing is hooking up the actual Namespace Attachment command opcode, > which will allow a user to toggle the "attached" flag per namespace. > Without Namespace Attachment support, the sole purpose of this parameter is to allow unusable namespace IDs to be reported. I have no problems with adding support for the additional CNS values. They will return identical responses, but I think that is good enough for now. When it is not really needed, we should be wary of adding a parameter that is really hard to get rid of again. > The reason for not hooking up this command completely is because the > NVMe specification requires the namespace management command to be > supported if the namespace attachment command is supported. > There are many ways to support Namespace Management, and there are a lot of quirks with each of them. Do we use a big blockdev and carve out namespaces? Then, what are the semantics of an image resize operation? Do we dynamically create blockdev devices - thats sounds pretty nice, but might have other quirks and the attachment is not really persistent. I think at least the "attached" parameter should be x-prefixed, but better, leave it out for now until we know how we want Namespace Attachment and Management to be implemented.
> -----Original Message----- > From: Klaus Jensen <its@irrelevant.dk> > Sent: Tuesday, October 20, 2020 4:21 AM > To: Dmitry Fomichev <Dmitry.Fomichev@wdc.com> > Cc: Keith Busch <kbusch@kernel.org>; Klaus Jensen > <k.jensen@samsung.com>; Kevin Wolf <kwolf@redhat.com>; Philippe > Mathieu-Daudé <philmd@redhat.com>; Maxim Levitsky > <mlevitsk@redhat.com>; Fam Zheng <fam@euphon.net>; Niklas Cassel > <Niklas.Cassel@wdc.com>; Damien Le Moal <Damien.LeMoal@wdc.com>; > qemu-block@nongnu.org; qemu-devel@nongnu.org; Alistair Francis > <Alistair.Francis@wdc.com>; Matias Bjorling <Matias.Bjorling@wdc.com> > Subject: Re: [PATCH v7 04/11] hw/block/nvme: Support allocated CNS > command variants > > On Oct 19 11:17, Dmitry Fomichev wrote: > > (snip) > > > CAP.CSS (together with the I/O Command Set data structure) defines > > what command sets are supported by the controller. > > > > CC.CSS (together with Set Profile) can be set to enable a subset of > > the available command sets. > > > > Even if a user configures CC.CSS to e.g. Admin only, NVM namespaces > > will still be attached (and thus marked as active). > > Similarly, if a user configures CC.CSS to e.g. NVM, ZNS namespaces > > will still be attached (and thus marked as active). > > > > However, any operation from a disabled command set will result in a > > Invalid Command Opcode. > > > > This part of the commit message seems irrelevant to the patch. > > > Add a new Boolean namespace property, "attached", to provide the most > > basic namespace attachment support. The default value for this new > > property is true. Also, implement the logic in the new CNS values to > > include/exclude namespaces based on this new property. The only thing > > missing is hooking up the actual Namespace Attachment command opcode, > > which will allow a user to toggle the "attached" flag per namespace. > > > > Without Namespace Attachment support, the sole purpose of this > parameter > is to allow unusable namespace IDs to be reported. I have no problems > with adding support for the additional CNS values. They will return > identical responses, but I think that is good enough for now. > > When it is not really needed, we should be wary of adding a parameter > that is really hard to get rid of again. > > > The reason for not hooking up this command completely is because the > > NVMe specification requires the namespace management command to be > > supported if the namespace attachment command is supported. > > > > There are many ways to support Namespace Management, and there are a > lot > of quirks with each of them. Do we use a big blockdev and carve out > namespaces? Then, what are the semantics of an image resize operation? > > Do we dynamically create blockdev devices - thats sounds pretty nice, > but might have other quirks and the attachment is not really persistent. > > I think at least the "attached" parameter should be x-prefixed, but > better, leave it out for now until we know how we want Namespace > Attachment and Management to be implemented. I don't mind leaving this property out. I used it for testing the patch and it could, in theory, be manipulated by an external process doing NS Management, but, as you said, there is no certainty about now NS Management will be implemented and any related CLI interface should better be added as a part of this future work, not now.
diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index c0362426cc..974aea33f7 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -132,6 +132,7 @@ static Property nvme_ns_props[] = { DEFINE_BLOCK_PROPERTIES(NvmeNamespace, blkconf), DEFINE_PROP_UINT32("nsid", NvmeNamespace, params.nsid, 0), DEFINE_PROP_UUID("uuid", NvmeNamespace, params.uuid), + DEFINE_PROP_BOOL("attached", NvmeNamespace, params.attached, true), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index d795e44bab..d6b2808b97 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -21,6 +21,7 @@ typedef struct NvmeNamespaceParams { uint32_t nsid; + bool attached; QemuUUID uuid; } NvmeNamespaceParams; diff --git a/hw/block/nvme.c b/hw/block/nvme.c index ca0d0abf5c..93728e51b3 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1062,6 +1062,9 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req) if (unlikely(!req->ns)) { return NVME_INVALID_FIELD | NVME_DNR; } + if (!req->ns->params.attached) { + return NVME_INVALID_FIELD | NVME_DNR; + } if (!(req->ns->iocs[req->cmd.opcode] & NVME_CMD_EFF_CSUPP)) { trace_pci_nvme_err_invalid_opc(req->cmd.opcode); @@ -1222,6 +1225,7 @@ static uint16_t nvme_smart_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len, uint32_t trans_len; NvmeNamespace *ns; time_t current_ms; + int i; if (off >= sizeof(smart)) { return NVME_INVALID_FIELD | NVME_DNR; @@ -1232,15 +1236,18 @@ static uint16_t nvme_smart_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len, if (!ns) { return NVME_INVALID_NSID | NVME_DNR; } - nvme_set_blk_stats(ns, &stats); + if (ns->params.attached) { + nvme_set_blk_stats(ns, &stats); + } } else { - int i; - for (i = 1; i <= n->num_namespaces; i++) { ns = nvme_ns(n, i); if (!ns) { continue; } + if (!ns->params.attached) { + continue; + } nvme_set_blk_stats(ns, &stats); } } @@ -1531,7 +1538,8 @@ static uint16_t nvme_identify_ctrl_csi(NvmeCtrl *n, NvmeRequest *req) return NVME_INVALID_FIELD | NVME_DNR; } -static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req, + bool only_active) { NvmeNamespace *ns; NvmeIdentify *c = (NvmeIdentify *)&req->cmd; @@ -1548,11 +1556,16 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req) return nvme_rpt_empty_id_struct(n, req); } + if (only_active && !ns->params.attached) { + return nvme_rpt_empty_id_struct(n, req); + } + return nvme_dma(n, (uint8_t *)&ns->id_ns, sizeof(NvmeIdNs), DMA_DIRECTION_FROM_DEVICE, req); } -static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req, + bool only_active) { NvmeNamespace *ns; NvmeIdentify *c = (NvmeIdentify *)&req->cmd; @@ -1569,6 +1582,10 @@ static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req) return nvme_rpt_empty_id_struct(n, req); } + if (only_active && !ns->params.attached) { + return nvme_rpt_empty_id_struct(n, req); + } + if (c->csi == NVME_CSI_NVM) { return nvme_rpt_empty_id_struct(n, req); } @@ -1576,7 +1593,8 @@ static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req) return NVME_INVALID_FIELD | NVME_DNR; } -static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req, + bool only_active) { NvmeNamespace *ns; NvmeIdentify *c = (NvmeIdentify *)&req->cmd; @@ -1606,6 +1624,9 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req) if (ns->params.nsid < min_nsid) { continue; } + if (only_active && !ns->params.attached) { + continue; + } list_ptr[j++] = cpu_to_le32(ns->params.nsid); if (j == data_len / sizeof(uint32_t)) { break; @@ -1615,7 +1636,8 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req) return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req); } -static uint16_t nvme_identify_nslist_csi(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_identify_nslist_csi(NvmeCtrl *n, NvmeRequest *req, + bool only_active) { NvmeNamespace *ns; NvmeIdentify *c = (NvmeIdentify *)&req->cmd; @@ -1639,6 +1661,9 @@ static uint16_t nvme_identify_nslist_csi(NvmeCtrl *n, NvmeRequest *req) if (ns->params.nsid < min_nsid) { continue; } + if (only_active && !ns->params.attached) { + continue; + } list_ptr[j++] = cpu_to_le32(ns->params.nsid); if (j == data_len / sizeof(uint32_t)) { break; @@ -1712,17 +1737,25 @@ static uint16_t nvme_identify(NvmeCtrl *n, NvmeRequest *req) switch (le32_to_cpu(c->cns)) { case NVME_ID_CNS_NS: - return nvme_identify_ns(n, req); + return nvme_identify_ns(n, req, true); case NVME_ID_CNS_CS_NS: - return nvme_identify_ns_csi(n, req); + return nvme_identify_ns_csi(n, req, true); + case NVME_ID_CNS_NS_PRESENT: + return nvme_identify_ns(n, req, false); + case NVME_ID_CNS_CS_NS_PRESENT: + return nvme_identify_ns_csi(n, req, false); case NVME_ID_CNS_CTRL: return nvme_identify_ctrl(n, req); case NVME_ID_CNS_CS_CTRL: return nvme_identify_ctrl_csi(n, req); case NVME_ID_CNS_NS_ACTIVE_LIST: - return nvme_identify_nslist(n, req); + return nvme_identify_nslist(n, req, true); case NVME_ID_CNS_CS_NS_ACTIVE_LIST: - return nvme_identify_nslist_csi(n, req); + return nvme_identify_nslist_csi(n, req, true); + case NVME_ID_CNS_NS_PRESENT_LIST: + return nvme_identify_nslist(n, req, false); + case NVME_ID_CNS_CS_NS_PRESENT_LIST: + return nvme_identify_nslist_csi(n, req, false); case NVME_ID_CNS_NS_DESCR_LIST: return nvme_identify_ns_descr_list(n, req); case NVME_ID_CNS_IO_COMMAND_SET: @@ -1795,6 +1828,7 @@ static uint16_t nvme_get_feature_timestamp(NvmeCtrl *n, NvmeRequest *req) static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeRequest *req) { + NvmeNamespace *ns; NvmeCmd *cmd = &req->cmd; uint32_t dw10 = le32_to_cpu(cmd->cdw10); uint32_t dw11 = le32_to_cpu(cmd->cdw11); @@ -1826,7 +1860,11 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeRequest *req) return NVME_INVALID_NSID | NVME_DNR; } - if (!nvme_ns(n, nsid)) { + ns = nvme_ns(n, nsid); + if (!ns) { + return NVME_INVALID_FIELD | NVME_DNR; + } + if (!ns->params.attached) { return NVME_INVALID_FIELD | NVME_DNR; } } @@ -1968,6 +2006,9 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req) if (unlikely(!ns)) { return NVME_INVALID_FIELD | NVME_DNR; } + if (!ns->params.attached) { + return NVME_INVALID_FIELD | NVME_DNR; + } } } else if (nsid && nsid != NVME_NSID_BROADCAST) { if (!nvme_nsid_valid(n, nsid)) { @@ -2015,6 +2056,9 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req) if (!ns) { continue; } + if (!ns->params.attached) { + continue; + } if (!(dw11 & 0x1) && blk_enable_write_cache(ns->blkconf.blk)) { blk_flush(ns->blkconf.blk); diff --git a/include/block/nvme.h b/include/block/nvme.h index f5ac9143c4..27125c9d28 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -805,14 +805,18 @@ typedef struct QEMU_PACKED NvmePSD { #define NVME_IDENTIFY_DATA_SIZE 4096 enum NvmeIdCns { - NVME_ID_CNS_NS = 0x00, - NVME_ID_CNS_CTRL = 0x01, - NVME_ID_CNS_NS_ACTIVE_LIST = 0x02, - NVME_ID_CNS_NS_DESCR_LIST = 0x03, - NVME_ID_CNS_CS_NS = 0x05, - NVME_ID_CNS_CS_CTRL = 0x06, - NVME_ID_CNS_CS_NS_ACTIVE_LIST = 0x07, - NVME_ID_CNS_IO_COMMAND_SET = 0x1c, + NVME_ID_CNS_NS = 0x00, + NVME_ID_CNS_CTRL = 0x01, + NVME_ID_CNS_NS_ACTIVE_LIST = 0x02, + NVME_ID_CNS_NS_DESCR_LIST = 0x03, + NVME_ID_CNS_CS_NS = 0x05, + NVME_ID_CNS_CS_CTRL = 0x06, + NVME_ID_CNS_CS_NS_ACTIVE_LIST = 0x07, + NVME_ID_CNS_NS_PRESENT_LIST = 0x10, + NVME_ID_CNS_NS_PRESENT = 0x11, + NVME_ID_CNS_CS_NS_PRESENT_LIST = 0x1a, + NVME_ID_CNS_CS_NS_PRESENT = 0x1b, + NVME_ID_CNS_IO_COMMAND_SET = 0x1c, }; typedef struct QEMU_PACKED NvmeIdCtrl {