Message ID | 20240218-reuse-v5-1-e4fc1c19b5a9@daynix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | hw/pci: SR-IOV related fixes and improvements | expand |
On Feb 18 13:56, Akihiko Odaki wrote: > nvme_sriov_pre_write_ctrl() used to directly inspect SR-IOV > configurations to know the number of VFs being disabled due to SR-IOV > configuration writes, but the logic was flawed and resulted in > out-of-bound memory access. > > It assumed PCI_SRIOV_NUM_VF always has the number of currently enabled > VFs, but it actually doesn't in the following cases: > - PCI_SRIOV_NUM_VF has been set but PCI_SRIOV_CTRL_VFE has never been. > - PCI_SRIOV_NUM_VF was written after PCI_SRIOV_CTRL_VFE was set. > - VFs were only partially enabled because of realization failure. > > It is a responsibility of pcie_sriov to interpret SR-IOV configurations > and pcie_sriov does it correctly, so use pcie_sriov_num_vfs(), which it > provides, to get the number of enabled VFs before and after SR-IOV > configuration writes. > > Cc: qemu-stable@nongnu.org > Fixes: 11871f53ef8e ("hw/nvme: Add support for the Virtualization Management command") > Suggested-by: Michael S. Tsirkin <mst@redhat.com> > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Thanks Akihiko, I'll pick this up for hw/nvme nvme-next as-is. Reviewed-by: Klaus Jensen <k.jensen@samsung.com> > --- > hw/nvme/ctrl.c | 26 ++++++++------------------ > 1 file changed, 8 insertions(+), 18 deletions(-) > > diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c > index f026245d1e9e..7a56e7b79b4d 100644 > --- a/hw/nvme/ctrl.c > +++ b/hw/nvme/ctrl.c > @@ -8466,36 +8466,26 @@ static void nvme_pci_reset(DeviceState *qdev) > nvme_ctrl_reset(n, NVME_RESET_FUNCTION); > } > > -static void nvme_sriov_pre_write_ctrl(PCIDevice *dev, uint32_t address, > - uint32_t val, int len) > +static void nvme_sriov_post_write_config(PCIDevice *dev, uint16_t old_num_vfs) > { > NvmeCtrl *n = NVME(dev); > NvmeSecCtrlEntry *sctrl; > - uint16_t sriov_cap = dev->exp.sriov_cap; > - uint32_t off = address - sriov_cap; > - int i, num_vfs; > + int i; > > - if (!sriov_cap) { > - return; > - } > - > - if (range_covers_byte(off, len, PCI_SRIOV_CTRL)) { > - if (!(val & PCI_SRIOV_CTRL_VFE)) { > - num_vfs = pci_get_word(dev->config + sriov_cap + PCI_SRIOV_NUM_VF); > - for (i = 0; i < num_vfs; i++) { > - sctrl = &n->sec_ctrl_list.sec[i]; > - nvme_virt_set_state(n, le16_to_cpu(sctrl->scid), false); > - } > - } > + for (i = pcie_sriov_num_vfs(dev); i < old_num_vfs; i++) { > + sctrl = &n->sec_ctrl_list.sec[i]; > + nvme_virt_set_state(n, le16_to_cpu(sctrl->scid), false); > } > } > > static void nvme_pci_write_config(PCIDevice *dev, uint32_t address, > uint32_t val, int len) > { > - nvme_sriov_pre_write_ctrl(dev, address, val, len); > + uint16_t old_num_vfs = pcie_sriov_num_vfs(dev); > + > pci_default_write_config(dev, address, val, len); > pcie_cap_flr_write_config(dev, address, val, len); > + nvme_sriov_post_write_config(dev, old_num_vfs); > } > > static const VMStateDescription nvme_vmstate = { > > -- > 2.43.1 >
diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index f026245d1e9e..7a56e7b79b4d 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -8466,36 +8466,26 @@ static void nvme_pci_reset(DeviceState *qdev) nvme_ctrl_reset(n, NVME_RESET_FUNCTION); } -static void nvme_sriov_pre_write_ctrl(PCIDevice *dev, uint32_t address, - uint32_t val, int len) +static void nvme_sriov_post_write_config(PCIDevice *dev, uint16_t old_num_vfs) { NvmeCtrl *n = NVME(dev); NvmeSecCtrlEntry *sctrl; - uint16_t sriov_cap = dev->exp.sriov_cap; - uint32_t off = address - sriov_cap; - int i, num_vfs; + int i; - if (!sriov_cap) { - return; - } - - if (range_covers_byte(off, len, PCI_SRIOV_CTRL)) { - if (!(val & PCI_SRIOV_CTRL_VFE)) { - num_vfs = pci_get_word(dev->config + sriov_cap + PCI_SRIOV_NUM_VF); - for (i = 0; i < num_vfs; i++) { - sctrl = &n->sec_ctrl_list.sec[i]; - nvme_virt_set_state(n, le16_to_cpu(sctrl->scid), false); - } - } + for (i = pcie_sriov_num_vfs(dev); i < old_num_vfs; i++) { + sctrl = &n->sec_ctrl_list.sec[i]; + nvme_virt_set_state(n, le16_to_cpu(sctrl->scid), false); } } static void nvme_pci_write_config(PCIDevice *dev, uint32_t address, uint32_t val, int len) { - nvme_sriov_pre_write_ctrl(dev, address, val, len); + uint16_t old_num_vfs = pcie_sriov_num_vfs(dev); + pci_default_write_config(dev, address, val, len); pcie_cap_flr_write_config(dev, address, val, len); + nvme_sriov_post_write_config(dev, old_num_vfs); } static const VMStateDescription nvme_vmstate = {
nvme_sriov_pre_write_ctrl() used to directly inspect SR-IOV configurations to know the number of VFs being disabled due to SR-IOV configuration writes, but the logic was flawed and resulted in out-of-bound memory access. It assumed PCI_SRIOV_NUM_VF always has the number of currently enabled VFs, but it actually doesn't in the following cases: - PCI_SRIOV_NUM_VF has been set but PCI_SRIOV_CTRL_VFE has never been. - PCI_SRIOV_NUM_VF was written after PCI_SRIOV_CTRL_VFE was set. - VFs were only partially enabled because of realization failure. It is a responsibility of pcie_sriov to interpret SR-IOV configurations and pcie_sriov does it correctly, so use pcie_sriov_num_vfs(), which it provides, to get the number of enabled VFs before and after SR-IOV configuration writes. Cc: qemu-stable@nongnu.org Fixes: 11871f53ef8e ("hw/nvme: Add support for the Virtualization Management command") Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> --- hw/nvme/ctrl.c | 26 ++++++++------------------ 1 file changed, 8 insertions(+), 18 deletions(-)